OpenMS
Loading...
Searching...
No Matches
OSWFile Class Reference

This class serves for reading in and writing OpenSWATH OSW files. More...

#include <OpenMS/FORMAT/OSWFile.h>

Collaboration diagram for OSWFile:
[legend]

Classes

struct  PercolatorFeature
 

Public Types

enum class  OSWLevel { MS1 , MS2 , TRANSITION , SIZE_OF_OSWLEVEL }
 for Percolator data read/write operations More...
 

Public Member Functions

 OSWFile (const std::string &filename)
 
 OSWFile (const OSWFile &rhs)=default
 
OSWFileoperator= (const OSWFile &rhs)=default
 
void read (OSWData &swath_result)
 
void readMinimal (OSWData &swath_result)
 
void readProtein (OSWData &swath_result, const Size index)
 populates a protein at index index within swath_results with Peptides, unless the protein already has peptides
 
std::vector< IPFPrecursorRowreadIPFPrecursorData (const PeptidoformInferenceConfig &config) const
 Read peakgroup and precursor evidence required for peptidoform inference.
 
std::vector< IPFTransitionRowreadIPFTransitionData (const PeptidoformInferenceConfig &config) const
 Read transition-level evidence required for peptidoform inference.
 
std::vector< IPFAlignmentRowreadIPFAlignmentData (const PeptidoformInferenceConfig &config) const
 Read alignment-group membership required for optional across-run signal propagation from FEATURE_MS2_ALIGNMENT_CANDIDATE.
 
std::vector< IPFAlignmentRowreadIPFAlignmentData (double ipf_max_alignment_pep) const
 Historical overload that reads legacy alignment-group membership from FEATURE_MS2_ALIGNMENT + SCORE_ALIGNMENT.
 
void writeIPFResults (const std::string &output_filename, const std::vector< IPFResultRow > &results) const
 Write peptidoform inference results into SCORE_IPF, copying to output_filename first if requested.
 
std::vector< LevelContextInputRowreadLevelContextData (InferenceLevel level, InferenceContext context) const
 Read compact peptide-, protein-, or gene-level rows for context inference.
 
void writeLevelContextResults (const std::string &output_filename, InferenceLevel level, InferenceContext context, const std::vector< LevelContextResultRow > &results) const
 Write context inference results into SCORE_PEPTIDE / SCORE_PROTEIN / SCORE_GENE.
 
std::vector< OpenSwathExportRowreadOpenSwathExportRows (const OpenSwathExportFilterConfig &config) const
 Read filtered feature rows for user-facing OpenSWATH results and matrix exports.
 
OpenSwathFeatureScoreTable readOpenSwathFeatureScoreTable (const OpenSwathParquetExportConfig &config) const
 Read scored feature rows for OpenSWATH Parquet export.
 
OpenSwathTransitionScoreTable readOpenSwathTransitionScoreTable (const OpenSwathParquetExportConfig &config) const
 Read optional transition-level score rows for OpenSWATH Parquet export.
 
std::map< Int64, std::string > readRunBasenames () const
 Read RUN IDs and convert their filenames to user-facing basenames.
 
UInt64 getRunID () const
 

Static Public Member Functions

static void readToPIN (const std::string &filename, const OSWFile::OSWLevel osw_level, std::ostream &pin_output, const double ipf_max_peakgroup_pep, const double ipf_max_transition_isotope_overlap, const double ipf_min_transition_sn)
 Reads an OSW SQLite file and returns the data on MS1-, MS2- or transition-level as ostream (e.g. stringstream or ofstream).
 
static void writeFromPercolator (const std::string &osw_filename, const OSWFile::OSWLevel osw_level, const std::map< std::string, PercolatorFeature > &features)
 Updates an OpenSWATH OSW SQLite file with the MS1-, MS2- or transition-level results of Percolator.
 

Static Public Attributes

static constexpr Size ALL_PROTEINS = -1
 query all proteins, not just one with a particular ID
 
static const std::array< std::string,(Size) OSWLevel::SIZE_OF_OSWLEVELnames_of_oswlevel
 

Protected Member Functions

void readTransitions_ (OSWData &swath_result)
 
void getFullProteins_ (OSWData &swath_result, Size prot_index=ALL_PROTEINS)
 fill one (prot_id) or all proteins into swath_result
 
void readMeta_ (OSWData &data)
 set source file and sqMass run-ID
 

Private Attributes

std::string filename_
 sql file to open/write to
 
SqliteConnector conn_
 SQL connection. Stays open as long as this object lives.
 
bool has_SCOREMS2_
 database contains pyProphet's score_MS2 table with qvalues
 

Detailed Description

This class serves for reading in and writing OpenSWATH OSW files.

See OpenSwathOSWWriter for more functionality.

The reader and writer returns data in a format suitable for PercolatorAdapter. OSW files have a flexible data structure. They contain all peptide query parameters of TraML/PQP files with the detected and quantified features of OpenSwathWorkflow (feature, feature_ms1, feature_ms2 & feature_transition).

The OSWFile reader extracts the feature information from the OSW file for each level (MS1, MS2 & transition) separately and generates Percolator input files. For each of the three Percolator reports, OSWFile writer adds a table (score_ms1, score_ms2, score_transition) with the respective confidence metrics. These tables can be mapped to the corresponding feature tables, are very similar to PyProphet results and can thus be used interchangeably.

Member Enumeration Documentation

◆ OSWLevel

enum class OSWLevel
strong

for Percolator data read/write operations

Enumerator
MS1 
MS2 
TRANSITION 
SIZE_OF_OSWLEVEL 

Constructor & Destructor Documentation

◆ OSWFile() [1/2]

OSWFile ( const std::string &  filename)

opens an OSW file for reading.

Exceptions
Exception::FileNotReadableif filename does not exist

◆ OSWFile() [2/2]

OSWFile ( const OSWFile rhs)
default

Member Function Documentation

◆ getFullProteins_()

void getFullProteins_ ( OSWData swath_result,
Size  prot_index = ALL_PROTEINS 
)
protected

fill one (prot_id) or all proteins into swath_result

Parameters
[out]swath_resultOutput data. Proteins are cleared before if ALL_PROTEINS is used.
[in]prot_indexUsing ALL_PROTEINS queries all proteins (could take some time)

◆ getRunID()

UInt64 getRunID ( ) const

extract the RUN::ID from the sqMass file

Exceptions
Exception::SqlOperationFailedmore than on run exists

◆ operator=()

OSWFile & operator= ( const OSWFile rhs)
default

◆ read()

void read ( OSWData swath_result)

read data from an SQLLite OSW file into swath_result Depending on the number of proteins, this could take a while.

Note
If you just want the proteins and transitions without peptides and features, use readMinimal().

◆ readIPFAlignmentData() [1/2]

std::vector< IPFAlignmentRow > readIPFAlignmentData ( const PeptidoformInferenceConfig config) const

Read alignment-group membership required for optional across-run signal propagation from FEATURE_MS2_ALIGNMENT_CANDIDATE.

◆ readIPFAlignmentData() [2/2]

std::vector< IPFAlignmentRow > readIPFAlignmentData ( double  ipf_max_alignment_pep) const

Historical overload that reads legacy alignment-group membership from FEATURE_MS2_ALIGNMENT + SCORE_ALIGNMENT.

◆ readIPFPrecursorData()

std::vector< IPFPrecursorRow > readIPFPrecursorData ( const PeptidoformInferenceConfig config) const

Read peakgroup and precursor evidence required for peptidoform inference.

◆ readIPFTransitionData()

std::vector< IPFTransitionRow > readIPFTransitionData ( const PeptidoformInferenceConfig config) const

Read transition-level evidence required for peptidoform inference.

◆ readLevelContextData()

std::vector< LevelContextInputRow > readLevelContextData ( InferenceLevel  level,
InferenceContext  context 
) const

Read compact peptide-, protein-, or gene-level rows for context inference.

◆ readMeta_()

void readMeta_ ( OSWData data)
protected

set source file and sqMass run-ID

◆ readMinimal()

void readMinimal ( OSWData swath_result)

Reads in transitions and a list of protein names/IDs but no peptide/feature/transition mapping data (which could be very expensive). Use in conjunction with on-demand readProtein() to fully populate proteins with peptide/feature data as needed.

Note
If you read in all proteins afterwards in one go anyway, using the read() method will be faster (by about 30%)

◆ readOpenSwathExportRows()

std::vector< OpenSwathExportRow > readOpenSwathExportRows ( const OpenSwathExportFilterConfig config) const

Read filtered feature rows for user-facing OpenSWATH results and matrix exports.

◆ readOpenSwathFeatureScoreTable()

OpenSwathFeatureScoreTable readOpenSwathFeatureScoreTable ( const OpenSwathParquetExportConfig config) const

Read scored feature rows for OpenSWATH Parquet export.

◆ readOpenSwathTransitionScoreTable()

OpenSwathTransitionScoreTable readOpenSwathTransitionScoreTable ( const OpenSwathParquetExportConfig config) const

Read optional transition-level score rows for OpenSWATH Parquet export.

◆ readProtein()

void readProtein ( OSWData swath_result,
const Size  index 
)

populates a protein at index index within swath_results with Peptides, unless the protein already has peptides

Internally uses the proteins ID to search for cross referencing peptides and transitions in the OSW file.

Parameters
[in]swath_resultOSWData obtained from the readMinimal() method
[out]indexIndex into swath_result.getProteins()[index]. Make sure the index is within the vector's size.
Exceptions
Exception::InvalidValueif the protein at index does not have any peptides present in the OSW file

◆ readRunBasenames()

std::map< Int64, std::string > readRunBasenames ( ) const

Read RUN IDs and convert their filenames to user-facing basenames.

The returned names are stripped to stem names when possible, so sample.mzML.gz becomes sample.

◆ readToPIN()

static void readToPIN ( const std::string &  filename,
const OSWFile::OSWLevel  osw_level,
std::ostream &  pin_output,
const double  ipf_max_peakgroup_pep,
const double  ipf_max_transition_isotope_overlap,
const double  ipf_min_transition_sn 
)
static

Reads an OSW SQLite file and returns the data on MS1-, MS2- or transition-level as ostream (e.g. stringstream or ofstream).

◆ readTransitions_()

void readTransitions_ ( OSWData swath_result)
protected

populate transitions of swath_result

Clears swath_result entirely (incl. proteins) before adding transitions.

◆ writeFromPercolator()

static void writeFromPercolator ( const std::string &  osw_filename,
const OSWFile::OSWLevel  osw_level,
const std::map< std::string, PercolatorFeature > &  features 
)
static

Updates an OpenSWATH OSW SQLite file with the MS1-, MS2- or transition-level results of Percolator.

◆ writeIPFResults()

void writeIPFResults ( const std::string &  output_filename,
const std::vector< IPFResultRow > &  results 
) const

Write peptidoform inference results into SCORE_IPF, copying to output_filename first if requested.

◆ writeLevelContextResults()

void writeLevelContextResults ( const std::string &  output_filename,
InferenceLevel  level,
InferenceContext  context,
const std::vector< LevelContextResultRow > &  results 
) const

Write context inference results into SCORE_PEPTIDE / SCORE_PROTEIN / SCORE_GENE.

Member Data Documentation

◆ ALL_PROTEINS

constexpr Size ALL_PROTEINS = -1
staticconstexpr

query all proteins, not just one with a particular ID

◆ conn_

SqliteConnector conn_
private

SQL connection. Stays open as long as this object lives.

◆ filename_

std::string filename_
private

sql file to open/write to

◆ has_SCOREMS2_

bool has_SCOREMS2_
private

database contains pyProphet's score_MS2 table with qvalues

◆ names_of_oswlevel

const std::array<std::string, (Size)OSWLevel::SIZE_OF_OSWLEVEL> names_of_oswlevel
static