![]() |
OpenMS
|
Reader for OpenSWATH mobilogram Parquet files (.xim). More...
#include <OpenMS/FORMAT/XIMParquetFile.h>
Classes | |
| struct | XIMAnalyte |
| Analyte metadata container. More... | |
| struct | XIMMobilogram |
| Lightweight mobilogram container for XIM parquet rows. More... | |
| struct | XIMRunInfo |
| Unique run information (run_id, source_file). More... | |
Public Member Functions | |
| XIMParquetFile (const String &filename) | |
| Construct from a single .xim file. | |
| XIMParquetFile (const std::vector< String > &filenames) | |
| Construct from multiple .xim files. | |
| XIMParquetFile (const XIMParquetFile &rhs)=default | |
| XIMParquetFile & | operator= (const XIMParquetFile &rhs)=default |
| const String & | getFilename () const |
| Return the primary filename. | |
| const std::vector< String > & | getFilenames () const |
| Return all filenames associated with this instance. | |
| void | load (std::vector< XIMMobilogram > &output) const |
| Load all mobilograms from the file(s). | |
| void | getMobilograms (std::vector< XIMMobilogram > &output, Int64 precursor_id=-1, Int64 transition_id=-1, const String &modified_sequence="", Int64 precursor_charge=-1, Int64 product_charge=-1, Int64 ms_level=-1, Int64 run_id=-1, const String &mobilogram_type="", Int64 feature_id=-1, double feature_rt=-1.0, const String &filter="") const |
| Load mobilograms with optional filtering. | |
| void | getMobilograms (std::vector< XIMMobilogram > &output, const ParquetFilter &filter) const |
| Return mobilograms using a typed filter expression. | |
| void | getMobilograms (std::vector< XIMMobilogram > &output, const ParquetFilterBuilder &filter) const |
| Return mobilograms using a typed filter builder. | |
| void | getRuns (std::vector< XIMRunInfo > &output) const |
| Return unique run metadata (run_id, source_file). | |
| void | getAnalytes (std::vector< XIMAnalyte > &output, const std::vector< String > &columns={}, bool nest_transitions=true) const |
| Return unique analyte metadata. | |
| void | getColumns (std::vector< String > &output) const |
| Return the parquet schema column names. | |
Private Member Functions | |
| void | getMobilograms_ (std::vector< XIMMobilogram > &output, const FilterExpression &extra_filter, Int64 precursor_id, Int64 transition_id, const String &modified_sequence, Int64 precursor_charge, Int64 product_charge, Int64 ms_level, Int64 run_id, const String &mobilogram_type, Int64 feature_id, double feature_rt, const String &filter) const |
Private Attributes | |
| String | filename_ |
| std::vector< String > | filenames_ |
Reader for OpenSWATH mobilogram Parquet files (.xim).
Supports loading single or multiple files and filtering on metadata columns (e.g., precursor id, transition id, annotations). Filters are applied before decoding mobility/intensity binary arrays.
The filter argument in getMobilograms() accepts simple boolean expressions over column names. Supported operators are:
Values can be integers or strings; strings may be unquoted if they contain no spaces or commas (e.g., annotation=y3^1), otherwise use quotes.
Supported filter columns (case-insensitive): RUN_ID, SOURCE_FILE, MS_LEVEL, MOBILOGRAM_TYPE, PRECURSOR_ID, TRANSITION_ID, FEATURE_ID, FEATURE_RT, MODIFIED_SEQUENCE, PRECURSOR_CHARGE, PRODUCT_CHARGE, DETECTING_TRANSITION, PRECURSOR_DECOY, PRODUCT_DECOY, TRANSITION_ORDINAL, TRANSITION_TYPE, ANNOTATION. MOBILITY and INTENSITY are not filterable because they are stored as compressed binary arrays.
The implementation uses an Arrow-based pipeline:
These steps are implemented in helper functions in the corresponding .cpp file (e.g., dataset scan vs. compute filter fallback and filter parsing). Keeping the helpers in the implementation file avoids exposing Arrow types in the public header.
| struct OpenMS::XIMParquetFile::XIMAnalyte |
Analyte metadata container.
If nest_transitions is false in getAnalytes(), transition-level fields are stored in the scalar members (transition_id, product_charge, etc.). If nest_transitions is true, transition-level fields are stored in the vector members (transition_ids, product_charges, etc.), with one entry per unique transition belonging to the precursor.
| Class Members | ||
|---|---|---|
| String | annotation | |
| vector< String > | annotations | |
| Int64 | detecting_transition {0} | |
| vector< Int64 > | detecting_transitions | |
| bool | has_detecting_transition {false} | |
| bool | has_precursor_charge {false} | |
| bool | has_precursor_decoy {false} | |
| bool | has_precursor_id {false} | |
| bool | has_product_charge {false} | |
| bool | has_product_decoy {false} | |
| bool | has_transition_id {false} | |
| bool | has_transition_ordinal {false} | |
| String | modified_sequence | |
| Int64 | precursor_charge {0} | |
| Int64 | precursor_decoy {0} | |
| Int64 | precursor_id {0} | |
| Int64 | product_charge {0} | |
| vector< Int64 > | product_charges | |
| Int64 | product_decoy {0} | |
| vector< Int64 > | product_decoys | |
| Int64 | transition_id {0} | |
| vector< Int64 > | transition_ids | |
| Int64 | transition_ordinal {0} | |
| vector< Int64 > | transition_ordinals | |
| String | transition_type | |
| vector< String > | transition_types | |
| struct OpenMS::XIMParquetFile::XIMMobilogram |
Lightweight mobilogram container for XIM parquet rows.
| Class Members | ||
|---|---|---|
| String | annotation | |
| Int64 | detecting_transition {0} | |
| Int64 | feature_id {0} | |
| double | feature_rt {0.0} | |
| bool | has_detecting_transition {false} | |
| bool | has_feature_id {false} | |
| bool | has_feature_rt {false} | |
| bool | has_precursor_charge {false} | |
| bool | has_precursor_decoy {false} | |
| bool | has_precursor_id {false} | |
| bool | has_product_charge {false} | |
| bool | has_product_decoy {false} | |
| bool | has_transition_id {false} | |
| bool | has_transition_ordinal {false} | |
| vector< double > | intensity | |
| vector< double > | mobility | |
| String | mobilogram_type | |
| String | modified_sequence | |
| Int64 | ms_level {0} | |
| Int64 | precursor_charge {0} | |
| Int64 | precursor_decoy {0} | |
| Int64 | precursor_id {0} | |
| Int64 | product_charge {0} | |
| Int64 | product_decoy {0} | |
| Int64 | run_id {0} | |
| String | source_file | |
| Int64 | transition_id {0} | |
| Int64 | transition_ordinal {0} | |
| String | transition_type | |
| struct OpenMS::XIMParquetFile::XIMRunInfo |
|
explicit |
Construct from a single .xim file.
| [in] | filename | Path to an OpenSWATH mobilogram parquet file. |
|
explicit |
Construct from multiple .xim files.
| [in] | filenames | Paths to OpenSWATH mobilogram parquet files. |
|
default |
| void getAnalytes | ( | std::vector< XIMAnalyte > & | output, |
| const std::vector< String > & | columns = {}, |
||
| bool | nest_transitions = true |
||
| ) | const |
Return unique analyte metadata.
If nest_transitions is false, each row represents a unique precursor-transition pair. If nest_transitions is true, each row represents a unique precursor with transition-level fields aggregated into vectors.
This method never decodes mobility/intensity arrays and always returns distinct entries.
| [out] | output | Output analyte metadata |
| [in] | columns | Optional list of analyte columns to return (empty for defaults) |
| [in] | nest_transitions | Aggregate transition fields per precursor |
| void getColumns | ( | std::vector< String > & | output | ) | const |
Return the parquet schema column names.
| [out] | output | Column names. |
| const String & getFilename | ( | ) | const |
Return the primary filename.
For multi-file instances this is the first file in the list.
| const std::vector< String > & getFilenames | ( | ) | const |
Return all filenames associated with this instance.
| void getMobilograms | ( | std::vector< XIMMobilogram > & | output, |
| const ParquetFilter & | filter | ||
| ) | const |
Return mobilograms using a typed filter expression.
| [out] | output | Output mobilograms |
| [in] | filter | Typed filter builder expression |
| void getMobilograms | ( | std::vector< XIMMobilogram > & | output, |
| const ParquetFilterBuilder & | filter | ||
| ) | const |
Return mobilograms using a typed filter builder.
| [out] | output | Output mobilograms |
| [in] | filter | Typed filter builder |
| void getMobilograms | ( | std::vector< XIMMobilogram > & | output, |
| Int64 | precursor_id = -1, |
||
| Int64 | transition_id = -1, |
||
| const String & | modified_sequence = "", |
||
| Int64 | precursor_charge = -1, |
||
| Int64 | product_charge = -1, |
||
| Int64 | ms_level = -1, |
||
| Int64 | run_id = -1, |
||
| const String & | mobilogram_type = "", |
||
| Int64 | feature_id = -1, |
||
| double | feature_rt = -1.0, |
||
| const String & | filter = "" |
||
| ) | const |
Load mobilograms with optional filtering.
| [out] | output | Output mobilograms |
| [in] | precursor_id | Optional precursor id (-1 to ignore) |
| [in] | transition_id | Optional transition id (-1 to ignore) |
| [in] | modified_sequence | Optional sequence filter (empty to ignore) |
| [in] | precursor_charge | Optional charge filter (-1 to ignore) |
| [in] | product_charge | Optional product charge filter (-1 to ignore) |
| [in] | ms_level | Optional MS level filter (-1 to ignore) |
| [in] | run_id | Optional run_id filter (-1 to ignore) |
| [in] | mobilogram_type | Optional mobilogram type filter (empty to ignore) |
| [in] | feature_id | Optional feature id filter (-1 to ignore) |
| [in] | feature_rt | Optional feature RT filter (< 0 to ignore) |
| [in] | filter | Optional filter expression on columns (e.g., "PRECURSOR_ID=1 OR TRANSITION_ID in [2,3]") |
|
private |
| void getRuns | ( | std::vector< XIMRunInfo > & | output | ) | const |
Return unique run metadata (run_id, source_file).
This method never decodes mobility/intensity arrays and always returns distinct rows.
| void load | ( | std::vector< XIMMobilogram > & | output | ) | const |
Load all mobilograms from the file(s).
| [out] | output | Output mobilograms. |
|
default |
|
private |
|
private |