![]() |
OpenMS
|
Writes chromatograms to a Parquet file with a PyProphet-compatible schema. More...
#include <OpenMS/FORMAT/DATAACCESS/MSChromatogramParquetConsumer.h>
Public Member Functions | |
| MSChromatogramParquetConsumer (const String &filename, UInt64 run_id, const String &source_file, const OpenSwath::LightTargetedExperiment &transition_exp) | |
| Construct a parquet consumer for chromatogram export. | |
| ~MSChromatogramParquetConsumer () override | |
| Destructor flushes pending data and closes the parquet writer. | |
| void | consumeSpectrum (SpectrumType &s) override |
| Consume a spectrum (no-op; spectra are ignored for chromatogram export). | |
| void | consumeChromatogram (ChromatogramType &c) override |
| Consume a chromatogram and append it to the parquet output. | |
| void | finalize () |
| Finalize and write the parquet file. | |
| void | setExpectedSize (Size expectedSpectra, Size expectedChromatograms) override |
| Reserve storage for expected data sizes. | |
| void | setExperimentalSettings (const ExperimentalSettings &exp) override |
| Set experimental settings (currently unused). | |
Public Member Functions inherited from IMSDataConsumer | |
| virtual | ~IMSDataConsumer () |
Private Attributes | |
| std::unique_ptr< MSChromatogramParquetConsumerImpl > | impl_ |
Additional Inherited Members | |
Public Types inherited from IMSDataConsumer | |
| typedef MSSpectrum | SpectrumType |
| typedef MSChromatogram | ChromatogramType |
Writes chromatograms to a Parquet file with a PyProphet-compatible schema.
The schema includes precursor/transition metadata, RT/intensity arrays and compression flags. Additional columns are run_id, source_file, and ms_level.
The Parquet output has the following columns (one row per chromatogram):
| Column | Type | Description |
|---|---|---|
| RUN_ID | int64 | Run identifier |
| SOURCE_FILE | string | Input source filename |
| MS_LEVEL | int64 | MS level (1 for precursor traces, 2 for fragment traces) |
| PRECURSOR_ID | int64 (nullable) | Precursor id |
| TRANSITION_ID | int64 (nullable) | Transition id |
| MODIFIED_SEQUENCE | string (nullable) | Modified peptide sequence |
| PRECURSOR_CHARGE | int64 (nullable) | Precursor charge |
| PRODUCT_CHARGE | int64 (nullable) | Product charge |
| DETECTING_TRANSITION | int64 (nullable) | Detecting transition flag |
| PRECURSOR_DECOY | int64 (nullable) | Precursor decoy flag |
| PRODUCT_DECOY | int64 (nullable) | Product decoy flag |
| TRANSITION_ORDINAL | int64 (nullable) | Transition ordinal |
| TRANSITION_TYPE | string (nullable) | Transition type (e.g., y, b) |
| ANNOTATION | string (nullable) | Transition annotation (e.g., y3^1) |
| RT_DATA | binary | Compressed RT array |
| INTENSITY_DATA | binary | Compressed intensity array |
| RT_COMPRESSION | int64 | RT compression scheme id |
| INTENSITY_COMPRESSION | int64 | Intensity compression scheme id |
Compression identifiers:
| Column | Value | Description |
|---|---|---|
| RT_COMPRESSION | 0 | No compression (raw doubles) |
| RT_COMPRESSION | 1 | Zlib-compressed raw doubles |
| RT_COMPRESSION | 5 | MSNumpress (linear) with lossy compression |
| INTENSITY_COMPRESSION | 0 | No compression (raw doubles) |
| INTENSITY_COMPRESSION | 1 | Zlib-compressed raw doubles |
| INTENSITY_COMPRESSION | 6 | MSNumpress (short logged float) with lossy compression |
| MSChromatogramParquetConsumer | ( | const String & | filename, |
| UInt64 | run_id, | ||
| const String & | source_file, | ||
| const OpenSwath::LightTargetedExperiment & | transition_exp | ||
| ) |
Construct a parquet consumer for chromatogram export.
| [in] | filename | Output parquet filename. |
| [in] | run_id | Run identifier to store with each chromatogram. |
| [in] | source_file | Source mzML filename to store with each chromatogram. |
| [in] | transition_exp | Transition metadata used to annotate chromatograms. |
|
override |
Destructor flushes pending data and closes the parquet writer.
|
overridevirtual |
Consume a chromatogram and append it to the parquet output.
Implements IMSDataConsumer.
|
overridevirtual |
Consume a spectrum (no-op; spectra are ignored for chromatogram export).
Implements IMSDataConsumer.
| void finalize | ( | ) |
Finalize and write the parquet file.
Call this explicitly to surface write errors during normal control flow.
Reserve storage for expected data sizes.
| [in] | expectedSpectra | Expected number of spectra (ignored). |
| [in] | expectedChromatograms | Expected number of chromatograms. |
Implements IMSDataConsumer.
|
overridevirtual |
Set experimental settings (currently unused).
| [in] | exp | Experimental settings to store for context. |
Implements IMSDataConsumer.
|
private |