![]() |
OpenMS
|
Utility class for analyzing modification patterns in open search results. More...
#include <OpenMS/ANALYSIS/ID/OpenSearchModificationAnalysis.h>
Classes | |
| struct | DeltaMassEntry |
| Statistics for a single delta mass bin in the histogram. More... | |
| struct | DeltaMassStatistics |
| Container for delta mass statistics table. More... | |
| struct | FuzzyDoubleComparator |
| Comparator for approximate comparison of double values. More... | |
| struct | ModificationPattern |
| Stores details of a modification pattern found in the data. More... | |
| struct | ModificationSummary |
| Data structure for modification summary output. More... | |
| struct | OpenSearchAnalysisResult |
| Combined result of open search modification analysis. More... | |
| struct | PTMEntry |
| Statistics for a mapped PTM. More... | |
| struct | PTMStatistics |
| Container for PTM statistics table. More... | |
Public Types | |
| using | DeltaMassHistogram = std::map< double, double, FuzzyDoubleComparator > |
| Type definitions for delta mass analysis. | |
| using | DeltaMassToChargeCount = std::map< double, int, FuzzyDoubleComparator > |
Public Member Functions | |
| OpenSearchModificationAnalysis ()=default | |
| Default constructor. | |
| ~OpenSearchModificationAnalysis ()=default | |
| Destructor. | |
| std::pair< DeltaMassHistogram, DeltaMassToChargeCount > | analyzeDeltaMassPatterns (const PeptideIdentificationList &peptide_ids, bool use_smoothing=false, bool debug=false) const |
| Analyze delta mass patterns from peptide identifications. | |
| std::vector< ModificationSummary > | mapDeltaMassesToModifications (const DeltaMassHistogram &delta_mass_histogram, const DeltaMassToChargeCount &charge_histogram, PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true, const String &output_file="") const |
| Map delta masses to known modifications and annotate peptides. | |
| std::vector< ModificationSummary > | analyzeModifications (PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true, bool use_smoothing=false, const String &output_file="") const |
| Complete analysis workflow: analyze patterns and map to modifications. | |
| OpenSearchAnalysisResult | analyzeModificationsWithStatistics (PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true, bool use_smoothing=false, const String &output_file="") const |
| Complete analysis returning structured statistics tables. | |
| DeltaMassStatistics | generateDeltaMassStatistics (const DeltaMassHistogram &histogram, const DeltaMassToChargeCount &charge_histogram, const PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true) const |
| Generate delta mass statistics table from histogram data. | |
| PTMStatistics | generatePTMStatistics (const PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true) const |
| Generate PTM statistics table with residue localization. | |
| std::map< char, int > | analyzeResidueFrequency (const PeptideIdentificationList &peptide_ids, double delta_mass, double tolerance=0.01) const |
| Analyze which amino acid residues are associated with a delta mass. | |
| void | writeDeltaMassStatistics (const DeltaMassStatistics &stats, const String &output_file) const |
| Write delta mass statistics to a TSV file. | |
| void | writePTMStatistics (const PTMStatistics &stats, const String &output_file) const |
| Write PTM statistics to a TSV file. | |
Private Member Functions | |
| void | writeModificationSummary_ (const std::vector< ModificationSummary > &modifications, const String &output_file) const |
| Write modification summary table to file. | |
| std::map< double, String, FuzzyDoubleComparator > | buildModificationMassLookup_ () const |
| Build lookup table mapping mass differences to known modifications. | |
| String | getTargetResidues_ (const String &mod_name) const |
| Get target residues for a modification name. | |
| int | countUniquePeptides_ (const PeptideIdentificationList &peptide_ids, double delta_mass, double tolerance) const |
| Count unique peptide sequences matching a delta mass. | |
Static Private Member Functions | |
| static double | gaussian_ (double x, double sigma) |
| Gaussian function for smoothing. | |
| static DeltaMassHistogram | smoothDeltaMassHistogram_ (const DeltaMassHistogram &histogram, double sigma=0.001) |
| Smooth delta mass histogram using Gaussian kernel density estimation. | |
| static DeltaMassHistogram | findPeaksInHistogram_ (const DeltaMassHistogram &histogram, double count_threshold=0.0, double snr=2.0) |
| Find peaks in delta mass histogram based on count threshold and signal-to-noise ratio. | |
Static Private Attributes | |
| static constexpr double | MAX_MOD_MAPPING_TOL_ = 0.02 |
| static constexpr double | DELTA_MASS_ZERO_THRESHOLD_ = 0.05 |
| Delta masses within this threshold of zero are considered unmodified. | |
Utility class for analyzing modification patterns in open search results.
This class provides functionality to analyze delta mass patterns from open search peptide identifications, identify common modifications, and map them to known modifications from the ModificationsDB. Originally extracted from SageAdapter.
The class can generate two types of statistics tables:
These tables can be used for modification discovery in open search workflows.
| struct OpenMS::OpenSearchModificationAnalysis::DeltaMassEntry |
Statistics for a single delta mass bin in the histogram.
| Class Members | ||
|---|---|---|
| int | count = 0 | Number of PSMs with this delta mass. |
| double | delta_mass = 0.0 | Central delta mass value. |
| bool | is_known_modification = false | Whether this maps to a known modification. |
| String | mapped_modification = "" | Name of mapped modification (if any) |
| int | num_charge_states = 0 | Number of different charge states observed. |
| double | percentage = 0.0 | Percentage of total PSMs. |
| int | unique_peptides = 0 | Number of unique peptide sequences. |
| struct OpenMS::OpenSearchModificationAnalysis::DeltaMassStatistics |
Container for delta mass statistics table.
| Class Members | ||
|---|---|---|
| vector< DeltaMassEntry > | entries | All delta mass entries. |
| double | mean_delta_mass = 0.0 | Mean delta mass (excluding unmodified) |
| double | median_delta_mass = 0.0 | Median delta mass (excluding unmodified) |
| int | modified_psms = 0 | PSMs with non-zero delta mass. |
| int | total_psms = 0 | Total number of PSMs analyzed. |
| int | unmodified_psms = 0 | PSMs with ~zero delta mass. |
| struct OpenMS::OpenSearchModificationAnalysis::ModificationPattern |
Stores details of a modification pattern found in the data.
| struct OpenMS::OpenSearchModificationAnalysis::ModificationSummary |
Data structure for modification summary output.
| Class Members | ||
|---|---|---|
| int | count | Modification rate (number of occurrences) |
| vector< double > | masses | Masses associated with the modification. |
| String | name | Modification name. |
| int | num_charge_states | Number of charge states. |
| struct OpenMS::OpenSearchModificationAnalysis::OpenSearchAnalysisResult |
Combined result of open search modification analysis.
| Class Members | ||
|---|---|---|
| DeltaMassStatistics | delta_mass_stats | Delta mass histogram statistics. |
| PTMStatistics | ptm_stats | Mapped PTM statistics. |
| vector< ModificationSummary > | summaries | Legacy modification summaries. |
| struct OpenMS::OpenSearchModificationAnalysis::PTMEntry |
Statistics for a mapped PTM.
| Class Members | ||
|---|---|---|
| int | count = 0 | Number of PSMs with this modification. |
| double | mass_deviation = 0.0 | Deviation between theoretical and observed. |
| String | name | Modification name (e.g., "Oxidation (M)") |
| int | num_charge_states = 0 | Number of different charge states observed. |
| double | observed_mass = 0.0 | Mean observed delta mass. |
| double | percentage = 0.0 | Percentage of total modified PSMs. |
| map< char, int > | residue_counts | Count per amino acid residue. |
| String | target_residues | Target residues for this modification. |
| double | theoretical_mass = 0.0 | Theoretical mass shift from ModificationsDB. |
| int | unique_peptides = 0 | Number of unique peptide sequences. |
| struct OpenMS::OpenSearchModificationAnalysis::PTMStatistics |
Container for PTM statistics table.
| Class Members | ||
|---|---|---|
| vector< PTMEntry > | entries | All PTM entries. |
| int | num_unique_modifications = 0 | Number of distinct modifications found. |
| int | total_modified_psms = 0 | Total PSMs with mapped modifications. |
| int | unknown_modification_psms = 0 | PSMs with unknown modifications. |
| using DeltaMassHistogram = std::map<double, double, FuzzyDoubleComparator> |
Type definitions for delta mass analysis.
| using DeltaMassToChargeCount = std::map<double, int, FuzzyDoubleComparator> |
|
default |
Default constructor.
|
default |
Destructor.
| std::pair< DeltaMassHistogram, DeltaMassToChargeCount > analyzeDeltaMassPatterns | ( | const PeptideIdentificationList & | peptide_ids, |
| bool | use_smoothing = false, |
||
| bool | debug = false |
||
| ) | const |
Analyze delta mass patterns from peptide identifications.
| [in,out] | peptide_ids | List of peptide identifications containing delta mass information |
| [in] | use_smoothing | Whether to apply smoothing to the delta mass histogram |
| [out] | debug | Enable debug output |
| std::vector< ModificationSummary > analyzeModifications | ( | PeptideIdentificationList & | peptide_ids, |
| double | precursor_mass_tolerance = 5.0, |
||
| bool | precursor_mass_tolerance_unit_ppm = true, |
||
| bool | use_smoothing = false, |
||
| const String & | output_file = "" |
||
| ) | const |
Complete analysis workflow: analyze patterns and map to modifications.
| [in] | peptide_ids | List of peptide identifications (modified in-place) |
| [in] | precursor_mass_tolerance | Mass tolerance for mapping |
| [in] | precursor_mass_tolerance_unit_ppm | Whether tolerance is in ppm (true) or Da (false) |
| [in] | use_smoothing | Whether to apply smoothing to delta mass histogram |
| [in] | output_file | Optional file path for writing modification summary table |
| OpenSearchAnalysisResult analyzeModificationsWithStatistics | ( | PeptideIdentificationList & | peptide_ids, |
| double | precursor_mass_tolerance = 5.0, |
||
| bool | precursor_mass_tolerance_unit_ppm = true, |
||
| bool | use_smoothing = false, |
||
| const String & | output_file = "" |
||
| ) | const |
Complete analysis returning structured statistics tables.
This is the main entry point for fragment index open search modification discovery. It performs a complete analysis workflow and returns structured tables containing:
| peptide_ids | List of peptide identifications (modified in-place with PTM annotations) |
| precursor_mass_tolerance | Mass tolerance for mapping delta masses to known modifications |
| precursor_mass_tolerance_unit_ppm | Whether tolerance is in ppm (true) or Da (false) |
| use_smoothing | Whether to apply Gaussian smoothing to delta mass histogram |
| output_file | Optional file path for writing CSV/TSV output tables |
| std::map< char, int > analyzeResidueFrequency | ( | const PeptideIdentificationList & | peptide_ids, |
| double | delta_mass, | ||
| double | tolerance = 0.01 |
||
| ) | const |
Analyze which amino acid residues are associated with a delta mass.
For each delta mass, examines the peptide sequences to determine which amino acids are most frequently present, helping to localize modifications.
| peptide_ids | Peptide identifications with DeltaMass meta values |
| delta_mass | Target delta mass to analyze |
| tolerance | Mass tolerance for matching |
|
private |
Build lookup table mapping mass differences to known modifications.
|
private |
Count unique peptide sequences matching a delta mass.
|
staticprivate |
Find peaks in delta mass histogram based on count threshold and signal-to-noise ratio.
|
staticprivate |
Gaussian function for smoothing.
| DeltaMassStatistics generateDeltaMassStatistics | ( | const DeltaMassHistogram & | histogram, |
| const DeltaMassToChargeCount & | charge_histogram, | ||
| const PeptideIdentificationList & | peptide_ids, | ||
| double | precursor_mass_tolerance = 5.0, |
||
| bool | precursor_mass_tolerance_unit_ppm = true |
||
| ) | const |
Generate delta mass statistics table from histogram data.
Converts the raw delta mass histogram into a structured statistics table with additional computed metrics like percentages and unique peptide counts.
| histogram | Delta mass histogram from analyzeDeltaMassPatterns() |
| charge_histogram | Charge state counts per delta mass |
| peptide_ids | Peptide identifications for computing unique peptide counts |
| precursor_mass_tolerance | Mass tolerance for grouping |
| precursor_mass_tolerance_unit_ppm | Whether tolerance is in ppm |
| PTMStatistics generatePTMStatistics | ( | const PeptideIdentificationList & | peptide_ids, |
| double | precursor_mass_tolerance = 5.0, |
||
| bool | precursor_mass_tolerance_unit_ppm = true |
||
| ) | const |
Generate PTM statistics table with residue localization.
Analyzes peptide identifications to generate a table of mapped PTMs including residue-specific localization analysis.
| peptide_ids | Peptide identifications with PTM annotations |
| precursor_mass_tolerance | Mass tolerance for mapping |
| precursor_mass_tolerance_unit_ppm | Whether tolerance is in ppm |
Get target residues for a modification name.
| std::vector< ModificationSummary > mapDeltaMassesToModifications | ( | const DeltaMassHistogram & | delta_mass_histogram, |
| const DeltaMassToChargeCount & | charge_histogram, | ||
| PeptideIdentificationList & | peptide_ids, | ||
| double | precursor_mass_tolerance = 5.0, |
||
| bool | precursor_mass_tolerance_unit_ppm = true, |
||
| const String & | output_file = "" |
||
| ) | const |
Map delta masses to known modifications and annotate peptides.
| [in] | delta_mass_histogram | Histogram of delta masses |
| [in] | charge_histogram | Charge state counts for each delta mass |
| [in,out] | peptide_ids | List of peptide identifications to annotate (modified in-place) |
| [in] | precursor_mass_tolerance | Mass tolerance for mapping |
| [in] | precursor_mass_tolerance_unit_ppm | Whether tolerance is in ppm (true) or Da (false) |
| [in] | output_file | Optional file path for writing modification summary table |
|
staticprivate |
Smooth delta mass histogram using Gaussian kernel density estimation.
| void writeDeltaMassStatistics | ( | const DeltaMassStatistics & | stats, |
| const String & | output_file | ||
| ) | const |
Write delta mass statistics to a TSV file.
| stats | Delta mass statistics to write |
| output_file | Output file path |
|
private |
Write modification summary table to file.
| void writePTMStatistics | ( | const PTMStatistics & | stats, |
| const String & | output_file | ||
| ) | const |
Write PTM statistics to a TSV file.
| stats | PTM statistics to write |
| output_file | Output file path |
|
staticconstexprprivate |
Delta masses within this threshold of zero are considered unmodified.
|
staticconstexprprivate |
Maximum tolerance (Da) for matching delta masses to known modification masses. Prevents overly broad matching when the precursor tolerance is large (e.g. open search).