OpenMS
Loading...
Searching...
No Matches
OpenSearchModificationAnalysis Class Reference

Utility class for analyzing modification patterns in open search results. More...

#include <OpenMS/ANALYSIS/ID/OpenSearchModificationAnalysis.h>

Collaboration diagram for OpenSearchModificationAnalysis:
[legend]

Classes

struct  DeltaMassEntry
 Statistics for a single delta mass bin in the histogram. More...
 
struct  DeltaMassStatistics
 Container for delta mass statistics table. More...
 
struct  FuzzyDoubleComparator
 Comparator for approximate comparison of double values. More...
 
struct  ModificationPattern
 Stores details of a modification pattern found in the data. More...
 
struct  ModificationSummary
 Data structure for modification summary output. More...
 
struct  OpenSearchAnalysisResult
 Combined result of open search modification analysis. More...
 
struct  PTMEntry
 Statistics for a mapped PTM. More...
 
struct  PTMStatistics
 Container for PTM statistics table. More...
 

Public Types

using DeltaMassHistogram = std::map< double, double, FuzzyDoubleComparator >
 Type definitions for delta mass analysis.
 
using DeltaMassToChargeCount = std::map< double, int, FuzzyDoubleComparator >
 

Public Member Functions

 OpenSearchModificationAnalysis ()=default
 Default constructor.
 
 ~OpenSearchModificationAnalysis ()=default
 Destructor.
 
std::pair< DeltaMassHistogram, DeltaMassToChargeCountanalyzeDeltaMassPatterns (const PeptideIdentificationList &peptide_ids, bool use_smoothing=false, bool debug=false) const
 Analyze delta mass patterns from peptide identifications.
 
std::vector< ModificationSummarymapDeltaMassesToModifications (const DeltaMassHistogram &delta_mass_histogram, const DeltaMassToChargeCount &charge_histogram, PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true, const String &output_file="") const
 Map delta masses to known modifications and annotate peptides.
 
std::vector< ModificationSummaryanalyzeModifications (PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true, bool use_smoothing=false, const String &output_file="") const
 Complete analysis workflow: analyze patterns and map to modifications.
 
OpenSearchAnalysisResult analyzeModificationsWithStatistics (PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true, bool use_smoothing=false, const String &output_file="") const
 Complete analysis returning structured statistics tables.
 
DeltaMassStatistics generateDeltaMassStatistics (const DeltaMassHistogram &histogram, const DeltaMassToChargeCount &charge_histogram, const PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true) const
 Generate delta mass statistics table from histogram data.
 
PTMStatistics generatePTMStatistics (const PeptideIdentificationList &peptide_ids, double precursor_mass_tolerance=5.0, bool precursor_mass_tolerance_unit_ppm=true) const
 Generate PTM statistics table with residue localization.
 
std::map< char, int > analyzeResidueFrequency (const PeptideIdentificationList &peptide_ids, double delta_mass, double tolerance=0.01) const
 Analyze which amino acid residues are associated with a delta mass.
 
void writeDeltaMassStatistics (const DeltaMassStatistics &stats, const String &output_file) const
 Write delta mass statistics to a TSV file.
 
void writePTMStatistics (const PTMStatistics &stats, const String &output_file) const
 Write PTM statistics to a TSV file.
 

Private Member Functions

void writeModificationSummary_ (const std::vector< ModificationSummary > &modifications, const String &output_file) const
 Write modification summary table to file.
 
std::map< double, String, FuzzyDoubleComparatorbuildModificationMassLookup_ () const
 Build lookup table mapping mass differences to known modifications.
 
String getTargetResidues_ (const String &mod_name) const
 Get target residues for a modification name.
 
int countUniquePeptides_ (const PeptideIdentificationList &peptide_ids, double delta_mass, double tolerance) const
 Count unique peptide sequences matching a delta mass.
 

Static Private Member Functions

static double gaussian_ (double x, double sigma)
 Gaussian function for smoothing.
 
static DeltaMassHistogram smoothDeltaMassHistogram_ (const DeltaMassHistogram &histogram, double sigma=0.001)
 Smooth delta mass histogram using Gaussian kernel density estimation.
 
static DeltaMassHistogram findPeaksInHistogram_ (const DeltaMassHistogram &histogram, double count_threshold=0.0, double snr=2.0)
 Find peaks in delta mass histogram based on count threshold and signal-to-noise ratio.
 

Static Private Attributes

static constexpr double MAX_MOD_MAPPING_TOL_ = 0.02
 
static constexpr double DELTA_MASS_ZERO_THRESHOLD_ = 0.05
 Delta masses within this threshold of zero are considered unmodified.
 

Detailed Description

Utility class for analyzing modification patterns in open search results.

This class provides functionality to analyze delta mass patterns from open search peptide identifications, identify common modifications, and map them to known modifications from the ModificationsDB. Originally extracted from SageAdapter.

The class can generate two types of statistics tables:

  1. PTM Statistics Table - Shows known modifications mapped to delta masses
  2. Delta Mass Statistics Table - Shows raw delta mass distribution analysis

These tables can be used for modification discovery in open search workflows.


Class Documentation

◆ OpenMS::OpenSearchModificationAnalysis::DeltaMassEntry

struct OpenMS::OpenSearchModificationAnalysis::DeltaMassEntry

Statistics for a single delta mass bin in the histogram.

Collaboration diagram for OpenSearchModificationAnalysis::DeltaMassEntry:
[legend]
Class Members
int count = 0 Number of PSMs with this delta mass.
double delta_mass = 0.0 Central delta mass value.
bool is_known_modification = false Whether this maps to a known modification.
String mapped_modification = "" Name of mapped modification (if any)
int num_charge_states = 0 Number of different charge states observed.
double percentage = 0.0 Percentage of total PSMs.
int unique_peptides = 0 Number of unique peptide sequences.

◆ OpenMS::OpenSearchModificationAnalysis::DeltaMassStatistics

struct OpenMS::OpenSearchModificationAnalysis::DeltaMassStatistics

Container for delta mass statistics table.

Collaboration diagram for OpenSearchModificationAnalysis::DeltaMassStatistics:
[legend]
Class Members
vector< DeltaMassEntry > entries All delta mass entries.
double mean_delta_mass = 0.0 Mean delta mass (excluding unmodified)
double median_delta_mass = 0.0 Median delta mass (excluding unmodified)
int modified_psms = 0 PSMs with non-zero delta mass.
int total_psms = 0 Total number of PSMs analyzed.
int unmodified_psms = 0 PSMs with ~zero delta mass.

◆ OpenMS::OpenSearchModificationAnalysis::ModificationPattern

struct OpenMS::OpenSearchModificationAnalysis::ModificationPattern

Stores details of a modification pattern found in the data.

Collaboration diagram for OpenSearchModificationAnalysis::ModificationPattern:
[legend]
Class Members
double count = 0.0 Number of peptides with this modification.
vector< double > masses Masses associated with the modification.
int num_charge_states = 0 Number of different charge states observed.

◆ OpenMS::OpenSearchModificationAnalysis::ModificationSummary

struct OpenMS::OpenSearchModificationAnalysis::ModificationSummary

Data structure for modification summary output.

Collaboration diagram for OpenSearchModificationAnalysis::ModificationSummary:
[legend]
Class Members
int count Modification rate (number of occurrences)
vector< double > masses Masses associated with the modification.
String name Modification name.
int num_charge_states Number of charge states.

◆ OpenMS::OpenSearchModificationAnalysis::OpenSearchAnalysisResult

struct OpenMS::OpenSearchModificationAnalysis::OpenSearchAnalysisResult

Combined result of open search modification analysis.

Collaboration diagram for OpenSearchModificationAnalysis::OpenSearchAnalysisResult:
[legend]
Class Members
DeltaMassStatistics delta_mass_stats Delta mass histogram statistics.
PTMStatistics ptm_stats Mapped PTM statistics.
vector< ModificationSummary > summaries Legacy modification summaries.

◆ OpenMS::OpenSearchModificationAnalysis::PTMEntry

struct OpenMS::OpenSearchModificationAnalysis::PTMEntry

Statistics for a mapped PTM.

Collaboration diagram for OpenSearchModificationAnalysis::PTMEntry:
[legend]
Class Members
int count = 0 Number of PSMs with this modification.
double mass_deviation = 0.0 Deviation between theoretical and observed.
String name Modification name (e.g., "Oxidation (M)")
int num_charge_states = 0 Number of different charge states observed.
double observed_mass = 0.0 Mean observed delta mass.
double percentage = 0.0 Percentage of total modified PSMs.
map< char, int > residue_counts Count per amino acid residue.
String target_residues Target residues for this modification.
double theoretical_mass = 0.0 Theoretical mass shift from ModificationsDB.
int unique_peptides = 0 Number of unique peptide sequences.

◆ OpenMS::OpenSearchModificationAnalysis::PTMStatistics

struct OpenMS::OpenSearchModificationAnalysis::PTMStatistics

Container for PTM statistics table.

Collaboration diagram for OpenSearchModificationAnalysis::PTMStatistics:
[legend]
Class Members
vector< PTMEntry > entries All PTM entries.
int num_unique_modifications = 0 Number of distinct modifications found.
int total_modified_psms = 0 Total PSMs with mapped modifications.
int unknown_modification_psms = 0 PSMs with unknown modifications.

Member Typedef Documentation

◆ DeltaMassHistogram

using DeltaMassHistogram = std::map<double, double, FuzzyDoubleComparator>

Type definitions for delta mass analysis.

◆ DeltaMassToChargeCount

using DeltaMassToChargeCount = std::map<double, int, FuzzyDoubleComparator>

Constructor & Destructor Documentation

◆ OpenSearchModificationAnalysis()

Default constructor.

◆ ~OpenSearchModificationAnalysis()

Destructor.

Member Function Documentation

◆ analyzeDeltaMassPatterns()

std::pair< DeltaMassHistogram, DeltaMassToChargeCount > analyzeDeltaMassPatterns ( const PeptideIdentificationList peptide_ids,
bool  use_smoothing = false,
bool  debug = false 
) const

Analyze delta mass patterns from peptide identifications.

Parameters
[in,out]peptide_idsList of peptide identifications containing delta mass information
[in]use_smoothingWhether to apply smoothing to the delta mass histogram
[out]debugEnable debug output
Returns
Pair containing delta mass histogram and charge state counts

◆ analyzeModifications()

std::vector< ModificationSummary > analyzeModifications ( PeptideIdentificationList peptide_ids,
double  precursor_mass_tolerance = 5.0,
bool  precursor_mass_tolerance_unit_ppm = true,
bool  use_smoothing = false,
const String output_file = "" 
) const

Complete analysis workflow: analyze patterns and map to modifications.

Parameters
[in]peptide_idsList of peptide identifications (modified in-place)
[in]precursor_mass_toleranceMass tolerance for mapping
[in]precursor_mass_tolerance_unit_ppmWhether tolerance is in ppm (true) or Da (false)
[in]use_smoothingWhether to apply smoothing to delta mass histogram
[in]output_fileOptional file path for writing modification summary table
Returns
List of modification summaries found

◆ analyzeModificationsWithStatistics()

OpenSearchAnalysisResult analyzeModificationsWithStatistics ( PeptideIdentificationList peptide_ids,
double  precursor_mass_tolerance = 5.0,
bool  precursor_mass_tolerance_unit_ppm = true,
bool  use_smoothing = false,
const String output_file = "" 
) const

Complete analysis returning structured statistics tables.

This is the main entry point for fragment index open search modification discovery. It performs a complete analysis workflow and returns structured tables containing:

  • Delta mass statistics (histogram of mass shifts)
  • PTM statistics (mapped modifications with residue localization)
Parameters
peptide_idsList of peptide identifications (modified in-place with PTM annotations)
precursor_mass_toleranceMass tolerance for mapping delta masses to known modifications
precursor_mass_tolerance_unit_ppmWhether tolerance is in ppm (true) or Da (false)
use_smoothingWhether to apply Gaussian smoothing to delta mass histogram
output_fileOptional file path for writing CSV/TSV output tables
Returns
OpenSearchAnalysisResult containing delta mass and PTM statistics tables

◆ analyzeResidueFrequency()

std::map< char, int > analyzeResidueFrequency ( const PeptideIdentificationList peptide_ids,
double  delta_mass,
double  tolerance = 0.01 
) const

Analyze which amino acid residues are associated with a delta mass.

For each delta mass, examines the peptide sequences to determine which amino acids are most frequently present, helping to localize modifications.

Parameters
peptide_idsPeptide identifications with DeltaMass meta values
delta_massTarget delta mass to analyze
toleranceMass tolerance for matching
Returns
Map from amino acid character to occurrence count

◆ buildModificationMassLookup_()

std::map< double, String, FuzzyDoubleComparator > buildModificationMassLookup_ ( ) const
private

Build lookup table mapping mass differences to known modifications.

◆ countUniquePeptides_()

int countUniquePeptides_ ( const PeptideIdentificationList peptide_ids,
double  delta_mass,
double  tolerance 
) const
private

Count unique peptide sequences matching a delta mass.

◆ findPeaksInHistogram_()

static DeltaMassHistogram findPeaksInHistogram_ ( const DeltaMassHistogram histogram,
double  count_threshold = 0.0,
double  snr = 2.0 
)
staticprivate

Find peaks in delta mass histogram based on count threshold and signal-to-noise ratio.

◆ gaussian_()

static double gaussian_ ( double  x,
double  sigma 
)
staticprivate

Gaussian function for smoothing.

◆ generateDeltaMassStatistics()

DeltaMassStatistics generateDeltaMassStatistics ( const DeltaMassHistogram histogram,
const DeltaMassToChargeCount charge_histogram,
const PeptideIdentificationList peptide_ids,
double  precursor_mass_tolerance = 5.0,
bool  precursor_mass_tolerance_unit_ppm = true 
) const

Generate delta mass statistics table from histogram data.

Converts the raw delta mass histogram into a structured statistics table with additional computed metrics like percentages and unique peptide counts.

Parameters
histogramDelta mass histogram from analyzeDeltaMassPatterns()
charge_histogramCharge state counts per delta mass
peptide_idsPeptide identifications for computing unique peptide counts
precursor_mass_toleranceMass tolerance for grouping
precursor_mass_tolerance_unit_ppmWhether tolerance is in ppm
Returns
DeltaMassStatistics table with all entries and summary statistics

◆ generatePTMStatistics()

PTMStatistics generatePTMStatistics ( const PeptideIdentificationList peptide_ids,
double  precursor_mass_tolerance = 5.0,
bool  precursor_mass_tolerance_unit_ppm = true 
) const

Generate PTM statistics table with residue localization.

Analyzes peptide identifications to generate a table of mapped PTMs including residue-specific localization analysis.

Parameters
peptide_idsPeptide identifications with PTM annotations
precursor_mass_toleranceMass tolerance for mapping
precursor_mass_tolerance_unit_ppmWhether tolerance is in ppm
Returns
PTMStatistics table with all mapped modifications

◆ getTargetResidues_()

String getTargetResidues_ ( const String mod_name) const
private

Get target residues for a modification name.

◆ mapDeltaMassesToModifications()

std::vector< ModificationSummary > mapDeltaMassesToModifications ( const DeltaMassHistogram delta_mass_histogram,
const DeltaMassToChargeCount charge_histogram,
PeptideIdentificationList peptide_ids,
double  precursor_mass_tolerance = 5.0,
bool  precursor_mass_tolerance_unit_ppm = true,
const String output_file = "" 
) const

Map delta masses to known modifications and annotate peptides.

Parameters
[in]delta_mass_histogramHistogram of delta masses
[in]charge_histogramCharge state counts for each delta mass
[in,out]peptide_idsList of peptide identifications to annotate (modified in-place)
[in]precursor_mass_toleranceMass tolerance for mapping
[in]precursor_mass_tolerance_unit_ppmWhether tolerance is in ppm (true) or Da (false)
[in]output_fileOptional file path for writing modification summary table
Returns
List of modification summaries found

◆ smoothDeltaMassHistogram_()

static DeltaMassHistogram smoothDeltaMassHistogram_ ( const DeltaMassHistogram histogram,
double  sigma = 0.001 
)
staticprivate

Smooth delta mass histogram using Gaussian kernel density estimation.

◆ writeDeltaMassStatistics()

void writeDeltaMassStatistics ( const DeltaMassStatistics stats,
const String output_file 
) const

Write delta mass statistics to a TSV file.

Parameters
statsDelta mass statistics to write
output_fileOutput file path

◆ writeModificationSummary_()

void writeModificationSummary_ ( const std::vector< ModificationSummary > &  modifications,
const String output_file 
) const
private

Write modification summary table to file.

◆ writePTMStatistics()

void writePTMStatistics ( const PTMStatistics stats,
const String output_file 
) const

Write PTM statistics to a TSV file.

Parameters
statsPTM statistics to write
output_fileOutput file path

Member Data Documentation

◆ DELTA_MASS_ZERO_THRESHOLD_

constexpr double DELTA_MASS_ZERO_THRESHOLD_ = 0.05
staticconstexprprivate

Delta masses within this threshold of zero are considered unmodified.

◆ MAX_MOD_MAPPING_TOL_

constexpr double MAX_MOD_MAPPING_TOL_ = 0.02
staticconstexprprivate

Maximum tolerance (Da) for matching delta masses to known modification masses. Prevents overly broad matching when the precursor tolerance is large (e.g. open search).