OpenMS
Loading...
Searching...
No Matches
CometModification Struct Reference

Helper struct that represents one Comet variable_modXX entry and supports merging compatible entries. More...

#include <OpenMS/ANALYSIS/ID/CometModification.h>

Collaboration diagram for CometModification:
[legend]

Public Member Functions

 CometModification ()=default
 Default constructor; produces a zeroed entry with mass == 0.
 
 CometModification (const ResidueModification *mod, int binary_grp, int max_mods)
 Build an entry from an OpenMS ResidueModification.
 
bool isMergeableWith (const CometModification &other) const
 Whether two entries can be combined into a single Comet variable_modXX line.
 
void merge (const CometModification &other)
 Merge another compatible entry into this one.
 
std::string toCometString (Size index) const
 Render this entry as a Comet variable_modXX parameter line.
 

Static Public Member Functions

static std::vector< CometModificationmergeModifications (const std::vector< CometModification > &mods)
 Greedy first-fit merge of a vector of entries.
 

Public Attributes

double mass {0.0}
 Modification mass difference (in Da).
 
std::string residues
 Residue(s) this modification applies to (e.g. "K", "KR", "n", "nKR").
 
int binary_group {0}
 Comet binary modification group (used e.g. for SILAC).
 
int max_mods_per_peptide {5}
 Maximum number of occurrences of this modification per peptide.
 
int term_distance {-1}
 Terminal distance constraint: -1 = no constraint, 0 = terminal only.
 
int nc_term {0}
 Terminal specificity: 0 = protein N-term, 1 = protein C-term, 2 = peptide N-term, 3 = peptide C-term. Only meaningful when term_distance == 0.
 
bool required {false}
 Whether this modification is required to appear in every peptide.
 

Static Public Attributes

static constexpr double MASS_TOLERANCE = 1e-6
 Absolute tolerance used by isMergeableWith when comparing the mass field.
 

Detailed Description

Helper struct that represents one Comet variable_modXX entry and supports merging compatible entries.

Comet supports specifying multiple residues in a single modification entry (e.g. "KR" for lysine and arginine), which significantly improves search performance. This struct mirrors the variable_modXX fields and bundles the rules used to collapse compatible entries.

Merging rules (verified against the cpp):

  • Two entries must have the same mass (within MASS_TOLERANCE) and the same binary_group.
  • Plain amino-acid mods (term_distance == -1) can merge with each other (e.g. "K" + "R" -> "KR").
  • Peptide N-term (nc_term == 2, term_distance == 0) and amino acids can merge (e.g. "n" + "RST" -> "nRST").
  • Peptide C-term (nc_term == 3, term_distance == 0) and amino acids can merge (e.g. "c" + "KR" -> "cKR").
  • Protein terminal mods (nc_term == 0 or 1 with term_distance == 0) only merge with another protein terminal mod of the same kind.
  • Peptide N-term and peptide C-term cannot merge with each other.

When a terminal and an amino-acid mod merge, term_distance is set to -1 and nc_term to 0 because the terminal specificity is encoded by the 'n' / 'c' character in the residue string (per Comet convention, e.g. "42.010565 nK 0 3 -1 0 0 0.0").

Constructor & Destructor Documentation

◆ CometModification() [1/2]

CometModification ( )
default

Default constructor; produces a zeroed entry with mass == 0.

◆ CometModification() [2/2]

CometModification ( const ResidueModification mod,
int  binary_grp,
int  max_mods 
)

Build an entry from an OpenMS ResidueModification.

Copies the modification's diff-monoisotopic mass into mass and the residue origin character into residues, then sets term_distance and nc_term according to mod 's term specificity:

  • PEPTIDE_C_TERM -> term_distance = 0, nc_term = 3.
  • PEPTIDE_N_TERM -> term_distance = 0, nc_term = 2.
  • PROTEIN_N_TERM -> term_distance = 0, nc_term = 0.
  • PROTEIN_C_TERM -> term_distance = 0, nc_term = 1.
  • Anywhere -> term_distance and nc_term are left at their defaults. Terminal mods whose origin is 'X' are translated into a residue string of "n" or "c" so Comet recognises them; non-'X' terminal mods keep the original residue character.
Parameters
[in]modSource OpenMS modification.
[in]binary_grpValue for binary_group.
[in]max_modsValue for max_mods_per_peptide.

Member Function Documentation

◆ isMergeableWith()

bool isMergeableWith ( const CometModification other) const

Whether two entries can be combined into a single Comet variable_modXX line.

Implements the merging rules listed in the struct brief. Returns false when the masses differ by more than MASS_TOLERANCE, the binary_group values differ, when only one of the two is a protein-terminal mod (or they are different protein-terminal kinds), or when one is a peptide N-term and the other a peptide C-term.

Parameters
[in]otherCandidate merge partner.
Returns
true if the entries can be merged.

◆ merge()

void merge ( const CometModification other)

Merge another compatible entry into this one.

Unions other 's residue characters into residues (skipping duplicates), keeps the larger max_mods_per_peptide, and ORs the required flag. When a terminal mod is merged with a non-terminal mod the terminal information moves into the residue string and term_distance / nc_term are reset to -1 / 0; same-kind terminal merges leave both fields unchanged.

Behaviour is only defined for pairs that isMergeableWith accepts; the caller is responsible for that check.

Parameters
[in]otherCompatible entry to absorb.

◆ mergeModifications()

static std::vector< CometModification > mergeModifications ( const std::vector< CometModification > &  mods)
static

Greedy first-fit merge of a vector of entries.

Walks mods in input order; each entry is merged into the first compatible existing result entry (via isMergeableWith and merge), or appended verbatim if no compatible partner is found. The relative order of the kept entries follows the input order; no entry is dropped.

Parameters
[in]modsInput modifications.
Returns
Merged modifications.

◆ toCometString()

std::string toCometString ( Size  index) const

Render this entry as a Comet variable_modXX parameter line.

The output is one whitespace-separated line of the form

variable_mod<II> = <mass> <residues> <binary_group> <max_mods_per_peptide> <term_distance> <nc_term> <required> 0.0

with the II suffix being index zero-padded to two digits. The last column is hard-coded to "0.0" (neutral loss; OpenMS does not currently surface it). Numeric formatting uses std::ostream default precision (six significant digits).

Parameters
[in]index1-based modification index used to form the variable_modXX key.
Returns
The fully-formed variable_modXX = ... line.

Member Data Documentation

◆ binary_group

int binary_group {0}

Comet binary modification group (used e.g. for SILAC).

◆ mass

double mass {0.0}

Modification mass difference (in Da).

◆ MASS_TOLERANCE

constexpr double MASS_TOLERANCE = 1e-6
staticconstexpr

Absolute tolerance used by isMergeableWith when comparing the mass field.

◆ max_mods_per_peptide

int max_mods_per_peptide {5}

Maximum number of occurrences of this modification per peptide.

◆ nc_term

int nc_term {0}

Terminal specificity: 0 = protein N-term, 1 = protein C-term, 2 = peptide N-term, 3 = peptide C-term. Only meaningful when term_distance == 0.

◆ required

bool required {false}

Whether this modification is required to appear in every peptide.

◆ residues

std::string residues

Residue(s) this modification applies to (e.g. "K", "KR", "n", "nKR").

◆ term_distance

int term_distance {-1}

Terminal distance constraint: -1 = no constraint, 0 = terminal only.