![]() |
OpenMS
|
Helper struct that represents one Comet variable_modXX entry and supports merging compatible entries.
More...
#include <OpenMS/ANALYSIS/ID/CometModification.h>
Public Member Functions | |
| CometModification ()=default | |
Default constructor; produces a zeroed entry with mass == 0. | |
| CometModification (const ResidueModification *mod, int binary_grp, int max_mods) | |
| Build an entry from an OpenMS ResidueModification. | |
| bool | isMergeableWith (const CometModification &other) const |
Whether two entries can be combined into a single Comet variable_modXX line. | |
| void | merge (const CometModification &other) |
| Merge another compatible entry into this one. | |
| std::string | toCometString (Size index) const |
Render this entry as a Comet variable_modXX parameter line. | |
Static Public Member Functions | |
| static std::vector< CometModification > | mergeModifications (const std::vector< CometModification > &mods) |
| Greedy first-fit merge of a vector of entries. | |
Public Attributes | |
| double | mass {0.0} |
| Modification mass difference (in Da). | |
| std::string | residues |
Residue(s) this modification applies to (e.g. "K", "KR", "n", "nKR"). | |
| int | binary_group {0} |
| Comet binary modification group (used e.g. for SILAC). | |
| int | max_mods_per_peptide {5} |
| Maximum number of occurrences of this modification per peptide. | |
| int | term_distance {-1} |
Terminal distance constraint: -1 = no constraint, 0 = terminal only. | |
| int | nc_term {0} |
Terminal specificity: 0 = protein N-term, 1 = protein C-term, 2 = peptide N-term, 3 = peptide C-term. Only meaningful when term_distance == 0. | |
| bool | required {false} |
| Whether this modification is required to appear in every peptide. | |
Static Public Attributes | |
| static constexpr double | MASS_TOLERANCE = 1e-6 |
| Absolute tolerance used by isMergeableWith when comparing the mass field. | |
Helper struct that represents one Comet variable_modXX entry and supports merging compatible entries.
Comet supports specifying multiple residues in a single modification entry (e.g. "KR" for lysine and arginine), which significantly improves search performance. This struct mirrors the variable_modXX fields and bundles the rules used to collapse compatible entries.
Merging rules (verified against the cpp):
term_distance == -1) can merge with each other (e.g. "K" + "R" -> "KR").nc_term == 2, term_distance == 0) and amino acids can merge (e.g. "n" + "RST" -> "nRST").nc_term == 3, term_distance == 0) and amino acids can merge (e.g. "c" + "KR" -> "cKR").nc_term == 0 or 1 with term_distance == 0) only merge with another protein terminal mod of the same kind.When a terminal and an amino-acid mod merge, term_distance is set to -1 and nc_term to 0 because the terminal specificity is encoded by the 'n' / 'c' character in the residue string (per Comet convention, e.g. "42.010565 nK 0 3 -1 0 0 0.0").
|
default |
Default constructor; produces a zeroed entry with mass == 0.
| CometModification | ( | const ResidueModification * | mod, |
| int | binary_grp, | ||
| int | max_mods | ||
| ) |
Build an entry from an OpenMS ResidueModification.
Copies the modification's diff-monoisotopic mass into mass and the residue origin character into residues, then sets term_distance and nc_term according to mod 's term specificity:
PEPTIDE_C_TERM -> term_distance = 0, nc_term = 3.PEPTIDE_N_TERM -> term_distance = 0, nc_term = 2.PROTEIN_N_TERM -> term_distance = 0, nc_term = 0.PROTEIN_C_TERM -> term_distance = 0, nc_term = 1.term_distance and nc_term are left at their defaults. Terminal mods whose origin is 'X' are translated into a residue string of "n" or "c" so Comet recognises them; non-'X' terminal mods keep the original residue character.| [in] | mod | Source OpenMS modification. |
| [in] | binary_grp | Value for binary_group. |
| [in] | max_mods | Value for max_mods_per_peptide. |
| bool isMergeableWith | ( | const CometModification & | other | ) | const |
Whether two entries can be combined into a single Comet variable_modXX line.
Implements the merging rules listed in the struct brief. Returns false when the masses differ by more than MASS_TOLERANCE, the binary_group values differ, when only one of the two is a protein-terminal mod (or they are different protein-terminal kinds), or when one is a peptide N-term and the other a peptide C-term.
| [in] | other | Candidate merge partner. |
true if the entries can be merged. | void merge | ( | const CometModification & | other | ) |
Merge another compatible entry into this one.
Unions other 's residue characters into residues (skipping duplicates), keeps the larger max_mods_per_peptide, and ORs the required flag. When a terminal mod is merged with a non-terminal mod the terminal information moves into the residue string and term_distance / nc_term are reset to -1 / 0; same-kind terminal merges leave both fields unchanged.
Behaviour is only defined for pairs that isMergeableWith accepts; the caller is responsible for that check.
| [in] | other | Compatible entry to absorb. |
|
static |
Greedy first-fit merge of a vector of entries.
Walks mods in input order; each entry is merged into the first compatible existing result entry (via isMergeableWith and merge), or appended verbatim if no compatible partner is found. The relative order of the kept entries follows the input order; no entry is dropped.
| [in] | mods | Input modifications. |
| std::string toCometString | ( | Size | index | ) | const |
Render this entry as a Comet variable_modXX parameter line.
The output is one whitespace-separated line of the form
variable_mod<II> = <mass> <residues> <binary_group> <max_mods_per_peptide> <term_distance> <nc_term> <required> 0.0
with the II suffix being index zero-padded to two digits. The last column is hard-coded to "0.0" (neutral loss; OpenMS does not currently surface it). Numeric formatting uses std::ostream default precision (six significant digits).
| [in] | index | 1-based modification index used to form the variable_modXX key. |
variable_modXX = ... line. | int binary_group {0} |
Comet binary modification group (used e.g. for SILAC).
| double mass {0.0} |
Modification mass difference (in Da).
|
staticconstexpr |
Absolute tolerance used by isMergeableWith when comparing the mass field.
| int max_mods_per_peptide {5} |
Maximum number of occurrences of this modification per peptide.
| int nc_term {0} |
Terminal specificity: 0 = protein N-term, 1 = protein C-term, 2 = peptide N-term, 3 = peptide C-term. Only meaningful when term_distance == 0.
| bool required {false} |
Whether this modification is required to appear in every peptide.
| std::string residues |
Residue(s) this modification applies to (e.g. "K", "KR", "n", "nKR").
| int term_distance {-1} |
Terminal distance constraint: -1 = no constraint, 0 = terminal only.