OpenMS
Loading...
Searching...
No Matches
ProForma.h File Reference
#include <OpenMS/CONCEPT/Exception.h>
#include <OpenMS/DATASTRUCTURES/String.h>
#include <OpenMS/OpenMSConfig.h>
#include <optional>
#include <utility>
#include <variant>
#include <vector>
Include dependency graph for ProForma.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  ProForma
 ProForma v2 peptidoform notation parser and data structures. More...
 
struct  ProForma::ConversionIssue
 Description of a conversion issue from Peptidoform to AASequence. More...
 
struct  ProForma::CvAccession
 Controlled vocabulary accession for a modification. More...
 
struct  ProForma::NamedMod
 Named modification with optional CV prefix hint. More...
 
struct  ProForma::MassDelta
 Mass delta modification with optional source hint. More...
 
struct  ProForma::FormulaTag
 Chemical formula with optional charge. More...
 
struct  ProForma::GlycanComposition
 Glycan composition specification. More...
 
struct  ProForma::InfoTag
 Info tag for arbitrary text annotations. More...
 
struct  ProForma::PositionConstraint
 Position constraint specifying allowed residues for a modification. More...
 
struct  ProForma::Label
 Label for cross-links, branches, or ambiguous grouping. More...
 
struct  ProForma::Modification
 A modification with one or more alternative tags. More...
 
struct  ProForma::SequenceElement
 A single amino acid with its modifications. More...
 
struct  ProForma::AmbiguousRegion
 Ambiguous amino acid region. More...
 
struct  ProForma::ModifiedRange
 Modified sequence range with shared modifications. More...
 
struct  ProForma::UnlocalisedMod
 Unlocalised modification with optional occurrence count. More...
 
struct  ProForma::LabileModification
 Labile modification that may be lost during fragmentation. More...
 
struct  ProForma::GlobalModification
 Global modification applied to specific locations. More...
 
struct  ProForma::IsotopeReplacement
 Isotope replacement for stable isotope labeling. More...
 
struct  ProForma::AdductIon
 Adduct ion specification for charge state. More...
 
struct  ProForma::Peptidoform
 A single peptidoform (one peptide chain) More...
 
struct  ProForma::PeptidoformIon
 A peptidoform ion (one or more chains with optional charge) More...
 
struct  ProForma::CrossLinkGroup
 Cross-link group connecting sites across chains. More...
 
class  ProForma::ParseError
 Structured parse error with context for ProForma parsing. More...
 

Namespaces

namespace  OpenMS
 Main OpenMS namespace.
 

Class Documentation

◆ OpenMS::ProForma::ConversionIssue

struct OpenMS::ProForma::ConversionIssue

Description of a conversion issue from Peptidoform to AASequence.

Records problems encountered when attempting to convert a ProForma Peptidoform to an OpenMS AASequence representation.

Collaboration diagram for ProForma::ConversionIssue:
[legend]
Class Members
String description Human-readable description.
size_t position Position in sequence (SIZE_MAX if not position-specific)
ConversionIssueType type The type of issue.

◆ OpenMS::ProForma::CvAccession

struct OpenMS::ProForma::CvAccession

Controlled vocabulary accession for a modification.

Represents a modification specified by a CV accession number, e.g., UNIMOD:35 for Oxidation. The accession string contains only the identifier portion (e.g., "35" for UNIMOD:35).

Collaboration diagram for ProForma::CvAccession:
[legend]
Class Members
String accession The accession identifier (e.g., "35" for UNIMOD:35, full string for GNO)
CvDatabase database The source database (UNIMOD, MOD, RESID, XLMOD, or GNO)

◆ OpenMS::ProForma::NamedMod

struct OpenMS::ProForma::NamedMod

Named modification with optional CV prefix hint.

Represents a modification specified by name, optionally with a CV prefix hint to disambiguate which database to search (e.g., "U:Oxidation" for UniMod, "M:Oxidation" for PSI-MOD).

Collaboration diagram for ProForma::NamedMod:
[legend]
Class Members
optional< CvDatabase > cv_hint Optional CV prefix hint (U, M, R, X, G)
String name The modification name (e.g., "Oxidation", "Phospho")

◆ OpenMS::ProForma::FormulaTag

struct OpenMS::ProForma::FormulaTag

Chemical formula with optional charge.

Represents a modification specified by chemical formula. The optional charge is specified via the :z+N suffix in ProForma (e.g., Formula:C12H20O2:z+2).

Collaboration diagram for ProForma::FormulaTag:
[legend]
Class Members
optional< int > charge Optional charge from :z+N suffix.
String formula_string The chemical formula string (e.g., "C12H20O2")

◆ OpenMS::ProForma::GlycanComposition

struct OpenMS::ProForma::GlycanComposition

Glycan composition specification.

Represents a glycan modification as a composition of monosaccharides. Each component can be either a named monosaccharide (e.g., "Hex", "HexNAc") or a custom formula specification.

Example: Glycan:HexNAc1Hex2 -> [(HexNAc, 1), (Hex, 2)]

Collaboration diagram for ProForma::GlycanComposition:
[legend]
Class Members
typedef variant< String, FormulaTag > Monosaccharide A monosaccharide component: either a name (String) or a custom formula (FormulaTag)
Class Members
vector< pair< Monosaccharide, int > > components List of (monosaccharide, count) pairs.

◆ OpenMS::ProForma::InfoTag

struct OpenMS::ProForma::InfoTag

Info tag for arbitrary text annotations.

Represents an INFO: tag in ProForma that carries arbitrary text metadata about a modification or site. Example: INFO:provenance_data

Collaboration diagram for ProForma::InfoTag:
[legend]
Class Members
String text The info text content.

◆ OpenMS::ProForma::PositionConstraint

struct OpenMS::ProForma::PositionConstraint

Position constraint specifying allowed residues for a modification.

Represents a Position: tag in ProForma that constrains where a modification can be localized. This is typically used as an alternative to a modification to indicate its possible sites.

Example: [Oxidation|Position:M] means Oxidation can only occur at M residues

Collaboration diagram for ProForma::PositionConstraint:
[legend]
Class Members
bool c_term = false True if modification can be at C-terminus.
bool n_term = false True if modification can be at N-terminus.
vector< char > residues List of allowed amino acid residues.

◆ OpenMS::ProForma::Modification

struct OpenMS::ProForma::Modification

A modification with one or more alternative tags.

In ProForma, a modification can have multiple alternatives separated by |, representing uncertainty about the exact modification. Each alternative consists of a tag and an optional label.

Example: K[Phospho|+79.97] has two alternatives

The resolved_mod field is populated by resolveModifications() and points to the ResidueModification in ModificationsDB (for the first/primary alternative).

Collaboration diagram for ProForma::Modification:
[legend]
Class Members
vector< pair< ModificationTag, optional< Label > > > alternatives Each alternative is a (tag, optional_label) pair.
const ResidueModification * resolved_mod = nullptr

Resolved modification pointer (populated by resolveModifications) Points to the ResidueModification for the first alternative, if found

◆ OpenMS::ProForma::SequenceElement

struct OpenMS::ProForma::SequenceElement

A single amino acid with its modifications.

Represents one position in the peptide sequence: the amino acid residue and zero or more modifications attached to it.

Collaboration diagram for ProForma::SequenceElement:
[legend]
Class Members
char amino_acid Single-letter amino acid code (A-Z)
vector< Modification > modifications Modifications at this position.

◆ OpenMS::ProForma::AmbiguousRegion

struct OpenMS::ProForma::AmbiguousRegion

Ambiguous amino acid region.

Represents a region where the amino acid sequence is uncertain. ProForma notation: (?DQ) means either D or Q at this position.

Collaboration diagram for ProForma::AmbiguousRegion:
[legend]
Class Members
vector< SequenceElement > elements The ambiguous amino acid possibilities.

◆ OpenMS::ProForma::ModifiedRange

struct OpenMS::ProForma::ModifiedRange

Modified sequence range with shared modifications.

Represents a subsequence where one or more modifications apply to the entire range, but the exact position is uncertain.

ProForma notation: (EOSFORMS)[+19.0523] means +19.0523 applies somewhere in EOSFORMS

Collaboration diagram for ProForma::ModifiedRange:
[legend]
Class Members
vector< SequenceElement > elements The amino acids in the range.
vector< Modification > modifications Modifications applying to the entire range.

◆ OpenMS::ProForma::UnlocalisedMod

struct OpenMS::ProForma::UnlocalisedMod

Unlocalised modification with optional occurrence count.

Represents a modification that is known to exist on the peptide but whose exact position is unknown. The occurrence specifies how many instances of this modification are present.

ProForma notation: [Phospho]?PEPTIDE or [Phospho]^2?PEPTIDE

Collaboration diagram for ProForma::UnlocalisedMod:
[legend]
Class Members
vector< Modification > modifications The unlocalised modification(s)
optional< int > occurrence Optional occurrence count from ^N suffix.

◆ OpenMS::ProForma::LabileModification

struct OpenMS::ProForma::LabileModification

Labile modification that may be lost during fragmentation.

Labile modifications are typically lost during ionization or fragmentation and thus may not be observed in MS/MS spectra.

ProForma notation: {Glycan:Hex}PEPTIDE

Collaboration diagram for ProForma::LabileModification:
[legend]
Class Members
Modification modification The labile modification.

◆ OpenMS::ProForma::GlobalModification

struct OpenMS::ProForma::GlobalModification

Global modification applied to specific locations.

A global modification applies the same modification to all occurrences of specified residues or termini throughout the peptide.

ProForma notation: <[TMT6plex]@K,N-term>

Collaboration diagram for ProForma::GlobalModification:
[legend]
Class Members
vector< String > locations Target locations ("K", "N-term", "C-term:K", etc.)
Modification modification The modification to apply.

◆ OpenMS::ProForma::IsotopeReplacement

struct OpenMS::ProForma::IsotopeReplacement

Isotope replacement for stable isotope labeling.

Represents global replacement of an element with a specific isotope, used for stable isotope labeling experiments.

ProForma notation: <13C> or <15N> or <D>

Collaboration diagram for ProForma::IsotopeReplacement:
[legend]
Class Members
String isotope The isotope specification (e.g., "13C", "15N", "D")

◆ OpenMS::ProForma::AdductIon

struct OpenMS::ProForma::AdductIon

Adduct ion specification for charge state.

Represents an adduct ion contributing to the charge state of a peptidoform ion. Multiple adducts can combine to give the total charge.

ProForma notation: Na:z+1 in /[Na:z+1,H:z+1]

Collaboration diagram for ProForma::AdductIon:
[legend]
Class Members
int charge The charge contribution of this adduct.
String formula The adduct formula (e.g., "Na", "H", "K")
optional< int > occurrence Optional occurrence count from ^N suffix.

◆ OpenMS::ProForma::Peptidoform

struct OpenMS::ProForma::Peptidoform

A single peptidoform (one peptide chain)

Represents a complete peptide chain including:

  • Optional name identifier (from v2.1 extension)
  • Global modifications (<13C>, <[TMT6plex]@K>)
  • Unlocalised modifications ([Phospho]?)
  • Labile modifications ({Glycan:Hex})
  • N-terminal modifications ([Acetyl]-)
  • The amino acid sequence with modifications
  • C-terminal modifications (-[Amidated])
Collaboration diagram for ProForma::Peptidoform:
[legend]
Class Members
vector< Modification > c_term_mods C-terminal modifications: -[Amidated].
optional< ChargeState > charge Optional per-chain charge (for chimeric spectra)
vector< GlobalModEntry > global_mods Global modifications: <13C>, <[TMT6plex]@K>
vector< LabileModification > labile_mods Labile modifications: {Glycan:Hex}.
vector< Modification > n_term_mods N-terminal modifications: [Acetyl]-.
optional< String > name Optional name from (>name) v2.1 extension.
vector< SequenceSection > sequence The sequence with modifications.
vector< UnlocalisedMod > unlocalised_mods Unlocalised modifications: [Phospho]?

◆ OpenMS::ProForma::PeptidoformIon

struct OpenMS::ProForma::PeptidoformIon

A peptidoform ion (one or more chains with optional charge)

Represents one or more peptide chains that form a single ion species. Multiple chains can be present in cross-linked or multi-chain entities.

ProForma notation: chains are separated by //

Collaboration diagram for ProForma::PeptidoformIon:
[legend]
Class Members
vector< Peptidoform > chains One or more peptide chains (separated by // or + in ProForma)
optional< ChargeState > charge Optional charge state specification.
bool is_chimeric = false True if chains are chimeric (+ separator), false if cross-linked (//)
optional< String > name Optional name from (>>name) v2.1 extension.

◆ OpenMS::ProForma::CrossLinkGroup

struct OpenMS::ProForma::CrossLinkGroup

Cross-link group connecting sites across chains.

Groups together all sites that share a cross-link label. Each site is identified by its chain index and position within that chain.

Derived during parsing from matching #XL labels.

Collaboration diagram for ProForma::CrossLinkGroup:
[legend]
Class Members
String label The cross-link label (e.g., XL1)
vector< pair< size_t, size_t > > sites (chain_index, site_index) pairs