|
|
| AASequence () |
| Default constructor. More...
|
|
| AASequence (const AASequence &)=default |
| Copy constructor. More...
|
|
| AASequence (AASequence &&) noexcept=default |
| Move constructor. More...
|
|
virtual | ~AASequence () |
| Destructor. More...
|
|
AASequence & | operator= (const AASequence &)=default |
| Assignment operator. More...
|
|
AASequence & | operator= (AASequence &&)=default |
| Move assignment operator. More...
|
|
bool | empty () const |
| check if sequence is empty More...
|
|
|
String | toString () const |
| returns the peptide as string with modifications embedded in brackets More...
|
|
String | toUnmodifiedString () const |
| returns the peptide as string without any modifications or (e.g., "PEPTIDER") More...
|
|
String | toUniModString () const |
| returns the peptide as string with UniMod-style modifications embedded in brackets More...
|
|
String | toBracketString (bool integer_mass=true, bool mass_delta=false, const std::vector< String > &fixed_modifications=std::vector< String >()) const |
| create a TPP compatible string of the modified sequence using bracket notation. More...
|
|
void | setModification (Size index, const String &modification) |
|
void | setModification (Size index, const Residue *modification) |
| sets the modification of AA at index by providing an already, potentially modified residue More...
|
|
void | setModification (Size index, const ResidueModification *modification) |
|
void | setModification (Size index, const ResidueModification &modification) |
|
void | setModificationByDiffMonoMass (Size index, double diffMonoMass) |
|
void | setNTerminalModification (const String &modification) |
|
void | setNTerminalModification (const ResidueModification *modification) |
| sets the N-terminal modification More...
|
|
void | setNTerminalModification (const ResidueModification &mod) |
| sets the N-terminal modification (copies and adds to database if not present) More...
|
|
void | setNTerminalModificationByDiffMonoMass (double diffMonoMass, bool protein_term) |
| sets the N-terminal modification by the monoisotopic mass difference it introduces (creates a "user-defined" mod if not present) More...
|
|
const String & | getNTerminalModificationName () const |
| returns the name (ID) of the N-terminal modification, or an empty string if none is set More...
|
|
const ResidueModification * | getNTerminalModification () const |
| returns a pointer to the N-terminal modification, or zero if none is set More...
|
|
void | setCTerminalModification (const String &modification) |
|
void | setCTerminalModification (const ResidueModification *modification) |
| sets the C-terminal modification (must be present in the database) More...
|
|
void | setCTerminalModification (const ResidueModification &mod) |
| sets the C-terminal modification (copies and adds to database if not present) More...
|
|
void | setCTerminalModificationByDiffMonoMass (double diffMonoMass, bool protein_term) |
| sets the C-terminal modification by the monoisotopic mass difference it introduces (creates a "user-defined" mod if not present) More...
|
|
const String & | getCTerminalModificationName () const |
| returns the name (ID) of the C-terminal modification, or an empty string if none is set More...
|
|
const ResidueModification * | getCTerminalModification () const |
| returns a pointer to the C-terminal modification, or zero if none is set More...
|
|
const Residue & | getResidue (Size index) const |
| returns a pointer to the residue at position index More...
|
|
EmpiricalFormula | getFormula (Residue::ResidueType type=Residue::Full, Int charge=0) const |
| returns the formula of the peptide More...
|
|
double | getAverageWeight (Residue::ResidueType type=Residue::Full, Int charge=0) const |
| returns the average weight of the peptide More...
|
|
double | getMonoWeight (Residue::ResidueType type=Residue::Full, Int charge=0) const |
|
double | getMZ (Int charge, Residue::ResidueType type=Residue::Full) const |
|
const Residue & | operator[] (Size index) const |
| returns a pointer to the residue at given position More...
|
|
AASequence | operator+ (const AASequence &peptide) const |
| adds the residues of the peptide More...
|
|
AASequence & | operator+= (const AASequence &) |
| adds the residues of a peptide More...
|
|
AASequence | operator+ (const Residue *residue) const |
| adds the residues of the peptide More...
|
|
AASequence & | operator+= (const Residue *) |
| adds the residues of a peptide More...
|
|
Size | size () const |
| returns the number of residues More...
|
|
AASequence | getPrefix (Size index) const |
| returns a peptide sequence of the first index residues More...
|
|
AASequence | getSuffix (Size index) const |
| returns a peptide sequence of the last index residues More...
|
|
AASequence | getSubsequence (Size index, UInt number) const |
| returns a peptide sequence of number residues, beginning at position index More...
|
|
void | getAAFrequencies (std::map< String, Size > &frequency_table) const |
| compute frequency table of amino acids More...
|
|
|
bool | has (const Residue &residue) const |
| returns true if the peptide contains the given residue More...
|
|
bool | hasSubsequence (const AASequence &peptide) const |
|
bool | hasPrefix (const AASequence &peptide) const |
|
bool | hasSuffix (const AASequence &peptide) const |
|
bool | hasNTerminalModification () const |
| predicate which is true if the peptide is N-term modified More...
|
|
bool | hasCTerminalModification () const |
| predicate which is true if the peptide is C-term modified More...
|
|
bool | isModified () const |
| returns true if any of the residues or termini are modified More...
|
|
bool | operator== (const AASequence &rhs) const |
| equality operator. Two sequences are equal iff all amino acids including PTMs are equal More...
|
|
bool | operator< (const AASequence &rhs) const |
| lesser than operator which compares the C-term mods, sequence including PTMS and N-term mods; can be used for maps More...
|
|
bool | operator!= (const AASequence &rhs) const |
| inequality operator. Complement of equality operator. More...
|
|
|
Iterator | begin () |
|
ConstIterator | begin () const |
|
Iterator | end () |
|
ConstIterator | end () const |
|
|
std::vector< const Residue * > | peptide_ |
|
const ResidueModification * | n_term_mod_ |
|
const ResidueModification * | c_term_mod_ |
|
std::ostream & | operator<< (std::ostream &os, const AASequence &peptide) |
| writes a peptide to an output stream More...
|
|
std::istream & | operator>> (std::istream &is, const AASequence &peptide) |
| reads a peptide from an input stream More...
|
|
static AASequence | fromString (const String &s, bool permissive=true) |
| create AASequence object by parsing an OpenMS string More...
|
|
static AASequence | fromString (const char *s, bool permissive=true) |
| create AASequence object by parsing a C string (character array) More...
|
|
static String::ConstIterator | parseModRoundBrackets_ (const String::ConstIterator str_it, const String &str, AASequence &aas, const ResidueModification::TermSpecificity &specificity) |
| Parses modifications in round brackets (an identifier) More...
|
|
static String::ConstIterator | parseModSquareBrackets_ (const String::ConstIterator str_it, const String &str, AASequence &aas, const ResidueModification::TermSpecificity &specificity) |
| Parses modifications in square brackets (a mass) More...
|
|
static void | parseString_ (const String &peptide, AASequence &aas, bool permissive=true) |
|
Representation of a peptide/protein sequence.
This class represents amino acid sequences in OpenMS. An AASequence instance primarily contains a sequence of residues. The sequence is represented as a vector of pointers to instances of Residue. Each amino acid has only one instance, which is accessible using the ResidueDB instance (singleton).
To create an AASequence instance for a specific amino acid sequence, use the AASequence::fromString function. For example, AASequence::fromString(".DFPIANGER.")
produces an instance of AASequence for the peptide "DFPIANGER". Please note that both the N- and the C-terminal are explicitly represented by dots.
A critical property of amino acid sequences is that they can be modified. Which means that one or more amino acids are chemically modified, e.g. oxidized. This is represented via Residue instances which carry a ResidueModification object. This is also handled in the ResidueDB.
Modifications are specified using a unique string identifier present in the ModificationsDB in round brackets after the modified amino acid or by providing the mass of the residue in square brackets. For example AASequence::fromString(".DFPIAM(Oxidation)GER.")
creates an instance of the peptide "DFPIAMGER" with an oxidized methionine (AASequence::fromString(".DFPIAM(UniMod:35)GER.")
, AASequence::fromString(".DFPIAM[+16]GER.")
and AASequence::fromString(".DFPIAM[147]GER.")
are all equivalent). N- and C-terminal modifications are represented by brackets to the right of the dots terminating the sequence. For example, ".(Dimethyl)DFPIAMGER."
and ".DFPIAMGER.(Label:18O(2))"
represent the labelling of the N- and C-terminus respectively, but ".DFPIAMGER(Phospho)."
will be interpreted as a phosphorylation of the last arginine at its side chain.
Note there is a subtle difference between AASequence::fromString(".DFPIAM[+16]GER.")
and AASequence::fromString(".DFPIAM[+15.9949]GER.")
– while the former will try to find the first modification matching to a mass difference of 16 +/- 0.5, the latter will try to find the closest matching modification to the exact mass. This usually gives the intended results while the first approach may not.
Arbitrary/unknown amino acids (usually due to an unknown modification) can be specified using tags preceded by X: "X[weight]". This indicates a new amino acid ("X") with the specified weight, e.g. "RX[148.5]T"". Note that this tag does not alter the amino acids to the left (R) or right (T). Rather, X represents an amino acid on its own. Be careful when converting such AASequence objects to an EmpiricalFormula using getFormula(), as tags will not be considered in this case (there exists no formula for them). However, they have an influence on getMonoWeight() and getAverageWeight()!
- Note
- For C/N terminal modifications, the absolute mass is assumed to be 1 (H) for the N-terminus and 17 (OH) for the C-terminus, therefore a modification specified as absolute n[43]PEPTIDE would translate to n[+42]PEPTIDE using relative masses. Note that there can be ambiguity in cases where the loss includes the terminal amino acids.