Class for the enzymatic digestion of proteins. More...
#include <OpenMS/CHEMISTRY/EnzymaticDigestion.h>
Public Types | |
enum | Specificity { SPEC_FULL, SPEC_SEMI, SPEC_NONE, SIZE_OF_SPECIFICITY } |
when querying for valid digestion products, this determines if the specificity of the two peptide ends is considered important More... | |
Public Member Functions | |
EnzymaticDigestion () | |
Default constructor. More... | |
EnzymaticDigestion (const EnzymaticDigestion &rhs) | |
Copy constructor. More... | |
EnzymaticDigestion & | operator= (const EnzymaticDigestion &rhs) |
Assignment operator. More... | |
Size | getMissedCleavages () const |
Returns the number of missed cleavages for the digestion. More... | |
void | setMissedCleavages (Size missed_cleavages) |
Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used. More... | |
String | getEnzymeName () const |
Returns the enzyme for the digestion. More... | |
void | setEnzyme (const String name) |
Sets the enzyme for the digestion. More... | |
Specificity | getSpecificity () const |
Returns the specificity for the digestion. More... | |
void | setSpecificity (Specificity spec) |
Sets the specificity for the digestion (default is SPEC_FULL). More... | |
void | digest (const AASequence &protein, std::vector< AASequence > &output) const |
Performs the enzymatic digestion of a protein. More... | |
void | digestUnmodifiedString (const StringView sequence, std::vector< StringView > &output, Size min_length=1, Size max_length=0) const |
Performs the enzymatic digestion of a unmodified protein String. By returning only references into the original string it is much faster. max_length restricts the maximum length of reported peptides (0 = no restriction) More... | |
Size | peptideCount (const AASequence &protein) |
Returns the number of peptides a digestion of protein would yield under the current enzyme and missed cleavage settings. More... | |
bool | isValidProduct (const AASequence &protein, Size pep_pos, Size pep_length, bool methionine_cleavage=false, bool ignore_missed_cleavages=true) const |
Returns true if peptide at position pep_pos with length pep_length within protein protein was generated by the current model. More... | |
bool | isValidProduct (const String &protein, Size pep_pos, Size pep_length, bool methionine_cleavage=false, bool ignore_missed_cleavages=true) const |
Static Public Member Functions | |
static Specificity | getSpecificityByName (const String &name) |
Static Public Attributes | |
static const std::string | NamesOfSpecificity [SIZE_OF_SPECIFICITY] |
Names of the Specificity. More... | |
static const std::string | UnspecificCleavage |
Name for unspecific cleavage. More... | |
Protected Member Functions | |
std::vector< Size > | tokenize_ (const String &protein) const |
Returns the naive cleavage site positions without specificity (including '0' as first position, but not size() as last) More... | |
Size | countMissedCleavages_ (const std::vector< Size > &cleavage_positions, Size pep_start, Size pep_end) const |
Protected Attributes | |
Size | missed_cleavages_ |
Number of missed cleavages. More... | |
Enzyme | enzyme_ |
Used enzyme. More... | |
Specificity | specificity_ |
specificity of enzyme More... | |
Class for the enzymatic digestion of proteins.
Digestion can be performed using simple regular expressions, e.g. [KR] | [^P] for trypsin. Also missed cleavages can be modeled, i.e. adjacent peptides are not cleaved due to enzyme malfunction/access restrictions. If n missed cleavages are given, all possible resulting peptides (cleaved and uncleaved) with up to n missed cleavages are returned. Thus no random selection of just n specific missed cleavage sites is performed.
An alternative model is also available in EnzymaticDigestionLogModel.
enum Specificity |
Default constructor.
EnzymaticDigestion | ( | const EnzymaticDigestion & | rhs | ) |
Copy constructor.
|
inlineprotected |
cleavage_positions | Positons of cleavage in protein as obtained from tokenize_() |
pep_start | Index into protein sequence |
pep_end | Past-the-end index into protein sequence |
void digest | ( | const AASequence & | protein, |
std::vector< AASequence > & | output | ||
) | const |
Performs the enzymatic digestion of a protein.
void digestUnmodifiedString | ( | const StringView | sequence, |
std::vector< StringView > & | output, | ||
Size | min_length = 1 , |
||
Size | max_length = 0 |
||
) | const |
Performs the enzymatic digestion of a unmodified protein String. By returning only references into the original string it is much faster. max_length restricts the maximum length of reported peptides (0 = no restriction)
Referenced by SimpleSearchEngine::main_(), and RNPxlSearch::main_().
String getEnzymeName | ( | ) | const |
Returns the enzyme for the digestion.
Size getMissedCleavages | ( | ) | const |
Returns the number of missed cleavages for the digestion.
Specificity getSpecificity | ( | ) | const |
Returns the specificity for the digestion.
|
static |
convert spec string name to enum returns SIZE_OF_SPECIFICITY if name
is not valid
bool isValidProduct | ( | const AASequence & | protein, |
Size | pep_pos, | ||
Size | pep_length, | ||
bool | methionine_cleavage = false , |
||
bool | ignore_missed_cleavages = true |
||
) | const |
Returns true if peptide at position pep_pos
with length pep_length
within protein protein
was generated by the current model.
Referenced by IDFilter::DigestionFilter::operator()().
bool isValidProduct | ( | const String & | protein, |
Size | pep_pos, | ||
Size | pep_length, | ||
bool | methionine_cleavage = false , |
||
bool | ignore_missed_cleavages = true |
||
) | const |
Returns true if peptide at position pep_pos
with length pep_length
within protein protein
was generated by the current model
EnzymaticDigestion& operator= | ( | const EnzymaticDigestion & | rhs | ) |
Assignment operator.
Size peptideCount | ( | const AASequence & | protein | ) |
Returns the number of peptides a digestion of protein
would yield under the current enzyme and missed cleavage settings.
void setEnzyme | ( | const String | name | ) |
Sets the enzyme for the digestion.
Referenced by TOPPOpenPepXLLF::main_(), SimpleSearchEngine::main_(), TOPPOpenPepXL::main_(), and RNPxlSearch::main_().
void setMissedCleavages | ( | Size | missed_cleavages | ) |
Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used.
Referenced by TOPPOpenPepXLLF::main_(), SimpleSearchEngine::main_(), TOPPOpenPepXL::main_(), and RNPxlSearch::main_().
void setSpecificity | ( | Specificity | spec | ) |
Sets the specificity for the digestion (default is SPEC_FULL).
Returns the naive cleavage site positions without specificity (including '0' as first position, but not size() as last)
|
protected |
Used enzyme.
|
protected |
Number of missed cleavages.
|
static |
Names of the Specificity.
|
protected |
specificity of enzyme
|
static |
Name for unspecific cleavage.
OpenMS / TOPP release 2.3.0 | Documentation generated on Tue Jan 9 2018 18:22:08 using doxygen 1.8.13 |