Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Chemistry

Especially for peptide/protein identification, a lot of data and data structures for chemical entities are needed. OpenMS offers classes for elements, formulas, peptides, etc. The classes described in this section can be found in the CHEMISTRY folder.

Elements

There is a representation of Elements implemented in OpenMS. The corresponding class is named Element. This class stores the relevant information about an element. The handling of the Elements is done by the class ElementDB, which is implemented as a singleton. This means there is only one instance of the class in OpenMS. This is straightforward because the Elements do not change during execution. Data stored in an Element spans its name, symbol, atomic weight, and isotope distribution beside others.

Example (Tutorial_Element.cpp):

const ElementDB * db = ElementDB::getInstance();
Element carbon = *db->getElement("Carbon"); // .getResidue("C") would also be ok
cout << carbon.getName() << " "
<< carbon.getSymbol() << " "
<< carbon.getMonoWeight() << " "
<< carbon.getAverageWeight() << endl;

Elements can be accessed by the ElementDB class. As it is implemented as a singleton, only a pointer of the singleton can be used, via getInstance(). The example program writes the following output to the console.

Carbon C 12 12.0107

EmpiricalFormula

The Elements described in the section above can be combined to empirical formulas. Application are the exact weights of molecules, like peptides and their isotopic distributions. The class supports a large number of operations like addition and subtraction. A simple example is given in the next few lines of code.

Example (Tutorial_EmpiricalFormula.cpp):

EmpiricalFormula methanol("CH3OH"), water("H2O");
EmpiricalFormula sum = methanol + water;
const Element * carbon = ElementDB::getInstance()->getElement("Carbon");
cout << sum << " "
<< sum.getNumberOf(carbon) << " "
<< sum.getAverageWeight() << endl;

Two instances of empirical formula are created. They are summed up, and some information about the new formula is printed to the terminal. The next lines show how to create and handle a isotopic distribution of a given formula.

IsotopeDistribution iso_dist = sum.getIsotopeDistribution(3);
for (IsotopeDistribution::ConstIterator it = iso_dist.begin(); it != iso_dist.end(); ++it)
{
cout << it->first << " " << it->second << endl;
}

The isotopic distribution can be simply accessed by the getIsotopeDistribution() function. The parameter of this function describes how many isotopes should be reported. In our case, 3 are enough, as the following numbers get very small. On larger molecules, or when one want to have the exact distribution, this number can be set much higher. The output of the code snipped might look like this.

O2CH6 1 50.0571
50 0.98387
51 0.0120698
52 0.00406

Residue

A residue is represented in OpenMS by the class Residue. It provides a container for the amino acids as well as some functionality. The class is able to provide information such as the isotope distribution of the residue, the average and monoisotopic weight. The residues can be identified by their full name, their three letter abbreviation or the single letter abbreviation. The residue can also be modified, which is implemented in the Modification class. Additional less frequently used parameters of a residue like the gas-phase basicity and pk values are also available.

Example (Tutorial_Residue.cpp):

const ResidueDB * res_db = ResidueDB::getInstance();
Residue lys = *res_db->getResidue("Lysine"); // .getResidue("K") would also be ok
cout << lys.getName() << " "
<< lys.getThreeLetterCode() << " "
<< lys.getOneLetterCode() << " "
<< lys.getAverageWeight() << endl;

This small example show how to create a instance of ResidueDB were all Residues are stored in. The amino acids themselves can be accessed via the getResidue function. ResidueDB reads its amino acid and modification data from share/OpenMS/CHEMISTRY/.

The output of the example would look like this

Lysine LYS K 146.188

AASequence

This class handles the amino acid sequences in OpenMS. A string of amino acid residues can be turned into a instance of AASequence to provide some commonly used operations and data. The implementation supports mathematical operations like addition or subtraction. Also, average and mono isotopic weight and isotope distributions are accessible.

Weights, formulas and isotope distribution can be calculated depending on the charge state (additional proton count in case of positive ions) and ion type. Therefore, the class allows for a flexible handling of amino acid strings.

A very simple example of handling amino acid sequence with AASequence is given in the next few lines.

Example (Tutorial_AASequence.cpp):

AASequence seq = AASequence::fromString("DFPIANGER");
AASequence prefix(seq.getPrefix(4));
AASequence suffix(seq.getSuffix(5));
cout << seq << " "
<< prefix << " "
<< suffix << " "
<< seq.getAverageWeight() << endl;

Not only the prefix, suffix and subsequence accession is supported, but also most of the features of EmpiricalFormulas and Residues given above. Additionally, a number of predicates like hasSuffix are supported. The output of the code snippet looks like this.

DFPIANGER DFPI ANGER 1018.08

TheoreticalSpectrumGenerator

This class implements a simple generator which generates tandem MS spectra from a given peptide charge combination. There are various options which influence the occurring ions and their intensities.

Example (Tutorial_TheoreticalSpectrumGenerator.cpp)

TheoreticalSpectrumGenerator tsg;
PeakSpectrum spec1, spec2;
AASequence peptide = AASequence::fromString("DFPIANGER");
// standard behavior is adding b- and y-ions of charge 1
Param p;
p.setValue("add_b_ions", "false", "Add peaks of b-ions to the spectrum");
tsg.setParameters(p);
tsg.getSpectrum(spec1, peptide, 1, 1);
p.setValue("add_b_ions", "true", "Add peaks of a-ions to the spectrum");
tsg.setParameters(p);
tsg.getSpectrum(spec2, peptide, 1, 2);
cout << "Spectrum 1 has " << spec1.size() << " peaks. " << endl;
cout << "Spectrum 2 has " << spec2.size() << " peaks. " << endl;

The example shows how to put peaks of a certain type, y-ions in this case, into a spectrum. Spectrum 2 is filled with a complete spectrum of all peaks (a-, b-, y-ions and losses). The TheoreticalSpectrumGenerator has many parameters which have a detailed description located in the class documentation. The output of the program looks like the following two lines.

Spectrum 1 has 8 peaks.
Spectrum 2 has 32 peaks.

OpenMS / TOPP release 2.3.0 Documentation generated on Tue Jan 9 2018 18:22:05 using doxygen 1.8.13