This tutorial will give you an overview of how to use the peak intensity prediction (PIP). In general, PIP allows you to predict the peak intensity of a peptide relative to other peptides of the same abundance from its sequence alone. At the same time, this value allows to correct peak intensities for peptide-specific instrument sensitivity in a label-free quantitation application.
This method is still in an early phase: A proof of concept has been conducted and published in [1]. Peak intensities can be predicted with significant correlations, but application tests are yet to come.
The sensitivity of a mass spectrometer depends on the analysed peptides, among other factors. This peptide-specific sensitivity causes peak heights of peptides with the same abundance to be generally different. PIP incorporates a model that maps peptide sequences to peptide-specific sensitivities.
The incorporated model has been adapted with a Local Linear Map [2] - a machine learning algorithm that uses both supervised and unsupervised learning in its training, and which is fast and easy to implement. Better results can be achieved with other learning architectures [3], however, these are not implemented in this prototype stage yet.
The model which the PIP module uses has been trained with data from a Bruker Ultraflex MALDI-TOF instrument. Details about these data can be found with [3]. A Pearson's squared correlation of 0.43 in ten-fold cross-validation and of 0.34 across datasets from the same instrument (but with different settings and operating persons) could be achieved. There is no experience yet about the performance across instruments. So we would be pleased if you could share your experience with the model incorporated in PIP applied to other datasets.
At this point, it is not possible to train a model with your own data, but it is a planned feature. It is as of yet unknown how similar peptide-specific sensitivities behave between different MALDI instruments.
PIP lets you predict intensities using peptide sequences as input. The output values have been normalized to a mean of 0 and variance 1.
To test PIP with data from your instrument, MALDI spectra that contain only peptides of one protein can be used:
To calculate relative peptide abundance (relative to those of the other peptides in the mixture) from intensities of a peptide mixture using values predicted by PIP, do above steps 2. to 4. Then calculate the peptide level x = exp(log(tI) - pI). !!! The quantification with an actual protein mixture has never been tested with this model.
There is a usage example for the PeakIntensityPredictor class in doc/code_examples/Tutorial_PeakIntensityPredictor.cpp
.
Sequences of peptides to be predicted should be stored in a vector of AASequence instances:
Then create an instance of the model, and predict the peak intensities of the peptides:
You can output AASequence instances like normal strings:
[1]
:Wiebke Timm: Peak Intensity Prediction in Mass Spectra using Machine Learning Methods, PhD Thesis (2008) [2]
:Helge Ritter: Learning with Self-Organizing Map, Artificial Neural Networks, In T. Kohonen et al., eds.: Artificial Neural Networks, Elsevier Science Publishers (1991), 379-384 [3]
:W. Timm, A. Scherbart, S. Böcker, O. Kohlbacher, T.W. Nattkemper: Peak Intensity Prediction in MALDI-TOF Mass Spectrometry: A Machine Learning Study to support Quantitative Proteomics, BMC Bioinformatics (2008)
OpenMS / TOPP release 2.3.0 | Documentation generated on Thu Jan 4 2018 02:04:07 using doxygen 1.8.13 |