OpenMS
RTPredict

This application is used to predict retention times for peptides or peptide separation.

This methods and applications of this model are described in several publications:

Nico Pfeifer, Andreas Leinenbach, Christian G. Huber and Oliver Kohlbacher Statistical learning of peptide retention behavior in chromatographic separations: A new kernel-based approach for computational proteomics. BMC Bioinformatics 2007, 8:468

Nico Pfeifer, Andreas Leinenbach, Christian G. Huber and Oliver Kohlbacher Improving Peptide Identification in Proteome Analysis by a Two-Dimensional Retention Time Filtering Approach J. Proteome Res. 2009, 8(8):4109-15

The input of this application is an svm model and a file with peptide identifications (idXML or text). The svm model file is specified by the svm_model parameter in the command line or the INI file. This file should have been produced by the RTModel application.
For retention time prediction the peptide sequences are extracted from the idXML/text inputfile and passed to the svm. The svm then predicts retention times according to the trained model. The predicted retention times are stored as

<userParam name="predicted_retention_time" value="<predicted retention time>" />

inside the peptide entities in the idXML output file.

For separation prediction you have to specify two output file names. 'out_id:positive' is the filename of the peptides which are predicted to be collected by the column and 'out_id:negative' is the file of the predicted flowthrough peptides.

Retention time prediction and separation prediction cannot be combined!

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

RTPredict -- Predicts retention times for peptides using a model trained by RTModel.
Full documentation: http://www.openms.de/doxygen/release/3.0.0/html/TOPP_RTPredict.html
Version: 3.0.0 Jul 14 2023, 11:57:33, Revision: be787e9
To cite OpenMS:
 + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
   mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  RTPredict <options>

Options (mandatory options marked with '*'):
  -in_id <file>                Peptides with precursor information (valid formats: 'idXML')
  -in_text <file>              Peptides as text-based file (valid formats: 'txt')
  -in_oligo_params <file>      Input file with additional model parameters when using the OLIGO kernel (valid
                                formats: 'paramXML')
  -in_oligo_trainset <file>    Input file with the used training dataset when using the OLIGO kernel (valid 
                               formats: 'txt')
  -svm_model <file>*           Svm model in libsvm format (can be produced by RTModel) (valid formats: 'txt')

  -total_gradient_time <time>  The time (in seconds) of the gradient (peptide RT prediction) (default: '1.0')
                                (min: '1.0e-05')

Output files in idXML format:
  -out_id:file <file>          Output file with peptide RT prediction (valid formats: 'idXML')
  -out_id:positive <file>      Output file in idXML format containing positive predictions for peptide separa
                               tion prediction - requires 'out_id:negative' to be present as well. (valid 
                               formats: 'idXML')
  -out_id:negative <file>      Output file in idXML format containing negative predictions for peptide separa
                               tion prediction - requires 'out_id:positive' to be present as well. (valid 
                               formats: 'idXML')

Output files in text format:
  -out_text:file <file>        Output file with predicted RT values (valid formats: 'csv')

                               
Common TOPP options:
  -ini <file>                  Use the given TOPP INI file
  -threads <n>                 Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>            Writes the default configuration file
  --help                       Shows options
  --helphelp                   Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+RTPredictPredicts retention times for peptides using a model trained by RTModel.
version3.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'RTPredict'
in_id Peptides with precursor informationinput file*.idXML
in_text Peptides as text-based fileinput file*.txt
in_oligo_params input file with additional model parameters when using the OLIGO kernelinput file*.paramXML
in_oligo_trainset input file with the used training dataset when using the OLIGO kernelinput file*.txt
svm_model svm model in libsvm format (can be produced by RTModel)input file*.txt
total_gradient_time1.0 The time (in seconds) of the gradient (peptide RT prediction)1.0e-05:∞
max_number_of_peptides100000 The maximum number of peptides considered at once (bigger number will lead to faster results but needs more memory).
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++out_idOutput files in idXML format
file Output file with peptide RT predictionoutput file*.idXML
positive Output file in idXML format containing positive predictions for peptide separation prediction - requires 'out_id:negative' to be present as well.output file*.idXML
negative Output file in idXML format containing negative predictions for peptide separation prediction - requires 'out_id:positive' to be present as well.output file*.idXML
rewrite_peptideidentification_rtmzfalse Rewrites each peptideidentification's rt and mz from prediction and calculation (according to the best hit)true, false
+++out_textOutput files in text format
file Output file with predicted RT valuesoutput file*.csv
Todo:
This needs serious clean up! Combining certain input and output options will result in strange behaviour, especially when using text output/input.