Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
ConsensusID

Computes a consensus from results of multiple peptide identification engines.

potential predecessor tools $ \longrightarrow $ ConsensusID $ \longrightarrow $ potential successor tools
IDPosteriorErrorProbability PeptideIndexer
IDFilter
IDMapper

Reference:

Nahnsen et al.: Probabilistic consensus scoring improves tandem mass spectrometry peptide identification (J. Proteome Res., 2011, PMID: 21644507).

Algorithms:

ConsensusID offers several algorithms that can aggregate results from multiple peptide identification engines ("search engines") into consensus identifications - typically one per MS2 spectrum. This works especially well for search engines that provide more than one peptide hit per spectrum, i.e. that report not just the best hit, but also a list of runner-up candidates with corresponding scores.

The available algorithms are (see also OpenMS::ConsensusIDAlgorithm and its subclasses):

PEPs for search results can be calculated using the IDPosteriorErrorProbability tool, which supports a variety of search engines.

Note
Important: All protein-level identification results will be lost by applying ConsensusID. (It is unclear how potentially conflicting protein-level results from different search engines should be combined.) If necessary, run the PeptideIndexer tool to add protein references for peptides again.
Peptides with different post-translational modifications (PTMs), or with different site localizations of the same PTMs, are treated as different peptides by all algorithms. However, a qualification applies for the PEPMatrix algorithm: The similarity scoring method used there can only take unmodified peptide sequences into account, so PTMs are ignored during that step. However, the PTMs are not removed from the peptides, and there will be separate results for differently-modified peptides.

File types:

Different input files types are supported:

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

Filtering:

Generally, search results can be filtered according to various criteria using IDFilter before (or after) applying this tool. ConsensusID itself offers only a limited number of filtering options that are especially useful in its context (see the filter parameter section):

The command line parameters of this tool are:

ConsensusID -- Computes a consensus of peptide identifications of several identification engines.
Version: 2.3.0 Jan  9 2018, 17:46:23, Revision: 38ae115

Usage:
  ConsensusID <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option.

Options (mandatory options marked with '*'):
  -in <file>*                       Input file (valid formats: 'idXML', 'featureXML', 'consensusXML')
  -out <file>*                      Output file (valid formats: 'idXML', 'featureXML', 'consensusXML')
                                    
  -rt_delta <value>                 [idXML input only] Maximum allowed retention time deviation between ident
                                    ifications belonging to the same spectrum. (default: '0.1' min: '0')
  -mz_delta <value>                 [idXML input only] Maximum allowed precursor m/z deviation between identi
                                    fications belonging to the same spectrum. (default: '0.1' min: '0')

Options for filtering peptide hits:
  -filter:considered_hits <number>  The number of top hits in each ID run that are considered for consensus 
                                    scoring ('0' for all hits). (default: '0' min: '0')
  -filter:min_support <value>       For each peptide hit from an ID run, the fraction of other ID runs that 
                                    must support that hit (otherwise it is removed). (default: '0' min: '0'
                                    max: '1')
  -filter:count_empty               Count empty ID runs (i.e. those containing no peptide hit for the current
                                    spectrum) when calculating 'min_support'?

  -algorithm <choice>               Algorithm used for consensus scoring.
                                    * PEPMatrix: Scoring based on posterior error probabilities (PEPs) and p
                                    eptide sequence similarities (scored by a substitution matrix). Requires
                                    PEPs as scores.
                                    * PEPIons: Scoring based on posterior error probabilities (PEPs) and fra
                                    gment ion similarities ('shared peak count'). Requires PEPs as scores.
                                    * best: For each peptide ID, use the best score of any search engine as
                                    the consensus score. Requires the same score type in all ID runs.
                                    ...
                                    t', 'average', 'ranks')
                                    
Common TOPP options:
  -ini <file>                       Use the given TOPP INI file
  -threads <n>                      Sets the number of threads allowed to be used by the TOPP tool (default: 
                                    '1')
  -write_ini <file>                 Writes the default configuration file
  --help                            Shows options
  --helphelp                        Shows all options (including advanced)

The following configuration subsections are valid:
 - PEPIons     PEPIons algorithm parameters
 - PEPMatrix   PEPMatrix algorithm parameters

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+ConsensusIDComputes a consensus of peptide identifications of several identification engines.
version2.3.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'ConsensusID'
in input fileinput file*.idXML,*.featureXML,*.consensusXML
out output fileoutput file*.idXML,*.featureXML,*.consensusXML
rt_delta0.1 [idXML input only] Maximum allowed retention time deviation between identifications belonging to the same spectrum.0:∞
mz_delta0.1 [idXML input only] Maximum allowed precursor m/z deviation between identifications belonging to the same spectrum.0:∞
algorithmPEPMatrix Algorithm used for consensus scoring.
* PEPMatrix: Scoring based on posterior error probabilities (PEPs) and peptide sequence similarities (scored by a substitution matrix). Requires PEPs as scores.
* PEPIons: Scoring based on posterior error probabilities (PEPs) and fragment ion similarities ('shared peak count'). Requires PEPs as scores.
* best: For each peptide ID, use the best score of any search engine as the consensus score. Requires the same score type in all ID runs.
* worst: For each peptide ID, use the worst score of any search engine as the consensus score. Requires the same score type in all ID runs.
* average: For each peptide ID, use the average score of all search engines as the consensus. Requires the same score type in all ID runs.
* ranks: Calculates a consensus score based on the ranks of peptide IDs in the results of different search engines. The final score is in the range (0, 1], with 1 being the best score. No requirements about score types.
PEPMatrix,PEPIons,best,worst,average,ranks
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overwrite tool specific checks.true,false
testfalse Enables the test mode (needed for internal use only)true,false
+++filterOptions for filtering peptide hits
considered_hits0 The number of top hits in each ID run that are considered for consensus scoring ('0' for all hits).0:∞
min_support0 For each peptide hit from an ID run, the fraction of other ID runs that must support that hit (otherwise it is removed).0:1
count_emptyfalse Count empty ID runs (i.e. those containing no peptide hit for the current spectrum) when calculating 'min_support'?true,false
+++PEPIonsPEPIons algorithm parameters
mass_tolerance0.5 Maximum difference between fragment masses (in Da) for fragments to be considered 'shared' between peptides .0:∞
min_shared2 The minimal number of 'shared' fragments (between two suggested peptides) that is necessary to evaluate the similarity based on shared peak count (SPC).1:∞
+++PEPMatrixPEPMatrix algorithm parameters
matrixidentity Substitution matrix to use for alignment-based similarity scoringidentity,PAM30MS
penalty5 Alignment gap penalty (the same value is used for gap opening and extension)1:∞

OpenMS / TOPP release 2.3.0 Documentation generated on Tue Jan 9 2018 18:22:05 using doxygen 1.8.13