OpenMS
|
Assigns protein/peptide identifications to features or consensus features.
potential predecessor tools | → IDMapper → | potential successor tools |
---|---|---|
MascotAdapter (or other ID engines) | ConsensusID | |
IDFilter | MapAlignerIdentification |
The mapping is based on retention times and mass-to-charge values. Roughly, a peptide identification is assigned to a (consensus) feature if its position lies within the boundaries of the feature or close enough to the feature centroid. Peptide identifications that don't match anywhere are still recorded in the resulting map, as "unassigned peptides". Protein identifications are annotated to the whole map, i.e. not to any particular (consensus) feature.
In all cases, tolerance in RT and m/z dimension is applied according to the parameters rt_tolerance
and mz_tolerance
. Tolerance is understood as "plus or minus x", so the matching range is actually increased by twice the tolerance value.
If several features or consensus features overlap the position of a peptide identification (taking the allowed tolerances into account), the identification is annotated to all of them.
Annotation of feature maps (featureXML input):
If all features have at least one convex hull, peptide positions are matched against the bounding boxes of the convex hulls (of individual mass traces, if available) by default. If not, the positions of the feature centroids are used. The respective coordinates of the centroids are also used for matching (in place of the corresponding ranges from the bounding boxes) if feature:use_centroid_rt
or feature:use_centroid_mz
are true.
Annotation of consensus maps (consensusXML input):
Peptide positions are always matched against centroid positions. By default, the consensus centroids are used. However, if consensus:use_subelements
is set, the centroids of sub-features are considered instead. In this case, a peptide identification is mapped to a consensus feature if any of its sub-features matches.
The command line parameters of this tool are:
IDMapper -- Assigns protein/peptide identifications to features or consensus features. Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_IDMapper.html Version: 3.2.0 Nov 26 2024, 13:16:38, Revision: 962e60f To cite OpenMS: + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7. Usage: IDMapper <options> Options (mandatory options marked with '*'): -id <file>* Protein/peptide identifications file (valid formats: 'mzid', 'idXML') -in <file>* Feature map/consensus map file (valid formats: 'featureXML', 'consensusX ML') -out <file>* Output file (the format depends on the input file format). (valid format s: 'featureXML', 'consensusXML') -rt_tolerance <value> RT tolerance (in seconds) for the matching of peptide identifications and (consensus) features. Tolerance is understood as 'plus or minus x', so the matching range incr eases by twice the given value. (default: '5.0') (min: '0.0') -mz_tolerance <value> M/z tolerance (in ppm or Da) for the matching of peptide identifications and (consensus) features. Tolerance is understood as 'plus or minus x', so the matching range incr eases by twice the given value. (default: '20.0') (min: '0.0') -mz_measure <choice> Unit of 'mz_tolerance'. (default: 'ppm') (valid: 'ppm', 'Da') -mz_reference <choice> Source of m/z values for peptide identifications. If 'precursor', the precursor-m/z from the idXML is used. If 'peptide', masses are computed from the sequences of peptide hits; in this case, an identification matches if any of its hits matches. ('peptide' should be used together with 'feature:use_centroid_mz' to avoid false-positive matches.) (default: 'peptide') (valid: 'precursor', 'peptide') Additional options for featureXML input: -feature:use_centroid_rt <choice> Use the RT coordinates of the feature centroids for matching, instead of the RT ranges of the features/mass traces. (default: 'false') (valid: 'true', 'false') -feature:use_centroid_mz <choice> Use the m/z coordinates of the feature centroids for matching, instead of the m/z ranges of the features/mass traces. (If you choose 'peptide' as 'mz_reference', you should usually set this flag to avoid false-positive matches.) (default: 'true') (valid: 'true', 'false') Additional options for consensusXML input: -consensus:use_subelements Match using RT and m/z of sub-features instead of consensus RT and m/z. A consensus feature matches if any of its sub-features matches. Additional options for mzML input: -spectra:in <file> MS run used to annotated unidentified spectra to features or consensus features. (valid formats: 'mzML') Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced)
INI file documentation of this tool:
On the peptide side, two sources for m/z values are possible (see parameter mz_reference
): 1. m/z of the precursor of the MS2 spectrum that gave rise to the peptide identification;
theoretical masses computed from the amino acid sequences of peptide hits. (When using theoretical masses, make sure that peptide modifications were identified correctly. OpenMS currently "forgets" mass shifts that it can't assign to modifications - if that happens, masses computed from peptide sequences will be off.)
rt_delta
and mz_delta
have been renamed to rt_tolerance
and mz_tolerance
. The possible values of the mz_reference
parameter have also been renamed. The default value of mz_tolerance
has been increased from 1 ppm to a more realistic 20 ppm.use_centroids
parameter from previous versions has been split into two parameters, feature:use_centroid_rt
and feature:use_centroid_mz
. In OpenMS 1.6, peptide identifications would be matched only against monoisotopic mass traces of features if mz_reference
was PeptideMass
; otherwise, all mass traces would be used. This implicit behaviour has been abandoned, you can now explicitly control it with the feature:use_centroid_mz
parameter. feature:use_centroid_mz
does not take into account m/z deviations in the monoisotopic mass trace, but this can be compensated by increasing mz_tolerance
. The new implementation should work correctly even if the monoisotopic mass trace itself was not detected.