OpenMS
FeatureFinderCentroided

The feature detection application for quantitation (centroided).

pot. predecessor tools → FeatureFinderCentroided → pot. successor tools
PeakPickerHiRes FeatureLinkerUnlabeled
(or another feature grouping tool)
SeedListGenerator MapAlignerPoseClustering
(or another alignment tool)

Reference:
Weisser et al.: An automated pipeline for high-throughput label-free quantitative proteomics (J. Proteome Res., 2013, PMID: 23391308).

This module identifies "features" in a LC/MS map. By feature, we understand a peptide in a MS sample that reveals a characteristic isotope distribution. The algorithm computes positions in rt and m/z dimension and a charge estimate of each peptide.

The algorithm identifies pronounced regions of the data around so-called seeds. In the next step, we iteratively fit a model of the isotope profile and the retention time to these data points. Data points with a low probability under this model are removed from the feature region. The intensity of the feature is then given by the sum of the data points included in its regions.

How to find suitable parameters and details of the different algorithms implemented are described in the "TOPP tutorial" (on https://openms.readthedocs.io/).

Specialized tools are available for some experimental techniques: IsobaricAnalyzer.

The command line parameters of this tool are:

FeatureFinderCentroided -- Detects two-dimensional features in LC-MS data.
Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_FeatureFinderCentroided.html
Version: 3.2.0 Nov 26 2024, 13:16:38, Revision: 962e60f
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.
To cite FeatureFinderCentroided:
 + Sturm M. A novel feature detection algorithm for centroided data. Dissertation, 2010-09-15, p.37 ff. doi:h
   ttps://publikationen.uni-tuebingen.de/xmlui/bitstream/handle/10900/49453/pdf/Dissertation_Marc_Sturm.pdf.

Usage:
  FeatureFinderCentroided <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -in <file>*        Input file (valid formats: 'mzML')
  -out <file>*       Output file (valid formats: 'featureXML')
  -seeds <file>      User specified seed list (valid formats: 'featureXML')
                     
                     
Common TOPP options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/release/3.2.0/html/TOPP_FeatureFinderCentroided.html

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+FeatureFinderCentroidedDetects two-dimensional features in LC-MS data.
version3.2.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'FeatureFinderCentroided'
in input fileinput file*.mzML
out output fileoutput file*.featureXML
seeds User specified seed listinput file*.featureXML
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++algorithmAlgorithm section
debugfalse When debug mode is activated, several files with intermediate results are written to the folder 'debug' (do not use in parallel mode).true, false
++++intensitySettings for the calculation of a score indicating if a peak's intensity is significant in the local environment (between 0 and 1)
bins10 Number of bins per dimension (RT and m/z). The higher this value, the more local the intensity significance score is.
This parameter should be decreased, if the algorithm is used on small regions of a map.
1:∞
++++mass_traceSettings for the calculation of a score indicating if a peak is part of a mass trace (between 0 and 1).
mz_tolerance0.03 Tolerated m/z deviation of peaks belonging to the same mass trace.
It should be larger than the m/z resolution of the instrument.
This value must be smaller than that 1/charge_high!
0.0:∞
min_spectra10 Number of spectra that have to show a similar peak mass in a mass trace.1:∞
max_missing1 Number of consecutive spectra where a high mass deviation or missing peak is acceptable.
This parameter should be well below 'min_spectra'!
0:∞
slope_bound0.1 The maximum slope of mass trace intensities when extending from the highest peak.
This parameter is important to separate overlapping elution peaks.
It should be increased if feature elution profiles fluctuate a lot.
0.0:∞
++++isotopic_patternSettings for the calculation of a score indicating if a peak is part of a isotopic pattern (between 0 and 1).
charge_low1 Lowest charge to search for.1:∞
charge_high4 Highest charge to search for.1:∞
mz_tolerance0.03 Tolerated m/z deviation from the theoretical isotopic pattern.
It should be larger than the m/z resolution of the instrument.
This value must be smaller than that 1/charge_high!
0.0:∞
intensity_percentage10.0 Isotopic peaks that contribute more than this percentage to the overall isotope pattern intensity must be present.0.0:100.0
intensity_percentage_optional0.1 Isotopic peaks that contribute more than this percentage to the overall isotope pattern intensity can be missing.0.0:100.0
optional_fit_improvement2.0 Minimal percental improvement of isotope fit to allow leaving out an optional peak.0.0:100.0
mass_window_width25.0 Window width in Dalton for precalculation of estimated isotope distributions.1.0:200.0
abundance_12C98.930000000000007 Rel. abundance of the light carbon. Modify if labeled.0.0:100.0
abundance_14N99.632000000000005 Rel. abundance of the light nitrogen. Modify if labeled.0.0:100.0
++++seedSettings that determine which peaks are considered a seed
min_score0.8 Minimum seed score a peak has to reach to be used as seed.
The seed score is the geometric mean of intensity score, mass trace score and isotope pattern score.
If your features show a large deviation from the averagene isotope distribution or from an gaussian elution profile, lower this score.
0.0:1.0
++++fitSettings for the model fitting
max_iterations500 Maximum number of iterations of the fit.1:∞
++++featureSettings for the features (intensity, quality assessment, ...)
min_score0.7 Feature score threshold for a feature to be reported.
The feature score is the geometric mean of the average relative deviation and the correlation between the model and the observed peaks.
0.0:1.0
min_isotope_fit0.8 Minimum isotope fit of the feature before model fitting.0.0:1.0
min_trace_score0.5 Trace score threshold.
Traces below this threshold are removed after the model fitting.
This parameter is important for features that overlap in m/z dimension.
0.0:1.0
min_rt_span0.333 Minimum RT span in relation to extended area that has to remain after model fitting.0.0:1.0
max_rt_span2.5 Maximum RT span in relation to extended area that the model is allowed to have.0.5:∞
rt_shapesymmetric Choose model used for RT profile fitting. If set to symmetric a gauss shape is used, in case of asymmetric an EGH shape is used.symmetric, asymmetric
max_intersection0.35 Maximum allowed intersection of features.0.0:1.0
reported_mzmonoisotopic The mass type that is reported for features.
'maximum' returns the m/z value of the highest mass trace.
'average' returns the intensity-weighted average m/z value of all contained peaks.
'monoisotopic' returns the monoisotopic m/z value derived from the fitted isotope model.
maximum, average, monoisotopic
++++user-seedSettings for user-specified seeds.
rt_tolerance5.0 Allowed RT deviation of seeds from the user-specified seed position.0.0:∞
mz_tolerance1.1 Allowed m/z deviation of seeds from the user-specified seed position.0.0:∞
min_score0.5 Overwrites 'seed:min_score' for user-specified seeds. The cutoff is typically a bit lower in this case.0.0:1.0
++++advanced
pseudo_rt_shift500.0 Pseudo RT shift used when .1.0:∞

For the parameters of the algorithm section see the algorithms documentation:
centroided
In the following table you can find example values of the most important parameters for different instrument types.
These parameters are not valid for all instruments of that type, but can be used as a starting point for finding suitable parameters.

'centroided' algorithm:

  Q-TOF LTQ Orbitrap
intensity:bins 10 10
mass_trace:mz_tolerance 0.02 0.004
isotopic_pattern:mz_tolerance 0.04 0.005

For the centroided algorithm centroided data is needed. In order to create centroided data from profile data use the PeakPickerHiRes.