OpenMS
SeedListGenerator

Application to generate seed lists for feature detection.

potential predecessor tools → SeedListGenerator → potential successor tools
IDFilter FeatureFinderCentroided
IDMapper
FeatureLinkerUnlabeled
(or another feature grouping algorithm)

Reference:
Weisser et al.: An automated pipeline for high-throughput label-free quantitative proteomics (J. Proteome Res., 2013, PMID: 23391308).

In feature detection algorithms, an early step is generally to identify points of interest in the LC-MS map (so-called seeds) that may later be extended to features. If supported by the feature detection algorithm (currently only the "centroided" algorithm), user-supplied seed lists allow greater control over this process.

The SeedListGenerator can automatically create seed lists from a variety of sources. The lists are exported in featureXML format - suitable as input to FeatureFinder -, but can be converted to or from text formats using the TextExporter (with "-minimal" option to convert to CSV) and FileConverter (to convert from CSV) tools.

Seed lists can be generated from the file types below. The seeds are created at the indicated positions (RT/MZ):

  • mzML: locations of MS2 precursors
  • idXML: locations of peptide identifications
  • featureXML: locations of unassigned peptide identifications
  • consensusXML: locations of consensus features that do not contain sub-features from the respective map

If input is consensusXML, one output file per constituent map is required (same order as in the consensusXML). Otherwise, exactly one output file.

What are possible use cases for custom seed lists?

  • In analyses that can take into account only features with peptide annotations, it may be useful to focus directly on certain locations in the LC-MS map - on all MS2 precursors (mzML input), or on precursors whose fragment spectra could be matched to a peptide sequence (idXML input).
  • When additional information becomes available during an analysis, one might want to perform a second, targeted round of feature detection on the experimental data. For example, once a feature map is annotated with peptide identifications, it is possible to go back to the LC-MS map and look for features near unassigned peptides, potentially with a lower score threshold (featureXML input).
  • Similarly, when features from different experiments are aligned and grouped, the consensus map may reveal where features were missed in the initial detection round in some experiments. The locations of these "holes" in the consensus map can be compiled into seed lists for the individual experiments (consensusXML input). (Note that the resulting seed lists use the retention time scale of the consensus map, which might be different from the original time scales of the experiments if e.g. one of the MapAligner tools was used to perform retention time correction as part of the alignment process. In this case, the RT transformations from the alignment must be applied to the LC-MS maps prior to the seed list-based feature detection runs.)
Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

SeedListGenerator -- Generates seed lists for feature detection.
Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_SeedListGenerator.html
Version: 3.2.0 Nov 26 2024, 13:16:38, Revision: 962e60f
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  SeedListGenerator <options>

Options (mandatory options marked with '*'):
  -in <file>*            Input file (see below for details) (valid formats: 'mzML', 'idXML', 'featureXML', 
                         'consensusXML')
  -out_prefix <prefix>*  Output file prefix (valid formats: 'featureXML')
                         
  -use_peptide_mass      [idXML input only] Use the monoisotopic mass of the best peptide hit for the m/z 
                         position (default: use precursor m/z)
                         
Common TOPP options:
  -ini <file>            Use the given TOPP INI file
  -threads <n>           Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>      Writes the default configuration file
  --help                 Shows options
  --helphelp             Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+SeedListGeneratorGenerates seed lists for feature detection.
version3.2.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'SeedListGenerator'
in Input file (see below for details)input file*.mzML, *.idXML, *.featureXML, *.consensusXML
out_prefix Output file prefixoutput prefix
use_peptide_massfalse [idXML input only] Use the monoisotopic mass of the best peptide hit for the m/z position (default: use precursor m/z)true, false
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false