OpenMS
InclusionExclusionListCreator

A tool for creating inclusion and/or exclusion lists for LC-MS/MS.

potential predecessor tools → InclusionExclusionListCreator → potential successor tools
MascotAdapter (or other ID engines) -
FeatureFinderCentroided

Currently this tool can create tab-delimited inclusion or exclusion lists (m/z, RT start, RT stop). The input can either be peptide identifications from previous runs, a feature map or a FASTA-file with proteins. Inclusion and exclusion charges can be specified for FASTA and idXML input. If no charges are specified in the case of peptide id input, only the charge state of the peptide id is in/excluded, otherwise all given charge states are entered to the list.

InclusionExclusionListCreator has different strategies for inclusion list creation: 'FeatureBased_LP', 'ProteinBased_LP' and 'ALL'. In the ALL mode all features are put onto the list. The FeatureBased_LP, which was designed for MALDI data, maximizes the number of features in the inclusion list given the constraints that for each RT fraction a maximal number of precursors is not exceeded and each feature is scheduled at most a fixed number of times. In this mode, the sum of normalized feature intensities is maximized so that for each feature high intensity RTs are favoured over lower intensity ones. The ProteinBased_LP uses RT and detectability prediction methods to predict features that are most likely to be identified by MS/MS. Both LP methods are described in more detail in a recent publication: Zerck et al.: Optimal precursor ion selection for LC-MALDI MS/MS (BMC Bioinformatics 2013).

The RT window size can be specified in the RT section of the INI file, either as relative window with [rt-rel_rt_window_size*rt,rt+rel_rt_window_size*rt] or absolute window.

The default is RT in minutes, but seconds can also be used (see INI file).

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

InclusionExclusionListCreator -- Creates inclusion and/or exclusion lists.
Full documentation: http://www.openms.de/doxygen/release/3.0.0/html/TOPP_InclusionExclusionListCreator.html
Version: 3.0.0 Jul 14 2023, 11:57:33, Revision: be787e9
To cite OpenMS:
 + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
   mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  InclusionExclusionListCreator <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -include <file>              Inclusion list input file in FASTA or featureXML format. (valid formats: 'feat
                               ureXML', 'fasta')
  -exclude <file>              Exclusion list input file in featureXML, idXML or FASTA format. (valid formats
                               : 'featureXML', 'idXML', 'fasta')
  -out <file>*                 Output file (tab delimited csv file). (valid formats: 'csv')
  -rt_model <file>             RTModel file used for the rt prediction of peptides in FASTA files. (valid 
                               formats: 'txt')
  -pt_model <file>             PTModel file used for the pt prediction of peptides in FASTA files (only neede
                               d for inclusion_strategy PreotinBased_LP). (valid formats: 'txt')
  -inclusion_charges <charge>  List containing the charge states to be considered for the inclusion list comp
                               ounds, space separated. (min: '1')
  -inclusion_strategy <name>   Strategy to be used for selection (default: 'ALL') (valid: 'FeatureBased_LP', 
                               'ProteinBased_LP', 'ALL')
  -exclusion_charges <charge>  List containing the charge states to be considered for the exclusion list comp
                               ounds (for idXML and FASTA input), space separated. (min: '1')
  -raw_data <mzMLFile>         File containing the raw data (only needed for FeatureBased_LP). (valid formats
                               : 'mzML')
                               
Common TOPP options:
  -ini <file>                  Use the given TOPP INI file
  -threads <n>                 Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>            Writes the default configuration file
  --help                       Shows options
  --helphelp                   Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Inclusion/Exclusion algorithm section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/release/3.0.0/html/TOPP_InclusionExclusionListCreator.html

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+InclusionExclusionListCreatorCreates inclusion and/or exclusion lists.
version3.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'InclusionExclusionListCreator'
include Inclusion list input file in FASTA or featureXML format.input file*.featureXML, *.fasta
exclude Exclusion list input file in featureXML, idXML or FASTA format.input file*.featureXML, *.idXML, *.fasta
out Output file (tab delimited csv file).output file*.csv
rt_model RTModel file used for the rt prediction of peptides in FASTA files.input file*.txt
pt_model PTModel file used for the pt prediction of peptides in FASTA files (only needed for inclusion_strategy PreotinBased_LP).input file*.txt
inclusion_charges[] List containing the charge states to be considered for the inclusion list compounds, space separated.1:∞
inclusion_strategyALL strategy to be used for selectionFeatureBased_LP, ProteinBased_LP, ALL
exclusion_charges[] List containing the charge states to be considered for the exclusion list compounds (for idXML and FASTA input), space separated.1:∞
raw_data File containing the raw data (only needed for FeatureBased_LP).input file*.mzML
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++algorithmInclusion/Exclusion algorithm section
++++InclusionExclusionList
missed_cleavages0 Number of missed cleavages used for protein digestion.
+++++RT
unitminutes Create lists with units as seconds instead of minutesminutes, seconds
use_relativetrue Use relative RT window, which depends on RT of precursor.true, false
window_relative0.05 [for RT:use_relative == true] The relative factor X for the RT exclusion window, e.g. the window is calculated as [rt - rt*X, rt + rt*X].0.0:10.0
window_absolute90.0 [for RT:use_relative == false] The absolute value X for the RT exclusion window in [sec], e.g. the window is calculated as [rt - X, rt + X].0.0:∞
+++++merge
mz_tol10.0 Two inclusion/exclusion windows are merged when they (almost) overlap in RT (see 'rt_tol') and are close in m/z by this tolerance. Unit of this is defined in 'mz_tol_unit'.0.0:∞
mz_tol_unitppm Unit of 'mz_tol'ppm, Da
rt_tol1.1 Maximal RT delta (in seconds) which would allow two windows in RT to overlap (which causes merging the windows). Two inclusion/exclusion windows are merged when they (almost) overlap in RT and are close in m/z by this tolerance (see 'mz_tol'). Unit of this param is [seconds].0.0:∞
++++PrecursorSelection
ms2_spectra_per_rt_bin5 Number of allowed MS/MS spectra in a retention time bin.1:∞
exclude_overlapping_peaksfalse If true, overlapping or nearby peaks (within 'min_mz_peak_distance') are excluded for selection.true, false
+++++Exclusion
use_dynamic_exclusionfalse If true dynamic exclusion is applied.true, false
exclusion_time100.0 The time (in seconds) a feature is excluded.0.0:∞
+++++ProteinBasedInclusion
max_list_size1000 The maximal number of precursors in the inclusion list.1:∞
++++++rt
min_rt960.0 Minimal rt in seconds.0.0:∞
max_rt3840.0 Maximal rt in seconds.0.0:∞
rt_step_size30.0 rt step size in seconds.1.0:∞
rt_window_size100 rt window size in seconds.1:∞
++++++thresholds
min_protein_id_probability0.95 Minimal protein probability for a protein to be considered identified.0.0:1.0
min_pt_weight0.5 Minimal pt weight of a precursor0.0:1.0
min_mz500.0 Minimal mz to be considered in protein based LP formulation.0.0:∞
max_mz5000.0 Minimal mz to be considered in protein based LP formulation.0.0:∞
use_peptide_rulefalse Use peptide rule instead of minimal protein id probabilitytrue, false
min_peptide_ids2 If use_peptide_rule is true, this parameter sets the minimal number of peptide ids for a protein id1:∞
min_peptide_probability0.95 If use_peptide_rule is true, this parameter sets the minimal probability for a peptide to be safely identified0.0:1.0
+++++feature_based
no_intensity_normalizationfalse Flag indicating if intensities shall be scaled to be in [0,1]. This is done for each feature separately, so that the feature's maximal intensity in a spectrum is set to 1.true, false
max_number_precursors_per_feature1 The maximal number of precursors per feature.1:∞