OpenSwathRTNormalizer

The OpenSwathRTNormalizer will find retention time peptides in data.

This tool will take a description of RT peptides and their normalized retention time to write out a transformation file on how to transoform the RT space into the normalized space.

The command line parameters of this tool are:

OpenSwathRTNormalizer -- This tool will take a description of RT peptides and their normalized retention time
to write out a transformation file on how to transform the RT space into the normalized space.
Version: 2.3.0 Jan  9 2018, 17:46:23, Revision: 38ae115

Usage:
  OpenSwathRTNormalizer <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option.

Options (mandatory options marked with '*'):
  -in <files>*            Input files separated by blank (valid formats: 'mzML')
  -tr <file>*             Transition file with the RT peptides ('TraML' or 'csv') (valid formats: 'csv', 'tra
                          ML')
  -out <file>*            Output file (valid formats: 'trafoXML')
  -rt_norm <file>         RT normalization file (how to map the RTs of this run to the ones stored in the 
                          library) (valid formats: 'trafoXML')
  -min_rsq <double>       Minimum r-squared of RT peptides regression (default: '0.95')
  -min_coverage <double>  Minimum relative amount of RT peptides to keep (default: '0.6')
  -estimateBestPeptides   Whether the algorithms should try to choose the best peptides based on their peak 
                          shape for normalization. Use this option you do not expect all your peptides to be
                          detected in a sample and too many 'bad' peptides enter the outlier removal step
                          (e.g. due to them being endogenous peptides or using a less curated list of peptide
                          s).
                          
Common TOPP options:
  -ini <file>             Use the given TOPP INI file
  -threads <n>            Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>       Writes the default configuration file
  --help                  Shows options
  --helphelp              Shows all options (including advanced)

The following configuration subsections are valid:
 - RTNormalization     Parameters for the RTNormalization. RT normalization and outlier detection can be done
                       iteratively (by default) which removes one outlier per iteration or using the RANSAC
                       algorithm.
 - algorithm           Algorithm parameters section
 - peptideEstimation   Parameters for the peptide estimation (use -estimateBestPeptides to enable).

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.

The algorithm parameters for the Analyzer filter are:

Legend:

required parameter

advanced parameter

+OpenSwathRTNormalizerThis tool will take a description of RT peptides and their normalized retention time to write out a transformation file on how to transform the RT space into the normalized space.

version2.3.0 Version of the tool that generated this parameters file.

++1Instance '1' section for 'OpenSwathRTNormalizer'

in[] Input files separated by blankinput file*.mzML

tr transition file with the RT peptides ('TraML' or 'csv')input file*.csv,*.traML

out output fileoutput file*.trafoXML

rt_norm RT normalization file (how to map the RTs of this run to the ones stored in the library)input file*.trafoXML

min_rsq0.95 Minimum r-squared of RT peptides regression

min_coverage0.6 Minimum relative amount of RT peptides to keep

estimateBestPeptidesfalse Whether the algorithms should try to choose the best peptides based on their peak shape for normalization. Use this option you do not expect all your peptides to be detected in a sample and too many 'bad' peptides enter the outlier removal step (e.g. due to them being endogenous peptides or using a less curated list of peptides).true,false

log Name of log file (created only when specified)

debug0 Sets the debug level

threads1 Sets the number of threads allowed to be used by the TOPP tool

no_progressfalse Disables progress logging to command linetrue,false

forcefalse Overwrite tool specific checks.true,false

testfalse Enables the test mode (needed for internal use only)true,false

+++RTNormalizationParameters for the RTNormalization. RT normalization and outlier detection can be done iteratively (by default) which removes one outlier per iteration or using the RANSAC algorithm.

outlierMethoditer_residual Which outlier detection method to use (valid: 'iter_residual', 'iter_jackknife', 'ransac', 'none'). Iterative methods remove one outlier at a time. Jackknife approach optimizes for maximum r-squared improvement while 'iter_residual' removes the datapoint with the largest residual error (removal by residual is computationally cheaper, use this with lots of peptides).iter_residual,iter_jackknife,ransac,none

useIterativeChauvenetfalse Whether to use Chauvenet's criterion when using iterative methods. This should be used if the algorithm removes too many datapoints but it may lead to true outliers being retained.true,false

RANSACMaxIterations1000 Maximum iterations for the RANSAC outlier detection algorithm.

RANSACMaxPercentRTThreshold3 Maximum threshold in RT dimension for the RANSAC outlier detection algorithm (in percent of the total gradient). Default is set to 3% which is around +/- 4 minutes on a 120 gradient.

RANSACSamplingSize10 Sampling size of data points per iteration for the RANSAC outlier detection algorithm.

+++algorithmAlgorithm parameters section

stop_report_after_feature-1 Stop reporting after feature (ordered by quality; -1 means do not stop).

rt_extraction_window-1 Only extract RT around this value (-1 means extract over the whole range, a value of 500 means to extract around +/- 500 s of the expected elution). For this to work, the TraML input file needs to contain normalized RT values.

rt_normalization_factor1 The normalized RT is expected to be between 0 and 1. If your normalized RT has a different range, pass this here (e.g. it goes from 0 to 100, set this value to 100)

quantification_cutoff0 Cutoff in m/z below which peaks should not be used for quantification any more0:∞

write_convex_hullfalse Whether to write out all points of all features into the featureXMLtrue,false

add_up_spectra1 Add up spectra around the peak apex (needs to be a non-even integer)1:∞

spacing_for_spectra_resampling0.005 If spectra are to be added, use this spacing to add them up0:∞

uis_threshold_sn-1 S/N threshold to consider identification transition (set to -1 to consider all)

uis_threshold_peak_area0 Peak area threshold to consider identification transition (set to -1 to consider all)

++++TransitionGroupPicker

stop_after_feature-1 Stop finding after feature (ordered by intensity; -1 means do not stop).

stop_after_intensity_ratio0.0001 Stop after reaching intensity ratio

min_peak_width-1 Minimal peak width (s), discard all peaks below this value (-1 means no action).

background_subtractionnone Try to apply a background subtraction to the peak (experimental). The background is estimated at the peak boundaries, either the smoothed or the raw chromatogram data can be used for that.none,smoothed,original

recalculate_peaksfalse Tries to get better peak picking by looking at peak consistency of all picked peaks. Tries to use the consensus (median) peak border if theof variation within the picked peaks is too large.true,false

use_precursorsfalse Use precursor chromatogram for peak pickingtrue,false

recalculate_peaks_max_z1 Determines the maximal Z-Score (difference measured in standard deviations) that is considered too large for peak boundaries. If the Z-Score is above this value, the median is used for peak boundaries (default value 1.0).

minimal_quality-10000 Only if compute_peak_quality is set, this parameter will not consider peaks below this quality threshold

resample_boundary15 For computing peak quality, how many extra seconds should be sample left and right of the actual peak

compute_peak_qualityfalse Tries to compute a quality value for each peakgroup and detect outlier transitions. The resulting score is centered around zero and values above 0 are generally good and below -1 or -2 are usually bad.true,false

+++++PeakPickerMRM

sgolay_frame_length15 The number of subsequent data points used for smoothing.
This number has to be uneven. If it is not, 1 will be added.

sgolay_polynomial_order3 Order of the polynomial that is fitted.

gauss_width50 Gaussian width in seconds, estimated peak size.

use_gausstrue Use Gaussian filter for smoothing (alternative is Savitzky-Golay filter)false,true

peak_width-1 Force a certain minimal peak_width on the data (e.g. extend the peak at least by this amount on both sides) in seconds. -1 turns this feature off.

signal_to_noise1 Signal-to-noise threshold at which a peak will not be extended any more. Note that setting this too high (e.g. 1.0) can lead to peaks whose flanks are not fully captured.0:∞

sn_win_len1000 Signal to noise window length.

sn_bin_count30 Signal to noise bin count.

write_sn_log_messagestrue Write out log messages of the signal-to-noise estimator in case of sparse windows or median in rightmost histogram bintrue,false

remove_overlapping_peaksfalse Try to remove overlapping peaks during peak pickingfalse,true

methodcorrected Which method to choose for chromatographic peak-picking (OpenSWATH legacy on raw data, corrected picking on smoothed chromatogram or Crawdad on smoothed chromatogram).legacy,corrected,crawdad

++++DIAScoring

dia_extraction_window0.05 DIA extraction window in Th.0:∞

dia_centroidedfalse Use centroded DIA data.true,false

dia_byseries_intensity_min300 DIA b/y series minimum intensity to consider.0:∞

dia_byseries_ppm_diff10 DIA b/y series minimal difference in ppm to consider.0:∞

dia_nr_isotopes4 DIA nr of isotopes to consider.0:∞

dia_nr_charges4 DIA nr of charges to consider.0:∞

peak_before_mono_max_ppm_diff20 DIA maximal difference in ppm to count a peak at lower m/z when searching for evidence that a peak might not be monoisotopic.0:∞

++++EMGScoring

interpolation_step0.2 Sampling rate for the interpolation of the model function.

tolerance_stdev_bounding_box3 Bounding box has range [minimim of data, maximum of data] enlarged by tolerance_stdev_bounding_box times the standard deviation of the data.

max_iteration500 Maximum number of iterations using by Levenberg-Marquardt algorithm.

+++++statistics

mean1 Centroid position of the model.

variance1 Variance of the model.

++++Scores

use_shape_scoretrue Use the shape score (this score measures the similarity in shape of the transitions using a cross-correlation)true,false

use_coelution_scoretrue Use the coelution score (this score measures the similarity in coelution of the transitions using a cross-correlation)true,false

use_rt_scoretrue Use the retention time score (this score measure the difference in retention time)true,false

use_library_scoretrue Use the library scoretrue,false

use_elution_model_scoretrue Use the elution model (EMG) score (this score fits a gaussian model to the peak and checks the fit)true,false

use_intensity_scoretrue Use the intensity scoretrue,false

use_nr_peaks_scoretrue Use the number of peaks scoretrue,false

use_total_xic_scoretrue Use the total XIC scoretrue,false

use_sn_scoretrue Use the SN (signal to noise) scoretrue,false

use_dia_scorestrue Use the DIA (SWATH) scores. If turned off, will not use fragment ion spectra for scoring.true,false

use_ms1_correlationfalse Use the correlation scores with the MS1 elution profilestrue,false

use_sonar_scoresfalse Use the scores for SONAR scans (scanning swath)true,false

use_ms1_fullscanfalse Use the full MS1 scan at the peak apex for scoring (ppm accuracy of precursor and isotopic pattern)true,false

use_uis_scoresfalse Use UIS scores for peptidoform identification true,false

+++peptideEstimationParameters for the peptide estimation (use -estimateBestPeptides to enable).

InitialQualityCutoff0.5 The initial overall quality cutoff for a peak to be scored (range ca. -2 to 2)

OverallQualityCutoff5.5 The overall quality cutoff for a peak to go into the retention time estimation (range ca. 0 to 10)

NrRTBins10 Number of RT bins to use to compute coverage. This option should be used to ensure that there is a complete coverage of the RT space (this should detect cases where only a part of the RT gradient is actually covered by normalization peptides)

MinPeptidesPerBin1 Minimal number of peptides that are required for a bin to counted as 'covered'

MinBinsFilled8 Minimal number of bins required to be covered