Central class for simulation of mass spectrometry experiments.
This implementation is an extended and rewritten version of the concepts and ideas presented in:
Ole Schulz-Trieglaff, Nico Pfeifer, Clemens Gropl, Oliver Kohlbacher, and Knut Reinert.
LC-MSsim - A simulation software for liquid chromatography mass spectrometry data.
BMC Bioinformatics 9:423, 2008.
Name | Type | Default | Restrictions | Description |
Digestion:enzyme |
string | Trypsin |
Arg-C/P, Asp-N/B, Asp-N_ambic, Formic_acid, Lys-C, Arg-C, Chymotrypsin/P, Chymotrypsin, CNBr, Lys-N, Lys-C/P, PepsinA, TrypChymo, Trypsin/P, V8-DE, V8-E, Alpha-lytic protease, glutamyl endopeptidase, proline endopeptidase, leukocyte elastase, 2-iodobenzoate, iodosobenzoate, staphylococcal protease/D, proline-endopeptidase/HKR, Glu-C+P, PepsinA + P, cyanogen-bromide, Clostripain/P, elastase-trypsin-chymotrypsin, no cleavage, unspecific cleavage, Asp-N, Trypsin | Enzyme to use for digestion (select 'no cleavage' to skip digestion) |
Digestion:model |
string | naive |
trained, naive | The cleavage model to use for digestion. 'Trained' is based on a log likelihood model (see DOI:10.1021/pr060507u). |
Digestion:min_peptide_length |
int | 3 |
min: 1 | Minimum peptide length after digestion (shorter ones will be discarded) |
Digestion:model_trained:threshold |
float | 0.5 |
min: -2 max: 4 | Model threshold for calling a cleavage. Higher values increase the number of cleavages. -2 will give no cleavages, +4 almost full cleavage. |
Digestion:model_naive:missed_cleavages |
int | 1 |
min: 0 | Maximum number of missed cleavages considered. All possible resulting peptides will be created. |
RT:rt_column |
string | HPLC |
none, HPLC, CE | Modelling of an RT or CE column |
RT:auto_scale |
string | true |
true, false | Scale predicted RT's/MT's to given 'total_gradient_time'? If 'true', for CE this means that 'CE:lenght_d', 'CE:length_total', 'CE:voltage' have no influence. |
RT:total_gradient_time |
float | 2500 |
min: 1e-05 | The duration [s] of the gradient. |
RT:sampling_rate |
float | 2 |
min: 0.01 max: 60 | Time interval [s] between consecutive scans |
RT:scan_window:min |
float | 500 |
min: 0 | Start of RT Scan Window [s] |
RT:scan_window:max |
float | 1500 |
min: 1 | End of RT Scan Window [s] |
RT:variation:feature_stddev |
int | 3 |
| Standard deviation of shift in retention time [s] from predicted model (applied to every single feature independently) |
RT:variation:affine_offset |
int | 0 |
| Global offset in retention time [s] from predicted model |
RT:variation:affine_scale |
int | 1 |
| Global scaling in retention time from predicted model |
RT:column_condition:distortion |
int | 0 |
min: 0 max: 10 | Distortion of the elution profiles. Good presets are 0 for a perfect elution profile, 1 for a slightly distorted elution profile etc... For trapping instruments (e.g. Orbitrap) distortion should be >4. |
RT:profile_shape:width:value |
float | 9 |
min: 0 | Width of the Exponential Gaussian Hybrid distribution shape of the elution profile. This does not correspond directly to the width in [s]. |
RT:profile_shape:width:variance |
float | 1.6 |
min: 0 | Random component of the width (set to 0 to disable randomness), i.e. scale parameter for the lorentzian variation of the variance (Note: The scale parameter has to be >= 0). |
RT:profile_shape:skewness:value |
float | 0.1 |
| Asymmetric component of the EGH. Higher absolute(!) values lead to more skewness (negative values cause fronting, positive values cause tailing). Tau parameter of the EGH, i.e. time constant of the exponential decay of the Exponential Gaussian Hybrid distribution shape of the elution profile. |
RT:profile_shape:skewness:variance |
float | 0.3 |
min: 0 | Random component of skewness (set to 0 to disable randomness), i.e. scale parameter for the lorentzian variation of the time constant (Note: The scale parameter has to be > 0). |
RT:HPLC:model_file |
string | examples/simulation/RTPredict.model |
| SVM model for retention time prediction |
RT:CE:pH |
float | 3 |
min: 0 max: 14 | pH of buffer |
RT:CE:alpha |
float | 0.5 |
min: 0 max: 1 | Exponent Alpha used to calculate mobility |
RT:CE:mu_eo |
float | 0 |
min: 0 max: 5 | Electroosmotic flow |
RT:CE:lenght_d |
float | 70 |
min: 0 max: 1000 | Length of capillary [cm] from injection site to MS |
RT:CE:length_total |
float | 75 |
min: 0 max: 1000 | Total length of capillary [cm] |
RT:CE:voltage |
float | 1000 |
min: 0 | Voltage applied to capillary |
Detectability:dt_simulation_on |
string | false |
true, false | Modelling detectibility enabled? This can serve as a filter to remove peptides which ionize badly, thus reducing peptide count |
Detectability:min_detect |
float | 0.5 |
| Minimum peptide detectability accepted. Peptides with a lower score will be removed |
Detectability:dt_model_file |
string | examples/simulation/DTPredict.model |
| SVM model for peptide detectability prediction |
Ionization:esi:ionized_residues |
string list | [Arg, Lys, His] |
Ala, Cys, Asp, Glu, Phe, Gly, His, Ile, Lys, Leu, Met, Asn, Pro, Gln, Arg, Sec, Ser, Thr, Val, Trp, Tyr | List of residues (as three letter code) that will be considered during ES ionization. The N-term is always assumed to carry a charge. This parameter will be ignored during MALDI ionization. |
Ionization:esi:charge_impurity |
string list | [H+:1] |
| List of charged ions that contribute to charge with weight of occurrence (their sum is scaled to 1 internally), e.g. ['H:1'] or ['H:0.7' 'Na:0.3'], ['H:4' 'Na:1'] (which internally translates to ['H:0.8' 'Na:0.2']) |
Ionization:esi:max_impurity_set_size |
int | 3 |
| Maximal @#combinations of charge impurities allowed (each generating one feature) per charge state. E.g. assuming charge=3 and this parameter is 2, then we could choose to allow '3H+, 2H+Na+' features (given a certain 'charge_impurity' constraints), but no '3H+, 2H+Na+, 3Na+' |
Ionization:esi:ionization_probability |
float | 0.8 |
| Probability for the binomial distribution of the ESI charge states |
Ionization:maldi:ionization_probabilities |
float list | [0.9, 0.1] |
| List of probabilities for the different charge states during MALDI ionization (the list must sum up to 1.0) |
Ionization:mz:lower_measurement_limit |
float | 200 |
min: 0 | Lower m/z detector limit. |
Ionization:mz:upper_measurement_limit |
float | 2500 |
min: 0 | Upper m/z detector limit. |
RawSignal:enabled |
string | true |
true, false | Enable RAW signal simulation? (select 'false' if you only need feature-maps) |
RawSignal:peak_shape |
string | Gaussian |
Gaussian, Lorentzian | Peak Shape used around each isotope peak (be aware that the area under the curve is constant for both types, but the maximal height will differ (~ 2:3 = Lorentz:Gaussian) due to the wider base of the Lorentzian. |
RawSignal:resolution:value |
int | 50000 |
| Instrument resolution at 400 Th. |
RawSignal:resolution:type |
string | linear |
constant, linear, sqrt | How does resolution change with increasing m/z?! QTOFs usually show 'constant' behavior, FTs have linear degradation, and on Orbitraps the resolution decreases with square root of mass. |
RawSignal:baseline:scaling |
float | 0 |
min: 0 | Scale of baseline. Set to 0 to disable simulation of baseline. |
RawSignal:baseline:shape |
float | 0.5 |
min: 0 | The baseline is modeled by an exponential probability density function (pdf) with f(x) = shape*e^(- shape*x) |
RawSignal:mz:sampling_points |
int | 3 |
min: 2 | Number of raw data points per FWHM of the peak. |
RawSignal:contaminants:file |
string | examples/simulation/contaminants.csv |
| Contaminants file with sum formula and absolute RT interval. See 'OpenMS/examples/simulation/contaminants.txt' for details. |
RawSignal:variation:mz:error_stddev |
float | 0 |
| Standard deviation for m/z errors. Set to 0 to disable simulation of m/z errors. |
RawSignal:variation:mz:error_mean |
float | 0 |
| Average systematic m/z error (Da) |
RawSignal:variation:intensity:scale |
float | 100 |
min: 0 | Constant scale factor of the feature intensity. Set to 1.0 to get the real intensity values provided in the FASTA file. |
RawSignal:variation:intensity:scale_stddev |
float | 0 |
min: 0 | Standard deviation of peak intensity (relative to the scaled peak height). Set to 0 to get simple rescaled intensities. |
RawSignal:noise:shot:rate |
float | 0 |
min: 0 | Poisson rate of shot noise per unit m/z. Set this to 0 to disable simulation of shot noise. |
RawSignal:noise:shot:intensity-mean |
float | 1 |
| Shot noise intensity mean (exponentially distributed with given mean). |
RawSignal:noise:white:mean |
float | 0 |
| Mean value of white noise being added to each measured signal. |
RawSignal:noise:white:stddev |
float | 0 |
| Standard deviation of white noise being added to each measured signal. |
RawSignal:noise:detector:mean |
float | 0 |
| Mean value of the detector noise being added to the complete measurement. |
RawSignal:noise:detector:stddev |
float | 0 |
| Standard deviation of the detector noise being added to the complete measurement. |
RawTandemSignal:status |
string | disabled |
disabled, precursor, MS^E | Create Tandem-MS scans? |
RawTandemSignal:tandem_mode |
int | 0 |
min: 0 max: 2 | Algorithm to generate the tandem-MS spectra. 0 - fixed intensities, 1 - SVC prediction (abundant/missing), 2 - SVR prediction of peak intensity
|
RawTandemSignal:svm_model_set_file |
string | examples/simulation/SvmModelSet.model |
| File containing the filenames of SVM Models for different charge variants |
RawTandemSignal:Precursor:ms2_spectra_per_rt_bin |
int | 5 |
min: 1 | Number of allowed MS/MS spectra in a retention time bin. |
RawTandemSignal:Precursor:min_mz_peak_distance |
float | 2 |
min: 0.0001 | The minimal distance (in Th) between two peaks for concurrent selection for fragmentation. Also used to define the m/z width of an exclusion window (distance +/- from m/z of precursor). If you set this lower than the isotopic envelope of a peptide, you might get multiple fragment spectra pointing to the same precursor. |
RawTandemSignal:Precursor:mz_isolation_window |
float | 2 |
min: 0 | All peaks within a mass window (in Th) of a selected peak are also selected for fragmentation. |
RawTandemSignal:Precursor:exclude_overlapping_peaks |
string | false |
true, false | If true, overlapping or nearby peaks (within 'min_mz_peak_distance') are excluded for selection. |
RawTandemSignal:Precursor:charge_filter |
int list | [2, 3] |
min: 1 max: 5 | Charges considered for MS2 fragmentation. |
RawTandemSignal:Precursor:Exclusion:use_dynamic_exclusion |
string | false |
true, false | If true dynamic exclusion is applied. |
RawTandemSignal:Precursor:Exclusion:exclusion_time |
float | 100 |
min: 0 | The time (in seconds) a feature is excluded. |
RawTandemSignal:Precursor:ProteinBasedInclusion:max_list_size |
int | 1000 |
min: 1 | The maximal number of precursors in the inclusion list. |
RawTandemSignal:Precursor:ProteinBasedInclusion:rt:min_rt |
float | 960 |
min: 0 | Minimal rt in seconds. |
RawTandemSignal:Precursor:ProteinBasedInclusion:rt:max_rt |
float | 3840 |
min: 0 | Maximal rt in seconds. |
RawTandemSignal:Precursor:ProteinBasedInclusion:rt:rt_step_size |
float | 30 |
min: 1 | rt step size in seconds. |
RawTandemSignal:Precursor:ProteinBasedInclusion:rt:rt_window_size |
int | 100 |
min: 1 | rt window size in seconds. |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:min_protein_id_probability |
float | 0.95 |
min: 0 max: 1 | Minimal protein probability for a protein to be considered identified. |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:min_pt_weight |
float | 0.5 |
min: 0 max: 1 | Minimal pt weight of a precursor |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:min_mz |
float | 500 |
min: 0 | Minimal mz to be considered in protein based LP formulation. |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:max_mz |
float | 5000 |
min: 0 | Minimal mz to be considered in protein based LP formulation. |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:use_peptide_rule |
string | false |
true, false | Use peptide rule instead of minimal protein id probability |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:min_peptide_ids |
int | 2 |
min: 1 | If use_peptide_rule is true, this parameter sets the minimal number of peptide ids for a protein id |
RawTandemSignal:Precursor:ProteinBasedInclusion:thresholds:min_peptide_probability |
float | 0.95 |
min: 0 max: 1 | If use_peptide_rule is true, this parameter sets the minimal probability for a peptide to be safely identified |
RawTandemSignal:MS_E:add_single_spectra |
string | false |
true, false | If true, the MS2 spectra for each peptide signal are included in the output (might be a lot). They will have a meta value 'MSE_DebugSpectrum' attached, so they can be filtered out. Native MS_E spectra will have 'MSE_Spectrum' instead. |
RawTandemSignal:TandemSim:Simple:add_isotopes |
string | false |
true, false | If set to 1 isotope peaks of the product ion peaks are added |
RawTandemSignal:TandemSim:Simple:max_isotope |
int | 2 |
| Defines the maximal isotopic peak which is added, add_isotopes must be set to 1 |
RawTandemSignal:TandemSim:Simple:add_metainfo |
string | false |
true, false | Adds the type of peaks as metainfo to the peaks, like y8+, [M-H2O+2H]++ |
RawTandemSignal:TandemSim:Simple:add_losses |
string | false |
true, false | Adds common losses to those ion expect to have them, only water and ammonia loss is considered |
RawTandemSignal:TandemSim:Simple:add_precursor_peaks |
string | false |
true, false | Adds peaks of the precursor to the spectrum, which happen to occur sometimes |
RawTandemSignal:TandemSim:Simple:add_all_precursor_charges |
string | false |
true, false | Adds precursor peaks with all charges in the given range |
RawTandemSignal:TandemSim:Simple:add_abundant_immonium_ions |
string | false |
true, false | Add most abundant immonium ions |
RawTandemSignal:TandemSim:Simple:add_first_prefix_ion |
string | false |
true, false | If set to true e.g. b1 ions are added |
RawTandemSignal:TandemSim:Simple:add_y_ions |
string | true |
true, false | Add peaks of y-ions to the spectrum |
RawTandemSignal:TandemSim:Simple:add_b_ions |
string | true |
true, false | Add peaks of b-ions to the spectrum |
RawTandemSignal:TandemSim:Simple:add_a_ions |
string | false |
true, false | Add peaks of a-ions to the spectrum |
RawTandemSignal:TandemSim:Simple:add_c_ions |
string | false |
true, false | Add peaks of c-ions to the spectrum |
RawTandemSignal:TandemSim:Simple:add_x_ions |
string | false |
true, false | Add peaks of x-ions to the spectrum |
RawTandemSignal:TandemSim:Simple:add_z_ions |
string | false |
true, false | Add peaks of z-ions to the spectrum |
RawTandemSignal:TandemSim:Simple:y_intensity |
float | 1 |
| Intensity of the y-ions |
RawTandemSignal:TandemSim:Simple:b_intensity |
float | 1 |
| Intensity of the b-ions |
RawTandemSignal:TandemSim:Simple:a_intensity |
float | 1 |
| Intensity of the a-ions |
RawTandemSignal:TandemSim:Simple:c_intensity |
float | 1 |
| Intensity of the c-ions |
RawTandemSignal:TandemSim:Simple:x_intensity |
float | 1 |
| Intensity of the x-ions |
RawTandemSignal:TandemSim:Simple:z_intensity |
float | 1 |
| Intensity of the z-ions |
RawTandemSignal:TandemSim:Simple:relative_loss_intensity |
float | 0.1 |
| Intensity of loss ions, in relation to the intact ion intensity |
RawTandemSignal:TandemSim:Simple:precursor_intensity |
float | 1 |
| Intensity of the precursor peak |
RawTandemSignal:TandemSim:Simple:precursor_H2O_intensity |
float | 1 |
| Intensity of the H2O loss peak of the precursor |
RawTandemSignal:TandemSim:Simple:precursor_NH3_intensity |
float | 1 |
| Intensity of the NH3 loss peak of the precursor |
RawTandemSignal:TandemSim:SVM:add_isotopes |
string | false |
true, false | If set to 1 isotope peaks of the product ion peaks are added |
RawTandemSignal:TandemSim:SVM:max_isotope |
int | 2 |
| Defines the maximal isotopic peak which is added, add_isotopes must be set to 1 |
RawTandemSignal:TandemSim:SVM:add_metainfo |
string | false |
true, false | Adds the type of peaks as metainfo to the peaks, like y8+, [M-H2O+2H]++ |
RawTandemSignal:TandemSim:SVM:add_first_prefix_ion |
string | false |
true, false | If set to true e.g. b1 ions are added |
RawTandemSignal:TandemSim:SVM:hide_y_ions |
string | false |
true, false | Add peaks of y-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_y2_ions |
string | false |
true, false | Add peaks of y-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_b_ions |
string | false |
true, false | Add peaks of b-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_b2_ions |
string | false |
true, false | Add peaks of b-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_a_ions |
string | false |
true, false | Add peaks of a-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_c_ions |
string | false |
true, false | Add peaks of c-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_x_ions |
string | false |
true, false | Add peaks of x-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_z_ions |
string | false |
true, false | Add peaks of z-ions to the spectrum |
RawTandemSignal:TandemSim:SVM:hide_losses |
string | false |
true, false | Adds common losses to those ion expect to have them, only water and ammonia loss is considered |
RawTandemSignal:TandemSim:SVM:y_intensity |
float | 1 |
| Intensity of the y-ions |
RawTandemSignal:TandemSim:SVM:b_intensity |
float | 1 |
| Intensity of the b-ions |
RawTandemSignal:TandemSim:SVM:a_intensity |
float | 1 |
| Intensity of the a-ions |
RawTandemSignal:TandemSim:SVM:c_intensity |
float | 1 |
| Intensity of the c-ions |
RawTandemSignal:TandemSim:SVM:x_intensity |
float | 1 |
| Intensity of the x-ions |
RawTandemSignal:TandemSim:SVM:z_intensity |
float | 1 |
| Intensity of the z-ions |
RawTandemSignal:TandemSim:SVM:relative_loss_intensity |
float | 0.1 |
| Intensity of loss ions, in relation to the intact ion intensity |
Global:ionization_type |
string | ESI |
MALDI, ESI | Type of Ionization (MALDI or ESI) |