OpenMS
FalseDiscoveryRate

Tool to estimate the false discovery rate on peptide and protein level

pot. predecessor tools → FalseDiscoveryRate → pot. successor tools
MascotAdapter (or other ID engines) IDFilter
PeptideIndexer

This TOPP tool calculates the false discovery rate (FDR) for results of target-decoy searches. The FDR calculation can be performed for proteins and/or for peptides (more exactly, peptide spectrum matches).

The false discovery rate is defined as the number of false discoveries (decoy hits) divided by the number of false and correct discoveries (both target and decoy hits) with a score better than a given threshold.

PeptideIndexer must be applied to the search results (idXML file) to index the data and to annotate peptide and protein hits with their target/decoy status.

Note
When no decoy hits were found you will get a warning like this:
"FalseDiscoveryRate: #decoy sequences is zero! Setting all target sequences to q-value/FDR 0!"
This should be a serious concern, since it indicates a possible problem with the target/decoy annotation step (PeptideIndexer), e.g. due to a misconfigured database.
FalseDiscoveryRate only annotates peptides and proteins with their FDR. By setting FDR:PSM or FDR:protein the maximum q-value (e.g., 0.05 corresponds to an FDR of 5%) can be controlled on the PSM and protein level. Alternatively, FDR filtering can be performed in the IDFilter tool by setting score:pep and score:prot to the maximum q-value. After potential filtering, associations are automatically updated and unreferenced proteins/peptides removed based on the advanced cleanup parameters.
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

FalseDiscoveryRate -- Estimates the false discovery rate on peptide and protein level using decoy searches.
Full documentation: http://www.openms.de/doxygen/release/3.3.0/html/TOPP_FalseDiscoveryRate.html
Version: 3.3.0 Dec 21 2024, 15:25:20, Revision: 35c5e65
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  FalseDiscoveryRate <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option

Options (mandatory options marked with '*'):
  -in <file>*                                   Identifications from searching a target-decoy database. (vali
                                                d formats: 'idXML')
  -out <file>*                                  Identifications with annotated FDR (valid formats: 'idXML')
  -PSM <FDR level>                              Perform FDR calculation on PSM level (default: 'true') (valid
                                                : 'true', 'false')
  -peptide <FDR level>                          Perform FDR calculation on peptide level and annotates it as 
                                                meta value
                                                (Note: if set, also calculates FDR/q-value on PSM level.) 
                                                (default: 'false') (valid: 'true', 'false')
  -PSM_peptide_base_score <score name or type>  Set if you want to choose a different score than the last 
                                                calculated main score for PSM or peptide level.
  -protein <FDR level>                          Perform FDR calculation on protein level (default: 'true') 
                                                (valid: 'true', 'false')
  -proteingroup <FDR level>                     Perform FDR calculation on (indist.) protein group level, 
                                                too. Currently, this will enable protein FDR automatically 
                                                (since internals need to be in-sync) but will affect the leve
                                                l at which it filters (if enabled). (default: 'false') (valid
                                                : 'true', 'false')
  -protein_base_score <score name or type>      Set if you want to choose a different score than the last 
                                                calculated main score for protein (group) level.

FDR control:
  -FDR:PSM <fraction>                           Filter PSMs based on q-value (e.g., 0.05 = 5% FDR, disabled 
                                                for 1) (default: '1.0') (min: '0.0' max: '1.0')
  -FDR:protein <fraction>                       Filter proteins based on q-value (e.g., 0.05 = 5% FDR, disabl
                                                ed for 1) (default: '1.0') (min: '0.0' max: '1.0')

                                                
Common TOPP options:
  -ini <file>                                   Use the given TOPP INI file
  -threads <n>                                  Sets the number of threads allowed to be used by the TOPP 
                                                tool (default: '1')
  -write_ini <file>                             Writes the default configuration file
  --help                                        Shows options
  --helphelp                                    Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Parameter section for the FDR calculation algorithm

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
For more information, please consult the online documentation for this tool:
  - http://www.openms.de/doxygen/release/3.3.0/html/TOPP_FalseDiscoveryRate.html

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+FalseDiscoveryRateEstimates the false discovery rate on peptide and protein level using decoy searches.
version3.3.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'FalseDiscoveryRate'
in Identifications from searching a target-decoy database.input file*.idXML
out Identifications with annotated FDRoutput file*.idXML
PSMtrue Perform FDR calculation on PSM leveltrue, false
peptidefalse Perform FDR calculation on peptide level and annotates it as meta value
(Note: if set, also calculates FDR/q-value on PSM level.)
true, false
PSM_peptide_base_score Set if you want to choose a different score than the last calculated main score for PSM or peptide level.
PSM_peptide_base_score_orientation In case the score orientation cannot be inferred.higher_better, lower_better
proteintrue Perform FDR calculation on protein leveltrue, false
proteingroupfalse Perform FDR calculation on (indist.) protein group level, too. Currently, this will enable protein FDR automatically (since internals need to be in-sync) but will affect the level at which it filters (if enabled).true, false
protein_score The protein score used to calculate the protein FDR. If empty, the main score is used.MS:1001492, Mascot, OMSSA, SEQUEST:xcorr, XTandem, hyperscore, ln(hyperscore), mvh, svm, E-Value, MS:1002053, MS:1002257, SpecEValue, evalue, expect, Posterior Probability, MS:1001493, Posterior Error Probability, pep, FDR, false discovery rate, fdr, MS:1001491, q-Value, q-value, qval, qvalue
protein_base_score Set if you want to choose a different score than the last calculated main score for protein (group) level.
protein_base_score_orientation Set if you want to choose a different score than the last calculated main score for protein (group) level.higher_better, lower_better
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++FDRFDR control
PSM1.0 Filter PSMs based on q-value (e.g., 0.05 = 5% FDR, disabled for 1)0.0:1.0
protein1.0 Filter proteins based on q-value (e.g., 0.05 = 5% FDR, disabled for 1)0.0:1.0
++++cleanupCleanup references after FDR control
remove_proteins_without_psmstrue Remove proteins without PSMs (due to being decoy or below PSM FDR threshold).true, false
remove_psms_without_proteinstrue Remove PSMs without proteins (due to being decoy or below protein FDR threshold).true, false
remove_spectra_without_psmstrue Remove spectra without PSMs (due to being decoy or below protein FDR threshold). Caution: if remove_psms_without_proteins is false, protein level filtering does not propagate.true, false
+++algorithmParameter section for the FDR calculation algorithm
no_qvaluesfalse If 'true' strict FDRs will be calculated instead of q-values (the default)true, false
use_all_hitsfalse If 'true' not only the first hit, but all are used (peptides only)true, false
split_charge_variantsfalse If 'true' charge variants are treated separately (for peptides of combined target/decoy searches only).true, false
treat_runs_separatelyfalse If 'true' different search runs are treated separately (for peptides of combined target/decoy searches only).true, false
add_decoy_peptidesfalse If 'true' decoy peptides will be written to output file, too. The q-value is set to the closest target score.true, false
add_decoy_proteinsfalse If 'true' decoy proteins will be written to output file, too. The q-value is set to the closest target score.true, false
conservativetrue If 'true' (D+1)/T instead of (D+1)/(T+D) is used as a formula.true, false