OpenMS
2.6.0
|
Calculates the suitability of a database which was used a for peptide identification search. Also reports the quality of LC-MS spectra.
The metric this tool uses to determine the suitability of a database is based on a de novo model. Therefore it is crucial that your workflow is set up the right way. Above you can see an example.
Most importantly the peptide identification search needs to be done with a combination of the database in question and a de novo "database".
To generate the de novo "database":
For re-ranking all cases where a peptide hit only found in the de novo "database" scores above a peptide hit found in the actual database are checked. In all these cases the cross-correlation scores of those peptide hits are compared. If they are similar enough, the database hit will be re-ranked to be on top of the de novo hit. You can control how much of cases with similar scores will be re-ranked by using the reranking_cutoff_percentile
.
For this to work it is important PeptideIndexer ran before. However it is also crucial that no FDR was performed. This tool does this itself and will crash if a q-value is found. You can still control the FDR that you want to establish using the corresponding flag.
no_rerank
flag in this tool. This will probably result in an underestimated suitability though.The results are written directly into the console. But you can provide an optional tsv output file where the most important results will be exported to.
This tool uses the metrics and algorithms first presented in:
Assessing protein sequence database suitability using de novo sequencing. Molecular & Cellular Proteomics. January 1, 2020; 19, 1: 198-208. doi:10.1074/mcp.TIR119.001752.
Richard S. Johnson, Brian C. Searle, Brook L. Nunn, Jason M. Gilmore, Molly Phillips, Chris T. Amemiya, Michelle Heck, Michael J. MacCoss.
The command line parameters of this tool are:
DatabaseSuitability -- Computes a suitability score for a database which was used for a peptide identificatio n search. Also reports the quality of LC-MS spectra. Full documentation: Version: 2.6.0 Sep 30 2020, 12:54:34, Revision: c26f752 To cite OpenMS: Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959. To cite DatabaseSuitability: Richard S. Johnson, Brian C. Searle, Brook L. Nunn, Jason M. Gilmore, Molly Phillips, Chris T. Amemiya, Michelle Heck, Michael J. MacCoss. Assessing protein sequence database suitability using de novo sequencing. Molecular & Cellular Proteomics. January 1, 2020; 19, 1: 198-208. doi:10.1074/mcp.TIR119.001752. Usage: DatabaseSuitability <options> This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript ion or use the --helphelp option. Options (mandatory options marked with '*'): -in_id <file>* Input idXML file from peptide search with combined database with added de novo peptide. PeptideIndexer is needed, FDR is forbidden. (valid formats: 'idXML') -in_spec <file>* Input MzML file used for the peptide identification (valid formats: 'mzML') -in_novo <file>* Input idXML file containing de novo peptides (unfiltered) (valid formats: 'idXML') -out <file> Optional tsv output containing database suitability information as well as spectral qual ity. (valid formats: 'tsv') Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced) The following configuration subsections are valid: - algorithm Parameter section for the suitability calculation algorithm You can write an example INI file using the '-write_ini' option. Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor. For more information, please consult the online documentation for this tool: -
INI file documentation of this tool: