OpenMS  2.7.0
GNPSExport

Export MS/MS data in .MGF format for GNPS (http://gnps.ucsd.edu).

GNPS (Global Natural Products Social Molecular Networking, http://gnps.ucsd.edu) is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. The GNPS web-platform makes possible to perform spectral library search against public MS/MS spectral libraries, as well as to perform various data analysis such as MS/MS molecular networking, network annotation propagation (http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006089), and the Dereplicator-based annotation (https://www.nature.com/articles/nchembio.2219). The GNPS manuscript is available here: https://www.nature.com/articles/nbt.3597

This tool was developed for the Feature Based Molecular Networking (FBMN) workflow on GNPS (https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash2.jsp)

Please cite our preprint: Nothias, LF., Petras, D., Schmid, R. et al. Feature-based molecular networking in the GNPS analysis environment. Nat Methods 17, 905–908 (2020). https://doi.org/10.1038/s41592-020-0933-6

See the FBMN workflow documentation here (https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/)

In brief, after running an OpenMS "metabolomics" pipeline, the GNPSExport TOPP tool can be used on the consensusXML file and corresponding mzML files to generate the files needed for FBMN on GNPS. These two files are:

  • The MS/MS spectral data file (.MGF format) which is generated with the GNPSExport util.
  • The feature quantification table (.CSV format) which is generated with the TextExport util.

For each consensusElement in the consensusXML file, the GNPSExport produces one representative consensus MS/MS spectrum (named peptide annotation in OpenMS jargon) outputted in the MS/MS spectral file (.MGF file). Several modes for the generation of the consensus MS/MS spectrum are available and described below. Note that these parameters are defined in the GNPSExport INI parameters file.

Representative command:

GNPSExport -ini iniFile-GNPSExport.ini -in_cm filefilter.consensusXML -in_mzml inputFile0.mzML inputFile1.mzML -out GNPSExport_output.mgf

The GNPSExport TOPP tool can be run on a consensusXML file and the corresponding mzML files to generate a MS/MS spectral file (MGF format) and corresponding feature quantification table (.TXT format) that contains the LC-MS peak area intensity.

Requirements:

  • The IDMapper has to be run on the featureXML files, in order to associate MS2 scan(s) (peptide annotation) with each features. These peptide annotations are used by the GNPSExport.
  • The FileFilter has to be run on the consensusXML file, prior to the GNPSExport, in order to remove consensusElements without MS2 scans (peptide annotation).

Parameters:

  • Binning (ms2_bin_size): Defines the binning width of fragment ions during the merging of eligible MS/MS spectra.
  • Cosine Score Threshold (merged_spectra:cos_similarity): Defines the necessary pairwise cosine similarity with the highest precursor intensity MS/MS scan.
  • Output Type (output_type): Options for outputting GNPSExport spectral processing are:
    1. [RECOMMENDED] merged_spectra For each consensusElement, the GNPSExport will merge all the eligible MS/MS scans into one representative consensus MS/MS spectrum. Eligible MS/MS scans have a pairwise cosine similarity with the MS/MS scan of highest precursor intensity above the Cosine Similarity Threshold. The fragment ions of merged MS/MS scans are binned in m/z (or Da) range defined by the Binning width parameter.
  1. Most intense: most_intense - For each consensusElement, the GNPSExport will output the most intense MS/MS scan (with the highest precursor ion intensity) as consensus MS/MS spectrum.

Note that mass accuracy and the retention time window for the pairing between MS/MS scans and a LC-MS feature or consensusElement is defined at the IDMapper tool step.

A representative OpenMS-GNPS workflow would sequentially use these OpenMS TOPP tools:

  1. Input mzML files
  2. Run the FeatureFinderMetabo tool on the mzML files.
  3. Run the IDMapper tool on the featureXML and mzML files.
  4. Run the MapAlignerPoseClustering tool on the featureXML files.
  5. Run the TOPP_MetaboliteAdductDecharger on the featureXML files.
  6. Run the FeatureLinkerUnlabeledKD tool or FeatureLinkerUnlabeledQT, on the featureXML files and output a consensusXML file.
  7. Run the FileFilter on the consensusXML file to keep only consensusElements with at least MS/MS scan (peptide identification).
  8. Run the GNPSExport on the "filtered consensusXML file" to export an .MGF file.
  9. Run the TextExporter on the "filtered consensusXML file" to export an .TXT file.
  10. Upload your files to GNPS and run the Feature-Based Molecular Networking workflow. Instructions are here: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/

The GitHub for that ProteoSAFe workflow and an OpenMS python wrappers is available here: https://github.com/Bioinformatic-squad-DorresteinLab/openms-gnps-workflow

An online version of the OpenMS-GNPS pipeline for FBMN running on CCMS server (http://proteomics.ucsd.edu/) is available on GNPS: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-openms/

GNPS (Global Natural Products Social Molecular Networking, https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash2.jsp) is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. The GNPS web-platform makes possible to perform spectral library search against public MS/MS spectral libraries, as well as to perform various data analysis such as MS/MS molecular networking, Network Annotation Propagation Network Annotation Propagation (http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006089) and the DEREPLICATOR (https://www.nature.com/articles/nchembio.2219) The GNPS paper is available here (https://www.nature.com/articles/nbt.3597)

The command line parameters of this tool are:

GNPSExport -- Tool to export representative consensus MS/MS scan per consensusElement into a .MGF file format
.
See the documentation on https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking_with_
openms
Full documentation: http://www.openms.de/doxygen/release/2.7.0/html/TOPP_GNPSExport.html
Version: 2.7.0 Sep 13 2021, 20:58:47, Revision: 9110e58
To cite OpenMS:
  Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.
To cite GNPSExport:
  Nothias L.F. et al.. Feature-based Molecular Networking in the GNPS Analysis Environment. bioRxiv 812404 (2019). doi:10.1101/812404.

Usage:
  GNPSExport <options>

Options (mandatory options marked with '*'):
  -in_cm <file>*                                  Input consensusXML file containing only consensusElements 
                                                  with "peptide" annotations. (valid formats: 'consensusXML')
  -in_mzml <files>*                               Original mzml files containing the ms2 spectra (aka peptide
                                                  annotation).
                                                  Must be in order that the consensusXML file maps the origi
                                                  nal mzML files. (valid formats: 'mzML')
  -out <file>*                                    Output MGF file (valid formats: 'mgf')
  -output_type <choice>                           Specificity of mgf output information (default: 'most_inten
                                                  se' valid: 'merged_spectra', 'most_intense')
                                                  
  -ms2_bin_size <num>                             Bin size (Da) for fragment ions when merging ms2 scans (def
                                                  ault: '0.019999999552965' min: '0.0')

Options for exporting mgf file with merged spectra per consensusElement:
  -merged_spectra:precursor_mass_tolerance <num>  Precursor mass tolerance (Da) for ms annotations (default: 
                                                  '0.5' min: '0.0')
  -merged_spectra:cos_similarity <num>            Cosine similarity threshold for merged_spectra output (defa
                                                  ult: '0.9' min: '0.0')

                                                  
Common TOPP options:
  -ini <file>                                     Use the given TOPP INI file
  -threads <n>                                    Sets the number of threads allowed to be used by the TOPP 
                                                  tool (default: '1')
  -write_ini <file>                               Writes the default configuration file
  --help                                          Shows options
  --helphelp                                      Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+GNPSExportTool to export representative consensus MS/MS scan per consensusElement into a .MGF file format.
See the documentation on https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking_with_openms
version2.7.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'GNPSExport'
in_cm Input consensusXML file containing only consensusElements with "peptide" annotations.input file*.consensusXML
in_mzml[] Original mzml files containing the ms2 spectra (aka peptide annotation).
Must be in order that the consensusXML file maps the original mzML files.
input file*.mzML
out Output MGF fileoutput file*.mgf
output_typemost_intense specificity of mgf output informationmerged_spectra,most_intense
peptide_cutoff5 Number of most intense peptides to consider per consensus element; '-1' to consider all identifications-1:∞
ms2_bin_size0.019999999552965 Bin size (Da) for fragment ions when merging ms2 scans0.0:∞
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overrides tool-specific checkstrue,false
testfalse Enables the test mode (needed for internal use only)true,false
+++merged_spectraOptions for exporting mgf file with merged spectra per consensusElement
precursor_mass_tolerance0.5 Precursor mass tolerance (Da) for ms annotations0.0:∞
cos_similarity0.9 Cosine similarity threshold for merged_spectra output0.0:∞