OpenMS
GNPSExport

Export MS/MS data in .MGF format for GNPS (http://gnps.ucsd.edu).

GNPS (Global Natural Products Social Molecular Networking, http://gnps.ucsd.edu) is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. The GNPS web-platform makes it possible to perform spectral library search against public MS/MS spectral libraries, as well as to perform various data analysis such as MS/MS molecular networking, network annotation propagation, and the Dereplicator-based annotation. The GNPS manuscript is available here: https://www.nature.com/articles/nbt.3597 This tool was developed for the Feature Based Molecular Networking (FBMN) (https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/) and Ion Identity Molecular Networking (IIMN) (https://ccms-ucsd.github.io/GNPSDocumentation/fbmn-iin/) workflows.

Please cite: Nothias, L.-F., Petras, D., Schmid, R. et al. Feature-based molecular networking in the GNPS analysis environment. Nat. Methods 17, 905–908 (2020).

In brief, after running an OpenMS metabolomics pipeline, the GNPSExport TOPP tool can be used on the consensusXML file and the mzML files to generate the files needed for FBMN and IIMN. Those files are:

A representative OpenMS-GNPS workflow would use the following OpenMS TOPP tools sequentially:

  • Input mzML files
  • Run the FeatureFinderMetabo tool on the mzML files.
  • Run MetaboliteAdductDecharger on the featureXML files (optional, for Ion Identity Molecular Networking).
  • Run the MapAlignerPoseClustering tool on the featureXML files.
    MapAlignerPoseClustering -in FFM_inputFile0.featureXML FFM_inputFile1.featureXML -out MapAlignerPoseClustering_inputFile0.featureXML MapAlignerPoseClustering_inputFile1.featureXML -trafo_out MapAlignerPoseClustering_inputFile0.trafoXML MapAlignerPoseClustering_inputFile1.trafoXML
  • Run the MapRTTransformer tool on the mzML files to transform retention times based on the feature map alignment by MapAlignerPoseClustering.
    MapRTTransformer -in inputFile0.mzML -out MapRTTransformer_inputFile0.mzML -trafo_in MapAlignerPoseClustering_inputFile0.trafoXML
    MapRTTransformer -in inputFile1.mzML -out MapRTTransformer_inputFile1.mzML -trafo_in MapAlignerPoseClustering_inputFile1.trafoXML
  • Run the IDMapper tool on the featureXML and mzML files.
    IDMapper -id emptyfile.idXML -in MapAlignerPoseClustering_inputFile0.featureXML -spectra:in MapRTTransformer_inputFile0.mzML -out IDMapper_inputFile0.featureXML
    IDMapper -id emptyfile.idXML -in MapAlignerPoseClustering_inputFile1.featureXML -spectra:in MapRTTransformer_inputFile1.mzML -out IDMapper_inputFile1.featureXML
  • Run the MetaboliteAdductDecharger tool on the featureXML files.
  • Run the FeatureLinkerUnlabeledKD tool or FeatureLinkerUnlabeledQT, on the featureXML files and output a consensusXML file.
    FeatureLinkerUnlabeledKD -in IDMapper_inputFile0.featureXML IDMapper_inputFile1.featureXML -out FeatureLinkerUnlabeledKD.consensusXML
  • Run the FileFilter on the consensusXML file to keep only consensusElements with at least MS/MS scan (peptide identification).
    FileFilter -id:remove_unannotated_features -in FeatureLinkerUnlabeledKD.consensusXML -out FileFilter.consensusXML
  • Run the GNPSExport on the "filtered consensusXML file" to export an .MGF file. For each consensusElement in the consensusXML file, the GNPSExport command produces one representative consensus MS/MS spectrum (named peptide annotation in OpenMS jargon) which is appended in the MS/MS spectral file (.MGF file). (Note that the parameters for the spectral file generation are defined in the GNPSExport INI parameters file, available here: https://ccms-ucsd.github.io/GNPSDocumentation/openms_gnpsexport/GNPSExport.ini
    GNPSExport -in_cm filtered.consensusXML -in_mzml MapRTTransformer_inputFile0.mzML MapRTTransformer_inputFile1.mzML -out GNPSExport_output.mgf -out_quantification FeatureQuantificationTable.txt -out_pairs SupplementaryPairsTable.csv -out_meta_values MetaValues.tsv
  • Upload your files to GNPS and run the Feature-Based Molecular Networking workflow. Instructions can be found here: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/

The GitHub page for the ProteoSAFe workflow and the OpenMS python wrappers is available here: https://github.com/Bioinformatic-squad-DorresteinLab/openms-gnps-workflow An online version of the OpenMS-GNPS pipeline for FBMN running on CCMS server (http://proteomics.ucsd.edu/) is available here: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-openms/ The command line parameters of this tool are:

GNPSExport -- Export representative consensus MS/MS scan per consensusElement into a .MGF file format.
See the documentation on https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-o
penms
Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_GNPSExport.html
Version: 3.2.0 Sep 18 2024, 16:00:56, Revision: e231942
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.
To cite GNPSExport:
 + Nothias L.F. et al.. Feature-based Molecular Networking in the GNPS Analysis Environment. bioRxiv 812404 
   (2019). doi:10.1101/812404.

Usage:
  GNPSExport <options>

Options (mandatory options marked with '*'):
  -in_cm <file>*                          Input consensusXML file containing only consensusElements with "pep
                                          tide" annotations. (valid formats: 'consensusXML')
  -in_mzml <files>*                       Original mzml files containing the ms2 spectra (aka peptide annotat
                                          ion). 
                                          Must be in order that the consensusXML file maps the original mzML 
                                          files. (valid formats: 'mzML')
  -out <file>*                            Output MGF file. (valid formats: 'mgf')
  -out_quantification <file>*             Output feature quantification table. (valid formats: 'txt')
  -out_pairs <file>                       Output supplementary pairs table for IIMN. (valid formats: 'csv')
  -out_meta_values <file>                 Output meta value file. (valid formats: 'tsv')
                                          
  -output_type <choice>                   Specificity of mgf output information (default: 'most_intense') 
                                          (valid: 'merged_spectra', 'most_intense')
  -peptide_cutoff <number>                Number of most intense peptides to consider per consensus element; 
                                          '-1' to consider all identifications. (default: '5') (min: '-1')
  -ms2_bin_size <value>                   Bin size (Da) for fragment ions when merging ms2 scans. (default: 
                                          '0.019999999552965') (min: '0.0')

Options for exporting mgf file with merged spectra per consensusElement:
  -merged_spectra:cos_similarity <value>  Cosine similarity threshold for merged_spectra output. (default: 
                                          '0.9') (min: '0.0')

                                          
Common TOPP options:
  -ini <file>                             Use the given TOPP INI file
  -threads <n>                            Sets the number of threads allowed to be used by the TOPP tool (def
                                          ault: '1')
  -write_ini <file>                       Writes the default configuration file
  --help                                  Shows options
  --helphelp                              Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+GNPSExportExport representative consensus MS/MS scan per consensusElement into a .MGF file format.
See the documentation on https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-openms
version3.2.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'GNPSExport'
in_cm Input consensusXML file containing only consensusElements with "peptide" annotations.input file*.consensusXML
in_mzml[] Original mzml files containing the ms2 spectra (aka peptide annotation).
Must be in order that the consensusXML file maps the original mzML files.
input file*.mzML
out Output MGF file.output file*.mgf
out_quantification Output feature quantification table.output file*.txt
out_pairs Output supplementary pairs table for IIMN.output file*.csv
out_meta_values Output meta value file.output file*.tsv
output_typemost_intense specificity of mgf output informationmerged_spectra, most_intense
peptide_cutoff5 Number of most intense peptides to consider per consensus element; '-1' to consider all identifications.-1:∞
ms2_bin_size0.019999999552965 Bin size (Da) for fragment ions when merging ms2 scans.0.0:∞
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++merged_spectraOptions for exporting mgf file with merged spectra per consensusElement
cos_similarity0.9 Cosine similarity threshold for merged_spectra output.0.0:∞