Converts between different MS file formats.
pot. predecessor tools | → FileConverter → | pot. successor tools |
GenericWrapper (e.g. for calling external converters) | any tool operating on the output format |
any vendor software exporting supported formats (e.g. mzML) |
The main use of this tool is to convert data from external sources to the formats used by OpenMS/TOPP. Maybe most importantly, data from MS experiments in a number of different formats can be converted to mzML, the canonical file format used by OpenMS/TOPP for experimental data. (mzML is the PSI approved format and supports traceability of analysis steps.)
Thermo raw files can be converted to mzML using the ThermoRawFileParser provided in the THIRDPARTY folder. On windows, a recent .NET framwork needs to be installed. On linux and mac, the mono runtime needs to be present and accessible via the -NET_executable parameter. The path to the ThermoRawFileParser can be set via the -ThermoRaw_executable option.
For MaxQuant-flavoured mzXML the use of the advanced option '-force_MaxQuant_compatibility' is recommended.
Many different format conversions are supported, and some may be more useful than others. Depending on the file formats involved, information can be lost during conversion, e.g. when converting featureXML to mzData. In such cases a warning is shown.
The input and output file types are determined from the file extensions or from the first few lines of the files. If file type determination is not possible, the input or output file type has to be given explicitly.
Conversion with the same output as input format is supported. In some cases, this can be helpful to remove errors from files (e.g. the index), to update file formats to new versions, or to check whether information is lost upon reading or writing.
Some information about the supported input types: mzML mzXML mzData mgf dta2d dta featureXML consensusXML ms2 fid/XMASS tsv peplist kroenik edta sqmass oms
- Note
- See IDFileConverter for similar functionality for protein/peptide identification file formats.
The command line parameters of this tool are:
FileConverter -- Converts between different MS file formats.
Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_FileConverter.html
Version: 3.2.0 Nov 26 2024, 13:16:38, Revision: 962e60f
To cite OpenMS:
+ Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.
Usage:
FileConverter <options>
Options (mandatory options marked with '*'):
-in <file>* Input file to convert. (valid formats: 'mzML', 'mzXML', 'mgf', 'raw', 'cachedMzML', 'mzD
ata', 'dta', 'dta2d', 'featureXML', 'consensusXML', 'ms2', 'fid', 'tsv', 'peplist', 'kro
enik', 'edta', 'oms')
-in_type <type> Input file type -- default: determined from file extension or content
(valid: 'mzML', 'mzXML', 'mgf', 'raw', 'cachedMzML', 'mzData', 'dta', 'dta2d', 'feature
XML', 'consensusXML', 'ms2', 'fid', 'tsv', 'peplist', 'kroenik', 'edta', 'oms')
-out <file>* Output file (valid formats: 'mzML', 'mzXML', 'cachedMzML', 'mgf', 'featureXML', 'consens
usXML', 'edta', 'mzData', 'dta2d', 'csv', 'sqmass', 'oms')
-out_type <type> Output file type -- default: determined from file extension or content
Note: that not all conversion paths work or make sense. (valid: 'mzML', 'mzXML', 'cached
MzML', 'mgf', 'featureXML', 'consensusXML', 'edta', 'mzData', 'dta2d', 'csv', 'sqmass',
'oms')
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1')
-write_ini <file> Writes the default configuration file
--help Shows options
--helphelp Shows all options (including advanced)
INI file documentation of this tool:
Legend:
required parameter
advanced parameter
+FileConverterConverts between different MS file formats.
version3.2.0
Version of the tool that generated this parameters file.
++1Instance '1' section for 'FileConverter'
in
Input file to convert.input file*.mzML, *.mzXML, *.mgf, *.raw, *.cachedMzML, *.mzData, *.dta, *.dta2d, *.featureXML, *.consensusXML, *.ms2, *.fid, *.tsv, *.peplist, *.kroenik, *.edta, *.oms
in_type
Input file type -- default: determined from file extension or content
mzML, mzXML, mgf, raw, cachedMzML, mzData, dta, dta2d, featureXML, consensusXML, ms2, fid, tsv, peplist, kroenik, edta, oms
UID_postprocessingensure
unique ID post-processing for output data.
'none' keeps current IDs even if invalid.
'ensure' keeps current IDs but reassigns invalid ones.
'reassign' assigns new unique IDs.none, ensure, reassign
out
Output fileoutput file*.mzML, *.mzXML, *.cachedMzML, *.mgf, *.featureXML, *.consensusXML, *.edta, *.mzData, *.dta2d, *.csv, *.sqmass, *.oms
out_type
Output file type -- default: determined from file extension or content
Note: that not all conversion paths work or make sense.mzML, mzXML, cachedMzML, mgf, featureXML, consensusXML, edta, mzData, dta2d, csv, sqmass, oms
TIC_DTA2Dfalse
Export the TIC instead of the entire experiment in mzML/mzData/mzXML -> DTA2D conversions.true, false
MGF_compactfalse
Use a more compact format when writing MGF (no zero-intensity peaks, limited number of decimal places)true, false
force_MaxQuant_compatibilityfalse
[mzXML output only] Make sure that MaxQuant can read the mzXML and set the msManufacturer to 'Thermo Scientific'.true, false
force_TPP_compatibilityfalse
[mzML output only] Make sure that TPP parsers can read the mzML and the precursor ion m/z in the file (otherwise it will be set to zero by the TPP).true, false
convert_to_chromatogramsfalse
[mzML output only] Assumes that the provided spectra represent data in SRM mode or targeted MS1 mode and converts them to chromatogram data.true, false
write_scan_indextrue
Append an index when writing mzML or mzXML files. Some external tools might rely on it.true, false
lossy_compressionfalse
Use numpress compression to achieve optimally small file size using linear compression for m/z domain and slof for intensity and float data arrays (attention: may cause small loss of precision; only for mzML data).true, false
lossy_mass_accuracy-1.0
Desired (absolute) m/z accuracy for lossy compression (e.g. use 0.0001 for a mass accuracy of 0.2 ppm at 500 m/z, default uses -1.0 for maximal accuracy).
process_lowmemoryfalse
Whether to process the file on the fly without loading the whole file into memory first (only for conversions of mzXML/mzML to mzML).
Note: this flag will prevent conversion from spectra to chromatograms.true, false
log
Name of log file (created only when specified)
debug0
Sets the debug level
threads1
Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse
Disables progress logging to command linetrue, false
forcefalse
Overrides tool-specific checkstrue, false
testfalse
Enables the test mode (needed for internal use only)true, false
+++RawToMzMLOptions for converting raw files to mzML (uses ThermoRawFileParser)
NET_executable
The .NET framework executable. Only required on linux and mac.input file, is_executable
ThermoRaw_executableThermoRawFileParser.exe
The ThermoRawFileParser executable.input file, is_executable*.exe
no_peak_pickingfalse
Disables vendor peak picking for raw files.true, false
no_zlib_compressionfalse
Disables zlib compression for raw file conversion. Enables compatibility with some tools that do not support compressed input files, e.g. X!Tandem.true, false
include_noisefalse
Include noise data in mzML output.true, false