OpenMS
|
Identifies peptides in MS/MS spectra via Mascot.
pot. predecessor tools | → MascotAdapter → | pot. successor tools |
---|---|---|
any signal-/preprocessing tool (in mzML format) | IDFilter or any protein/peptide processing tool |
This wrapper application serves for getting peptide identifications for MS/MS spectra. It uses a local installation of the Mascot server to generate the identifications. A second wrapper (MascotAdapterOnline) is available which is able to perform identifications by communicating with a Mascot server over the network. So, it is not necessary to execute MascotAdapterOnline on the same machine as Mascot.
The minimal version of Mascot supported with this server is 2.1.
This wrapper can be executed in three different modes:
The whole process of ProteinIdentification via Mascot is executed. Inputfile is a mzData file containing the MS/MS spectra for which the identifications are to be found. The results are written as a idXML output file. This mode is selected by default.
Only the first part of the ProteinIdentification process is performed. This means that the MS/MS data is transformed into Mascot Generic Format (mgf) which can be used directly with Mascot. Being in the cgi directory of the Mascot directory calling a Mascot process should look like the following:
Consult your Mascot reference manual for further details.
This mode is selected by the -mascot_in option in the command line.
Only the second part of the ProteinIdentification process is performed. This means that the outputfile of the Mascot server is translated into idXML.
This mode is selected by the -mascot_out option in the command line.
If your Mascot server is installed on the same computer as the TOPP applications the MascotAdapter can be executed in mode 1. Otherwise the Mascot engine has to be executed manually assisted by mode 2 and mode 3. The ProteinIdentification steps then look like:
For mode 1 you have to specify the directory in which the Mascot server is installed. This is done by setting the option mascot_dir in the ini file. Furthermore you have to specify a folder in which the user has write permissions. This is done by setting the option temp_data_directory in the ini file. Two temporary files will be created in this directory during execution but deleted at the end of execution.
The command line parameters of this tool are:
MascotAdapter -- Annotates MS/MS spectra using Mascot. Full documentation: http://www.openms.de/doxygen/release/3.3.0/html/TOPP_MascotAdapter.html Version: 3.3.0 Dec 21 2024, 15:25:20, Revision: 35c5e65 To cite OpenMS: + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7. Usage: MascotAdapter <options> Options (mandatory options marked with '*'): -in <file>* Input file in mzData format. Note: In mode 'mascot_out' a Mascot results file (.mascotXML) is read (val id formats: 'mzData', 'mascotXML') -out <file>* Output file in idXML format. Note: In mode 'mascot_in' Mascot generic format is written. (valid formats : 'idXML', 'mgf') -out_type <type> Output file type (for TOPPAS) (valid: 'idXML', 'mgf') -instrument <i> The instrument that was used to measure the spectra (default: 'Default') -precursor_mass_tolerance <tol> The precursor mass tolerance (default: '2.0') -peak_mass_tolerance <tol> The peak mass tolerance (default: '1.0') -taxonomy <tax> The taxonomy (default: 'All entries') (valid: 'All entries', '. . Archaea (Archaeobacteria)', '. . Eukaryota (eucaryotes)', '. . . . Alveolata (alve olates)', '. . . . . . Plasmodium falciparum (malaria parasite)', '. . . . . . Other Alveolata', '. . . . Metazoa (Animals)', '. . . . . . Caenorha bditis elegans', '. . . . . . Drosophila (fruit flies)', '. . . . . . Chor data (vertebrates and relatives)', '. . . . . . . . bony vertebrates', '. . . . . . . . . . lobe-finned fish and tetrapod clade', '. . . . . . . . . . . . Mammalia (mammals)', '. . . . . . . . . . . . . . Primates', '. . ... ilable') -modifications <mods> The modifications i.e. Carboxymethyl (C) -variable_modifications <mods> The variable modifications i.e. Carboxymethyl (C) -charges [1+ 2+ ...] The different charge states (default: '[1+ 2+ 3+]') -db <name> The database to search in (default: 'MSDB') -hits <num> The number of hits to report (default: 'AUTO') -cleavage <enz> The enzyme descriptor to the enzyme used for digestion. (Trypsin is defaul t, None would be best for peptide input or unspecific digestion, for more please refer to your mascot server). (default: 'Trypsin') (valid: 'Trypsin ', 'Arg-C', 'Asp-N', 'Asp-N_ambic', 'Chymotrypsin', 'CNBr', 'CNBr+Trypsin' , 'Formic_acid', 'Lys-C', 'Lys-C/P', 'PepsinA', 'Tryp-CNBr', 'TrypChymo', 'Trypsin/P', 'V8-DE', 'V8-E', 'semiTrypsin', 'LysC+AspN', 'None') -missed_cleavages <num> Number of allowed missed cleavages (default: '0') (min: '0') -sig_threshold <num> Significance threshold (default: '0.05') -pep_homol <num> Peptide homology threshold (default: '1.0') -pep_ident <num> Peptide ident threshold (default: '1.0') -pep_rank <num> Peptide rank (default: '1') -prot_score <num> Protein score (default: '1.0') -pep_score <num> Peptide score (default: '1.0') -pep_exp_z <num> Peptide expected charge (default: '1') -show_unassigned <num> Show_unassigned (default: '1') -first_dim_rt <num> Additional information which is added to every peptide identification as metavalue if set > 0 (default: '0.0') -boundary <string> MIME boundary for mascot output format -mass_type <type> Mass type (default: 'Monoisotopic') (valid: 'Monoisotopic', 'Average') -mascot_directory <dir> The directory in which mascot is located -temp_data_directory <dir> A directory in which some temporary files can be stored Common TOPP options: -ini <file> Use the given TOPP INI file -threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1') -write_ini <file> Writes the default configuration file --help Shows options --helphelp Shows all options (including advanced)
INI file documentation of this tool:
You can specify the Mascot parameters precursor_mass_tolerance (the peptide mass tolerance), peak_mass_tolerance (the MS/MS tolerance), taxonomy (restriction to a certain subset of the database), modifications, variable_modifications, charges (the possible charge variants), db (database where the peptides are searched in), hits (number of hits), cleavage (the cleavage enzyme), missed_cleavages (number of missed cleavages) and mass_type (Monoisotopic or Average) via the ini file.
Known problems with Mascot server execution:
getting error message: "FATAL_ERROR: M00327 The ms-monitor daemon/service is not running, please start it."