OpenMS  2.5.0
Digestor

Digests a protein database in-silico.

pot. predecessor tools $ \longrightarrow $ Digestor $ \longrightarrow $ pot. successor tools
none (FASTA input) IDFilter (peptide blacklist)

This application is used to digest a protein database to get all peptides given a cleavage enzyme.

The output can be used e.g. as a blacklist filter input to IDFilter, to remove certain peptides.

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

Digestor -- Digests a protein database in-silico.
Full documentation: http://www.openms.de/documentation/UTILS_Digestor.html
Version: 2.5.0-nightly-2020-03-06 Mar  7 2020, 01:22:16, Revision: 84b1398
To cite OpenMS:
  Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  Digestor <options>

Options (mandatory options marked with '*'):
  -in <file>*                  Input file (valid formats: 'fasta')
  -out <file>*                 Output file (peptides) (valid formats: 'idXML', 'fasta')
  -out_type <type>             Set this if you cannot control the filename of 'out', e.g., in TOPPAS. (valid:
                               'idXML', 'fasta')
  -missed_cleavages <number>   The number of allowed missed cleavages (default: '1' min: '0')
  -min_length <number>         Minimum length of peptide (default: '6')
  -max_length <number>         Maximum length of peptide (default: '40')
  -enzyme <string>             The type of digestion enzyme (default: 'Trypsin' valid: 'Arg-C/P', '2-iodobenz
                               oate', 'iodosobenzoate', 'staphylococcal protease/D', 'proline-endopeptidase/H
                               KR', 'Glu-C+P', 'PepsinA + P', 'cyanogen-bromide', 'Clostripain/P', 'elastase-
                               trypsin-chymotrypsin', 'no cleavage', 'unspecific cleavage', 'Trypsin', 'Chymo
                               trypsin', 'Chymotrypsin/P', 'V8-E', 'leukocyte elastase', 'proline endopeptida
                               se', 'glutamyl endopeptidase', 'Alpha-lytic protease', 'Asp-N', 'Asp-N/B',
                               'Lys-N', 'Lys-C/P', 'PepsinA', 'CNBr', 'Formic_acid', 'Lys-C', 'TrypChymo',
                               'Trypsin/P', 'V8-DE', 'Arg-C', 'Asp-N_ambic')

Options for FASTA output files:
  -FASTA:ID <option>           Identifier to use for each peptide: copy from parent protein (parent); a conse
                               cutive number (number); parent ID + consecutive number (both) (default: 'paren
                               t' valid: 'parent', 'number', 'both')
  -FASTA:description <option>  Keep or remove the (possibly lengthy) FASTA header description. Keeping it 
                               can increase resulting FASTA file significantly. (default: 'remove' valid:
                               'remove', 'keep')

                               
Common UTIL options:
  -ini <file>                  Use the given TOPP INI file
  -threads <n>                 Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>            Writes the default configuration file
  --help                       Shows options
  --helphelp                   Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+DigestorDigests a protein database in-silico.
version2.5.0-nightly-2020-03-06 Version of the tool that generated this parameters file.
++1Instance '1' section for 'Digestor'
in input fileinput file*.fasta
out Output file (peptides)output file*.idXML,*.fasta
out_type Set this if you cannot control the filename of 'out', e.g., in TOPPAS.idXML,fasta
missed_cleavages1 The number of allowed missed cleavages0:∞
min_length6 Minimum length of peptide
max_length40 Maximum length of peptide
enzymeTrypsin The type of digestion enzymeArg-C/P,2-iodobenzoate,iodosobenzoate,staphylococcal protease/D,proline-endopeptidase/HKR,Glu-C+P,PepsinA + P,cyanogen-bromide,Clostripain/P,elastase-trypsin-chymotrypsin,no cleavage,unspecific cleavage,Trypsin,Chymotrypsin,Chymotrypsin/P,V8-E,leukocyte elastase,proline endopeptidase,glutamyl endopeptidase,Alpha-lytic protease,Asp-N,Asp-N/B,Lys-N,Lys-C/P,PepsinA,CNBr,Formic_acid,Lys-C,TrypChymo,Trypsin/P,V8-DE,Arg-C,Asp-N_ambic
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overwrite tool specific checks.true,false
testfalse Enables the test mode (needed for internal use only)true,false
+++FASTAOptions for FASTA output files
IDparent Identifier to use for each peptide: copy from parent protein (parent); a consecutive number (number); parent ID + consecutive number (both)parent,number,both
descriptionremove Keep or remove the (possibly lengthy) FASTA header description. Keeping it can increase resulting FASTA file significantly.remove,keep