OpenMS
Digestor

Digests a protein database in-silico.

pot. predecessor tools → Digestor → pot. successor tools
none (FASTA input) IDFilter (peptide blacklist)

This application is used to digest a protein database to get all peptides given a cleavage enzyme.

The output can be used e.g. as a blacklist filter input to IDFilter, to remove certain peptides.

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

Digestor -- Digests a protein database in-silico.
Full documentation: http://www.openms.de/doxygen/release/3.1.0/html/TOPP_Digestor.html
Version: 3.1.0 Oct 18 2023, 10:27:18, Revision: 17a07f8
To cite OpenMS:
 + Rost HL, Sachsenberg T, Aiche S, Bielow C et al.. OpenMS: a flexible open-source software platform for 
   mass spectrometry data analysis. Nat Meth. 2016; 13, 9: 741-748. doi:10.1038/nmeth.3959.

Usage:
  Digestor <options>

Options (mandatory options marked with '*'):
  -in <file>*                  Input file (valid formats: 'fasta')
  -out <file>*                 Output file (peptides) (valid formats: 'idXML', 'fasta')
  -out_type <type>             Set this if you cannot control the filename of 'out', e.g., in TOPPAS. (valid:
                                'idXML', 'fasta')
  -missed_cleavages <number>   The number of allowed missed cleavages (default: '1') (min: '0')
  -min_length <number>         Minimum length of peptide (default: '6')
  -max_length <number>         Maximum length of peptide (default: '40')
  -enzyme <string>             The type of digestion enzyme (default: 'Trypsin') (valid: 'Asp-N/B', 'Asp-N_am
                               bic', 'Chymotrypsin', 'Chymotrypsin/P', 'CNBr', 'Formic_acid', 'Lys-C', 'Lys-N
                               ', 'Lys-C/P', 'PepsinA', 'TrypChymo', 'Trypsin/P', 'V8-DE', 'V8-E', 'Alpha-lyt
                               ic protease', 'leukocyte elastase', 'proline endopeptidase', 'glutamyl endopep
                               tidase', '2-iodobenzoate', 'iodosobenzoate', 'staphylococcal protease/D', 'pro
                               line-endopeptidase/HKR', 'Glu-C+P', 'PepsinA + P', 'cyanogen-bromide', 'Clostr
                               ipain/P', 'elastase-trypsin-chymotrypsin', 'Arg-C/P', 'Asp-N', 'Arg-C', 'Tryps
                               in', 'unspecific cleavage', 'no cleavage')

Options for FASTA output files:
  -FASTA:ID <option>           Identifier to use for each peptide: copy from parent protein (parent); a conse
                               cutive number (number); parent ID + consecutive number (both) (default: 'paren
                               t') (valid: 'parent', 'number', 'both')
  -FASTA:description <option>  Keep or remove the (possibly lengthy) FASTA header description. Keeping it 
                               can increase resulting FASTA file significantly. (default: 'remove') (valid: 
                               'remove', 'keep')

                               
Common TOPP options:
  -ini <file>                  Use the given TOPP INI file
  -threads <n>                 Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>            Writes the default configuration file
  --help                       Shows options
  --helphelp                   Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+DigestorDigests a protein database in-silico.
version3.1.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'Digestor'
in input fileinput file*.fasta
out Output file (peptides)output file*.idXML, *.fasta
out_type Set this if you cannot control the filename of 'out', e.g., in TOPPAS.idXML, fasta
missed_cleavages1 The number of allowed missed cleavages0:∞
min_length6 Minimum length of peptide
max_length40 Maximum length of peptide
enzymeTrypsin The type of digestion enzymeAsp-N/B, Asp-N_ambic, Chymotrypsin, Chymotrypsin/P, CNBr, Formic_acid, Lys-C, Lys-N, Lys-C/P, PepsinA, TrypChymo, Trypsin/P, V8-DE, V8-E, Alpha-lytic protease, leukocyte elastase, proline endopeptidase, glutamyl endopeptidase, 2-iodobenzoate, iodosobenzoate, staphylococcal protease/D, proline-endopeptidase/HKR, Glu-C+P, PepsinA + P, cyanogen-bromide, Clostripain/P, elastase-trypsin-chymotrypsin, Arg-C/P, Asp-N, Arg-C, Trypsin, unspecific cleavage, no cleavage
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false
+++FASTAOptions for FASTA output files
IDparent Identifier to use for each peptide: copy from parent protein (parent); a consecutive number (number); parent ID + consecutive number (both)parent, number, both
descriptionremove Keep or remove the (possibly lengthy) FASTA header description. Keeping it can increase resulting FASTA file significantly.remove, keep