OpenMS
DatabaseFilter

The DatabaseFilter tool filters a protein database in fasta format according to one or multiple filtering criteria.

The resulting database is written as output. Depending on the reporting method (method="whitelist" or "blacklist") only entries are retained that passed all filters ("whitelist) or failed at least one filter ("blacklist").

Implemented filter criteria:

accession: Filter database according to the set of protein accessions contained in an identification file (idXML, mzIdentML)

The command line parameters of this tool are:

DatabaseFilter -- Filters a protein database (FASTA format) based on identified proteins
Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_DatabaseFilter.html
Version: 3.2.0 Sep 18 2024, 16:00:56, Revision: e231942
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.

Usage:
  DatabaseFilter <options>

Options (mandatory options marked with '*'):
  -in <file>*        Input FASTA file, containing a protein database. (valid formats: 'fasta')
  -id <file>*        Input file containing identified peptides and proteins. (valid formats: 'idXML', 'mzid')

  -method <choice>   Switch between white-/blacklisting of protein IDs (default: 'whitelist') (valid: 'whitel
                     ist', 'blacklist')
  -out <file>*       Output FASTA file where the reduced database will be written to. (valid formats: 'fasta'
                     )
                     
Common TOPP options:
  -ini <file>        Use the given TOPP INI file
  -threads <n>       Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>  Writes the default configuration file
  --help             Shows options
  --helphelp         Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+DatabaseFilterFilters a protein database (FASTA format) based on identified proteins
version3.2.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'DatabaseFilter'
in Input FASTA file, containing a protein database.input file*.fasta
id Input file containing identified peptides and proteins.input file*.idXML, *.mzid
methodwhitelist Switch between white-/blacklisting of protein IDswhitelist, blacklist
out Output FASTA file where the reduced database will be written to.output file*.fasta
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false