OpenMS
Loading...
Searching...
No Matches
SearchEngineBase Class Reference

Base class for TOPP search-engine adapters. More...

#include <OpenMS/APPLICATIONS/SearchEngineBase.h>

Inheritance diagram for SearchEngineBase:
[legend]
Collaboration diagram for SearchEngineBase:
[legend]

Public Member Functions

 SearchEngineBase ()=delete
 Default construction is disabled; a tool name and description are required.
 
 SearchEngineBase (const SearchEngineBase &)=delete
 Copy construction is disabled.
 
 SearchEngineBase (const std::string &name, const std::string &description, bool official=true, const std::vector< Citation > &citations={}, bool toolhandler_test=true)
 Construct the adapter (signature mirrors TOPPBase to forward arguments verbatim).
 
 ~SearchEngineBase () override
 Destructor.
 
std::string getRawfileName (int ms_level=2) const
 Resolve the -in argument and verify the spectra match a search engine's expectations.
 
std::string getDBFilename (const std::string &db="") const
 Resolve the search-database path.
 
virtual void registerPeptideIndexingParameter_ (Param peptide_indexing_parameter)
 Register the -reindex switch and a hidden PeptideIndexing parameter sub-tree.
 
virtual SearchEngineBase::ExitCodes reindex_ (std::vector< ProteinIdentification > &protein_identifications, PeptideIdentificationList &peptide_identifications) const
 Optionally re-run PeptideIndexing against the parsed identifications.
 
- Public Member Functions inherited from TOPPBase
 TOPPBase ()=delete
 No default constructor.
 
 TOPPBase (const TOPPBase &)=delete
 No default copy constructor.
 
 TOPPBase (const std::string &name, const std::string &description, bool official=true, const std::vector< Citation > &citations={}, bool toolhandler_test=true)
 Constructor.
 
virtual ~TOPPBase ()
 Destructor.
 
ExitCodes main (int argc, const char **argv)
 Main routine of all TOPP applications.
 
std::string getToolPrefix () const
 Returns the prefix used to identify the tool.
 
std::string getDocumentationURL () const
 Returns a link to the documentation of the tool (accessible on our servers and only after inclusion in the nightly branch or a release).
 

Additional Inherited Members

- Public Types inherited from TOPPBase
enum  ExitCodes {
  EXECUTION_OK , INPUT_FILE_NOT_FOUND , INPUT_FILE_NOT_READABLE , INPUT_FILE_CORRUPT ,
  INPUT_FILE_EMPTY , CANNOT_WRITE_OUTPUT_FILE , ILLEGAL_PARAMETERS , MISSING_PARAMETERS ,
  UNKNOWN_ERROR , EXTERNAL_PROGRAM_ERROR , PARSE_ERROR , INCOMPATIBLE_INPUT_DATA ,
  INTERNAL_ERROR , UNEXPECTED_RESULT , EXTERNAL_PROGRAM_NOTFOUND
}
 Exit codes. More...
 
- Static Public Member Functions inherited from TOPPBase
static void setMaxNumberOfThreads (int num_threads)
 Sets the maximal number of usable threads.
 
- Static Public Attributes inherited from TOPPBase
static const char * TAG_OUTPUT_FILE = "output file"
 
static const char * TAG_INPUT_FILE = "input file"
 
static const char * TAG_OUTPUT_DIR = "output dir"
 
static const char * TAG_OUTPUT_PREFIX = "output prefix"
 
static const char * TAG_ADVANCED = "advanced"
 
static const char * TAG_REQUIRED = "required"
 
static const Citation cite_openms
 The latest and greatest OpenMS citation.
 
- Protected Member Functions inherited from TOPPBase
virtual void registerOptionsAndFlags_ ()=0
 Sets the valid command line options (with argument) and flags (without argument).
 
std::string getParamArgument_ (const Param::ParamEntry &entry) const
 Utility function that determines a suitable argument value for the given Param::ParamEntry.
 
std::vector< ParameterInformationparamToParameterInformation_ (const Param &param) const
 Translates the given parameter object into a vector of ParameterInformation, that can be utilized for cl parsing.
 
ParameterInformation paramEntryToParameterInformation_ (const Param::ParamEntry &entry, const std::string &argument="", const std::string &full_name="") const
 Transforms a ParamEntry object to command line parameter (ParameterInformation).
 
void registerParamSubsectionsAsTOPPSubsections_ (const Param &param)
 
void registerFullParam_ (const Param &param)
 Register command line parameters for all entries in a Param object.
 
void registerStringOption_ (const std::string &name, const std::string &argument, const std::string &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers a string option.
 
void setValidStrings_ (const std::string &name, const std::vector< std::string > &strings)
 Sets the valid strings for a string option or a whole string list.
 
void setValidStrings_ (const std::string &name, const std::string vstrings[], int count)
 Sets the valid strings for a string option or a whole string list.
 
void registerInputFile_ (const std::string &name, const std::string &argument, const std::string &default_value, const std::string &description, bool required=true, bool advanced=false, const StringList &tags=StringList())
 Registers an input file option.
 
void registerOutputFile_ (const std::string &name, const std::string &argument, const std::string &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers an output file option.
 
void registerOutputPrefix_ (const std::string &name, const std::string &argument, const std::string &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers an output file prefix used for tools with multiple file output.
 
void registerOutputDir_ (const std::string &name, const std::string &argument, const std::string &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers an output directory used for tools with multiple output files which are not an output file list, i.e. do not correspond to the number of input files.
 
void setValidFormats_ (const std::string &name, const std::vector< std::string > &formats, const bool force_OpenMS_format=true)
 Sets the formats for a input/output file option or for all members of an input/output file lists.
 
void registerDoubleOption_ (const std::string &name, const std::string &argument, double default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers a double option.
 
void setMinInt_ (const std::string &name, Int min)
 Sets the minimum value for the integer parameter(can be a list of integers,too) name.
 
void setMaxInt_ (const std::string &name, Int max)
 Sets the maximum value for the integer parameter(can be a list of integers,too) name.
 
void setMinFloat_ (const std::string &name, double min)
 Sets the minimum value for the floating point parameter(can be a list of floating points,too) name.
 
void setMaxFloat_ (const std::string &name, double max)
 Sets the maximum value for the floating point parameter(can be a list of floating points,too) name.
 
void registerIntOption_ (const std::string &name, const std::string &argument, Int default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers an integer option.
 
void registerIntList_ (const std::string &name, const std::string &argument, const IntList &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers a list of integers option.
 
void registerDoubleList_ (const std::string &name, const std::string &argument, const DoubleList &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers a list of doubles option.
 
void registerStringList_ (const std::string &name, const std::string &argument, const StringList &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers a list of strings option.
 
void registerInputFileList_ (const std::string &name, const std::string &argument, const StringList &default_value, const std::string &description, bool required=true, bool advanced=false, const StringList &tags=StringList())
 Registers a list of input files option.
 
void registerOutputFileList_ (const std::string &name, const std::string &argument, const StringList &default_value, const std::string &description, bool required=true, bool advanced=false)
 Registers a list of output files option.
 
void registerFlag_ (const std::string &name, const std::string &description, bool advanced=false)
 Registers a flag.
 
void registerSubsection_ (const std::string &name, const std::string &description)
 Registers an allowed subsection in the INI file (usually from OpenMS algorithms).
 
void registerTOPPSubsection_ (const std::string &name, const std::string &description)
 Registers an allowed subsection in the INI file originating from the TOPP tool itself.
 
void addEmptyLine_ ()
 Adds an empty line between registered variables in the documentation.
 
std::string getStringOption_ (const std::string &name) const
 Returns the value of a previously registered string option (use getOutputDirOption() for output directories)
 
std::string getOutputDirOption (const std::string &name) const
 Returns the value of a previously registered output_dir option.
 
double getDoubleOption_ (const std::string &name) const
 Returns the value of a previously registered double option.
 
Int getIntOption_ (const std::string &name) const
 Returns the value of a previously registered integer option.
 
StringList getStringList_ (const std::string &name) const
 Returns the value of a previously registered StringList.
 
IntList getIntList_ (const std::string &name) const
 Returns the value of a previously registered IntList.
 
DoubleList getDoubleList_ (const std::string &name) const
 Returns the value of a previously registered DoubleList.
 
bool getFlag_ (const std::string &name) const
 Returns the value of a previously registered flag.
 
const ParameterInformationfindEntry_ (const std::string &name) const
 Finds the entry in the parameters_ array that has the name name.
 
Param const & getParam_ () const
 Return all parameters relevant to this TOPP tool.
 
void checkParam_ (const Param &param, const std::string &filename, const std::string &location) const
 Checks top-level entries of param according to the information during registration.
 
void fileParamValidityCheck_ (const StringList &param_value, const std::string &param_name, const ParameterInformation &p) const
 checks if files of an input file list exist
 
void fileParamValidityCheck_ (std::string &param_value, const std::string &param_name, const ParameterInformation &p) const
 checks if an input file exists (respecting the flags)
 
void checkIfIniParametersAreApplicable_ (const Param &ini_params)
 Checks if the parameters of the provided ini file are applicable to this tool.
 
void printUsage_ ()
 Prints the tool-specific command line options and appends the common options.
 
virtual ExitCodes main_ (int argc, const char **argv)=0
 The actual "main" method. main_() is invoked by main().
 
void writeLogInfo_ (const std::string &text) const
 Writes a string to the log file and to OPENMS_LOG_INFO.
 
void writeLogWarn_ (const std::string &text) const
 Writes a string to the log file and to OPENMS_LOG_WARN.
 
void writeLogError_ (const std::string &text) const
 Writes a string to the log file and to OPENMS_LOG_ERROR.
 
void writeDebug_ (const std::string &text, UInt min_level) const
 Writes a string to the log file and to OPENMS_LOG_DEBUG if the debug level is at least min_level.
 
void writeDebug_ (const std::string &text, const Param &param, UInt min_level) const
 Writes a std::string followed by a Param to the log file and to OPENMS_LOG_DEBUG if the debug level is at least min_level.
 
ExitCodes runExternalProcess_ (const std::string &executable, const std::vector< std::string > &arguments, const std::string &workdir="", const std::map< std::string, std::string > &env={}) const
 Runs an external process via ExternalProcess and prints its stderr output on failure or if debug_level > 4.
 
ExitCodes runExternalProcess_ (const std::string &executable, const std::vector< std::string > &arguments, std::string &proc_stdout, std::string &proc_stderr, const std::string &workdir="", const std::map< std::string, std::string > &env={}) const
 
const std::string & getIniLocation_ () const
 Returns the location of the ini file where parameters are taken from. E.g. if the command line was TOPPTool -instance 17, then this will be "TOPPTool:17:". Note the ':' at the end.
 
const std::string & toolName_ () const
 Returns the tool name.
 
void inputFileReadable_ (const std::string &filename, const std::string &param_name) const
 Checks if an input file exists, is readable and is not empty.
 
void outputFileWritable_ (const std::string &filename, const std::string &param_name) const
 Checks if an output file is writable.
 
bool parseRange_ (const std::string &text, double &low, double &high) const
 Parses a range string ([a]:[b]) into two variables (doubles)
 
bool parseRange_ (const std::string &text, Int &low, Int &high) const
 Parses a range string ([a]:[b]) into two variables (integers)
 
void addDataProcessing_ (ConsensusMap &map, const DataProcessing &dp) const
 Data processing setter for consensus maps.
 
void addDataProcessing_ (FeatureMap &map, const DataProcessing &dp) const
 Data processing setter for feature maps.
 
void addDataProcessing_ (PeakMap &map, const DataProcessing &dp) const
 Data processing setter for peak maps.
 
DataProcessing getProcessingInfo_ (DataProcessing::ProcessingAction action) const
 Returns the data processing information.
 
DataProcessing getProcessingInfo_ (const std::set< DataProcessing::ProcessingAction > &actions) const
 Returns the data processing information.
 
template<typename Writer >
void writeToolDescription_ (Writer &writer, std::string write_type, std::string fileExtension)
 Helper function avoiding repeated code between CTD, JSON and CWL.
 
- Protected Attributes inherited from TOPPBase
std::string version_
 Version string (if empty, the OpenMS/TOPP version is printed)
 
std::string verboseVersion_
 Version string including additional revision/date time information. Note: This differs from version_ only if not provided by the user.
 
bool official_
 Flag indicating if this an official TOPP tool.
 
std::vector< Citationcitations_
 Papers, specific for this tool (will be shown in '–help')
 
bool toolhandler_test_
 Enable the ToolHandler tests.
 
ProgressLogger::LogType log_type_
 Type of progress logging.
 
bool test_mode_
 Test mode.
 
Int debug_level_
 Debug level set by -debug.
 
- Static Protected Attributes inherited from TOPPBase
static std::string topp_ini_file_
 .TOPP.ini file for storing system default parameters
 

Detailed Description

Base class for TOPP search-engine adapters.

Sits between TOPPBase and concrete search-engine adapters (CometAdapter, MSGFPlusAdapter, ...) and bundles a few conventions that every adapter shares:

Constructor & Destructor Documentation

◆ SearchEngineBase() [1/3]

SearchEngineBase ( )
delete

Default construction is disabled; a tool name and description are required.

◆ SearchEngineBase() [2/3]

SearchEngineBase ( const SearchEngineBase )
delete

Copy construction is disabled.

◆ SearchEngineBase() [3/3]

SearchEngineBase ( const std::string &  name,
const std::string &  description,
bool  official = true,
const std::vector< Citation > &  citations = {},
bool  toolhandler_test = true 
)

Construct the adapter (signature mirrors TOPPBase to forward arguments verbatim).

Parameters
[in]nameTool name.
[in]descriptionOne-line tool description.
[in]officialIf true, the tool is treated as an official TOPP tool: the name is cross-checked against the TOPP-tool list and a warning is printed if it is missing.
[in]citationsCitations associated with this TOPP tool; printed during --help.
[in]toolhandler_testWhether to check that the tool is registered with the ToolHandler. Disable for unit tests only.

◆ ~SearchEngineBase()

~SearchEngineBase ( )
override

Destructor.

Member Function Documentation

◆ getDBFilename()

std::string getDBFilename ( const std::string &  db = "") const

Resolve the search-database path.

Returns db if non-empty, otherwise the value of the -database parameter. If the resulting path is not directly readable, the OpenMS database search paths (see OpenMS::File::findDatabase) are scanned.

Parameters
[in]dbOptional explicit database name; used in place of the -database parameter when non-empty (occasionally required for special database formats, e.g. OMSSA).
Returns
Resolved database path (relative or absolute).
Exceptions
OpenMS::Exception::FileNotFoundwhen the name cannot be resolved against the OpenMS database paths.

◆ getRawfileName()

std::string getRawfileName ( int  ms_level = 2) const

Resolve the -in argument and verify the spectra match a search engine's expectations.

Reads the -in parameter, identifies the file format and – if it is mzML – inspects the per-MS-level centroid metadata to make sure spectra of MS level ms_level are centroided. mzML profile spectra trigger an IllegalArgument exception unless the -force flag is set (in which case the call only logs a warning and proceeds). Format-specific shortcuts: MGF and Bruker TDF data are accepted without further inspection; for any other format only a general warning is issued.

Parameters
[in]ms_levelMS level to check for their peak type (centroided/profile).
Returns
-in path (relative or absolute as supplied).
Exceptions
OpenMS::Exception::FileEmptywhen the mzML file has no spectra of the requested level.
OpenMS::Exception::IllegalArgumentwhen the mzML file contains profile spectra at this level (or no centroided spectra) and the -force flag is not set.

◆ registerPeptideIndexingParameter_()

virtual void registerPeptideIndexingParameter_ ( Param  peptide_indexing_parameter)
virtual

Register the -reindex switch and a hidden PeptideIndexing parameter sub-tree.

Adds a -reindex string option ("true"/"false", default "true") plus the PeptideIndexing parameter tree under a PeptideIndexing: prefix. The supplied peptide_indexing_parameter is patched in two ways before being registered: missing_decoy_action is forced to "warn", and a fixed list of advanced options (decoy_string, decoy_string_position, missing_decoy_action, enzyme:name, enzyme:specificity, write_protein_sequence, write_protein_description, keep_unreferenced_proteins, unmatched_action, aaa_max, mismatches_max, IL_equivalent) are tagged "advanced".

Parameters
[in]peptide_indexing_parameterDefaults for the indexer; search-engine adapters can customise these (e.g. non-tryptic digestion) before passing them in.

◆ reindex_()

virtual SearchEngineBase::ExitCodes reindex_ ( std::vector< ProteinIdentification > &  protein_identifications,
PeptideIdentificationList peptide_identifications 
) const
virtual

Optionally re-run PeptideIndexing against the parsed identifications.

A no-op when the -reindex parameter is anything other than "true". When enabled, runs PeptideIndexing with the adapter's PeptideIndexing: sub-tree merged onto the indexer's defaults, against the database resolved via getDBFilename.

Parameters
[in,out]protein_identificationsProtein hits; updated in place.
[in,out]peptide_identificationsPeptide hits; updated in place.
Returns
EXECUTION_OK on success or when -reindex is disabled; INPUT_FILE_EMPTY when the indexer reports DATABASE_EMPTY; UNEXPECTED_RESULT when the indexer reports UNEXPECTED_RESULT; UNKNOWN_ERROR for any other non-OK indexer exit code (excluding PEPTIDE_IDS_EMPTY, which is treated as success).