OpenMS
ConsensusMapMergerAlgorithm Class Reference

Merges identification data in ConsensusMaps. More...

#include <OpenMS/ANALYSIS/ID/ConsensusMapMergerAlgorithm.h>

Inheritance diagram for ConsensusMapMergerAlgorithm:
[legend]
Collaboration diagram for ConsensusMapMergerAlgorithm:
[legend]

Public Member Functions

 ConsensusMapMergerAlgorithm ()
 
void mergeProteinsAcrossFractionsAndReplicates (ConsensusMap &cmap, const ExperimentalDesign &exp_design) const
 
void mergeAllIDRuns (ConsensusMap &cmap) const
 
void mergeProteinIDRuns (ConsensusMap &cmap, const std::map< unsigned, unsigned > &mapIdx_to_new_protIDRun) const
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Private Types

using hash_type = std::size_t(*)(const ProteinHit &)
 
using equal_type = bool(*)(const ProteinHit &, const ProteinHit &)
 

Private Member Functions

bool checkOldRunConsistency_ (const std::vector< ProteinIdentification > &protRuns, const String &experiment_type) const
 
bool checkOldRunConsistency_ (const std::vector< ProteinIdentification > &protRuns, const ProteinIdentification &ref, const String &experiment_type) const
 Same as above but with specific reference run. More...
 

Static Private Member Functions

static size_t accessionHash_ (const ProteinHit &p)
 
static bool accessionEqual_ (const ProteinHit &p1, const ProteinHit &p2)
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Protected Member Functions inherited from DefaultParamHandler
virtual void updateMembers_ ()
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Static Protected Member Functions inherited from ProgressLogger
static String logTypeToFactoryName_ (LogType type)
 Return the name of the factory product used for this log type. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

Merges identification data in ConsensusMaps.

Todo:

This could be merged in the future with the general IDMergerAlgorithm since it shares a lot. IDMergerAlgorithm needs additional methods to have multiple runs as output. It also needs to store an extended mapping internally to distribute the PeptideIDs to the right output run according to origin and label. And should have non-copying/moving overloads for inserting PeptideIDs since we probably do not want to distribute the PeptideIDs to the features again. In general detaching IDs from features would be of great help here.

Untested for TMT/iTraq data where you usually have one Identification run per File but in one File you might have multiple conditions multiplexed, that you might want to split for inference. Problem: There is only one PeptideIdentification object per Feature that is representative for all "sub maps" (in this case the labels/reporter ions). -> A lookup is necessary if the reporter ion had non-zero intensity and if so, the peptide ID needs to be duplicated for every new (condition-based) IdentificationRun it is supposed to be used in, according to the mapping.

Member Typedef Documentation

◆ equal_type

using equal_type = bool (*)(const ProteinHit&, const ProteinHit&)
private

◆ hash_type

using hash_type = std::size_t (*)(const ProteinHit&)
private

Constructor & Destructor Documentation

◆ ConsensusMapMergerAlgorithm()

Member Function Documentation

◆ accessionEqual_()

static bool accessionEqual_ ( const ProteinHit p1,
const ProteinHit p2 
)
inlinestaticprivate

◆ accessionHash_()

static size_t accessionHash_ ( const ProteinHit p)
inlinestaticprivate

◆ checkOldRunConsistency_() [1/2]

bool checkOldRunConsistency_ ( const std::vector< ProteinIdentification > &  protRuns,
const ProteinIdentification ref,
const String experiment_type 
) const
private

Same as above but with specific reference run.

◆ checkOldRunConsistency_() [2/2]

bool checkOldRunConsistency_ ( const std::vector< ProteinIdentification > &  protRuns,
const String experiment_type 
) const
private

Checks consistency of search engines and settings across runs before merging. Uses the first run as reference and compares all to it.

Returns
all same? TODO: return a merged RunDescription about what to put in the new runs (e.g. for SILAC)
Exceptions
BaseExceptionfor disagreeing settings

◆ mergeAllIDRuns()

void mergeAllIDRuns ( ConsensusMap cmap) const

Similar to above, merges every ID Run into one big run. Proteins get only inserted once but Peptides stay unfiltered i.e. might occur in several PeptideIdentifications afterwards

Note
Groups are not carried over during merging.
Exceptions
MissingInformationExceptionfor e.g. missing map_indices in PeptideIDs

◆ mergeProteinIDRuns()

void mergeProteinIDRuns ( ConsensusMap cmap,
const std::map< unsigned, unsigned > &  mapIdx_to_new_protIDRun 
) const

Takes a ConsensusMap and a mapping between ConsensusMap column index (sub map index in the map_index meta value) and the new ProteinIdentification run index and merges them based on this.

Exceptions
MissingInformationExceptionfor e.g. missing map_indices in PeptideIDs
Todo:
Do we need to consider the old IDRun identifier in addition to the sub map index

◆ mergeProteinsAcrossFractionsAndReplicates()

void mergeProteinsAcrossFractionsAndReplicates ( ConsensusMap cmap,
const ExperimentalDesign exp_design 
) const

Takes a ConsensusMap (with usually one IdentificationRun per column [= sub map] and merges them to one IdentificationRun per Condition (unique entries in Sample section when removing replicate columns) while reassociating the PeptideHits accordingly. It just does not make sense to have every protein duplicated. And IdentificationRuns are used to guide inference methods on what identifications to perform inference on

Note
Constructs the mapping based on the exp. design and then uses mergeProteinIDRuns
Groups are not carried over during merging.
Exceptions
MissingInformationExceptionfor e.g. missing map_indices in PeptideIDs