OpenMS
MapAlignmentAlgorithmIdentification Class Reference

A map alignment algorithm based on peptide identifications from MS2 spectra. More...

#include <OpenMS/ANALYSIS/MAPMATCHING/MapAlignmentAlgorithmIdentification.h>

Inheritance diagram for MapAlignmentAlgorithmIdentification:
[legend]
Collaboration diagram for MapAlignmentAlgorithmIdentification:
[legend]

Public Member Functions

 MapAlignmentAlgorithmIdentification ()
 Default constructor. More...
 
 ~MapAlignmentAlgorithmIdentification () override
 Destructor. More...
 
template<typename DataType >
void setReference (DataType &data)
 
template<typename DataType >
void align (std::vector< DataType > &data, std::vector< TransformationDescription > &transformations, Int reference_index=-1)
 Align feature maps, consensus maps, peak maps, or peptide identifications. More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void setLogger (ProgressLoggerImpl *logger)
 Sets the logger to be used for progress logging. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Protected Types

typedef std::map< String, DoubleListSeqToList
 Type to store retention times given for individual peptide sequences. More...
 
typedef std::map< String, double > SeqToValue
 Type to store one representative retention time per peptide sequence. More...
 

Protected Member Functions

void computeMedians_ (SeqToList &rt_data, SeqToValue &medians, bool sorted=false)
 Compute the median retention time for each peptide sequence. More...
 
bool getRetentionTimes_ (std::vector< PeptideIdentification > &peptides, SeqToList &rt_data)
 Collect retention time data from peptide IDs. More...
 
bool getRetentionTimes_ (IdentificationData &id_data, SeqToList &rt_data)
 Collect retention time data from spectrum matches. More...
 
bool getRetentionTimes_ (PeakMap &experiment, SeqToList &rt_data)
 Collect retention time data from peptide IDs annotated to spectra. More...
 
template<typename MapType >
bool getRetentionTimes_ (MapType &features, SeqToList &rt_data)
 Collect retention time data from peptide IDs contained in feature maps or consensus maps. More...
 
void computeTransformations_ (std::vector< SeqToList > &rt_data, std::vector< TransformationDescription > &transforms, bool sorted=false)
 Compute retention time transformations from RT data grouped by peptide sequence. More...
 
void checkParameters_ (const Size runs)
 Check that parameter values are valid. More...
 
void getReference_ ()
 Get reference retention times. More...
 
IdentificationData::ScoreTypeRef handleIdDataScoreType_ (const IdentificationData &id_data)
 Helper function to find/define the score type for processing IdentificationData. More...
 
- Protected Member Functions inherited from DefaultParamHandler
virtual void updateMembers_ ()
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 

Protected Attributes

Int reference_index_
 Index of input file to use as reference (if any) More...
 
SeqToValue reference_
 Reference retention times (per peptide sequence) More...
 
Size min_run_occur_
 Minimum number of runs a peptide must occur in. More...
 
bool use_feature_rt_ {}
 Use feature RT instead of RT from best peptide ID in the feature? More...
 
bool use_adducts_ {}
 Consider differently adducted IDs as different? More...
 
double min_score_
 Minimum score to reach for a peptide to be considered. More...
 
bool score_cutoff_ {}
 Actually use the above defined score_cutoff? Needed since it is hard to define a non-cutting score for a user. More...
 
String score_type_
 Score type to use for filtering. More...
 
bool(* better_ )(double, double) = [](double, double) {return true;}
 Score better? More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 

Private Member Functions

 MapAlignmentAlgorithmIdentification (const MapAlignmentAlgorithmIdentification &)
 Copy constructor intentionally not implemented -> private. More...
 
MapAlignmentAlgorithmIdentificationoperator= (const MapAlignmentAlgorithmIdentification &)
 Assignment operator intentionally not implemented -> private. More...
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

A map alignment algorithm based on peptide identifications from MS2 spectra.

PeptideIdentification instances are grouped by sequence of the respective best-scoring PeptideHit and retention time data is collected (PeptideIdentification::getRT()). ID groups with the same sequence in different maps represent points of correspondence between the maps and form the basis of the alignment. Only the best PSM per spectrum is considered as the correct identification.

Each map is aligned to a reference retention time scale. This time scale can either come from a reference file (reference parameter) or be computed as a consensus of the input maps (median retention times over all maps of the ID groups). The maps are then aligned to this scale as follows:
The median retention time of each ID group in a map is mapped to the reference retention time of this group. Cubic spline smoothing is used to convert this mapping to a smooth function. Retention times in the map are transformed to the consensus scale by applying this function.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
score_type string  Name of the score type to use for ranking and filtering (.oms input only). If left empty, a score type is picked automatically.
score_cutoff stringfalse true, falseUse only IDs above a score cut-off (parameter 'min_score') for alignment?
min_score float0.05  If 'score_cutoff' is 'true': Minimum score for an ID to be considered.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
min_run_occur int2 min: 2Minimum number of runs (incl. reference, if any) in which a peptide must occur to be used for the alignment.
Unless you have very few runs or identifications, increase this value to focus on more informative peptides.
max_rt_shift float0.5 min: 0.0Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment.
If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale.
use_unassigned_peptides stringtrue true, falseShould unassigned peptide identifications be used when computing an alignment of feature or consensus maps? If 'false', only peptide IDs assigned to features will be used.
use_feature_rt stringfalse true, falseWhen aligning feature or consensus maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used.
Precludes 'use_unassigned_peptides'.
use_adducts stringtrue true, falseIf IDs contain adducts, treat differently adducted variants of the same molecule as different.

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

Member Typedef Documentation

◆ SeqToList

typedef std::map<String, DoubleList> SeqToList
protected

Type to store retention times given for individual peptide sequences.

◆ SeqToValue

typedef std::map<String, double> SeqToValue
protected

Type to store one representative retention time per peptide sequence.

Constructor & Destructor Documentation

◆ MapAlignmentAlgorithmIdentification() [1/2]

Default constructor.

◆ ~MapAlignmentAlgorithmIdentification()

Destructor.

◆ MapAlignmentAlgorithmIdentification() [2/2]

Copy constructor intentionally not implemented -> private.

Member Function Documentation

◆ align()

void align ( std::vector< DataType > &  data,
std::vector< TransformationDescription > &  transformations,
Int  reference_index = -1 
)
inline

Align feature maps, consensus maps, peak maps, or peptide identifications.

Parameters
dataVector of input data (FeatureMap, ConsensusMap, PeakMap or vector<PeptideIdentification>) that should be aligned.
transformationsVector of RT transformations that will be computed.
reference_indexIndex in data of the reference to align to, if any
Exceptions
Exception::MissingInformationNot enough suitable RT data to perform alignment

◆ checkParameters_()

void checkParameters_ ( const Size  runs)
protected

Check that parameter values are valid.

Currently only 'min_run_occur' is checked.

Parameters
runsNumber of runs (input files) to be aligned

◆ computeMedians_()

void computeMedians_ ( SeqToList rt_data,
SeqToValue medians,
bool  sorted = false 
)
protected

Compute the median retention time for each peptide sequence.

Parameters
rt_dataLists of RT values for diff. peptide sequences (input, will be sorted)
mediansMedian RT values for the peptide sequences (output)
sortedAre RT lists already sorted?
Exceptions
Exception::IllegalArgumentif the input list is empty

◆ computeTransformations_()

void computeTransformations_ ( std::vector< SeqToList > &  rt_data,
std::vector< TransformationDescription > &  transforms,
bool  sorted = false 
)
protected

Compute retention time transformations from RT data grouped by peptide sequence.

Parameters
rt_dataLists of RT values for diff. peptide sequences, per dataset (input, will be sorted)
transformsResulting transformations, per dataset (output)
sortedAre RT lists already sorted?

◆ getReference_()

void getReference_ ( )
protected

Get reference retention times.

If a reference file is supplied via the reference parameter, extract retention time information and store it in reference_.

◆ getRetentionTimes_() [1/4]

bool getRetentionTimes_ ( IdentificationData id_data,
SeqToList rt_data 
)
protected

Collect retention time data from spectrum matches.

Parameters
id_dataInput identification data
rt_dataLists of RT values for diff. spectrum matches (output)
Returns
Are the RTs already sorted? (Here: false)

◆ getRetentionTimes_() [2/4]

bool getRetentionTimes_ ( MapType features,
SeqToList rt_data 
)
inlineprotected

Collect retention time data from peptide IDs contained in feature maps or consensus maps.

The following global flags (mutually exclusive) influence the processing:
Depending on use_unassigned_peptides, unassigned peptide IDs are used in addition to IDs annotated to features.
Depending on use_feature_rt, feature retention times are used instead of peptide retention times. Depending on score_cutoff and min_score, only peptide IDs with minimum score X are used. Higher score better is determined from the first PeptideID encountered. Make sure they are the same. This param is useless with use_feature_rt yet.

Parameters
featuresInput features for RT data
rt_dataLists of RT values for diff. peptide sequences (output)
Returns
Are the RTs already sorted? (Here: true)

References MSExperiment::begin(), and MSExperiment::end().

◆ getRetentionTimes_() [3/4]

bool getRetentionTimes_ ( PeakMap experiment,
SeqToList rt_data 
)
protected

Collect retention time data from peptide IDs annotated to spectra.

Parameters
experimentInput map for RT data
rt_dataLists of RT values for diff. peptide sequences (output)
Returns
Are the RTs already sorted? (Here: false)

◆ getRetentionTimes_() [4/4]

bool getRetentionTimes_ ( std::vector< PeptideIdentification > &  peptides,
SeqToList rt_data 
)
protected

Collect retention time data from peptide IDs.

Parameters
peptidesInput peptide IDs (lists of peptide hits will be sorted)
rt_dataLists of RT values for diff. peptide sequences (output)
Returns
Are the RTs already sorted? (Here: false)

◆ handleIdDataScoreType_()

IdentificationData::ScoreTypeRef handleIdDataScoreType_ ( const IdentificationData id_data)
protected

Helper function to find/define the score type for processing IdentificationData.

Returns
Reference to the score type denoted by algorithm parameter "score_type"

◆ operator=()

Assignment operator intentionally not implemented -> private.

◆ setReference()

void setReference ( DataType &  data)
inline

Member Data Documentation

◆ better_

bool(* better_) (double, double) = [](double, double) {return true;}
protected

Score better?

◆ min_run_occur_

Size min_run_occur_
protected

Minimum number of runs a peptide must occur in.

◆ min_score_

double min_score_
protected

Minimum score to reach for a peptide to be considered.

◆ reference_

SeqToValue reference_
protected

Reference retention times (per peptide sequence)

◆ reference_index_

Int reference_index_
protected

Index of input file to use as reference (if any)

◆ score_cutoff_

bool score_cutoff_ {}
protected

Actually use the above defined score_cutoff? Needed since it is hard to define a non-cutting score for a user.

◆ score_type_

String score_type_
protected

Score type to use for filtering.

◆ use_adducts_

bool use_adducts_ {}
protected

Consider differently adducted IDs as different?

◆ use_feature_rt_

bool use_feature_rt_ {}
protected

Use feature RT instead of RT from best peptide ID in the feature?