OpenMS
FeatureFindingMetabo Class Reference

Method for the assembly of mass traces belonging to the same isotope pattern, i.e., that are compatible in retention times, mass-to-charge ratios, and isotope abundances. More...

#include <OpenMS/FILTERING/DATAREDUCTION/FeatureFindingMetabo.h>

Inheritance diagram for FeatureFindingMetabo:
[legend]
Collaboration diagram for FeatureFindingMetabo:
[legend]

Public Member Functions

 FeatureFindingMetabo ()
 Default constructor. More...
 
 ~FeatureFindingMetabo () override
 Default destructor. More...
 
void run (std::vector< MassTrace > &input_mtraces, FeatureMap &output_featmap, std::vector< std::vector< OpenMS::MSChromatogram > > &output_chromatograms)
 main method of FeatureFindingMetabo More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 
- Public Member Functions inherited from ProgressLogger
 ProgressLogger ()
 Constructor. More...
 
virtual ~ProgressLogger ()
 Destructor. More...
 
 ProgressLogger (const ProgressLogger &other)
 Copy constructor. More...
 
ProgressLoggeroperator= (const ProgressLogger &other)
 Assignment Operator. More...
 
void setLogType (LogType type) const
 Sets the progress log that should be used. The default type is NONE! More...
 
LogType getLogType () const
 Returns the type of progress log being used. More...
 
void startProgress (SignedSize begin, SignedSize end, const String &label) const
 Initializes the progress display. More...
 
void setProgress (SignedSize value) const
 Sets the current progress. More...
 
void endProgress (UInt64 bytes_processed=0) const
 
void nextProgress () const
 increment progress by 1 (according to range begin-end) More...
 

Protected Member Functions

void updateMembers_ () override
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 

Private Member Functions

std::vector< const Element * > elementsFromString_ (const std::string &elements_string) const
 parses a string of element symbols into a vector of Elements More...
 
Range getTheoreticIsotopicMassWindow_ (const std::vector< Element const * > &alphabet, int peakOffset) const
 
double computeCosineSim_ (const std::vector< double > &, const std::vector< double > &) const
 Computes the cosine similarity between two vectors. More...
 
int isLegalIsotopePattern_ (const FeatureHypothesis &feat_hypo) const
 Compare intensities of feature hypothesis with model. More...
 
void loadIsotopeModel_ (const String &)
 
double scoreMZ_ (const MassTrace &, const MassTrace &, Size isotopic_position, Size charge, Range isotope_window) const
 Perform mass to charge scoring of two multiple mass traces. More...
 
double scoreMZByExpectedMean_ (Size iso_pos, Size charge, const double diff_mz, double mt_variances) const
 score isotope m/z distance based on the expected m/z distances using C13-C12 or Kenar method More...
 
double scoreMZByExpectedRange_ (Size charge, const double diff_mz, double mt_variances, Range isotope_window) const
 score isotope m/z distance based on an expected isotope window which was calculated from a set of expected elements More...
 
double scoreRT_ (const MassTrace &, const MassTrace &) const
 Perform retention time scoring of two multiple mass traces. More...
 
double computeAveragineSimScore_ (const std::vector< double > &intensities, const double &molecular_weight) const
 Perform intensity scoring using the averagine model (for peptides only) More...
 
void findLocalFeatures_ (const std::vector< const MassTrace * > &candidates, double total_intensity, std::vector< FeatureHypothesis > &output_hypotheses) const
 Identify groupings of mass traces based on a set of reasonable candidates. More...
 

Private Attributes

svm_model * isotope_filt_svm_ = nullptr
 SVM parameters. More...
 
std::vector< double > svm_feat_centers_
 
std::vector< double > svm_feat_scales_
 
double local_rt_range_
 parameter stuff More...
 
double local_mz_range_
 
Size charge_lower_bound_
 
Size charge_upper_bound_
 
double chrom_fwhm_
 
bool report_summed_ints_
 
bool enable_RT_filtering_
 
String isotope_filtering_model_
 
bool use_smoothed_intensities_
 
bool use_mz_scoring_C13_
 
bool use_mz_scoring_by_element_range_
 
bool report_convex_hulls_
 
bool report_chromatograms_
 
bool remove_single_traces_
 
std::vector< const Element * > elements_
 

Additional Inherited Members

- Public Types inherited from ProgressLogger
enum  LogType { CMD , GUI , NONE }
 Possible log types. More...
 
- Static Public Member Functions inherited from DefaultParamHandler
static void writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="")
 Writes all parameters to meta values. More...
 
- Static Protected Member Functions inherited from ProgressLogger
static String logTypeToFactoryName_ (LogType type)
 Return the name of the factory product used for this log type. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 
- Protected Attributes inherited from ProgressLogger
LogType type_
 
time_t last_invoke_
 
ProgressLoggerImplcurrent_logger_
 
- Static Protected Attributes inherited from ProgressLogger
static int recursion_depth_
 

Detailed Description

Method for the assembly of mass traces belonging to the same isotope pattern, i.e., that are compatible in retention times, mass-to-charge ratios, and isotope abundances.

In FeatureFindingMetabo, mass traces detected by the MassTraceDetection method and afterwards split into individual chromatographic peaks by the ElutionPeakDetection method are assembled to composite features if they are compatible with respect to RTs, m/z ratios, and isotopic intensities. To this end, feature hypotheses are formulated exhaustively based on the set of mass traces detected within a local RT and m/z region. These feature hypotheses are scored by their similarity to real metabolite isotope patterns. The score is derived from independent models for retention time shifts and m/z differences between isotopic mass traces. Hypotheses with correct or false isotopic abundances are distinguished by a SVM model. Mass traces that could not be assembled or low-intensity metabolites with only a monoisotopic mass trace to observe are left in the resulting FeatureMap as singletons with the undefined charge state of 0.

Reference: Kenar et al., doi: 10.1074/mcp.M113.031278

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
local_rt_range float10.0  RT range where to look for coeluting mass traces
local_mz_range float6.5  MZ range where to look for isotopic mass traces
charge_lower_bound int1  Lowest charge state to consider
charge_upper_bound int3  Highest charge state to consider
chrom_fwhm float5.0  Expected chromatographic peak width (in seconds).
report_summed_ints stringfalse false, trueSet to true for a feature intensity summed up over all traces rather than using monoisotopic trace intensity alone.
enable_RT_filtering stringtrue false, trueRequire sufficient overlap in RT while assembling mass traces. Disable for direct injection data..
isotope_filtering_model stringmetabolites (5% RMS) metabolites (2% RMS), metabolites (5% RMS), peptides, noneRemove/score candidate assemblies based on isotope intensities. SVM isotope models for metabolites were trained with either 2% or 5% RMS error. For peptides, an averagine cosine scoring is used. Select the appropriate noise model according to the quality of measurement or MS device.
mz_scoring_13C stringfalse false, trueUse the 13C isotope peak position (~1.003355 Da) as the expected shift in m/z for isotope mass traces (highly recommended for lipidomics!). Disable for general metabolites (as described in Kenar et al. 2014, MCP.).
use_smoothed_intensities stringtrue false, trueUse LOWESS intensities instead of raw intensities.
report_convex_hulls stringfalse false, trueAugment each reported feature with the convex hull of the underlying mass traces (increases featureXML file size considerably).
report_chromatograms stringfalse false, trueAdds Chromatogram for each reported feature (Output in mzml).
remove_single_traces stringfalse false, trueRemove unassembled traces (single traces).
mz_scoring_by_elements stringfalse false, trueUse the m/z range of the assumed elements to detect isotope peaks. A expected m/z range is computed from the isotopes of the assumed elements. If enabled, this ignores 'mz_scoring_13C'
elements stringCHNOPS  Elements assumes to be present in the sample (this influences isotope detection).

Note:
  • If a section name is documented, the documentation is displayed as tooltip.
  • Advanced parameter names are italic.

Constructor & Destructor Documentation

◆ FeatureFindingMetabo()

Default constructor.

◆ ~FeatureFindingMetabo()

~FeatureFindingMetabo ( )
override

Default destructor.

Member Function Documentation

◆ computeAveragineSimScore_()

double computeAveragineSimScore_ ( const std::vector< double > &  intensities,
const double &  molecular_weight 
) const
private

Perform intensity scoring using the averagine model (for peptides only)

Compare the isotopic intensity distribution with the theoretical one expected for peptides, using the averagine model. Compute the cosine similarity between the two values.

◆ computeCosineSim_()

double computeCosineSim_ ( const std::vector< double > &  ,
const std::vector< double > &   
) const
private

Computes the cosine similarity between two vectors.

The cosine similarity (or cosine distance) is the cosine of the angle between two vectors or the normalized dot product of two vectors.

See also https://en.wikipedia.org/wiki/Cosine_similarity

◆ elementsFromString_()

std::vector<const Element*> elementsFromString_ ( const std::string &  elements_string) const
private

parses a string of element symbols into a vector of Elements

Parameters
elements_stringstring of element symbols without whitespaces or commas. e.g. CHNOPSCl
Returns
vector of Elements

◆ findLocalFeatures_()

void findLocalFeatures_ ( const std::vector< const MassTrace * > &  candidates,
double  total_intensity,
std::vector< FeatureHypothesis > &  output_hypotheses 
) const
private

Identify groupings of mass traces based on a set of reasonable candidates.

Takes a set of reasonable candidates for mass trace grouping and checks all combinations of charge and isotopic positions on the candidates. It is assumed that candidates[0] is the monoisotopic trace.

The resulting possible groupings are appended to output_hypotheses.

◆ getTheoreticIsotopicMassWindow_()

Range getTheoreticIsotopicMassWindow_ ( const std::vector< Element const * > &  alphabet,
int  peakOffset 
) const
private

Calculate the maximal and minimal mass defects of isotopes for a given set of elements.

Parameters
alphabetchemical alphabet (elements which are expected to be present)
peakOffsetinteger distance between isotope peak and monoisotopic peak (minimum: 1)
Returns
an interval which should contain the isotopic peak. This interval is relative to the monoisotopic peak.

◆ isLegalIsotopePattern_()

int isLegalIsotopePattern_ ( const FeatureHypothesis feat_hypo) const
private

Compare intensities of feature hypothesis with model.

Use a pre-trained SVM model to evaluate the intensity distribution of a given feature hypothesis. The model is trained on the monoisotopic and the first tree isotopic traces of each feature and uses the scaled ratios between the traces as input.

Reference: Kenar et al., doi: 10.1074/mcp.M113.031278

Parameters
feat_hypoA feature hypotheses containing mass traces
Returns
0 for 'no'; 1 for 'yes'; -1 if only a single mass trace exists

◆ loadIsotopeModel_()

void loadIsotopeModel_ ( const String )
private

◆ run()

void run ( std::vector< MassTrace > &  input_mtraces,
FeatureMap output_featmap,
std::vector< std::vector< OpenMS::MSChromatogram > > &  output_chromatograms 
)

main method of FeatureFindingMetabo

◆ scoreMZ_()

double scoreMZ_ ( const MassTrace ,
const MassTrace ,
Size  isotopic_position,
Size  charge,
Range  isotope_window 
) const
private

Perform mass to charge scoring of two multiple mass traces.

Scores two mass traces based on the m/z and the hypothesis that one trace is an isotopic trace of the other one. The isotopic position (which trace it is) and the charge for the hypothesis are given as additional parameters. The scoring is described in Kenar et al., and is based on a random sample of 115 000 compounds drawn from a comprehensive set of 24 million putative sum formulas, of which the isotopic distribution was accurately calculated. Thus, a theoretical mu and sigma are calculated as:

mu = 1.000857 * j + 0.001091 u sigma = 0.0016633 j * 0.0004751

where j is the isotopic peak considered. A similarity score based on agreement with the model is then computed.

Reference: Kenar et al., doi: 10.1074/mcp.M113.031278

An alternative scoring was added which test if isotope m/z distances lie in an expected m/z window. This window is computed from a given set of elements.

◆ scoreMZByExpectedMean_()

double scoreMZByExpectedMean_ ( Size  iso_pos,
Size  charge,
const double  diff_mz,
double  mt_variances 
) const
private

score isotope m/z distance based on the expected m/z distances using C13-C12 or Kenar method

Parameters
iso_pos
charge
diff_mz
mt_variances
Returns

◆ scoreMZByExpectedRange_()

double scoreMZByExpectedRange_ ( Size  charge,
const double  diff_mz,
double  mt_variances,
Range  isotope_window 
) const
private

score isotope m/z distance based on an expected isotope window which was calculated from a set of expected elements

Parameters
charge
diff_mz
mt_variancesm/z variance between the two mass traces which are compared
isotope_window
Returns

◆ scoreRT_()

double scoreRT_ ( const MassTrace ,
const MassTrace  
) const
private

Perform retention time scoring of two multiple mass traces.

Computes the similarity of the two peak shapes using cosine similarity (see computeCosineSim_) if some conditions are fulfilled. Mainly the overlap between the two peaks at FHWM needs to exceed a certain threshold. The threshold is set at 0.7 (i.e. 70 % overlap) as also described in Kenar et al.

Note
this only works for equally sampled mass traces, e.g. they need to come from the same map (not for SRM measurements for example).

◆ updateMembers_()

void updateMembers_ ( )
overrideprotectedvirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ charge_lower_bound_

Size charge_lower_bound_
private

◆ charge_upper_bound_

Size charge_upper_bound_
private

◆ chrom_fwhm_

double chrom_fwhm_
private

◆ elements_

std::vector<const Element*> elements_
private

◆ enable_RT_filtering_

bool enable_RT_filtering_
private

◆ isotope_filt_svm_

svm_model* isotope_filt_svm_ = nullptr
private

SVM parameters.

◆ isotope_filtering_model_

String isotope_filtering_model_
private

◆ local_mz_range_

double local_mz_range_
private

◆ local_rt_range_

double local_rt_range_
private

parameter stuff

◆ remove_single_traces_

bool remove_single_traces_
private

◆ report_chromatograms_

bool report_chromatograms_
private

◆ report_convex_hulls_

bool report_convex_hulls_
private

◆ report_summed_ints_

bool report_summed_ints_
private

◆ svm_feat_centers_

std::vector<double> svm_feat_centers_
private

◆ svm_feat_scales_

std::vector<double> svm_feat_scales_
private

◆ use_mz_scoring_by_element_range_

bool use_mz_scoring_by_element_range_
private

◆ use_mz_scoring_C13_

bool use_mz_scoring_C13_
private

◆ use_smoothed_intensities_

bool use_smoothed_intensities_
private