OpenMS  2.5.0
Public Member Functions | Protected Member Functions | Private Member Functions | Private Attributes | Friends | List of all members
EmgGradientDescent Class Reference

Compute the area, background and shape metrics of a peak. More...

#include <OpenMS/MATH/MISC/EmgGradientDescent.h>

Inheritance diagram for EmgGradientDescent:
DefaultParamHandler

Public Member Functions

 EmgGradientDescent ()
 Constructor. More...
 
 ~EmgGradientDescent ()=default
 Destructor. More...
 
void getDefaultParameters (Param &params)
 
template<typename PeakContainerT >
void fitEMGPeakModel (const PeakContainerT &input_peak, PeakContainerT &output_peak, const double left_pos=0.0, const double right_pos=0.0) const
 Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model. More...
 
UInt estimateEmgParameters (const std::vector< double > &xs, const std::vector< double > &ys, double &best_h, double &best_mu, double &best_sigma, double &best_tau) const
 The implementation of the gradient descent algorithm for the EMG peak model. More...
 
void applyEstimatedParameters (const std::vector< double > &xs, const double h, const double mu, const double sigma, const double tau, std::vector< double > &out_xs, std::vector< double > &out_ys) const
 Compute the EMG function on a set of points. More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
virtual DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 

Protected Member Functions

void updateMembers_ () override
 This method is used to update extra member variables at the end of the setParameters() method. More...
 
void extractTrainingSet (const std::vector< double > &xs, const std::vector< double > &ys, std::vector< double > &TrX, std::vector< double > &TrY) const
 Given a peak, extract a training set to be used with the gradient descent algorithm. More...
 
double computeMuMaxDistance (const std::vector< double > &xs) const
 Compute the boundary for the mean (`mu`) parameter in gradient descent. More...
 
double computeInitialMean (const std::vector< double > &xs, const std::vector< double > &ys) const
 Compute an estimation of the mean of a peak. More...
 
- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 

Private Member Functions

void iRpropPlus (const double prev_diff_E_param, double &diff_E_param, double &param_lr, double &param_update, double &param, const double current_E, const double previous_E) const
 Apply the iRprop+ algorithm for gradient descent. More...
 
double Loss_function (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
 Compute the cost given by loss function E. More...
 
double E_wrt_h (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
 Compute the cost given by the partial derivative of the loss function E, with respect to `h` (the amplitude) More...
 
double E_wrt_mu (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
 Compute the cost given by the partial derivative of the loss function E, with respect to `mu` (the mean) More...
 
double E_wrt_sigma (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
 Compute the cost given by the partial derivative of the loss function E, with respect to `sigma` (the standard deviation) More...
 
double E_wrt_tau (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const
 Compute the cost given by the partial derivative of the loss function E, with respect to `tau` (the exponent relaxation time) More...
 
double compute_z (const double x, const double mu, const double sigma, const double tau) const
 Compute EMG's z parameter. More...
 
double emg_point (const double x, const double h, const double mu, const double sigma, const double tau) const
 Compute the EMG function on a single point. More...
 

Private Attributes

const double PI = OpenMS::Constants::PI
 Alias for OpenMS::Constants:PI. More...
 
UInt print_debug_
 
UInt max_gd_iter_
 Maximum number of gradient descent iterations in `fitEMGPeakModel()`. More...
 
bool compute_additional_points_
 

Friends

class EmgGradientDescent_friend
 To test private and protected methods. More...
 

Additional Inherited Members

- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 

Detailed Description

Compute the area, background and shape metrics of a peak.

The area computation is performed in integratePeak() and it supports integration by simple sum of the intensity, integration by Simpson's rule implementations for an odd number of unequally spaced points or integration by the trapezoid rule.

The background computation is performed in estimateBackground() and it supports three different approaches to baseline correction, namely computing a rectangular shape under the peak based on the minimum value of the peak borders (vertical_division_min), a rectangular shape based on the maximum value of the beak borders (vertical_division_max) or a trapezoidal shape based on a straight line between the peak borders (base_to_base).

Peak shape metrics are computed in calculatePeakShapeMetrics() and multiple metrics are supported.

The containers supported by the methods are MSChromatogram and MSSpectrum.

Constructor & Destructor Documentation

◆ EmgGradientDescent()

Constructor.

◆ ~EmgGradientDescent()

~EmgGradientDescent ( )
default

Destructor.

Member Function Documentation

◆ applyEstimatedParameters()

void applyEstimatedParameters ( const std::vector< double > &  xs,
const double  h,
const double  mu,
const double  sigma,
const double  tau,
std::vector< double > &  out_xs,
std::vector< double > &  out_ys 
) const

Compute the EMG function on a set of points.

If class parameter `compute_additional_points` is `"true"`, the algorithm will detect which side of the peak is cutoff and add points to it.

Parameters
[in]xsPositions
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
[out]out_xsThe output positions
[out]out_ysThe output intensities

Referenced by EmgGradientDescent_friend::applyEstimatedParameters().

◆ compute_z()

double compute_z ( const double  x,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute EMG's z parameter.

The value of z decides which formula is to be used during EMG function computation. Z values in the following ranges will each use a different EMG formula to avoid numerical instability and potential numerical overflow: (-inf, 0), [0, 6.71e7], (6.71e7, +inf)

Reference: Kalambet, Y.; Kozmin, Y.; Mikhailova, K.; Nagaev, I.; Tikhonov, P. (2011). "Reconstruction of chromatographic peaks using the exponentially modified Gaussian function". Journal of Chemometrics. 25 (7): 352.

Parameters
[in]xPosition
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The computed parameter z

Referenced by EmgGradientDescent_friend::compute_z().

◆ computeInitialMean()

double computeInitialMean ( const std::vector< double > &  xs,
const std::vector< double > &  ys 
) const
protected

Compute an estimation of the mean of a peak.

The method computes the middle point on different levels of intensity of the peak. The returned mean is the average of these middle points.

Exceptions
Exception::SizeUnderflowif the input is empty
Parameters
[in]xsPositions
[in]ysIntensities
Returns
The peak's estimated mean

Referenced by EmgGradientDescent_friend::computeInitialMean().

◆ computeMuMaxDistance()

double computeMuMaxDistance ( const std::vector< double > &  xs) const
protected

Compute the boundary for the mean (`mu`) parameter in gradient descent.

Together with the value returned by computeInitialMean(), this method decides the minimum and maximum value that `mu` can assume during iterations of the gradient descent algorithm. The value is based on the width of the peak.

Parameters
[in]xsPositions
Returns
The maximum distance from the precomputed initial mean in the gradient descent algorithm

Referenced by EmgGradientDescent_friend::computeMuMaxDistance().

◆ E_wrt_h()

double E_wrt_h ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
const double  h,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute the cost given by the partial derivative of the loss function E, with respect to `h` (the amplitude)

Needed by the gradient descent algorithm.

Parameters
[in]xsPositions
[in]ysIntensities
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The computed cost

◆ E_wrt_mu()

double E_wrt_mu ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
const double  h,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute the cost given by the partial derivative of the loss function E, with respect to `mu` (the mean)

Needed by the gradient descent algorithm.

Parameters
[in]xsPositions
[in]ysIntensities
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The computed cost

◆ E_wrt_sigma()

double E_wrt_sigma ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
const double  h,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute the cost given by the partial derivative of the loss function E, with respect to `sigma` (the standard deviation)

Needed by the gradient descent algorithm.

Parameters
[in]xsPositions
[in]ysIntensities
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The computed cost

◆ E_wrt_tau()

double E_wrt_tau ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
const double  h,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute the cost given by the partial derivative of the loss function E, with respect to `tau` (the exponent relaxation time)

Needed by the gradient descent algorithm.

Parameters
[in]xsPositions
[in]ysIntensities
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The computed cost

◆ emg_point()

double emg_point ( const double  x,
const double  h,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute the EMG function on a single point.

Parameters
[in]xPosition
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The estimated intensity for the given input point

Referenced by EmgGradientDescent_friend::emg_point().

◆ estimateEmgParameters()

UInt estimateEmgParameters ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
double best_h,
double best_mu,
double best_sigma,
double best_tau 
) const

The implementation of the gradient descent algorithm for the EMG peak model.

Parameters
[in]xsPositions
[in]ysIntensities
[out]best_h`h` (amplitude) parameter
[out]best_mu`mu` (mean) parameter
[out]best_sigma`sigma` (standard deviation) parameter
[out]best_tau`tau` (exponent relaxation time) parameter
Returns
The number of iterations necessary to reach the best values for the parameters

◆ extractTrainingSet()

void extractTrainingSet ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
std::vector< double > &  TrX,
std::vector< double > &  TrY 
) const
protected

Given a peak, extract a training set to be used with the gradient descent algorithm.

The algorithm tries to select only those points that can help in finding the optimal parameters with gradient descent. The decision of which points to skip is based on the derivatives between consecutive points.

It first selects all those points whose intensity is below a certain value (`intensity_threshold`). Then, the derivatives of all the remaining points are computed. Based on the results, the algorithm selects those points that present a high enough derivative. Once a low value is found, the algorithm stops taking points from that side. It then repeats the same procedure on the other side of the peak. The goal is to limit the inclusion of saturated or spurious points near the peak apex during training.

Exceptions
Exception::SizeUnderflowif the input has less than 2 elements
Parameters
[in]xsPositions
[in]ysIntensities
[out]TrXExtracted training set positions
[out]TrYExtracted training set intensities

Referenced by EmgGradientDescent_friend::extractTrainingSet().

◆ fitEMGPeakModel()

void fitEMGPeakModel ( const PeakContainerT &  input_peak,
PeakContainerT &  output_peak,
const double  left_pos = 0.0,
const double  right_pos = 0.0 
) const

Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model.

The method is able to recapitulate the actual peak area of saturated or cutoff peaks. In addition, the method is able to fine tune the peak area of well acquired peaks. The output is a reconstruction of the input peak. Additional points are often added to produce a peak with similar intensities on boundaries' points.

Metadata will be added to the output peak, containing the optimal parameters for the EMG peak model. This information will be found in a `FloatDataArray` of name "emg_parameters", with the parameters being saved in the following order (from index 0 to 3): amplitude `h`, mean `mu`, standard deviation `sigma`, exponent relaxation time `tau`.

If `left_pos` and `right_pos` are passed, then only that part of the peak is taken into consideration.

Note
All optimal gradient descent parameters are currently hard coded to allow for a simplified user interface
Cutoff peak: The intensities of the left and right baselines are not equal
Saturated peak: The maximum intensity of the peak is lower than expected due to saturation of the detector

Inspired by the results found in: Yuri Kalambet, Yuri Kozmin, Ksenia Mikhailova, Igor Nagaev, Pavel Tikhonov Reconstruction of chromatographic peaks using the exponentially modified Gaussian function

Template Parameters
PeakContainerTEither a MSChromatogram or a MSSpectrum
Parameters
[in]input_peakInput peak
[out]output_peakOutput peak
[in]left_posRT or MZ value of the first point of interest
[in]right_posRT or MZ value of the last point of interest

Referenced by PeakIntegrator::EMGPreProcess_().

◆ getDefaultParameters()

void getDefaultParameters ( Param params)

◆ iRpropPlus()

void iRpropPlus ( const double  prev_diff_E_param,
double diff_E_param,
double param_lr,
double param_update,
double param,
const double  current_E,
const double  previous_E 
) const
private

Apply the iRprop+ algorithm for gradient descent.

Reference: Christian Igel and Michael Hüsken. Improving the Rprop Learning Algorithm. Second International Symposium on Neural Computation (NC 2000), pp. 115-121, ICSC Academic Press, 2000

Parameters
[in]prev_diff_E_paramThe cost of the partial derivative of E with respect to the given parameter, at the previous iteration of gradient descent
[in,out]diff_E_paramThe cost of the partial derivative of E with respect to the given parameter, at the current iteration
[in,out]param_lrThe learning rate for the given parameter
[in,out]param_updateThe amount to add/remove to/from `param`
[in,out]paramThe parameter for which the algorithm tries speeding the convergence to a minimum
[in]current_EThe current cost E
[in]previous_EThe previous cost E

Referenced by EmgGradientDescent_friend::iRpropPlus().

◆ Loss_function()

double Loss_function ( const std::vector< double > &  xs,
const std::vector< double > &  ys,
const double  h,
const double  mu,
const double  sigma,
const double  tau 
) const
private

Compute the cost given by loss function E.

Needed by the gradient descent algorithm. The mean squared error is used as the loss function E.

Parameters
[in]xsPositions
[in]ysIntensities
[in]hAmplitude
[in]muMean
[in]sigmaStandard deviation
[in]tauExponent relaxation time
Returns
The computed cost

Referenced by EmgGradientDescent_friend::Loss_function().

◆ updateMembers_()

void updateMembers_ ( )
overrideprotectedvirtual

This method is used to update extra member variables at the end of the setParameters() method.

Also call it at the end of the derived classes' copy constructor and assignment operator.

The default implementation is empty.

Reimplemented from DefaultParamHandler.

Friends And Related Function Documentation

◆ EmgGradientDescent_friend

friend class EmgGradientDescent_friend
friend

To test private and protected methods.

Member Data Documentation

◆ compute_additional_points_

bool compute_additional_points_
private

Whether additional points should be added when fitting EMG peak model, particularly useful with cutoff peaks

◆ max_gd_iter_

UInt max_gd_iter_
private

Maximum number of gradient descent iterations in `fitEMGPeakModel()`.

◆ PI

const double PI = OpenMS::Constants::PI
private

Alias for OpenMS::Constants:PI.

◆ print_debug_

UInt print_debug_
private

Level of debug information to print to the terminal Valid values are: 0, 1, 2 Higher values mean more information