Compute the area, background and shape metrics of a peak. More...
#include <OpenMS/MATH/MISC/EmgGradientDescent.h>
Public Member Functions | |
EmgGradientDescent () | |
Constructor. More... | |
~EmgGradientDescent () override=default | |
Destructor. More... | |
void | getDefaultParameters (Param ¶ms) |
template<typename PeakContainerT > | |
void | fitEMGPeakModel (const PeakContainerT &input_peak, PeakContainerT &output_peak, const double left_pos=0.0, const double right_pos=0.0) const |
Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model. More... | |
UInt | estimateEmgParameters (const std::vector< double > &xs, const std::vector< double > &ys, double &best_h, double &best_mu, double &best_sigma, double &best_tau) const |
The implementation of the gradient descent algorithm for the EMG peak model. More... | |
void | applyEstimatedParameters (const std::vector< double > &xs, const double h, const double mu, const double sigma, const double tau, std::vector< double > &out_xs, std::vector< double > &out_ys) const |
Compute the EMG function on a set of points. More... | |
![]() | |
DefaultParamHandler (const String &name) | |
Constructor with name that is displayed in error messages. More... | |
DefaultParamHandler (const DefaultParamHandler &rhs) | |
Copy constructor. More... | |
virtual | ~DefaultParamHandler () |
Destructor. More... | |
DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
Assignment operator. More... | |
virtual bool | operator== (const DefaultParamHandler &rhs) const |
Equality operator. More... | |
void | setParameters (const Param ¶m) |
Sets the parameters. More... | |
const Param & | getParameters () const |
Non-mutable access to the parameters. More... | |
const Param & | getDefaults () const |
Non-mutable access to the default parameters. More... | |
const String & | getName () const |
Non-mutable access to the name. More... | |
void | setName (const String &name) |
Mutable access to the name. More... | |
const std::vector< String > & | getSubsections () const |
Non-mutable access to the registered subsections. More... | |
Protected Member Functions | |
void | updateMembers_ () override |
This method is used to update extra member variables at the end of the setParameters() method. More... | |
void | extractTrainingSet (const std::vector< double > &xs, const std::vector< double > &ys, std::vector< double > &TrX, std::vector< double > &TrY) const |
Given a peak, extract a training set to be used with the gradient descent algorithm. More... | |
double | computeMuMaxDistance (const std::vector< double > &xs) const |
Compute the boundary for the mean (mu ) parameter in gradient descent. More... | |
double | computeInitialMean (const std::vector< double > &xs, const std::vector< double > &ys) const |
Compute an estimation of the mean of a peak. More... | |
![]() | |
void | defaultsToParam_ () |
Updates the parameters after the defaults have been set in the constructor. More... | |
Private Member Functions | |
void | iRpropPlus (const double prev_diff_E_param, double &diff_E_param, double ¶m_lr, double ¶m_update, double ¶m, const double current_E, const double previous_E) const |
Apply the iRprop+ algorithm for gradient descent. More... | |
double | Loss_function (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
Compute the cost given by loss function E. More... | |
double | E_wrt_h (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
Compute the cost given by the partial derivative of the loss function E, with respect to h (the amplitude) More... | |
double | E_wrt_mu (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
Compute the cost given by the partial derivative of the loss function E, with respect to mu (the mean) More... | |
double | E_wrt_sigma (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
Compute the cost given by the partial derivative of the loss function E, with respect to sigma (the standard deviation) More... | |
double | E_wrt_tau (const std::vector< double > &xs, const std::vector< double > &ys, const double h, const double mu, const double sigma, const double tau) const |
Compute the cost given by the partial derivative of the loss function E, with respect to tau (the exponent relaxation time) More... | |
double | compute_z (const double x, const double mu, const double sigma, const double tau) const |
Compute EMG's z parameter. More... | |
double | emg_point (const double x, const double h, const double mu, const double sigma, const double tau) const |
Compute the EMG function on a single point. More... | |
Private Attributes | |
const double | PI = OpenMS::Constants::PI |
Alias for OpenMS::Constants:PI. More... | |
UInt | print_debug_ |
UInt | max_gd_iter_ |
Maximum number of gradient descent iterations in fitEMGPeakModel() More... | |
bool | compute_additional_points_ |
Friends | |
class | EmgGradientDescent_friend |
To test private and protected methods. More... | |
Additional Inherited Members | |
![]() | |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
Writes all parameters to meta values. More... | |
![]() | |
Param | param_ |
Container for current parameters. More... | |
Param | defaults_ |
Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
std::vector< String > | subsections_ |
Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
String | error_name_ |
Name that is displayed in error messages during the parameter checking. More... | |
bool | check_defaults_ |
If this member is set to false no checking if parameters in done;. More... | |
bool | warn_empty_defaults_ |
If this member is set to false no warning is emitted when defaults are empty;. More... | |
Compute the area, background and shape metrics of a peak.
The area computation is performed in integratePeak() and it supports integration by simple sum of the intensity, integration by Simpson's rule implementations for an odd number of unequally spaced points or integration by the trapezoid rule.
The background computation is performed in estimateBackground() and it supports three different approaches to baseline correction, namely computing a rectangular shape under the peak based on the minimum value of the peak borders (vertical_division_min), a rectangular shape based on the maximum value of the beak borders (vertical_division_max) or a trapezoidal shape based on a straight line between the peak borders (base_to_base).
Peak shape metrics are computed in calculatePeakShapeMetrics() and multiple metrics are supported.
The containers supported by the methods are MSChromatogram and MSSpectrum.
Constructor.
|
overridedefault |
Destructor.
void applyEstimatedParameters | ( | const std::vector< double > & | xs, |
const double | h, | ||
const double | mu, | ||
const double | sigma, | ||
const double | tau, | ||
std::vector< double > & | out_xs, | ||
std::vector< double > & | out_ys | ||
) | const |
Compute the EMG function on a set of points.
If class parameter compute_additional_points
is "true"
, the algorithm will detect which side of the peak is cutoff and add points to it.
[in] | xs | Positions |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
[out] | out_xs | The output positions |
[out] | out_ys | The output intensities |
Referenced by EmgGradientDescent_friend::applyEstimatedParameters().
|
private |
Compute EMG's z parameter.
The value of z decides which formula is to be used during EMG function computation. Z values in the following ranges will each use a different EMG formula to avoid numerical instability and potential numerical overflow: (-inf, 0), [0, 6.71e7], (6.71e7, +inf)
Reference: Kalambet, Y.; Kozmin, Y.; Mikhailova, K.; Nagaev, I.; Tikhonov, P. (2011). "Reconstruction of chromatographic peaks using the exponentially modified Gaussian function". Journal of Chemometrics. 25 (7): 352.
[in] | x | Position |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
Referenced by EmgGradientDescent_friend::compute_z().
|
protected |
Compute an estimation of the mean of a peak.
The method computes the middle point on different levels of intensity of the peak. The returned mean is the average of these middle points.
Exception::SizeUnderflow | if the input is empty |
[in] | xs | Positions |
[in] | ys | Intensities |
Referenced by EmgGradientDescent_friend::computeInitialMean().
|
protected |
Compute the boundary for the mean (mu
) parameter in gradient descent.
Together with the value returned by computeInitialMean(), this method decides the minimum and maximum value that mu
can assume during iterations of the gradient descent algorithm. The value is based on the width of the peak.
[in] | xs | Positions |
Referenced by EmgGradientDescent_friend::computeMuMaxDistance().
|
private |
Compute the cost given by the partial derivative of the loss function E, with respect to h
(the amplitude)
Needed by the gradient descent algorithm.
[in] | xs | Positions |
[in] | ys | Intensities |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
|
private |
Compute the cost given by the partial derivative of the loss function E, with respect to mu
(the mean)
Needed by the gradient descent algorithm.
[in] | xs | Positions |
[in] | ys | Intensities |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
|
private |
Compute the cost given by the partial derivative of the loss function E, with respect to sigma
(the standard deviation)
Needed by the gradient descent algorithm.
[in] | xs | Positions |
[in] | ys | Intensities |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
|
private |
Compute the cost given by the partial derivative of the loss function E, with respect to tau
(the exponent relaxation time)
Needed by the gradient descent algorithm.
[in] | xs | Positions |
[in] | ys | Intensities |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
|
private |
Compute the EMG function on a single point.
[in] | x | Position |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
Referenced by EmgGradientDescent_friend::emg_point().
UInt estimateEmgParameters | ( | const std::vector< double > & | xs, |
const std::vector< double > & | ys, | ||
double & | best_h, | ||
double & | best_mu, | ||
double & | best_sigma, | ||
double & | best_tau | ||
) | const |
The implementation of the gradient descent algorithm for the EMG peak model.
[in] | xs | Positions |
[in] | ys | Intensities |
[out] | best_h | h (amplitude) parameter |
[out] | best_mu | mu (mean) parameter |
[out] | best_sigma | sigma (standard deviation) parameter |
[out] | best_tau | tau (exponent relaxation time) parameter |
|
protected |
Given a peak, extract a training set to be used with the gradient descent algorithm.
The algorithm tries to select only those points that can help in finding the optimal parameters with gradient descent. The decision of which points to skip is based on the derivatives between consecutive points.
It first selects all those points whose intensity is below a certain value (intensity_threshold
). Then, the derivatives of all the remaining points are computed. Based on the results, the algorithm selects those points that present a high enough derivative. Once a low value is found, the algorithm stops taking points from that side. It then repeats the same procedure on the other side of the peak. The goal is to limit the inclusion of saturated or spurious points near the peak apex during training.
Exception::SizeUnderflow | if the input has less than 2 elements |
[in] | xs | Positions |
[in] | ys | Intensities |
[out] | TrX | Extracted training set positions |
[out] | TrY | Extracted training set intensities |
Referenced by EmgGradientDescent_friend::extractTrainingSet().
void fitEMGPeakModel | ( | const PeakContainerT & | input_peak, |
PeakContainerT & | output_peak, | ||
const double | left_pos = 0.0 , |
||
const double | right_pos = 0.0 |
||
) | const |
Fit the given peak (either MSChromatogram or MSSpectrum) to the EMG peak model.
The method is able to recapitulate the actual peak area of saturated or cutoff peaks. In addition, the method is able to fine tune the peak area of well acquired peaks. The output is a reconstruction of the input peak. Additional points are often added to produce a peak with similar intensities on boundaries' points.
Metadata will be added to the output peak, containing the optimal parameters for the EMG peak model. This information will be found in a FloatDataArray
of name "emg_parameters", with the parameters being saved in the following order (from index 0 to 3): amplitude h
, mean mu
, standard deviation sigma
, exponent relaxation time tau
.
If left_pos
and right_pos
are passed, then only that part of the peak is taken into consideration.
Inspired by the results found in: Yuri Kalambet, Yuri Kozmin, Ksenia Mikhailova, Igor Nagaev, Pavel Tikhonov Reconstruction of chromatographic peaks using the exponentially modified Gaussian function
PeakContainerT | Either a MSChromatogram or a MSSpectrum |
[in] | input_peak | Input peak |
[out] | output_peak | Output peak |
[in] | left_pos | RT or MZ value of the first point of interest |
[in] | right_pos | RT or MZ value of the last point of interest |
Referenced by PeakIntegrator::EMGPreProcess_().
void getDefaultParameters | ( | Param & | params | ) |
|
private |
Apply the iRprop+ algorithm for gradient descent.
Reference: Christian Igel and Michael Hüsken. Improving the Rprop Learning Algorithm. Second International Symposium on Neural Computation (NC 2000), pp. 115-121, ICSC Academic Press, 2000
[in] | prev_diff_E_param | The cost of the partial derivative of E with respect to the given parameter, at the previous iteration of gradient descent |
[in,out] | diff_E_param | The cost of the partial derivative of E with respect to the given parameter, at the current iteration |
[in,out] | param_lr | The learning rate for the given parameter |
[in,out] | param_update | The amount to add/remove to/from param |
[in,out] | param | The parameter for which the algorithm tries speeding the convergence to a minimum |
[in] | current_E | The current cost E |
[in] | previous_E | The previous cost E |
Referenced by EmgGradientDescent_friend::iRpropPlus().
|
private |
Compute the cost given by loss function E.
Needed by the gradient descent algorithm. The mean squared error is used as the loss function E.
[in] | xs | Positions |
[in] | ys | Intensities |
[in] | h | Amplitude |
[in] | mu | Mean |
[in] | sigma | Standard deviation |
[in] | tau | Exponent relaxation time |
Referenced by EmgGradientDescent_friend::Loss_function().
|
overrideprotectedvirtual |
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
|
friend |
To test private and protected methods.
|
private |
Whether additional points should be added when fitting EMG peak model, particularly useful with cutoff peaks
|
private |
Maximum number of gradient descent iterations in fitEMGPeakModel()
|
private |
Alias for OpenMS::Constants:PI.
|
private |
Level of debug information to print to the terminal Valid values are: 0, 1, 2 Higher values mean more information