OpenMS
|
A map alignment algorithm based on spectrum similarity (dynamic programming). More...
#include <OpenMS/ANALYSIS/MAPMATCHING/MapAlignmentAlgorithmSpectrumAlignment.h>
Classes | |
class | Compare |
inner class necessary for using the sort algorithm. More... | |
Public Member Functions | |
MapAlignmentAlgorithmSpectrumAlignment () | |
Default constructor. More... | |
~MapAlignmentAlgorithmSpectrumAlignment () override | |
Destructor. More... | |
virtual void | align (std::vector< PeakMap > &, std::vector< TransformationDescription > &) |
Align peak maps. More... | |
Public Member Functions inherited from DefaultParamHandler | |
DefaultParamHandler (const String &name) | |
Constructor with name that is displayed in error messages. More... | |
DefaultParamHandler (const DefaultParamHandler &rhs) | |
Copy constructor. More... | |
virtual | ~DefaultParamHandler () |
Destructor. More... | |
DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
Assignment operator. More... | |
virtual bool | operator== (const DefaultParamHandler &rhs) const |
Equality operator. More... | |
void | setParameters (const Param ¶m) |
Sets the parameters. More... | |
const Param & | getParameters () const |
Non-mutable access to the parameters. More... | |
const Param & | getDefaults () const |
Non-mutable access to the default parameters. More... | |
const String & | getName () const |
Non-mutable access to the name. More... | |
void | setName (const String &name) |
Mutable access to the name. More... | |
const std::vector< String > & | getSubsections () const |
Non-mutable access to the registered subsections. More... | |
Public Member Functions inherited from ProgressLogger | |
ProgressLogger () | |
Constructor. More... | |
virtual | ~ProgressLogger () |
Destructor. More... | |
ProgressLogger (const ProgressLogger &other) | |
Copy constructor. More... | |
ProgressLogger & | operator= (const ProgressLogger &other) |
Assignment Operator. More... | |
void | setLogType (LogType type) const |
Sets the progress log that should be used. The default type is NONE! More... | |
LogType | getLogType () const |
Returns the type of progress log being used. More... | |
void | startProgress (SignedSize begin, SignedSize end, const String &label) const |
Initializes the progress display. More... | |
void | setProgress (SignedSize value) const |
Sets the current progress. More... | |
void | endProgress (UInt64 bytes_processed=0) const |
void | nextProgress () const |
increment progress by 1 (according to range begin-end) More... | |
Private Member Functions | |
MapAlignmentAlgorithmSpectrumAlignment (const MapAlignmentAlgorithmSpectrumAlignment &) | |
Copy constructor is not implemented -> private. More... | |
MapAlignmentAlgorithmSpectrumAlignment & | operator= (const MapAlignmentAlgorithmSpectrumAlignment &) |
Assignment operator is not implemented -> private. More... | |
void | prepareAlign_ (const std::vector< MSSpectrum * > &pattern, PeakMap &aligned, std::vector< TransformationDescription > &transformation) |
A function to prepare the sequence for the alignment. It calls intern the main function for the alignment. More... | |
void | msFilter_ (PeakMap &peakmap, std::vector< MSSpectrum * > &spectrum_pointer_container) |
filtered the MSLevel to gain only MSLevel 1 More... | |
bool | insideBand_ (Size i, Size j, Size n, Size m, Int k_) |
function for the test if cell i,j of the grid is inside the band More... | |
Int | bestk_ (const std::vector< MSSpectrum * > &pattern, std::vector< MSSpectrum * > &aligned, std::map< Size, std::map< Size, float > > &buffer, bool column_row_orientation, Size xbegin, Size xend, Size ybegin, Size yend) |
calculate the size of the band for the alignment for two given Sequence More... | |
float | scoreCalculation_ (Size i, Size j, Size patternbegin, Size alignbegin, const std::vector< MSSpectrum * > &pattern, std::vector< MSSpectrum * > &aligned, std::map< Size, std::map< Size, float > > &buffer, bool column_row_orientation) |
calculate the score of two given MSSpectra calls intern scoring_ More... | |
float | scoring_ (const MSSpectrum &a, MSSpectrum &b) |
return the score of two given MSSpectra by calling the scorefunction More... | |
void | affineGapalign_ (Size xbegin, Size ybegin, Size xend, Size yend, const std::vector< MSSpectrum * > &pattern, std::vector< MSSpectrum * > &aligned, std::vector< int > &xcoordinate, std::vector< float > &ycoordinate, std::vector< int > &xcoordinatepattern) |
affine gap cost Alignment More... | |
void | bucketFilter_ (const std::vector< MSSpectrum * > &pattern, std::vector< MSSpectrum * > &aligned, std::vector< Int > &xcoordinate, std::vector< float > &ycoordinate, std::vector< Int > &xcoordinatepattern) |
preparation function of data points to construct later the spline function. More... | |
void | debugFileCreator_ (const std::vector< MSSpectrum * > &pattern, std::vector< MSSpectrum * > &aligned) |
Creates files for the debugging. More... | |
void | debugscoreDistributionCalculation_ (float score) |
Rounding the score of two spectra, only necessary for debugging. More... | |
void | updateMembers_ () override |
This method is used to update extra member variables at the end of the setParameters() method. More... | |
Private Attributes | |
float | gap_ |
Represent the gap cost for opening or closing a gap in the alignment. More... | |
float | e_ |
Extension cost after a gap is open. More... | |
PeakSpectrumCompareFunctor * | c1_ |
Pointer holds the scoring function, which can be selected. More... | |
float | cutoffScore_ |
This is the minimal score to be count as a mismatch(range 0.0 - 1.0) More... | |
Size | bucketsize_ |
Defines the size of one bucket. More... | |
Size | anchorPoints_ |
Defines the amount of anchor points which are selected within one bucket. More... | |
bool | debug_ |
Debug mode flag default: False. More... | |
float | mismatchscore_ |
Represent the cost of a mismatch in the alignment. More... | |
float | threshold_ |
This is the minimum score for counting as a match(1-cutoffScore_) More... | |
std::vector< std::vector< float > > | debugmatrix_ |
Container holding the score of the matchmatrix and also the insertmatrix. More... | |
std::vector< std::vector< float > > | debugscorematrix_ |
Container holding the only the score of Spectra. More... | |
std::vector< std::pair< float, float > > | debugtraceback_ |
Container holding the path of the traceback. More... | |
std::vector< float > | scoredistribution_ |
Container holding the score of each cell(matchmatrix,insertmatrix, traceback) More... | |
Additional Inherited Members | |
Public Types inherited from ProgressLogger | |
enum | LogType { CMD , GUI , NONE } |
Possible log types. More... | |
Static Public Member Functions inherited from DefaultParamHandler | |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
Writes all parameters to meta values. More... | |
Protected Member Functions inherited from DefaultParamHandler | |
void | defaultsToParam_ () |
Updates the parameters after the defaults have been set in the constructor. More... | |
Static Protected Member Functions inherited from ProgressLogger | |
static String | logTypeToFactoryName_ (LogType type) |
Return the name of the factory product used for this log type. More... | |
Protected Attributes inherited from DefaultParamHandler | |
Param | param_ |
Container for current parameters. More... | |
Param | defaults_ |
Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
std::vector< String > | subsections_ |
Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
String | error_name_ |
Name that is displayed in error messages during the parameter checking. More... | |
bool | check_defaults_ |
If this member is set to false no checking if parameters in done;. More... | |
bool | warn_empty_defaults_ |
If this member is set to false no warning is emitted when defaults are empty;. More... | |
Protected Attributes inherited from ProgressLogger | |
LogType | type_ |
time_t | last_invoke_ |
ProgressLoggerImpl * | current_logger_ |
Static Protected Attributes inherited from ProgressLogger | |
static int | recursion_depth_ |
A map alignment algorithm based on spectrum similarity (dynamic programming).
Parameters of this class are:Name | Type | Default | Restrictions | Description |
---|---|---|---|---|
gapcost | float | 1.0 | min: 0.0 | This Parameter stands for the cost of opening a gap in the Alignment. A gap means that one spectrum can not be aligned directly to another spectrum in the Map. This happens, when the similarity of both spectra a too low or even not present. Imagine it as a insert or delete of the spectrum in the map (similar to sequence alignment). The gap is necessary for aligning, if we open a gap there is a possibility that an another spectrum can be correct aligned with a higher score as before without gap. But to open a gap is a negative event and needs to carry a punishment, so a gap should only be opened if the benefits outweigh the downsides. The Parameter is to giving as a positive number, the implementation convert it to a negative number. |
affinegapcost | float | 0.5 | min: 0.0 | This Parameter controls the cost of extension a already open gap. The idea behind the affine gapcost lies under the assumption, that it is better to get a long distance of connected gaps than to have a structure of gaps interspersed with matches (gap match gap match etc.). Therefore the punishment for the extension of a gap generally should be lower than the normal gapcost. If the result of the alignment shows high compression, it is a good idea to lower either the affine gapcost or gap opening cost. |
cutoff_score | float | 0.7 | min: 0.0 max: 1.0 | The Parameter defines the threshold which filtered spectra, these spectra are high potential candidate for deciding the interval of a sub-alignment. Only those pair of spectra are selected, which has a score higher or same of the threshold. |
bucketsize | int | 100 | min: 1 | Defines the numbers of buckets. It is a quantize of the interval of those points, which defines the main alignment (match points). These points have to filtered, to reduce the amount of points for the calculating a smoother spline curve. |
anchorpoints | int | 100 | min: 1 max: 100 | Defines the percent of numbers of match points which a selected from one bucket. The high score pairs are previously selected. The reduction of match points helps to get a smoother spline curve. |
debug | string | false | true, false | Activate the debug mode, there a files written starting with debug prefix. |
mismatchscore | float | -5.0 | max: 0.0 | Defines the score of two spectra if they have no similarity to each other. |
scorefunction | string | SteinScottImproveScore | SteinScottImproveScore, ZhangSimilarityScore | The score function is the core of an alignment. The success of an alignment depends mostly of the elected score function. The score function return the similarity of two spectra. The score influence defines later the way of possible traceback. There are multiple spectra similarity scores available.. |
Default constructor.
|
override |
Destructor.
|
private |
Copy constructor is not implemented -> private.
|
private |
affine gap cost Alignment
This Alignment is based on the Needleman Wunsch Algorithm. To improve the time complexity a banded version was implemented, known as k - alignment. To save some space, the alignment is going to be calculated by position xbegin to xend of one sequence and ybegin and yend by another given sequence. The result of the alignment is stored in the second argument. The first sequence is used as a template for the alignment.
xbegin | coordinate for the beginning of the template sequence. |
ybegin | coordinate for the beginning of the aligned sequence . |
xend | coordinate for the end of the template sequence. |
yend | coordinate for the end of the aligned sequence. |
pattern | template map. |
aligned | map to be aligned. |
xcoordinate | save the position of anchor points |
ycoordinate | save the retentiontimes of an anchor points |
xcoordinatepattern | save the reference position of the anchor points from the pattern |
Exception::OutOfRange | if a out of bound appear pattern or aligned |
|
virtual |
Align peak maps.
|
private |
calculate the size of the band for the alignment for two given Sequence
This function calculates the size of the band for the alignment. It takes three samples from the aligned sequence and tries to find the highscore pairs (matching against the template sequence). The highscore pair with the worst distance is to be chosen as the size of k.
pattern | vector of pointers of the template sequence |
aligned | vector of pointers of the aligned sequence |
buffer | holds the calculated score of index i,j. |
column_row_orientation | indicate the order of the matrix |
xbegin | indicate the beginning of the template sequence |
xend | indicate the end of the template sequence |
ybegin | indicate the beginning of the aligned sequence |
yend | indicate the end of the aligned sequence |
|
private |
preparation function of data points to construct later the spline function.
This function reduced the amount of data values for the next step. The reduction is done by using a number of buckets, where the data points a selected. Within the buckets, only defined number a selected, to be written back as a data point. The selection within the buckets is done by scoring.
pattern | template map. |
aligned | map to be aligned. |
xcoordinate | save the position of anchor points |
ycoordinate | save the retention times of an anchor points |
xcoordinatepattern | save the reference position of the anchor points from the pattern |
|
private |
Creates files for the debugging.
This function is only active if the debug_ flag is true. The debugfileCreator creates following files:
Debugscoreheatmap.r contains the scores of the Spectra to each other from the alignment and also the traceback. DebugRscript is the R script which reads those data. So both files are only working under R. Start R and type main(location of debugscoreheatmap.r). The output will be a heatmap of each sub-alignment. Debugtraceback.txt shows the way of the Traceback by using gnuplot.
pattern | template map. |
aligned | map to be aligned. |
|
private |
Rounding the score of two spectra, only necessary for debugging.
This function rounded the score of two spectra. This is necessary for some function in the Debug-Mode
function for the test if cell i,j of the grid is inside the band
The function returns true if the cell underlie these conditions: -k<=i-j<=k+n-m else return false.
i | coordinate i |
j | coordinate j |
n | size of column |
m | size of row |
k_ | size of k_ |
|
private |
filtered the MSLevel to gain only MSLevel 1
The alignment works only on MSLevel 1 data, so a filter has to be run.
peakmap | map which has to be filtered |
spectrum_pointer_container | output container, where pointers of the MSSpectrum are saved (only with MS level 1) |
Exception::IllegalArgument | is thrown if no spectra are contained in peakmap |
|
private |
Assignment operator is not implemented -> private.
|
private |
A function to prepare the sequence for the alignment. It calls intern the main function for the alignment.
This function takes two arguments. These argument types are two MSExperiments. The first argument should have been filtered, so that only the type of MSLevel 1 exists in the Sequence. The second argument doesn't have to fulfill this restriction. It's going to be filtered automatically. With these two arguments a pre-calculation is done to find some corresponding data points(maximum 4) for building alignment blocks. After the alignment a re-transformation is done, the new Retention Times appear in the original data.
The parameters are MSExperiments.
pattern | template map. |
aligned | map which has to be aligned. |
transformation | container for rebuilding the alignment only by specific data-points |
|
private |
calculate the score of two given MSSpectra calls intern scoring_
This function calculates the score from two MSSpectra. These two MSSpectra are chosen by the coordinates i,j. The two coordinates i,j indicate the index in the matrix. To find the right index on the sequence, each beginning is also given to the function. A flag indicates the labeling of the axes. The buffermatrix stores the result of the scoring. If the band expands only a lookup of known scores is done.
i | is a index from the matrix. |
j | is a index from the matrix. |
patternbegin | indicate the beginning of the template sequence |
alignbegin | indicate the beginning of the aligned sequence |
pattern | vector of pointers of the template sequence |
aligned | vector of pointers of the aligned sequence |
buffer | holds the calculated score of index i,j. |
column_row_orientation | indicate the order of the matrix |
|
private |
return the score of two given MSSpectra by calling the scorefunction
|
overrideprivatevirtual |
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
|
private |
Defines the amount of anchor points which are selected within one bucket.
|
private |
Defines the size of one bucket.
|
private |
Pointer holds the scoring function, which can be selected.
|
private |
This is the minimal score to be count as a mismatch(range 0.0 - 1.0)
|
private |
Debug mode flag default: False.
|
private |
Container holding the score of the matchmatrix and also the insertmatrix.
|
private |
Container holding the only the score of Spectra.
|
private |
Container holding the path of the traceback.
|
private |
Extension cost after a gap is open.
|
private |
Represent the gap cost for opening or closing a gap in the alignment.
|
private |
Represent the cost of a mismatch in the alignment.
|
private |
Container holding the score of each cell(matchmatrix,insertmatrix, traceback)
|
private |
This is the minimum score for counting as a match(1-cutoffScore_)