OpenMS
Loading...
Searching...
No Matches
NeedlemanWunsch Class Reference

Global-alignment similarity score for two amino-acid sequences using the Needleman-Wunsch algorithm. More...

#include <OpenMS/ANALYSIS/SEQUENCE/NeedlemanWunsch.h>

Collaboration diagram for NeedlemanWunsch:
[legend]

Public Types

enum class  ScoringMatrix { identity , PAM30MS , SIZE_OF_SCORINGMATRIX }
 Substitution matrices bundled with this class. More...
 

Public Member Functions

 NeedlemanWunsch (ScoringMatrix matrix, int penalty)
 Construct with a chosen scoring matrix and gap penalty.
 
 NeedlemanWunsch ()=default
 Default constructor.
 
 ~NeedlemanWunsch ()=default
 Destructor.
 
int align (const std::string &seq1, const std::string &seq2)
 Compute the Needleman-Wunsch similarity score of seq1 vs. seq2.
 
void setMatrix (const ScoringMatrix &matrix)
 Select the scoring matrix used by align.
 
void setMatrix (const std::string &matrix)
 Select the scoring matrix by name; the accepted names are listed in NamesOfScoringMatrices.
 
void setPenalty (const int penalty)
 Set the linear gap penalty subtracted per gap step.
 
ScoringMatrix getMatrix () const
 Currently selected scoring matrix.
 
int getPenalty () const
 Currently configured gap penalty.
 

Static Public Attributes

static const std::vector< std::string > NamesOfScoringMatrices
 Lookup table for setMatrix(const std::string&); current entries: "identity", "PAM30MS".
 

Private Attributes

int gap_penalty_ = 5
 Linear gap penalty.
 
ScoringMatrix my_matrix_ = ScoringMatrix::PAM30MS
 Currently selected scoring matrix.
 
std::vector< int > first_row_ {}
 First row of the two-row rolling buffer used by align.
 
std::vector< int > second_row_ {}
 Second row of the two-row rolling buffer used by align.
 

Detailed Description

Global-alignment similarity score for two amino-acid sequences using the Needleman-Wunsch algorithm.

Computes the score (not the alignment itself) of the best global alignment of two amino-acid sequences against a substitution matrix plus a linear gap penalty. The implementation uses a two-row rolling buffer.

Note
The implementation indexes the substitution matrix by c - 'A', so the input strings must contain only uppercase ASCII letters in [A-Z]. Characters J, O, and U have INT16_MAX rows/columns in both bundled matrices and effectively forbid any alignment involving them. Lowercase or non-ASCII characters lead to out-of-bounds reads.

Member Enumeration Documentation

◆ ScoringMatrix

enum class ScoringMatrix
strong

Substitution matrices bundled with this class.

Enumerator
identity 

+1 on the diagonal, 0 elsewhere; J / O / U columns and rows are forbidden (INT16_MAX).

PAM30MS 

PAM30 matrix extended for MS-style substitutions (default).

SIZE_OF_SCORINGMATRIX 

Number of valid matrices.

Constructor & Destructor Documentation

◆ NeedlemanWunsch() [1/2]

NeedlemanWunsch ( ScoringMatrix  matrix,
int  penalty 
)

Construct with a chosen scoring matrix and gap penalty.

Parameters
[in]matrixScoring matrix to use.
[in]penaltyLinear gap penalty (subtracted per gap step).

◆ NeedlemanWunsch() [2/2]

NeedlemanWunsch ( )
default

Default constructor.

Uses ScoringMatrix::PAM30MS and a gap penalty of 5.

◆ ~NeedlemanWunsch()

~NeedlemanWunsch ( )
default

Destructor.

Member Function Documentation

◆ align()

int align ( const std::string &  seq1,
const std::string &  seq2 
)

Compute the Needleman-Wunsch similarity score of seq1 vs. seq2.

Uses the currently selected scoring matrix and gap penalty. The sequences must contain only uppercase [A-Z] characters; see the class brief for the constraints.

Parameters
[in]seq1First amino-acid sequence.
[in]seq2Second amino-acid sequence.
Returns
Score of the best global alignment.

◆ getMatrix()

ScoringMatrix getMatrix ( ) const

Currently selected scoring matrix.

Returns
The scoring matrix passed to the last setMatrix call (or the default PAM30MS if setMatrix has not been called).

◆ getPenalty()

int getPenalty ( ) const

Currently configured gap penalty.

Returns
The penalty passed to the last setPenalty call (or the default 5 if setPenalty has not been called).

◆ setMatrix() [1/2]

void setMatrix ( const ScoringMatrix matrix)

Select the scoring matrix used by align.

Parameters
[in]matrixScoring matrix to use.

◆ setMatrix() [2/2]

void setMatrix ( const std::string &  matrix)

Select the scoring matrix by name; the accepted names are listed in NamesOfScoringMatrices.

Parameters
[in]matrixMatrix name (currently "identity" or "PAM30MS").
Exceptions
Exception::IllegalArgumentwhen matrix is not one of the names in NamesOfScoringMatrices. The error message lists the valid names.

◆ setPenalty()

void setPenalty ( const int  penalty)

Set the linear gap penalty subtracted per gap step.

Parameters
[in]penaltyGap penalty (typically positive); the algorithm subtracts penalty from the score for each gap step.

Member Data Documentation

◆ first_row_

std::vector<int> first_row_ {}
private

First row of the two-row rolling buffer used by align.

◆ gap_penalty_

int gap_penalty_ = 5
private

Linear gap penalty.

◆ my_matrix_

ScoringMatrix my_matrix_ = ScoringMatrix::PAM30MS
private

Currently selected scoring matrix.

◆ NamesOfScoringMatrices

const std::vector<std::string> NamesOfScoringMatrices
static

Lookup table for setMatrix(const std::string&); current entries: "identity", "PAM30MS".

◆ second_row_

std::vector<int> second_row_ {}
private

Second row of the two-row rolling buffer used by align.