Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
QTCluster Class Reference

A representation of a QT cluster used for feature grouping. More...

#include <OpenMS/DATASTRUCTURES/QTCluster.h>

Public Member Functions

 QTCluster (GridFeature *center_point, Size num_maps, double max_distance, bool use_IDs, Int x_coord, Int y_coord)
 Detailed constructor. More...
 
virtual ~QTCluster ()
 Destructor. More...
 
GridFeaturegetCenterPoint ()
 Returns the cluster center. More...
 
double getCenterRT () const
 Returns the RT value of the cluster. More...
 
double getCenterMZ () const
 Returns the m/z value of the cluster center. More...
 
Int getXCoord () const
 Returns the x coordinate in the grid. More...
 
Int getYCoord () const
 Returns the y coordinate in the grid. More...
 
Size size () const
 Returns the size of the cluster (number of elements, incl. center) More...
 
bool operator< (QTCluster &cluster)
 Compare by quality. More...
 
void add (GridFeature *element, double distance)
 Adds a new element/neighbor to the cluster. More...
 
void getElements (OpenMSBoost::unordered_map< Size, GridFeature *> &elements)
 Gets the clustered elements. More...
 
bool update (const OpenMSBoost::unordered_map< Size, GridFeature *> &removed)
 Updates the cluster after the indicated data points are removed. More...
 
double getQuality ()
 Returns the cluster quality. More...
 
const std::set< AASequence > & getAnnotations ()
 Return the set of peptide sequences annotated to the cluster center. More...
 
void setInvalid ()
 Sets current cluster as invalid (also frees some memory) More...
 
bool isInvalid () const
 Whether current cluster is invalid. More...
 
void initializeCluster ()
 Has to be called before adding elements (calling QTCluster::add) More...
 
void finalizeCluster ()
 Has to be called after adding elements (after calling QTCluster::add one or multiple times) More...
 
OpenMSBoost::unordered_map< Size, std::vector< GridFeature * > > getAllNeighbors ()
 Get all current neighbors. More...
 

Private Types

typedef std::multimap< double, GridFeature * > NeighborListType
 
typedef OpenMSBoost::unordered_map< Size, NeighborListTypeNeighborMapMulti
 
typedef std::pair< double, GridFeature * > NeighborPairType
 
typedef OpenMSBoost::unordered_map< Size, NeighborPairTypeNeighborMap
 

Private Member Functions

 QTCluster ()
 Base constructor (not accessible) More...
 
void computeQuality_ ()
 Computes the quality of the cluster. More...
 
double optimizeAnnotations_ ()
 Finds the optimal annotation (peptide sequences) for the cluster. More...
 

Private Attributes

GridFeaturecenter_point_
 Pointer to the cluster center. More...
 
NeighborMap neighbors_
 Map that keeps track of the best current feature for each map. More...
 
NeighborMapMultitmp_neighbors_
 Temporary map tracking *all* neighbors. More...
 
double max_distance_
 Maximum distance of a point that can still belong to the cluster. More...
 
Size num_maps_
 Number of input maps. More...
 
double quality_
 Quality of the cluster. More...
 
bool changed_
 Has the cluster changed (if yes, quality needs to be recomputed)? More...
 
bool use_IDs_
 Keep track of peptide IDs and use them for matching? More...
 
bool valid_
 Whether current cluster is valid. More...
 
bool collect_annotations_
 Whether initial collection of all neighbors is needed. More...
 
bool finalized_
 Whether current cluster is accepting new elements or not (if true, no more new elements allowed) More...
 
Int x_coord_
 x coordinate in the grid cell More...
 
Int y_coord_
 y coordinate in the grid cell More...
 
std::set< AASequenceannotations_
 Set of annotations of the cluster. More...
 

Detailed Description

A representation of a QT cluster used for feature grouping.

Ultimately, a cluster represents a group of corresponding features (or consensus features) from different input maps (feature maps or consensus maps).

Clusters are defined by their center points (one feature each). A cluster also stores a number of potential cluster elements (other features) from different input maps, together with their distances to the cluster center. Every feature that satisfies certain constraints with respect to the cluster center is a potential cluster element. However, since a feature group can only contain one feature from each input map, only the "best" (i.e. closest to the cluster center) such feature is considered a true cluster element. To save memory, only the "best" element for each map is stored inside a cluster.

The QT clustering algorithm has the characteristic of initially producing all possible, overlapping clusters. Iteratively, the best cluster is then extracted and the clustering is recomputed for the remaining points.

In our implementation, multiple rounds of clustering are not necessary. Instead, the clustering is updated in each iteration. This is the reason for storing all potential cluster elements: When a certain cluster is finalized, its elements have to be removed from the remaining clusters, and affected clusters change their composition. (Note that clusters can also be invalidated by this, if the cluster center is being removed.)

The quality of a cluster is the normalized average distance to the cluster center for present and missing cluster elements. The distance value for missing elements (if the cluster contains no feature from a certain input map) is the user-defined threshold that marks the maximum allowed radius of a cluster.

When adding elements to the cluster, the client needs to call initializeCluster first and the client needs to call finalizeCluster after adding the last element. After finalizeCluster, the client may not add any more elements through the add function (the client must call initializeCluster again before adding new elements).

See also
QTClusterFinder

Member Typedef Documentation

◆ NeighborListType

typedef std::multimap<double, GridFeature*> NeighborListType
private

◆ NeighborMap

typedef OpenMSBoost::unordered_map<Size, NeighborPairType> NeighborMap
private

◆ NeighborMapMulti

typedef OpenMSBoost::unordered_map<Size, NeighborListType> NeighborMapMulti
private

◆ NeighborPairType

typedef std::pair<double, GridFeature*> NeighborPairType
private

Constructor & Destructor Documentation

◆ QTCluster() [1/2]

QTCluster ( )
private

Base constructor (not accessible)

◆ QTCluster() [2/2]

QTCluster ( GridFeature center_point,
Size  num_maps,
double  max_distance,
bool  use_IDs,
Int  x_coord,
Int  y_coord 
)

Detailed constructor.

Parameters
center_pointPointer to the center point
num_mapsNumber of input maps
max_distanceMaximum allowed distance of two points
use_IDsUse peptide annotations?

◆ ~QTCluster()

virtual ~QTCluster ( )
virtual

Destructor.

Member Function Documentation

◆ add()

void add ( GridFeature element,
double  distance 
)

Adds a new element/neighbor to the cluster.

Note
There is no check whether the element/neighbor already exists in the cluster!
Parameters
elementThe element to be added
distanceDistance of the element to the center point

◆ computeQuality_()

void computeQuality_ ( )
private

Computes the quality of the cluster.

◆ finalizeCluster()

void finalizeCluster ( )

Has to be called after adding elements (after calling QTCluster::add one or multiple times)

◆ getAllNeighbors()

OpenMSBoost::unordered_map<Size, std::vector<GridFeature*> > getAllNeighbors ( )

Get all current neighbors.

◆ getAnnotations()

const std::set<AASequence>& getAnnotations ( )

Return the set of peptide sequences annotated to the cluster center.

◆ getCenterMZ()

double getCenterMZ ( ) const

Returns the m/z value of the cluster center.

◆ getCenterPoint()

GridFeature* getCenterPoint ( )

Returns the cluster center.

◆ getCenterRT()

double getCenterRT ( ) const

Returns the RT value of the cluster.

◆ getElements()

void getElements ( OpenMSBoost::unordered_map< Size, GridFeature *> &  elements)

Gets the clustered elements.

◆ getQuality()

double getQuality ( )

Returns the cluster quality.

◆ getXCoord()

Int getXCoord ( ) const

Returns the x coordinate in the grid.

◆ getYCoord()

Int getYCoord ( ) const

Returns the y coordinate in the grid.

◆ initializeCluster()

void initializeCluster ( )

Has to be called before adding elements (calling QTCluster::add)

◆ isInvalid()

bool isInvalid ( ) const
inline

Whether current cluster is invalid.

◆ operator<()

bool operator< ( QTCluster cluster)

Compare by quality.

◆ optimizeAnnotations_()

double optimizeAnnotations_ ( )
private

Finds the optimal annotation (peptide sequences) for the cluster.

The optimal annotation is the one that results in the best quality. It is stored in annotations_;

This function is only needed when peptide ids are used and the current center point does not have any peptide id associated with it. In this case, it is not clear which peptide id the current cluster should use. The function thus iterates through all possible peptide ids and selects the one producing the best cluster.

This function needs access to all possible neighbors for this cluster and thus can only be run when tmp_neighbors_ is filled (which is during the filling of a cluster). The function thus cannot be called after finalizing the cluster.

Returns
The total distance between cluster elements and the center.

◆ setInvalid()

void setInvalid ( )

Sets current cluster as invalid (also frees some memory)

Note
Do not attempt to use the cluster again once it is invalid, some internal data structures have now been cleared

◆ size()

Size size ( ) const

Returns the size of the cluster (number of elements, incl. center)

◆ update()

bool update ( const OpenMSBoost::unordered_map< Size, GridFeature *> &  removed)

Updates the cluster after the indicated data points are removed.

Parameters
removedThe datapoints to be removed from the cluster
Returns
Whether the cluster composition has changed due to the update

Member Data Documentation

◆ annotations_

std::set<AASequence> annotations_
private

Set of annotations of the cluster.

The set of peptide sequences that is compatible to the cluster center and results in the best cluster quality.

◆ center_point_

GridFeature* center_point_
private

Pointer to the cluster center.

◆ changed_

bool changed_
private

Has the cluster changed (if yes, quality needs to be recomputed)?

◆ collect_annotations_

bool collect_annotations_
private

Whether initial collection of all neighbors is needed.

This variable stores whether we need to collect all annotations first before we can decide upon the best set of cluster points. This is usually only necessary if the center point does not have an annotation but we want to use ids.

◆ finalized_

bool finalized_
private

Whether current cluster is accepting new elements or not (if true, no more new elements allowed)

◆ max_distance_

double max_distance_
private

Maximum distance of a point that can still belong to the cluster.

◆ neighbors_

NeighborMap neighbors_
private

Map that keeps track of the best current feature for each map.

◆ num_maps_

Size num_maps_
private

Number of input maps.

◆ quality_

double quality_
private

Quality of the cluster.

◆ tmp_neighbors_

NeighborMapMulti* tmp_neighbors_
private

Temporary map tracking *all* neighbors.

For each input run, a multimap which contains pointers to all neighboring elements and the respective distance.

◆ use_IDs_

bool use_IDs_
private

Keep track of peptide IDs and use them for matching?

◆ valid_

bool valid_
private

Whether current cluster is valid.

◆ x_coord_

Int x_coord_
private

x coordinate in the grid cell

◆ y_coord_

Int y_coord_
private

y coordinate in the grid cell


OpenMS / TOPP release 2.3.0 Documentation generated on Tue Jan 9 2018 18:22:12 using doxygen 1.8.13