OpenMS
Loading...
Searching...
No Matches
SpectrumLookup Class Reference

Helper class for looking up spectra based on different attributes. More...

#include <OpenMS/METADATA/SpectrumLookup.h>

Inheritance diagram for SpectrumLookup:
[legend]
Collaboration diagram for SpectrumLookup:
[legend]

Public Member Functions

 SpectrumLookup ()
 Constructor.
 
virtual ~SpectrumLookup ()
 Destructor.
 
bool empty () const
 Check if any spectra were set.
 
template<typename SpectrumContainer >
void readSpectra (const SpectrumContainer &spectra, const std::string &scan_regexp=default_scan_regexp)
 Read and index spectra for later look-up.
 
Size findByRT (double rt) const
 Look up spectrum by retention time (RT).
 
Size findByNativeID (const std::string &native_id) const
 Look up spectrum by native ID.
 
Size findByIndex (Size index, bool count_from_one=false) const
 Look up spectrum by index (position in the vector of spectra).
 
Size findByScanNumber (Size scan_number) const
 Look up spectrum by scan number (extracted from the native ID).
 
Size findByReference (const std::string &spectrum_ref) const
 Look up spectrum by reference.
 
void addReferenceFormat (const std::string &regexp)
 Register a possible format for a spectrum reference.
 

Static Public Member Functions

static Int extractScanNumber (const std::string &native_id, const boost::regex &scan_regexp, bool no_error=false)
 Extract the scan number from the native ID of a spectrum.
 
static Int extractScanNumber (const std::string &native_id, const std::string &native_id_type_accession)
 Extract the scan number from the native ID using a CV accession.
 
static std::string getRegExFromNativeID (const std::string &native_id)
 Determine the RegEx string to extract scan/index number from native IDs. Can be used for extractScanNumber.
 
static bool isNativeID (const std::string &id)
 Simple prefix check if a spectrum identifier id is a nativeID from a vendor file.
 

Public Attributes

std::vector< boost::regex > reference_formats
 Possible formats of spectrum references, defined as regular expressions.
 
double rt_tolerance
 Tolerance for look-up by retention time.
 

Static Public Attributes

static const std::string & default_scan_regexp
 Default regular expression for extracting scan numbers from spectrum native IDs.
 

Protected Member Functions

void addEntry_ (Size index, double rt, Int scan_number, const std::string &native_id)
 Add a look-up entry for a spectrum.
 
Size findByRegExpMatch_ (const std::string &spectrum_ref, const std::string &regexp, const boost::smatch &match) const
 Look up spectrum by regular expression match.
 
void setScanRegExp_ (const std::string &scan_regexp)
 Set the regular expression for extracting scan numbers from spectrum native IDs.
 

Protected Attributes

Size n_spectra_
 Number of spectra.
 
boost::regex scan_regexp_
 Regular expression to extract scan numbers.
 
std::vector< std::string > regexp_name_list_
 Named groups in vector format.
 
std::map< double, Sizerts_
 Mapping: RT -> spectrum index.
 
std::map< std::string, Sizeids_
 Mapping: native ID -> spectrum index.
 
std::map< Size, Sizescans_
 Mapping: scan number -> spectrum index.
 

Static Protected Attributes

static const std::string & regexp_names_
 Named groups recognized in regular expression.
 

Private Member Functions

 SpectrumLookup (const SpectrumLookup &)
 Copy constructor (not implemented)
 
SpectrumLookupoperator= (const SpectrumLookup &)
 Assignment operator (not implemented).
 

Detailed Description

Helper class for looking up spectra based on different attributes.

This class provides functions for looking up spectra that are stored in a vector (e.g. MSExperiment::getSpectra()) by index, retention time, native ID, scan number (extracted from the native ID), or by a reference string containing any of the previous information ("spectrum reference").

Spectrum reference formats
Formats for spectrum references are defined by regular expressions, that must contain certain fields (named groups, i.e. "(?<GROUP>...)") referring to usable information. The following named groups are recognized and can be used to look up spectra:
  • INDEX0: spectrum index, i.e. position in the vector of spectra, counting from zero
  • INDEX1: spectrum index, i.e. position in the vector of spectra, counting from one
  • ID: spectrum native ID
  • SCAN: scan number (extracted from the native ID)
  • RT: retention time
For example, if the format of a spectrum reference is "scan=123", where 123 is the scan number, the expression "scan=(?<SCAN>\\d+)" can be used to extract that number, allowing look-up of the corresponding spectrum.
Reference formats are registered via addReferenceFormat(). Several possible formats can be added and will be tried in order by the function findByReference().
Native ID Parsing
For standalone parsing of spectrum native IDs (without spectrum lookup), use SpectrumNativeIDParser directly:
// Extract scan number from native ID using CV accession
Int scan = SpectrumNativeIDParser::extractScanNumber("scan=42", "MS:1000768");
// Check if a string is a native ID
{
std::string regex = SpectrumNativeIDParser::getRegExFromNativeID(spectrum_id);
// use regex for further processing...
}
static std::string getRegExFromNativeID(const std::string &native_id)
Determine the regular expression to extract scan/index numbers from native IDs.
static bool isNativeID(const std::string &id)
Check if a spectrum identifier is a native ID from a vendor file.
static Int extractScanNumber(const std::string &native_id, const boost::regex &scan_regexp, bool no_error=false)
Extract the scan number from the native ID of a spectrum using a regular expression.
int Int
Signed integer type.
Definition Types.h:72
See also
SpectrumMetaDataLookup, SpectrumNativeIDParser

Constructor & Destructor Documentation

◆ SpectrumLookup() [1/2]

Constructor.

◆ ~SpectrumLookup()

virtual ~SpectrumLookup ( )
virtual

Destructor.

◆ SpectrumLookup() [2/2]

SpectrumLookup ( const SpectrumLookup )
private

Copy constructor (not implemented)

Member Function Documentation

◆ addEntry_()

void addEntry_ ( Size  index,
double  rt,
Int  scan_number,
const std::string &  native_id 
)
protected

Add a look-up entry for a spectrum.

Parameters
[in]indexSpectrum index (position in the vector)
[in]rtRetention time
[in]scan_numberScan number
[in]native_idNative ID

◆ addReferenceFormat()

void addReferenceFormat ( const std::string &  regexp)

Register a possible format for a spectrum reference.

Parameters
[in]regexpRegular expression defining the format
Exceptions
Exception::IllegalArgumentif regexp does not contain any of the recognized named groups

The regular expression defining the reference format must contain one or more of the recognized named groups defined in SpectrumLookup::regexp_names_.

◆ empty()

bool empty ( ) const

Check if any spectra were set.

◆ extractScanNumber() [1/2]

static Int extractScanNumber ( const std::string &  native_id,
const boost::regex &  scan_regexp,
bool  no_error = false 
)
static

Extract the scan number from the native ID of a spectrum.

Parameters
[in]native_idSpectrum native ID
[in]scan_regexpRegular expression to use (must contain the named group "?<SCAN>")
[in]no_errorSuppress the exception on failure
Exceptions
Exception::ParseErrorif the scan number could not be extracted (unless no_error is set)
Returns
Scan number of the spectrum (or -1 on failure to extract)
Deprecated:
Use SpectrumNativeIDParser::extractScanNumber() instead for better discoverability.
See also
SpectrumNativeIDParser::extractScanNumber()

◆ extractScanNumber() [2/2]

static Int extractScanNumber ( const std::string &  native_id,
const std::string &  native_id_type_accession 
)
static

Extract the scan number from the native ID using a CV accession.

Parameters
[in]native_idSpectrum native ID
[in]native_id_type_accessionCV accession specifying the native ID format
Returns
Scan number of the spectrum (or -1 on failure to extract)
Deprecated:
Use SpectrumNativeIDParser::extractScanNumber() instead for better discoverability.
See also
SpectrumNativeIDParser::extractScanNumber()

◆ findByIndex()

Size findByIndex ( Size  index,
bool  count_from_one = false 
) const

Look up spectrum by index (position in the vector of spectra).

Parameters
[in]indexIndex to look up
[in]count_from_oneDo indexes start counting at one (default: zero)?
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ findByNativeID()

Size findByNativeID ( const std::string &  native_id) const

Look up spectrum by native ID.

Parameters
[in]native_idNative ID to look up
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ findByReference()

Size findByReference ( const std::string &  spectrum_ref) const

Look up spectrum by reference.

Parameters
[in]spectrum_refSpectrum reference to parse
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Exception::ParseErrorif the reference could not be parsed (no reference format matched)
Returns
Index of the spectrum that matched

The regular expressions in SpectrumLookup::reference_formats are matched against the spectrum reference in order. The first one that matches is used to look up the spectrum.

◆ findByRegExpMatch_()

Size findByRegExpMatch_ ( const std::string &  spectrum_ref,
const std::string &  regexp,
const boost::smatch &  match 
) const
protected

Look up spectrum by regular expression match.

Parameters
[in]spectrum_refSpectrum reference that was parsed
[in]regexpRegular expression used for parsing
[in]matchRegular expression match
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ findByRT()

Size findByRT ( double  rt) const

Look up spectrum by retention time (RT).

Parameters
[in]rtRetention time to look up
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

There is a tolerance for matching of RT values defined by SpectrumLookup::rt_tolerance. The spectrum with the closest match within that tolerance is returned (if any).

◆ findByScanNumber()

Size findByScanNumber ( Size  scan_number) const

Look up spectrum by scan number (extracted from the native ID).

Parameters
[in]scan_numberScan number to look up
Exceptions
Exception::ElementNotFoundif no matching spectrum was found
Returns
Index of the spectrum that matched

◆ getRegExFromNativeID()

static std::string getRegExFromNativeID ( const std::string &  native_id)
static

Determine the RegEx string to extract scan/index number from native IDs. Can be used for extractScanNumber.

Parameters
[in]native_idNative ID string to analyze
Returns
Regular expression string with named group
Deprecated:
Use SpectrumNativeIDParser::getRegExFromNativeID() instead for better discoverability.
See also
SpectrumNativeIDParser::getRegExFromNativeID()

◆ isNativeID()

static bool isNativeID ( const std::string &  id)
static

Simple prefix check if a spectrum identifier id is a nativeID from a vendor file.

Parameters
[in]idSpectrum identifier to check
Returns
True if the string matches a known native ID prefix pattern
Deprecated:
Use SpectrumNativeIDParser::isNativeID() instead for better discoverability.
See also
SpectrumNativeIDParser::isNativeID()

◆ operator=()

SpectrumLookup & operator= ( const SpectrumLookup )
private

Assignment operator (not implemented).

◆ readSpectra()

template<typename SpectrumContainer >
void readSpectra ( const SpectrumContainer &  spectra,
const std::string &  scan_regexp = default_scan_regexp 
)
inline

Read and index spectra for later look-up.

Template Parameters
SpectrumContainerSpectrum container class, must support size and operator[]
Parameters
[in]spectraContainer of spectra
[in]scan_regexpRegular expression for matching scan numbers in spectrum native IDs (must contain the named group "?<SCAN>")
Exceptions
Exception::IllegalArgumentif scan_regexp does not contain "?<SCAN>" (and is not empty)

Spectra are indexed by retention time, native ID and scan number. In all cases it is expected that the value for each spectrum will be unique. Setting scan_regexp to the empty string ("") disables extraction of scan numbers; look-ups by scan number will fail in that case.

References SpectrumSettings::getNativeID(), MSSpectrum::getRT(), and OPENMS_LOG_WARN.

◆ setScanRegExp_()

void setScanRegExp_ ( const std::string &  scan_regexp)
protected

Set the regular expression for extracting scan numbers from spectrum native IDs.

Parameters
[in]scan_regexpRegular expression to use (must contain the named group "?<SCAN>")

Member Data Documentation

◆ default_scan_regexp

const std::string& default_scan_regexp
static

Default regular expression for extracting scan numbers from spectrum native IDs.

◆ ids_

std::map<std::string, Size> ids_
protected

Mapping: native ID -> spectrum index.

◆ n_spectra_

Size n_spectra_
protected

Number of spectra.

◆ reference_formats

std::vector<boost::regex> reference_formats

Possible formats of spectrum references, defined as regular expressions.

◆ regexp_name_list_

std::vector<std::string> regexp_name_list_
protected

Named groups in vector format.

◆ regexp_names_

const std::string& regexp_names_
staticprotected

Named groups recognized in regular expression.

◆ rt_tolerance

double rt_tolerance

Tolerance for look-up by retention time.

◆ rts_

std::map<double, Size> rts_
protected

Mapping: RT -> spectrum index.

◆ scan_regexp_

boost::regex scan_regexp_
protected

Regular expression to extract scan numbers.

◆ scans_

std::map<Size, Size> scans_
protected

Mapping: scan number -> spectrum index.