OpenMS
2.8.0
|
Representation of spectrum identification results and associated data. More...
#include <OpenMS/METADATA/ID/IdentificationData.h>
Classes | |
struct | ModifyMultiIndexAddProcessingStep |
Helper functor for adding processing steps to elements in a @t boost::multi_index_container structure. More... | |
struct | ModifyMultiIndexAddScore |
Helper functor for adding scores to elements in a boost::multi_index_container structure. More... | |
struct | ModifyMultiIndexRemoveParentMatches |
Helper functor for removing invalid parent matches from elements in a boost::multi_index_container structure. More... | |
struct | RefTranslator |
structure that maps references of corresponding objects after copying More... | |
Public Member Functions | |
IdentificationData () | |
Default constructor. More... | |
IdentificationData (const IdentificationData &other) | |
Copy constructor. More... | |
IdentificationData (IdentificationData &&other) noexcept | |
Move constructor. More... | |
InputFileRef | registerInputFile (const InputFile &file) |
Register an input file. More... | |
ProcessingSoftwareRef | registerProcessingSoftware (const ProcessingSoftware &software) |
Register data processing software. More... | |
SearchParamRef | registerDBSearchParam (const DBSearchParam ¶m) |
Register database search parameters. More... | |
ProcessingStepRef | registerProcessingStep (const ProcessingStep &step) |
Register a data processing step. More... | |
ProcessingStepRef | registerProcessingStep (const ProcessingStep &step, SearchParamRef search_ref) |
Register a database search step with associated parameters. More... | |
ScoreTypeRef | registerScoreType (const ScoreType &score) |
Register a score type. More... | |
ObservationRef | registerObservation (const Observation &obs) |
Register an observation (e.g. MS2 spectrum or feature) More... | |
ParentSequenceRef | registerParentSequence (const ParentSequence &parent) |
Register a parent sequence (e.g. protein or intact RNA) More... | |
void | registerParentGroupSet (const ParentGroupSet &groups) |
Register a grouping of parent sequences (e.g. protein inference result) More... | |
IdentifiedPeptideRef | registerIdentifiedPeptide (const IdentifiedPeptide &peptide) |
Register an identified peptide. More... | |
IdentifiedCompoundRef | registerIdentifiedCompound (const IdentifiedCompound &compound) |
Register an identified compound (small molecule) More... | |
IdentifiedOligoRef | registerIdentifiedOligo (const IdentifiedOligo &oligo) |
Register an identified RNA oligonucleotide. More... | |
AdductRef | registerAdduct (const AdductInfo &adduct) |
Register an adduct. More... | |
ObservationMatchRef | registerObservationMatch (const ObservationMatch &match) |
Register an observation match (e.g. peptide-spectrum match) More... | |
MatchGroupRef | registerObservationMatchGroup (const ObservationMatchGroup &group) |
Register a group of observation matches that belong together. More... | |
const InputFiles & | getInputFiles () const |
Return the registered input files (immutable) More... | |
const ProcessingSoftwares & | getProcessingSoftwares () const |
Return the registered data processing software (immutable) More... | |
const ProcessingSteps & | getProcessingSteps () const |
Return the registered data processing steps (immutable) More... | |
const DBSearchParams & | getDBSearchParams () const |
Return the registered database search parameters (immutable) More... | |
const DBSearchSteps & | getDBSearchSteps () const |
Return the registered database search steps (immutable) More... | |
const ScoreTypes & | getScoreTypes () const |
Return the registered score types (immutable) More... | |
const Observations & | getObservations () const |
Return the registered observations (immutable) More... | |
const ParentSequences & | getParentSequences () const |
Return the registered parent sequences (immutable) More... | |
const ParentGroupSets & | getParentGroupSets () const |
Return the registered parent sequence groupings (immutable) More... | |
const IdentifiedPeptides & | getIdentifiedPeptides () const |
Return the registered identified peptides (immutable) More... | |
const IdentifiedCompounds & | getIdentifiedCompounds () const |
Return the registered compounds (immutable) More... | |
const IdentifiedOligos & | getIdentifiedOligos () const |
Return the registered identified oligonucleotides (immutable) More... | |
const Adducts & | getAdducts () const |
Return the registered adducts (immutable) More... | |
const ObservationMatches & | getObservationMatches () const |
Return the registered observation matches (immutable) More... | |
const ObservationMatchGroups & | getObservationMatchGroups () const |
Return the registered groups of observation matches (immutable) More... | |
void | addScore (ObservationMatchRef match_ref, ScoreTypeRef score_ref, double value) |
Add a score to an input match (e.g. PSM) More... | |
void | setCurrentProcessingStep (ProcessingStepRef step_ref) |
Set a data processing step that will apply to all subsequent "register..." calls. More... | |
ProcessingStepRef | getCurrentProcessingStep () |
Return the current processing step (set via setCurrentProcessingStep()). More... | |
void | clearCurrentProcessingStep () |
Cancel the effect of setCurrentProcessingStep(). More... | |
std::vector< ObservationMatchRef > | getBestMatchPerObservation (ScoreTypeRef score_ref, bool require_score=false) const |
Return the best match for each observation, according to a given score type. More... | |
std::pair< ObservationMatchRef, ObservationMatchRef > | getMatchesForObservation (ObservationRef obs_ref) const |
Get range of matches (cf. equal_range ) for a given observation. More... | |
ScoreTypeRef | findScoreType (const String &score_name) const |
Look up a score type by name. More... | |
void | calculateCoverages (bool check_molecule_length=false) |
Calculate sequence coverages of parent sequences. More... | |
void | cleanup (bool require_observation_match=true, bool require_identified_sequence=true, bool require_parent_match=true, bool require_parent_group=false, bool require_match_group=false) |
Clean up the data structure after filtering parts of it. More... | |
bool | empty () const |
Return whether the data structure is empty (no data) More... | |
RefTranslator | merge (const IdentificationData &other) |
Merge in data from another instance. More... | |
void | swap (IdentificationData &other) |
Swap contents with a second instance. More... | |
void | clear () |
Clear all contents. More... | |
template<class ScoredProcessingResults > | |
ScoreTypeRef | pickScoreType (const ScoredProcessingResults &container, bool all_elements=false, bool any_score=false) const |
void | setMetaValue (const ObservationMatchRef ref, const String &key, const DataValue &value) |
Set a meta value on a stored input match. More... | |
void | setMetaValue (const ObservationRef ref, const String &key, const DataValue &value) |
Set a meta value on a stored input item. More... | |
void | setMetaValue (const IdentifiedMolecule &var, const String &key, const DataValue &value) |
Set a meta value on a stored identified molecule (variant) More... | |
void | setMetaValue (const String &name, const DataValue &value) |
Sets the DataValue corresponding to a name. More... | |
void | setMetaValue (UInt index, const DataValue &value) |
Sets the DataValue corresponding to an index. More... | |
Public Member Functions inherited from MetaInfoInterface | |
MetaInfoInterface () | |
Constructor. More... | |
MetaInfoInterface (const MetaInfoInterface &rhs) | |
Copy constructor. More... | |
MetaInfoInterface (MetaInfoInterface &&) noexcept | |
Move constructor. More... | |
~MetaInfoInterface () | |
Destructor. More... | |
MetaInfoInterface & | operator= (const MetaInfoInterface &rhs) |
Assignment operator. More... | |
MetaInfoInterface & | operator= (MetaInfoInterface &&) noexcept |
Move assignment operator. More... | |
void | swap (MetaInfoInterface &rhs) |
Swap contents. More... | |
bool | operator== (const MetaInfoInterface &rhs) const |
Equality operator. More... | |
bool | operator!= (const MetaInfoInterface &rhs) const |
Equality operator. More... | |
const DataValue & | getMetaValue (const String &name, const DataValue &default_value=DataValue::EMPTY) const |
Returns the value corresponding to a string, or a default value (default: DataValue::EMPTY) if not found. More... | |
const DataValue & | getMetaValue (UInt index, const DataValue &default_value=DataValue::EMPTY) const |
Returns the value corresponding to an index, or a default value (default: DataValue::EMPTY) if not found. More... | |
bool | metaValueExists (const String &name) const |
Returns whether an entry with the given name exists. More... | |
bool | metaValueExists (UInt index) const |
Returns whether an entry with the given index exists. More... | |
void | setMetaValue (const String &name, const DataValue &value) |
Sets the DataValue corresponding to a name. More... | |
void | setMetaValue (UInt index, const DataValue &value) |
Sets the DataValue corresponding to an index. More... | |
void | removeMetaValue (const String &name) |
Removes the DataValue corresponding to name if it exists. More... | |
void | removeMetaValue (UInt index) |
Removes the DataValue corresponding to index if it exists. More... | |
void | addMetaValues (const MetaInfoInterface &from) |
function to copy all meta values from one object to this one More... | |
void | getKeys (std::vector< String > &keys) const |
Fills the given vector with a list of all keys for which a value is set. More... | |
void | getKeys (std::vector< UInt > &keys) const |
Fills the given vector with a list of all keys for which a value is set. More... | |
bool | isMetaEmpty () const |
Returns if the MetaInfo is empty. More... | |
void | clearMetaInfo () |
Removes all meta values. More... | |
Protected Member Functions | |
void | checkScoreTypes_ (const std::map< ScoreTypeRef, double > &scores) const |
Helper function to check if all score types are valid. More... | |
void | checkAppliedProcessingSteps_ (const AppliedProcessingSteps &steps_and_scores) const |
Helper function to check if all applied processing steps are valid. More... | |
void | checkParentMatches_ (const ParentMatches &matches, MoleculeType expected_type) const |
Helper function to check if all parent matches are valid. More... | |
void | mergeScoredProcessingResults_ (ScoredProcessingResult &result, const ScoredProcessingResult &other, const RefTranslator &trans) |
Helper function to merge scored processing results while updating references (to processing steps and score types) More... | |
template<typename ContainerType , typename ElementType > | |
ContainerType::iterator | insertIntoMultiIndex_ (ContainerType &container, const ElementType &element) |
Helper function for adding entries (derived from ScoredProcessingResult) to a boost::multi_index_container structure. More... | |
template<typename ContainerType , typename ElementType > | |
ContainerType::iterator | insertIntoMultiIndex_ (ContainerType &container, const ElementType &element, AddressLookup &lookup) |
Variant of insertIntoMultiIndex_() that also updates a look-up table of valid references (addresses) More... | |
template<typename RefType , typename ContainerType > | |
void | setMetaValue_ (const RefType ref, const String &key, const DataValue &value, ContainerType &container, const AddressLookup &lookup=AddressLookup()) |
Helper function to add a meta value to an element in a multi-index container. More... | |
Protected Member Functions inherited from MetaInfoInterface | |
void | createIfNotExists_ () |
Creates the MetaInfo object if it does not exist. More... | |
Static Protected Member Functions | |
template<typename RefType , typename ContainerType > | |
static bool | isValidReference_ (RefType ref, ContainerType &container) |
Check whether a reference points to an element in a container. More... | |
template<typename RefType > | |
static bool | isValidHashedReference_ (RefType ref, const AddressLookup &lookup) |
Check validity of a reference based on a look-up table of addresses. More... | |
template<typename ContainerType , typename PredicateType > | |
static void | removeFromSetIf_ (ContainerType &container, PredicateType predicate) |
Remove elements from a set (or ordered multi_index_container) if they fulfill a predicate. More... | |
template<typename ContainerType > | |
static void | removeFromSetIfNotHashed_ (ContainerType &container, const AddressLookup &lookup) |
Remove elements from a set (or ordered multi_index_container) if they don't occur in a look-up table. More... | |
template<typename ContainerType > | |
static void | updateAddressLookup_ (const ContainerType &container, AddressLookup &lookup) |
Recreate the address look-up table for a container. More... | |
Friends | |
class | IDFilter |
class | MapAlignmentTransformer |
Additional Inherited Members | |
Static Public Member Functions inherited from MetaInfoInterface | |
static MetaInfoRegistry & | metaRegistry () |
Returns a reference to the MetaInfoRegistry. More... | |
Representation of spectrum identification results and associated data.
This class provides capabilities for storing spectrum identification results from different types of experiments/molecules (proteomics: peptides/proteins, metabolomics: small molecules, "nucleomics": RNA).
The class design has the following goals:
The following important subordinate classes are provided to represent different types of data:
Class | Represents | Key | Proteomics example | Corresponding legacy class |
---|---|---|---|---|
ProcessingStep | Information about a data processing step that was applied (e.g. input files, software used, parameters) | Combined information | Mascot search | ProteinIdentification |
Observation | A search query (with identifier, RT, m/z) from an input file, i.e. an MS2 spectrum or feature (for accurate mass search) | File/Identifier | MS2 spectrum | PeptideIdentification |
ParentSequence | An entry in a FASTA file with associated information (sequence, coverage, etc.) | Accession | Protein | ProteinHit |
IdentifiedPeptide/-Oligo/-Compound | An identified molecule of the respective type | Sequence (or identifier for a compound) | Peptide | PeptideHit |
ObservationMatch | A match between a query (Observation), identified molecule (Identified...), and optionally adduct | Combination of query/molecule/adduct references | Peptide-spectrum match (PSM) | PeptideIdentification/PeptideHit |
To populate an IdentificationData instance with data, "register..." functions are used. These functions return "references" (implemented as iterators) that can be used to refer to stored data items and thus form connections. For example, a protein can be stored using registerParentSequence, which returns a corresponding reference. This reference can be used to build an IdentifiedPeptide object that references the protein. An identified peptide referencing a protein can only be registered if that protein has been registered already, to ensure data consistency. Given the identified peptide, information about the associated protein can be retrieved efficiently by simply dereferencing the reference.
To ensure non-redundancy, many data types have a "key" (see table above) to which a uniqueness constraint applies. This means only one item of such a type with a given key can be stored in an IdentificationData object. If items with an existing key are registered subsequently, attempts are made to merge new information (e.g. additional scores) into the existing entry. The details of this merging are handled in the merge
function in each data class.
using AddressLookup = std::unordered_set<uintptr_t> |
|
inline |
Default constructor.
IdentificationData | ( | const IdentificationData & | other | ) |
Copy constructor.
Copy-constructing is expensive due to the necessary "rewiring" of references. Use the move constructor where possible.
|
inlinenoexcept |
Move constructor.
void addScore | ( | ObservationMatchRef | match_ref, |
ScoreTypeRef | score_ref, | ||
double | value | ||
) |
Add a score to an input match (e.g. PSM)
void calculateCoverages | ( | bool | check_molecule_length = false | ) |
Calculate sequence coverages of parent sequences.
Referenced by NucleicAcidSearchEngine::main_().
|
protected |
Helper function to check if all applied processing steps are valid.
|
protected |
Helper function to check if all parent matches are valid.
|
protected |
Helper function to check if all score types are valid.
void cleanup | ( | bool | require_observation_match = true , |
bool | require_identified_sequence = true , |
||
bool | require_parent_match = true , |
||
bool | require_parent_group = false , |
||
bool | require_match_group = false |
||
) |
Clean up the data structure after filtering parts of it.
Make sure there are no invalid references or "orphan" data entries.
require_observation_match | Remove identified molecules, observations and adducts that aren't part of observation matches? |
require_identified_sequence | Remove parent sequences (proteins/RNAs) that aren't referenced by identified peptides/oligonucleotides? |
require_parent_match | Remove identified peptides/oligonucleotides that don't reference a parent sequence (protein/RNA)? |
require_parent_group | Remove parent sequences that aren't part of parent sequence groups? |
require_match_group | Remove input matches that aren't part of match groups? |
Referenced by IDFilter::filterObservationMatchesByFunctor(), and NucleicAcidSearchEngine::postProcessHits_().
void clear | ( | ) |
Clear all contents.
void clearCurrentProcessingStep | ( | ) |
Cancel the effect of setCurrentProcessingStep().
bool empty | ( | ) | const |
Return whether the data structure is empty (no data)
ScoreTypeRef findScoreType | ( | const String & | score_name | ) | const |
Look up a score type by name.
getScoreTypes()
.end() Referenced by NucleicAcidSearchEngine::calculateAndFilterFDR_().
|
inline |
Return the registered adducts (immutable)
std::vector<ObservationMatchRef> getBestMatchPerObservation | ( | ScoreTypeRef | score_ref, |
bool | require_score = false |
||
) | const |
Return the best match for each observation, according to a given score type.
score_ref | Score type to use |
require_score | Exclude matches without score of this type, even if they are the only matches for their observations? |
ProcessingStepRef getCurrentProcessingStep | ( | ) |
Return the current processing step (set via setCurrentProcessingStep()).
If no current processing step has been set, processing_steps.end()
is returned.
Referenced by NucleicAcidSearchEngine::main_(), and NucleicAcidSearchEngine::postProcessHits_().
|
inline |
Return the registered database search parameters (immutable)
Referenced by NucleicAcidSearchEngine::main_().
|
inline |
Return the registered database search steps (immutable)
|
inline |
Return the registered compounds (immutable)
|
inline |
Return the registered identified oligonucleotides (immutable)
Referenced by NucleicAcidSearchEngine::main_().
|
inline |
Return the registered identified peptides (immutable)
|
inline |
Return the registered input files (immutable)
Referenced by NucleicAcidSearchEngine::postProcessHits_().
std::pair<ObservationMatchRef, ObservationMatchRef> getMatchesForObservation | ( | ObservationRef | obs_ref | ) | const |
Get range of matches (cf. equal_range
) for a given observation.
|
inline |
Return the registered observation matches (immutable)
Referenced by NucleicAcidSearchEngine::calculateAndFilterFDR_(), and NucleicAcidSearchEngine::generateLFQInput_().
|
inline |
Return the registered groups of observation matches (immutable)
|
inline |
Return the registered observations (immutable)
Referenced by NucleicAcidSearchEngine::calculateAndFilterFDR_(), and NucleicAcidSearchEngine::main_().
|
inline |
Return the registered parent sequence groupings (immutable)
|
inline |
Return the registered parent sequences (immutable)
Referenced by NucleicAcidSearchEngine::main_().
|
inline |
Return the registered data processing software (immutable)
|
inline |
Return the registered data processing steps (immutable)
|
inline |
Return the registered score types (immutable)
Referenced by NucleicAcidSearchEngine::postProcessHits_().
|
inlineprotected |
Helper function for adding entries (derived from ScoredProcessingResult) to a boost::multi_index_container structure.
|
inlineprotected |
Variant of insertIntoMultiIndex_() that also updates a look-up table of valid references (addresses)
|
inlinestaticprotected |
Check validity of a reference based on a look-up table of addresses.
|
inlinestaticprotected |
Check whether a reference points to an element in a container.
RefTranslator merge | ( | const IdentificationData & | other | ) |
Merge in data from another instance.
Can be used to make a deep copy by calling merge() on an empty object. The returned translation table allows updating of references that are held externally.
other | Instance to merge in. |
|
protected |
Helper function to merge scored processing results while updating references (to processing steps and score types)
result | Instance that gets updated |
other | Instance to merge into result |
trans | Mapping of corresponding references between other and result |
|
inline |
Pick a score type for operations (e.g. filtering) on a container of scored processing results (e.g. input matches, identified peptides, ...).
If all_elements
is false, only the first element with a score will be considered (which is sufficient if all elements were processed in the same way). If all_elements
is true, the score type supported by the highest number of elements will be chosen.
If any_score
is false, only the primary score from the most recent processing step (that assigned a score) is taken into account. If any_score
is true, all score types assigned across all elements are considered (this implies all_elements
= true).
container | Container with elements derived from ScoredProcessingResult |
all_elements | Consider all elements? |
any_score | Consider any score (or just primary/most recent ones)? |
getScoreTypes()
.end() if there were no scores) AdductRef registerAdduct | ( | const AdductInfo & | adduct | ) |
Register an adduct.
Referenced by NucleicAcidSearchEngine::main_().
SearchParamRef registerDBSearchParam | ( | const DBSearchParam & | param | ) |
Register database search parameters.
Referenced by NucleicAcidSearchEngine::main_().
IdentifiedCompoundRef registerIdentifiedCompound | ( | const IdentifiedCompound & | compound | ) |
Register an identified compound (small molecule)
IdentifiedOligoRef registerIdentifiedOligo | ( | const IdentifiedOligo & | oligo | ) |
Register an identified RNA oligonucleotide.
Referenced by NucleicAcidSearchEngine::postProcessHits_().
IdentifiedPeptideRef registerIdentifiedPeptide | ( | const IdentifiedPeptide & | peptide | ) |
Register an identified peptide.
InputFileRef registerInputFile | ( | const InputFile & | file | ) |
Register an input file.
Referenced by NucleicAcidSearchEngine::main_().
ObservationRef registerObservation | ( | const Observation & | obs | ) |
Register an observation (e.g. MS2 spectrum or feature)
Referenced by NucleicAcidSearchEngine::postProcessHits_().
ObservationMatchRef registerObservationMatch | ( | const ObservationMatch & | match | ) |
Register an observation match (e.g. peptide-spectrum match)
Referenced by NucleicAcidSearchEngine::postProcessHits_().
MatchGroupRef registerObservationMatchGroup | ( | const ObservationMatchGroup & | group | ) |
Register a group of observation matches that belong together.
void registerParentGroupSet | ( | const ParentGroupSet & | groups | ) |
Register a grouping of parent sequences (e.g. protein inference result)
ParentSequenceRef registerParentSequence | ( | const ParentSequence & | parent | ) |
Register a parent sequence (e.g. protein or intact RNA)
Referenced by NucleicAcidSearchEngine::main_().
ProcessingSoftwareRef registerProcessingSoftware | ( | const ProcessingSoftware & | software | ) |
Register data processing software.
Referenced by NucleicAcidSearchEngine::main_().
ProcessingStepRef registerProcessingStep | ( | const ProcessingStep & | step | ) |
Register a data processing step.
Referenced by NucleicAcidSearchEngine::main_().
ProcessingStepRef registerProcessingStep | ( | const ProcessingStep & | step, |
SearchParamRef | search_ref | ||
) |
Register a database search step with associated parameters.
ScoreTypeRef registerScoreType | ( | const ScoreType & | score | ) |
Register a score type.
Referenced by NucleicAcidSearchEngine::main_().
|
inlinestaticprotected |
Remove elements from a set (or ordered multi_index_container) if they fulfill a predicate.
Referenced by IDFilter::filterObservationMatchesByFunctor().
|
inlinestaticprotected |
Remove elements from a set (or ordered multi_index_container) if they don't occur in a look-up table.
void setCurrentProcessingStep | ( | ProcessingStepRef | step_ref | ) |
Set a data processing step that will apply to all subsequent "register..." calls.
This step will be appended to the list of processing steps for all relevant elements that are registered subsequently (unless it is already the last entry in the list). If a score type without a software reference is registered, the software reference of this processing step will be applied. Effective until clearCurrentProcessingStep() is called.
Referenced by NucleicAcidSearchEngine::main_().
void setMetaValue | ( | const IdentifiedMolecule & | var, |
const String & | key, | ||
const DataValue & | value | ||
) |
Set a meta value on a stored identified molecule (variant)
void setMetaValue | ( | const ObservationMatchRef | ref, |
const String & | key, | ||
const DataValue & | value | ||
) |
Set a meta value on a stored input match.
void setMetaValue | ( | const ObservationRef | ref, |
const String & | key, | ||
const DataValue & | value | ||
) |
Set a meta value on a stored input item.
void setMetaValue |
Sets the DataValue corresponding to a name.
void setMetaValue |
Sets the DataValue corresponding to an index.
|
inlineprotected |
Helper function to add a meta value to an element in a multi-index container.
void swap | ( | IdentificationData & | other | ) |
Swap contents with a second instance.
|
inlinestaticprotected |
Recreate the address look-up table for a container.
|
friend |
|
friend |
|
protected |
|
protected |
Reference to the current data processing step (see setCurrentProcessingStep())
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
Suppress validity checks in register
... calls?
This is useful in situations where validity is already guaranteed (e.g. copying).
|
protected |
|
protected |
|
protected |
|
protected |
Referenced by IDFilter::filterObservationMatchesByFunctor().
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |