OpenMS
MzIdentMLDOMHandler Class Reference

XML DOM handler for MzIdentMLFile. More...

#include <OpenMS/FORMAT/HANDLERS/MzIdentMLDOMHandler.h>

Collaboration diagram for MzIdentMLDOMHandler:
[legend]

Classes

struct  AnalysisSoftware
 Struct to hold the used analysis software for that file. More...
 
struct  DatabaseInput
 Struct to hold the information from the DatabaseInput xml tag. More...
 
struct  DBSequence
 Struct to hold the information from the DBSequence xml tag. More...
 
struct  ModificationParam
 Struct to hold the information from the ModificationParam xml tag. More...
 
struct  PeptideEvidence
 Struct to hold the PeptideEvidence information. More...
 
struct  SpectrumIdentification
 Struct to hold the information from the SpectrumIdentification xml tag. More...
 
struct  SpectrumIdentificationProtocol
 Struct to hold the information from the SpectrumIdentificationProtocol xml tag. More...
 

Constructors and destructor

const ProgressLoggerlogger_
 Progress logger. More...
 
ControlledVocabulary cv_
 Controlled vocabulary (psi-ms from OpenMS/share/OpenMS/CV/psi-ms.obo) More...
 
ControlledVocabulary unimod_
 Controlled vocabulary for modifications (unimod from OpenMS/share/OpenMS/CV/unimod.obo) More...
 
std::vector< ProteinIdentification > * pro_id_ = nullptr
 Internal +w Identification Item for proteins. More...
 
std::vector< PeptideIdentification > * pep_id_ = nullptr
 Internal +w Identification Item for peptides. More...
 
const std::vector< ProteinIdentification > * cpro_id_ = nullptr
 Internal -w Identification Item for proteins. More...
 
const std::vector< PeptideIdentification > * cpep_id_ = nullptr
 Internal -w Identification Item for peptides. More...
 
const String schema_version_
 Internal version keeping. More...
 
 MzIdentMLDOMHandler (const std::vector< ProteinIdentification > &pro_id, const std::vector< PeptideIdentification > &pep_id, const String &version, const ProgressLogger &logger)
 Constructor for a write-only handler for internal identification structures. More...
 
 MzIdentMLDOMHandler (std::vector< ProteinIdentification > &pro_id, std::vector< PeptideIdentification > &pep_id, const String &version, const ProgressLogger &logger)
 Constructor for a read-only handler for internal identification structures. More...
 
virtual ~MzIdentMLDOMHandler ()
 Destructor. More...
 
void readMzIdentMLFile (const std::string &mzid_file)
 Provides the functionality of reading a mzid with a handler object. More...
 
void writeMzIdentMLFile (const std::string &mzid_file)
 Provides the functionality to write a mzid with a handler object. More...
 
ControlledVocabulary::CVTerm getChildWithName_ (const String &parent_accession, const String &name) const
 Looks up a child CV term of parent_accession with the name name. If no such term is found, an empty term is returned. More...
 

Helper functions to build a DOM tree from the internal id structures

XMLCh * xml_root_tag_ptr_
 
XMLCh * xml_cvparam_tag_ptr_
 
XMLCh * xml_name_attr_ptr_
 
xercesc::XercesDOMParser mzid_parser_
 
std::unique_ptr< XMLHandlerxml_handler_ = nullptr
 
String search_engine_
 
String search_engine_version_
 
std::map< String, AnalysisSoftwareas_map_
 mapping AnalysisSoftware id -> AnalysisSoftware More...
 
std::map< String, Stringsr_map_
 mapping sourcefile id -> sourcefile location More...
 
std::map< String, Stringsd_map_
 mapping spectradata id -> spectradata location More...
 
std::map< String, DatabaseInputdb_map_
 mapping database id -> DatabaseInput More...
 
std::map< String, SpectrumIdentificationsi_map_
 mapping SpectrumIdentification id -> SpectrumIdentification (id refs) More...
 
std::map< String, size_t > si_pro_map_
 mapping SpectrumIdentificationList id -> index to ProteinIdentification in pro_id_ More...
 
std::map< String, SpectrumIdentificationProtocolsp_map_
 mapping SpectrumIdentificationProtocol id -> SpectrumIdentificationProtocol More...
 
std::map< String, AASequencepep_map_
 mapping Peptide id -> Sequence More...
 
std::map< String, PeptideEvidencepe_ev_map_
 mapping PeptideEvidence id -> PeptideEvidence More...
 
std::map< String, Stringpv_db_map_
 mapping PeptideEvidence id -> DBSequence id More...
 
std::multimap< String, Stringp_pv_map_
 mapping Peptide id -> PeptideEvidence id, multiple PeptideEvidences can have equivalent Peptides. More...
 
std::map< String, DBSequencedb_sq_map_
 mapping DBSequence id -> Sequence More...
 
std::list< std::list< String > > hit_pev_
 writing help only More...
 
bool xl_ms_search_
 is true when reading a file containing Cross-Linking MS search results More...
 
std::map< String, Stringxl_id_donor_map_
 mapping Peptide id -> crosslink donor value More...
 
std::map< String, Stringxl_id_acceptor_map_
 mapping peptide id of acceptor peptide -> crosslink acceptor value More...
 
std::map< String, SignedSizexl_donor_pos_map_
 mapping donor value -> cross-link modification location More...
 
std::map< String, SignedSizexl_acceptor_pos_map_
 mapping acceptor value -> cross-link modification location More...
 
std::map< String, double > xl_mass_map_
 mapping Peptide id -> cross-link mass More...
 
std::map< String, Stringxl_mod_map_
 mapping peptide id -> cross-linking reagent name More...
 
void buildCvList_ (xercesc::DOMElement *cvElements)
 
void buildAnalysisSoftwareList_ (xercesc::DOMElement *analysisSoftwareElements)
 
void buildSequenceCollection_ (xercesc::DOMElement *sequenceCollectionElements)
 
void buildAnalysisCollection_ (xercesc::DOMElement *analysisCollectionElements)
 
void buildAnalysisProtocolCollection_ (xercesc::DOMElement *protocolElements)
 
void buildInputDataCollection_ (xercesc::DOMElement *inputElements)
 
void buildEnclosedCV_ (xercesc::DOMElement *parentElement, const String &encel, const String &acc, const String &name, const String &cvref)
 
void buildAnalysisDataCollection_ (xercesc::DOMElement *analysisElements)
 
 MzIdentMLDOMHandler ()
 
 MzIdentMLDOMHandler (const MzIdentMLDOMHandler &rhs)
 
MzIdentMLDOMHandleroperator= (const MzIdentMLDOMHandler &rhs)
 

Helper functions to build the internal id structures from the DOM tree

std::pair< CVTermList, std::map< String, DataValue > > parseParamGroup_ (xercesc::DOMNodeList *paramGroup)
 First: CVparams, Second: userParams (independent of each other) More...
 
CVTerm parseCvParam_ (xercesc::DOMElement *param)
 
std::pair< String, DataValueparseUserParam_ (xercesc::DOMElement *param)
 
void parseAnalysisSoftwareList_ (xercesc::DOMNodeList *analysisSoftwareElements)
 
void parseDBSequenceElements_ (xercesc::DOMNodeList *dbSequenceElements)
 
void parsePeptideElements_ (xercesc::DOMNodeList *peptideElements)
 
AASequence parsePeptideSiblings_ (xercesc::DOMElement *peptide)
 
void parsePeptideEvidenceElements_ (xercesc::DOMNodeList *peptideEvidenceElements)
 
void parseSpectrumIdentificationElements_ (xercesc::DOMNodeList *spectrumIdentificationElements)
 
void parseSpectrumIdentificationProtocolElements_ (xercesc::DOMNodeList *spectrumIdentificationProtocolElements)
 
void parseInputElements_ (xercesc::DOMNodeList *inputElements)
 
void parseSpectrumIdentificationListElements_ (xercesc::DOMNodeList *spectrumIdentificationListElements)
 
void parseSpectrumIdentificationItemSetXLMS (std::set< String >::const_iterator set_it, std::multimap< String, int > xl_val_map, xercesc::DOMElement *element_res, const String &spectrumID)
 
void parseSpectrumIdentificationItemElement_ (xercesc::DOMElement *spectrumIdentificationItemElement, PeptideIdentification &spectrum_identification, String &spectrumIdentificationList_ref)
 
void parseProteinDetectionHypothesisElement_ (xercesc::DOMElement *proteinDetectionHypothesisElement, ProteinIdentification &protein_identification)
 
void parseProteinAmbiguityGroupElement_ (xercesc::DOMElement *proteinAmbiguityGroupElement, ProteinIdentification &protein_identification)
 
void parseProteinDetectionListElements_ (xercesc::DOMNodeList *proteinDetectionListElements)
 
static ProteinIdentification::SearchParameters findSearchParameters_ (std::pair< CVTermList, std::map< String, DataValue > > as_params)
 

Detailed Description

XML DOM handler for MzIdentMLFile.

In read-mode, this class will parse an MzIdentML XML file and append the input identifications to the provided PeptideIdentifications and ProteinIdentifications.

Note
Do not use this class. It is only needed in MzIdentMLFile.
DOM and STREAM handler for MzIdentML have the same interface for legacy id structures.
Only upon destruction of this class it can be guaranteed that all data has been appended to the appropriate containers. Do not try to access the data before that.

Class Documentation

◆ OpenMS::Internal::MzIdentMLDOMHandler::AnalysisSoftware

struct OpenMS::Internal::MzIdentMLDOMHandler::AnalysisSoftware

Struct to hold the used analysis software for that file.

Collaboration diagram for MzIdentMLDOMHandler::AnalysisSoftware:
[legend]
Class Members
String name
String version

◆ OpenMS::Internal::MzIdentMLDOMHandler::DatabaseInput

struct OpenMS::Internal::MzIdentMLDOMHandler::DatabaseInput

Struct to hold the information from the DatabaseInput xml tag.

Collaboration diagram for MzIdentMLDOMHandler::DatabaseInput:
[legend]
Class Members
DateTime date
String location
String name
String version

◆ OpenMS::Internal::MzIdentMLDOMHandler::DBSequence

struct OpenMS::Internal::MzIdentMLDOMHandler::DBSequence

Struct to hold the information from the DBSequence xml tag.

Collaboration diagram for MzIdentMLDOMHandler::DBSequence:
[legend]
Class Members
String accession
CVTermList cvs
String database_ref
String sequence

◆ OpenMS::Internal::MzIdentMLDOMHandler::ModificationParam

struct OpenMS::Internal::MzIdentMLDOMHandler::ModificationParam

Struct to hold the information from the ModificationParam xml tag.

Collaboration diagram for MzIdentMLDOMHandler::ModificationParam:
[legend]
Class Members
String fixed_mod
long double mass_delta
CVTermList modification_param_cvs
String residues
CVTermList specificities

◆ OpenMS::Internal::MzIdentMLDOMHandler::PeptideEvidence

struct OpenMS::Internal::MzIdentMLDOMHandler::PeptideEvidence

Struct to hold the PeptideEvidence information.

Collaboration diagram for MzIdentMLDOMHandler::PeptideEvidence:
[legend]
Class Members
bool idec
char post
char pre
int start
int stop

◆ OpenMS::Internal::MzIdentMLDOMHandler::SpectrumIdentification

struct OpenMS::Internal::MzIdentMLDOMHandler::SpectrumIdentification

Struct to hold the information from the SpectrumIdentification xml tag.

Collaboration diagram for MzIdentMLDOMHandler::SpectrumIdentification:
[legend]
Class Members
String search_database_ref
String spectra_data_ref
String spectrum_identification_list_ref
String spectrum_identification_protocol_ref

◆ OpenMS::Internal::MzIdentMLDOMHandler::SpectrumIdentificationProtocol

struct OpenMS::Internal::MzIdentMLDOMHandler::SpectrumIdentificationProtocol

Struct to hold the information from the SpectrumIdentificationProtocol xml tag.

Collaboration diagram for MzIdentMLDOMHandler::SpectrumIdentificationProtocol:
[legend]
Class Members
String enzyme
long double fragment_tolerance
CVTermList modification_parameter
CVTermList parameter_cvs
map< String, DataValue > parameter_ups
long double precursor_tolerance
CVTerm searchtype
CVTermList threshold_cvs
map< String, DataValue > threshold_ups

Constructor & Destructor Documentation

◆ MzIdentMLDOMHandler() [1/4]

MzIdentMLDOMHandler ( const std::vector< ProteinIdentification > &  pro_id,
const std::vector< PeptideIdentification > &  pep_id,
const String version,
const ProgressLogger logger 
)

Constructor for a write-only handler for internal identification structures.

◆ MzIdentMLDOMHandler() [2/4]

MzIdentMLDOMHandler ( std::vector< ProteinIdentification > &  pro_id,
std::vector< PeptideIdentification > &  pep_id,
const String version,
const ProgressLogger logger 
)

Constructor for a read-only handler for internal identification structures.

◆ ~MzIdentMLDOMHandler()

virtual ~MzIdentMLDOMHandler ( )
virtual

Destructor.

◆ MzIdentMLDOMHandler() [3/4]

MzIdentMLDOMHandler ( )
private

◆ MzIdentMLDOMHandler() [4/4]

MzIdentMLDOMHandler ( const MzIdentMLDOMHandler rhs)
private

Member Function Documentation

◆ buildAnalysisCollection_()

void buildAnalysisCollection_ ( xercesc::DOMElement *  analysisCollectionElements)
protected

◆ buildAnalysisDataCollection_()

void buildAnalysisDataCollection_ ( xercesc::DOMElement *  analysisElements)
protected

◆ buildAnalysisProtocolCollection_()

void buildAnalysisProtocolCollection_ ( xercesc::DOMElement *  protocolElements)
protected

◆ buildAnalysisSoftwareList_()

void buildAnalysisSoftwareList_ ( xercesc::DOMElement *  analysisSoftwareElements)
protected

◆ buildCvList_()

void buildCvList_ ( xercesc::DOMElement *  cvElements)
protected

◆ buildEnclosedCV_()

void buildEnclosedCV_ ( xercesc::DOMElement *  parentElement,
const String encel,
const String acc,
const String name,
const String cvref 
)
protected

◆ buildInputDataCollection_()

void buildInputDataCollection_ ( xercesc::DOMElement *  inputElements)
protected

◆ buildSequenceCollection_()

void buildSequenceCollection_ ( xercesc::DOMElement *  sequenceCollectionElements)
protected

◆ findSearchParameters_()

static ProteinIdentification::SearchParameters findSearchParameters_ ( std::pair< CVTermList, std::map< String, DataValue > >  as_params)
staticprotected

◆ getChildWithName_()

ControlledVocabulary::CVTerm getChildWithName_ ( const String parent_accession,
const String name 
) const
protected

Looks up a child CV term of parent_accession with the name name. If no such term is found, an empty term is returned.

◆ operator=()

MzIdentMLDOMHandler& operator= ( const MzIdentMLDOMHandler rhs)
private

◆ parseAnalysisSoftwareList_()

void parseAnalysisSoftwareList_ ( xercesc::DOMNodeList *  analysisSoftwareElements)
protected

◆ parseCvParam_()

CVTerm parseCvParam_ ( xercesc::DOMElement *  param)
protected

◆ parseDBSequenceElements_()

void parseDBSequenceElements_ ( xercesc::DOMNodeList *  dbSequenceElements)
protected

◆ parseInputElements_()

void parseInputElements_ ( xercesc::DOMNodeList *  inputElements)
protected

◆ parseParamGroup_()

std::pair<CVTermList, std::map<String, DataValue> > parseParamGroup_ ( xercesc::DOMNodeList *  paramGroup)
protected

First: CVparams, Second: userParams (independent of each other)

◆ parsePeptideElements_()

void parsePeptideElements_ ( xercesc::DOMNodeList *  peptideElements)
protected

◆ parsePeptideEvidenceElements_()

void parsePeptideEvidenceElements_ ( xercesc::DOMNodeList *  peptideEvidenceElements)
protected

◆ parsePeptideSiblings_()

AASequence parsePeptideSiblings_ ( xercesc::DOMElement *  peptide)
protected

◆ parseProteinAmbiguityGroupElement_()

void parseProteinAmbiguityGroupElement_ ( xercesc::DOMElement *  proteinAmbiguityGroupElement,
ProteinIdentification protein_identification 
)
protected

◆ parseProteinDetectionHypothesisElement_()

void parseProteinDetectionHypothesisElement_ ( xercesc::DOMElement *  proteinDetectionHypothesisElement,
ProteinIdentification protein_identification 
)
protected

◆ parseProteinDetectionListElements_()

void parseProteinDetectionListElements_ ( xercesc::DOMNodeList *  proteinDetectionListElements)
protected

◆ parseSpectrumIdentificationElements_()

void parseSpectrumIdentificationElements_ ( xercesc::DOMNodeList *  spectrumIdentificationElements)
protected

◆ parseSpectrumIdentificationItemElement_()

void parseSpectrumIdentificationItemElement_ ( xercesc::DOMElement *  spectrumIdentificationItemElement,
PeptideIdentification spectrum_identification,
String spectrumIdentificationList_ref 
)
protected

◆ parseSpectrumIdentificationItemSetXLMS()

void parseSpectrumIdentificationItemSetXLMS ( std::set< String >::const_iterator  set_it,
std::multimap< String, int >  xl_val_map,
xercesc::DOMElement *  element_res,
const String spectrumID 
)
protected

◆ parseSpectrumIdentificationListElements_()

void parseSpectrumIdentificationListElements_ ( xercesc::DOMNodeList *  spectrumIdentificationListElements)
protected

◆ parseSpectrumIdentificationProtocolElements_()

void parseSpectrumIdentificationProtocolElements_ ( xercesc::DOMNodeList *  spectrumIdentificationProtocolElements)
protected

◆ parseUserParam_()

std::pair<String, DataValue> parseUserParam_ ( xercesc::DOMElement *  param)
protected

◆ readMzIdentMLFile()

void readMzIdentMLFile ( const std::string &  mzid_file)

Provides the functionality of reading a mzid with a handler object.

◆ writeMzIdentMLFile()

void writeMzIdentMLFile ( const std::string &  mzid_file)

Provides the functionality to write a mzid with a handler object.

Member Data Documentation

◆ as_map_

std::map<String, AnalysisSoftware> as_map_
private

◆ cpep_id_

const std::vector<PeptideIdentification>* cpep_id_ = nullptr
protected

Internal -w Identification Item for peptides.

◆ cpro_id_

const std::vector<ProteinIdentification>* cpro_id_ = nullptr
protected

Internal -w Identification Item for proteins.

◆ cv_

ControlledVocabulary cv_
protected

Controlled vocabulary (psi-ms from OpenMS/share/OpenMS/CV/psi-ms.obo)

◆ db_map_

std::map<String, DatabaseInput> db_map_
private

mapping database id -> DatabaseInput

◆ db_sq_map_

std::map<String, DBSequence> db_sq_map_
private

mapping DBSequence id -> Sequence

◆ hit_pev_

std::list<std::list<String> > hit_pev_
private

writing help only

◆ logger_

const ProgressLogger& logger_
protected

Progress logger.

◆ mzid_parser_

xercesc::XercesDOMParser mzid_parser_
private

◆ p_pv_map_

std::multimap<String, String> p_pv_map_
private

mapping Peptide id -> PeptideEvidence id, multiple PeptideEvidences can have equivalent Peptides.

◆ pe_ev_map_

std::map<String, PeptideEvidence> pe_ev_map_
private

◆ pep_id_

std::vector<PeptideIdentification>* pep_id_ = nullptr
protected

Internal +w Identification Item for peptides.

◆ pep_map_

std::map<String, AASequence> pep_map_
private

mapping Peptide id -> Sequence

◆ pro_id_

std::vector<ProteinIdentification>* pro_id_ = nullptr
protected

Internal +w Identification Item for proteins.

◆ pv_db_map_

std::map<String, String> pv_db_map_
private

mapping PeptideEvidence id -> DBSequence id

◆ schema_version_

const String schema_version_
protected

Internal version keeping.

◆ sd_map_

std::map<String, String> sd_map_
private

mapping spectradata id -> spectradata location

◆ search_engine_

String search_engine_
private

◆ search_engine_version_

String search_engine_version_
private

◆ si_map_

std::map<String, SpectrumIdentification> si_map_
private

◆ si_pro_map_

std::map<String, size_t> si_pro_map_
private

mapping SpectrumIdentificationList id -> index to ProteinIdentification in pro_id_

◆ sp_map_

◆ sr_map_

std::map<String, String> sr_map_
private

mapping sourcefile id -> sourcefile location

◆ unimod_

ControlledVocabulary unimod_
protected

Controlled vocabulary for modifications (unimod from OpenMS/share/OpenMS/CV/unimod.obo)

◆ xl_acceptor_pos_map_

std::map<String, SignedSize> xl_acceptor_pos_map_
private

mapping acceptor value -> cross-link modification location

◆ xl_donor_pos_map_

std::map<String, SignedSize> xl_donor_pos_map_
private

mapping donor value -> cross-link modification location

◆ xl_id_acceptor_map_

std::map<String, String> xl_id_acceptor_map_
private

mapping peptide id of acceptor peptide -> crosslink acceptor value

◆ xl_id_donor_map_

std::map<String, String> xl_id_donor_map_
private

mapping Peptide id -> crosslink donor value

◆ xl_mass_map_

std::map<String, double> xl_mass_map_
private

mapping Peptide id -> cross-link mass

◆ xl_mod_map_

std::map<String, String> xl_mod_map_
private

mapping peptide id -> cross-linking reagent name

◆ xl_ms_search_

bool xl_ms_search_
private

is true when reading a file containing Cross-Linking MS search results

◆ xml_cvparam_tag_ptr_

XMLCh* xml_cvparam_tag_ptr_
private

◆ xml_handler_

std::unique_ptr<XMLHandler> xml_handler_ = nullptr
private

◆ xml_name_attr_ptr_

XMLCh* xml_name_attr_ptr_
private

◆ xml_root_tag_ptr_

XMLCh* xml_root_tag_ptr_
private