OpenMS  2.4.0
Classes | Public Types | Public Member Functions | Private Member Functions | Private Attributes | List of all members
PeptideAndProteinQuant Class Reference

Helper class for peptide and protein quantification based on feature data annotated with IDs. More...

#include <OpenMS/ANALYSIS/QUANTITATION/PeptideAndProteinQuant.h>

Inheritance diagram for PeptideAndProteinQuant:
DefaultParamHandler

Classes

struct  PeptideData
 Quantitative and associated data for a peptide. More...
 
struct  ProteinData
 Quantitative and associated data for a protein. More...
 
struct  Statistics
 Statistics for processing summary. More...
 

Public Types

typedef std::map< UInt64, doubleSampleAbundances
 Mapping: sample ID -> abundance. More...
 
typedef std::map< AASequence, PeptideDataPeptideQuant
 Mapping: peptide sequence (modified) -> peptide data. More...
 
typedef std::map< String, ProteinDataProteinQuant
 Mapping: protein accession -> protein data. More...
 

Public Member Functions

 PeptideAndProteinQuant ()
 Constructor. More...
 
 ~PeptideAndProteinQuant () override
 Destructor. More...
 
void readQuantData (FeatureMap &features)
 Read quantitative data from a feature map. More...
 
void readQuantData (ConsensusMap &consensus)
 Read quantitative data from a consensus map. More...
 
void readQuantData (std::vector< ProteinIdentification > &proteins, std::vector< PeptideIdentification > &peptides)
 Read quantitative data from identification results (for quantification via spectral counting). More...
 
void quantifyPeptides (const std::vector< PeptideIdentification > &peptides=std::vector< PeptideIdentification >())
 Compute peptide abundances. More...
 
void quantifyProteins (const ProteinIdentification &proteins=ProteinIdentification())
 Compute protein abundances. More...
 
const StatisticsgetStatistics ()
 Get summary statistics. More...
 
const PeptideQuantgetPeptideResults ()
 Get peptide abundance data. More...
 
const ProteinQuantgetProteinResults ()
 Get protein abundance data. More...
 
- Public Member Functions inherited from DefaultParamHandler
 DefaultParamHandler (const String &name)
 Constructor with name that is displayed in error messages. More...
 
 DefaultParamHandler (const DefaultParamHandler &rhs)
 Copy constructor. More...
 
virtual ~DefaultParamHandler ()
 Destructor. More...
 
virtual DefaultParamHandleroperator= (const DefaultParamHandler &rhs)
 Assignment operator. More...
 
virtual bool operator== (const DefaultParamHandler &rhs) const
 Equality operator. More...
 
void setParameters (const Param &param)
 Sets the parameters. More...
 
const ParamgetParameters () const
 Non-mutable access to the parameters. More...
 
const ParamgetDefaults () const
 Non-mutable access to the default parameters. More...
 
const StringgetName () const
 Non-mutable access to the name. More...
 
void setName (const String &name)
 Mutable access to the name. More...
 
const std::vector< String > & getSubsections () const
 Non-mutable access to the registered subsections. More...
 

Private Member Functions

PeptideHit getAnnotation_ (std::vector< PeptideIdentification > &peptides)
 Get the "canonical" annotation (a single peptide hit) of a feature/consensus feature from the associated list of peptide identifications. More...
 
void quantifyFeature_ (const FeatureHandle &feature, const PeptideHit &hit)
 Gather quantitative information from a feature. More...
 
template<typename T >
void orderBest_ (const std::map< T, SampleAbundances > &abundances, std::vector< T > &result)
 Order keys (charges/peptides for peptide/protein quantification) according to how many samples they allow to quantify, breaking ties by total abundance. More...
 
void normalizePeptides_ ()
 Normalize peptide abundances across samples by (multiplicative) scaling to equal medians. More...
 
String getAccession_ (const std::set< String > &pep_accessions, std::map< String, String > &accession_to_leader)
 Get the "canonical" protein accession from the list of protein accessions of a peptide. More...
 
void countPeptides_ (std::vector< PeptideIdentification > &peptides)
 Count the number of identifications (best hits only) of each peptide sequence. More...
 
void updateMembers_ () override
 Clear all data when parameters are set. More...
 

Private Attributes

Statistics stats_
 Processing statistics for output in the end. More...
 
PeptideQuant pep_quant_
 Peptide quantification data. More...
 
ProteinQuant prot_quant_
 Protein quantification data. More...
 

Additional Inherited Members

- Protected Member Functions inherited from DefaultParamHandler
void defaultsToParam_ ()
 Updates the parameters after the defaults have been set in the constructor. More...
 
- Protected Attributes inherited from DefaultParamHandler
Param param_
 Container for current parameters. More...
 
Param defaults_
 Container for default parameters. This member should be filled in the constructor of derived classes! More...
 
std::vector< Stringsubsections_
 Container for registered subsections. This member should be filled in the constructor of derived classes! More...
 
String error_name_
 Name that is displayed in error messages during the parameter checking. More...
 
bool check_defaults_
 If this member is set to false no checking if parameters in done;. More...
 
bool warn_empty_defaults_
 If this member is set to false no warning is emitted when defaults are empty;. More...
 

Detailed Description

Helper class for peptide and protein quantification based on feature data annotated with IDs.

This class is used by ProteinQuantifier. See there for further documentation.

Parameters of this class are:

NameTypeDefaultRestrictionsDescription
top int3 min: 0Calculate protein abundance from this number of proteotypic peptides (most abundant first; '0' for all)
average stringmedian median, mean, weighted_mean, sumAveraging method used to compute protein abundances from peptide abundances
include_all stringfalse true, falseInclude results for proteins with fewer proteotypic peptides than indicated by 'top' (no effect if 'top' is 0 or 1)
filter_charge stringfalse true, falseDistinguish between charge states of a peptide. For peptides, abundances will be reported separately for each charge;
for proteins, abundances will be computed based only on the most prevalent charge of each peptide.
By default, abundances are summed over all charge states.
consensus:normalize stringfalse true, falseScale peptide abundances so that medians of all samples are equal
consensus:fix_peptides stringfalse true, falseUse the same peptides for protein quantification across all samples.
With 'top 0', all peptides that occur in every sample are considered.
Otherwise ('top N'), the N peptides that occur in the most samples (independently of each other) are selected,
breaking ties by total abundance (there is no guarantee that the best co-ocurring peptides are chosen!).

Note:

Member Typedef Documentation

◆ PeptideQuant

typedef std::map<AASequence, PeptideData> PeptideQuant

Mapping: peptide sequence (modified) -> peptide data.

◆ ProteinQuant

typedef std::map<String, ProteinData> ProteinQuant

Mapping: protein accession -> protein data.

◆ SampleAbundances

typedef std::map<UInt64, double> SampleAbundances

Mapping: sample ID -> abundance.

Constructor & Destructor Documentation

◆ PeptideAndProteinQuant()

Constructor.

◆ ~PeptideAndProteinQuant()

~PeptideAndProteinQuant ( )
inlineoverride

Destructor.

Member Function Documentation

◆ countPeptides_()

void countPeptides_ ( std::vector< PeptideIdentification > &  peptides)
private

Count the number of identifications (best hits only) of each peptide sequence.

The peptide hits in peptides are sorted by score in the process.

◆ getAccession_()

String getAccession_ ( const std::set< String > &  pep_accessions,
std::map< String, String > &  accession_to_leader 
)
private

Get the "canonical" protein accession from the list of protein accessions of a peptide.

Parameters
pep_accessionsProtein accessions of a peptide
accession_to_leaderCaptures information about indistinguishable proteins (maps accession to accession of group leader)

If there is no information about indistinguishable proteins (from protXML) available, a canonical accession exists only for proteotypic peptides - it's the single accession for the respective peptide.

Otherwise, a peptide has a canonical accession if it maps only to proteins of one indistinguishable group. In this case, the canonical accession is that of the group leader.

If there is no canonical accession, the empty string is returned.

◆ getAnnotation_()

PeptideHit getAnnotation_ ( std::vector< PeptideIdentification > &  peptides)
private

Get the "canonical" annotation (a single peptide hit) of a feature/consensus feature from the associated list of peptide identifications.

Only the best-scoring peptide hit of each ID in peptides is taken into account. The hits of each ID must already be sorted! If there's more than one ID and the best hits are not identical by sequence, or if there's no peptide ID, an empty peptide hit (for "ambiguous/no annotation") is returned. Protein accessions from identical peptide hits are accumulated.

◆ getPeptideResults()

const PeptideQuant& getPeptideResults ( )

Get peptide abundance data.

◆ getProteinResults()

const ProteinQuant& getProteinResults ( )

Get protein abundance data.

◆ getStatistics()

const Statistics& getStatistics ( )

Get summary statistics.

◆ normalizePeptides_()

void normalizePeptides_ ( )
private

Normalize peptide abundances across samples by (multiplicative) scaling to equal medians.

◆ orderBest_()

void orderBest_ ( const std::map< T, SampleAbundances > &  abundances,
std::vector< T > &  result 
)
inlineprivate

Order keys (charges/peptides for peptide/protein quantification) according to how many samples they allow to quantify, breaking ties by total abundance.

The keys of abundances are stored ordered in result, best first.

◆ quantifyFeature_()

void quantifyFeature_ ( const FeatureHandle feature,
const PeptideHit hit 
)
private

Gather quantitative information from a feature.

Store quantitative information from feature in member pep_quant_, based on the peptide annotation in hit. If hit is empty ("ambiguous/no annotation"), nothing is stored.

◆ quantifyPeptides()

void quantifyPeptides ( const std::vector< PeptideIdentification > &  peptides = std::vector< PeptideIdentification >())

Compute peptide abundances.

Based on quantitative data for individual charge states (in member pep_quant_), overall abundances for peptides are computed (and stored again in pep_quant_).

Quantitative data must first be read via readQuantData().

Optional (peptide-level) protein inference information (e.g. from Fido or ProteinProphet) can be supplied via peptides. In that case, peptide-to-protein associations - the basis for protein-level quantification - will also be read from peptides!

◆ quantifyProteins()

void quantifyProteins ( const ProteinIdentification proteins = ProteinIdentification())

Compute protein abundances.

Peptide abundances must be computed first with quantifyPeptides(). Optional protein inference information (e.g. from Fido or ProteinProphet) can be supplied via proteins.

◆ readQuantData() [1/3]

void readQuantData ( FeatureMap features)

Read quantitative data from a feature map.

Parameters should be set before using this method, as setting parameters will clear all results.

◆ readQuantData() [2/3]

void readQuantData ( ConsensusMap consensus)

Read quantitative data from a consensus map.

Parameters should be set before using this method, as setting parameters will clear all results.

◆ readQuantData() [3/3]

void readQuantData ( std::vector< ProteinIdentification > &  proteins,
std::vector< PeptideIdentification > &  peptides 
)

Read quantitative data from identification results (for quantification via spectral counting).

Parameters should be set before using this method, as setting parameters will clear all results.

◆ updateMembers_()

void updateMembers_ ( )
overrideprivatevirtual

Clear all data when parameters are set.

Reimplemented from DefaultParamHandler.

Member Data Documentation

◆ pep_quant_

PeptideQuant pep_quant_
private

Peptide quantification data.

◆ prot_quant_

ProteinQuant prot_quant_
private

Protein quantification data.

◆ stats_

Statistics stats_
private

Processing statistics for output in the end.