![]() |
OpenMS
|
#include <OpenMS/FEATUREFINDER/FeatureFinderIdentificationAlgorithm.h>
Classes | |
| struct | FeatureCompare |
| comparison functor for features More... | |
| struct | FeatureFilterPeptides |
| predicate for filtering features by assigned peptides: More... | |
| struct | FeatureFilterQuality |
| predicate for filtering features by overall quality: More... | |
| struct | IMStats |
| Ion mobility statistics for a peptide in a specific RT region and charge state. More... | |
| struct | PeptideCompare |
| comparison functor for (unassigned) peptide IDs More... | |
| struct | RTRegion |
| region in RT in which a peptide elutes: More... | |
Public Member Functions | |
| FeatureFinderIdentificationAlgorithm () | |
| default constructor More... | |
| void | run (PeptideIdentificationList peptides, const std::vector< ProteinIdentification > &proteins, PeptideIdentificationList peptides_ext, std::vector< ProteinIdentification > proteins_ext, FeatureMap &features, const FeatureMap &seeds=FeatureMap(), const String &spectra_file="") |
| void | runOnCandidates (FeatureMap &features) |
| PeakMap & | getMSData () |
| const PeakMap & | getMSData () const |
| void | setMSData (const PeakMap &ms_data) |
| set the MS data used for feature detection More... | |
| void | setMSData (PeakMap &&ms_data) |
| PeakMap & | getChromatograms () |
| const PeakMap & | getChromatograms () const |
| ProgressLogger & | getProgressLogger () |
| const ProgressLogger & | getProgressLogger () const |
| TargetedExperiment & | getLibrary () |
| const TargetedExperiment & | getLibrary () const |
Public Member Functions inherited from DefaultParamHandler | |
| DefaultParamHandler (const String &name) | |
| Constructor with name that is displayed in error messages. More... | |
| DefaultParamHandler (const DefaultParamHandler &rhs) | |
| Copy constructor. More... | |
| virtual | ~DefaultParamHandler () |
| Destructor. More... | |
| DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
| Assignment operator. More... | |
| virtual bool | operator== (const DefaultParamHandler &rhs) const |
| Equality operator. More... | |
| void | setParameters (const Param ¶m) |
| Sets the parameters. More... | |
| const Param & | getParameters () const |
| Non-mutable access to the parameters. More... | |
| const Param & | getDefaults () const |
| Non-mutable access to the default parameters. More... | |
| const String & | getName () const |
| Non-mutable access to the name. More... | |
| void | setName (const String &name) |
| Mutable access to the name. More... | |
| const std::vector< String > & | getSubsections () const |
| Non-mutable access to the registered subsections. More... | |
Protected Types | |
| typedef FeatureFinderAlgorithmPickedHelperStructs::MassTrace | MassTrace |
| typedef FeatureFinderAlgorithmPickedHelperStructs::MassTraces | MassTraces |
| typedef std::multimap< double, PeptideIdentification * > | RTMap |
| mapping: RT (not necessarily unique) -> pointer to peptide More... | |
| typedef std::map< Int, std::pair< RTMap, RTMap > > | ChargeMap |
| mapping: charge -> internal/external: (RT -> pointer to peptide) More... | |
| typedef std::map< AASequence, ChargeMap > | PeptideMap |
| mapping: sequence -> charge -> internal/external ID information More... | |
| typedef std::map< String, std::pair< RTMap, RTMap > > | PeptideRefRTMap |
| mapping: peptide ref. -> int./ext.: (RT -> pointer to peptide) More... | |
Protected Member Functions | |
| void | updateMembers_ () override |
| This method is used to update extra member variables at the end of the setParameters() method. More... | |
| void | generateTransitions_ (const String &peptide_id, double mz, Int charge, const IsotopeDistribution &iso_dist) |
| generate transitions (isotopic traces) for a peptide ion and add them to the library: More... | |
| void | addPeptideRT_ (TargetedExperiment::Peptide &peptide, double rt) const |
| void | getRTRegions_ (ChargeMap &peptide_data, std::vector< RTRegion > &rt_regions, bool clear_IDs=true) const |
| get regions in which peptide eludes (ideally only one) by clustering RT elution times More... | |
| IMStats | getRTRegionIMStats_ (const RTRegion &r) |
| Calculate ion mobility statistics for peptide identifications in an RT region. More... | |
| void | calculateGlobalIMStats_ () |
| Calculate global IM statistics from MS data and peptide identifications. More... | |
| void | annotateFeaturesFinalizeAssay_ (FeatureMap &features, std::map< Size, std::vector< PeptideIdentification * > > &feat_ids, RTMap &rt_internal) |
| void | annotateFeatures_ (FeatureMap &features, PeptideRefRTMap &ref_rt_map) |
| annotate identified features with m/z, isotope probabilities, etc. More... | |
| void | ensureConvexHulls_ (Feature &feature) const |
| void | postProcess_ (FeatureMap &features, bool with_external_ids) |
| void | validateSVMParameters_ () const |
| Helper functions for run() More... | |
| void | initializeFeatureFinder_ () |
| double | calculateRTWindow_ (double rt_uncertainty) const |
| void | removeSeedPseudoIDs_ (FeatureMap &features) |
| std::pair< double, double > | calculateRTBounds_ (double rt_min, double rt_max) const |
| Calculate RT bounds with optional tolerance expansion. More... | |
| void | statistics_ (const FeatureMap &features) const |
| some statistics on detected features More... | |
| void | createAssayLibrary_ (const PeptideMap::iterator &begin, const PeptideMap::iterator &end, PeptideRefRTMap &ref_rt_map, bool clear_IDs=true) |
| void | addPeptideToMap_ (PeptideIdentification &peptide, PeptideMap &peptide_map, bool external=false) |
| void | filterFeatures_ (FeatureMap &features, bool classified) |
| void | runSingleGroup_ (PeptideIdentificationList peptides, const std::vector< ProteinIdentification > &proteins, PeptideIdentificationList peptides_ext, std::vector< ProteinIdentification > proteins_ext, FeatureMap &features, const FeatureMap &seeds, const String &spectra_file) |
| Size | addSeeds_ (PeptideIdentificationList &peptides, const FeatureMap &seeds) |
| Size | addOffsetPeptides_ (PeptideIdentificationList &peptides, double offset) |
| template<typename It > | |
| std::vector< std::pair< It, It > > | chunk_ (It range_from, It range_to, const std::ptrdiff_t batch_size) |
Protected Member Functions inherited from DefaultParamHandler | |
| void | defaultsToParam_ () |
| Updates the parameters after the defaults have been set in the constructor. More... | |
Static Protected Member Functions | |
| static bool | isSeedPseudoHit_ (const PeptideHit &hit) |
| Helper function to check if a peptide hit is a seed pseudo-ID. More... | |
Protected Attributes | |
| PeptideMap | peptide_map_ |
| Size | n_internal_peps_ |
| number of internal peptide More... | |
| Size | n_external_peps_ |
| number of external peptides More... | |
| Size | batch_size_ |
| nr of peptides to use at the same time during chromatogram extraction More... | |
| double | rt_window_ |
| RT window width. More... | |
| double | mz_window_ |
| m/z window width More... | |
| bool | mz_window_ppm_ |
| m/z window width is given in PPM (not Da)? More... | |
| double | mapping_tolerance_ |
| RT tolerance for mapping IDs to features. More... | |
| double | isotope_pmin_ |
| min. isotope probability for peptide assay More... | |
| Size | n_isotopes_ |
| number of isotopes for peptide assay More... | |
| double | rt_quantile_ |
| double | peak_width_ |
| double | min_peak_width_ |
| double | signal_to_noise_ |
| String | elution_model_ |
| double | svm_min_prob_ |
| StringList | svm_predictor_names_ |
| String | svm_xval_out_ |
| double | svm_quality_cutoff |
| Size | svm_n_parts_ |
| number of partitions for SVM cross-validation More... | |
| Size | svm_n_samples_ |
| number of samples for SVM training More... | |
| String | candidates_out_ |
| Size | debug_level_ |
| struct OpenMS::FeatureFinderIdentificationAlgorithm::FeatureFilterQuality | feature_filter_quality_ |
| struct OpenMS::FeatureFinderIdentificationAlgorithm::FeatureFilterPeptides | feature_filter_peptides_ |
| struct OpenMS::FeatureFinderIdentificationAlgorithm::PeptideCompare | peptide_compare_ |
| struct OpenMS::FeatureFinderIdentificationAlgorithm::FeatureCompare | feature_compare_ |
| PeakMap | ms_data_ |
| input LC-MS data More... | |
| PeakMap | chrom_data_ |
| accumulated chromatograms (XICs) More... | |
| TargetedExperiment | library_ |
| assays for peptides (cleared per chunk during processing) More... | |
| TargetedExperiment | output_library_ |
| accumulated assays for output (populated from library_ before clearing) More... | |
| bool | quantify_decoys_ |
| double | add_mass_offset_peptides_ {0.0} |
| non-zero if for every feature an additional offset features should be extracted More... | |
| bool | use_psm_cutoff_ |
| double | psm_score_cutoff_ |
| PeptideIdentificationList | unassignedIDs_ |
| const double | seed_rt_window_ = 60.0 |
| extraction window used for seeds (smaller than rt_window_ as we know the exact apex positions) More... | |
| std::map< double, std::pair< Size, Size > > | svm_probs_internal_ |
| SVM probability -> number of pos./neg. features (for FDR calculation): More... | |
| std::multiset< double > | svm_probs_external_ |
| SVM probabilities for "external" features (for FDR calculation): More... | |
| Size | n_internal_features_ |
| internal feature counter (for FDR calculation) More... | |
| Size | n_external_features_ |
| std::map< String, double > | isotope_probs_ |
| TransformationDescription trafo_; // RT transformation (to range 0-1) More... | |
| std::map< String, IMStats > | im_stats_ |
| Ion mobility statistics per peptide reference (peptide sequence/charge:region) More... | |
| IMStats | global_im_stats_ |
| Global ion mobility statistics from all peptide identifications. More... | |
| MRMFeatureFinderScoring | feat_finder_ |
| OpenSWATH feature finder. More... | |
| Internal::FFIDAlgoExternalIDHandler | external_id_handler_ |
| Handler for external peptide IDs. More... | |
| ProgressLogger | prog_log_ |
Protected Attributes inherited from DefaultParamHandler | |
| Param | param_ |
| Container for current parameters. More... | |
| Param | defaults_ |
| Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
| std::vector< String > | subsections_ |
| Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
| String | error_name_ |
| Name that is displayed in error messages during the parameter checking. More... | |
| bool | check_defaults_ |
| If this member is set to false no checking if parameters in done;. More... | |
| bool | warn_empty_defaults_ |
| If this member is set to false no warning is emitted when defaults are empty;. More... | |
Additional Inherited Members | |
Static Public Member Functions inherited from DefaultParamHandler | |
| static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
| Writes all parameters to meta values. More... | |
| struct OpenMS::FeatureFinderIdentificationAlgorithm::IMStats |
Ion mobility statistics for a peptide in a specific RT region and charge state.
This structure stores statistical measures of ion mobility values collected from peptide identifications within a single RT region. These statistics are used for:
All values default to -1.0 to indicate missing/unavailable IM data.
| struct OpenMS::FeatureFinderIdentificationAlgorithm::RTRegion |
mapping: charge -> internal/external: (RT -> pointer to peptide)
|
protected |
|
protected |
|
protected |
mapping: sequence -> charge -> internal/external ID information
|
protected |
mapping: peptide ref. -> int./ext.: (RT -> pointer to peptide)
|
protected |
mapping: RT (not necessarily unique) -> pointer to peptide
default constructor
|
protected |
|
protected |
|
protected |
CAUTION: This method stores a pointer to the given peptide reference in internals Make sure it stays valid until destruction of the class.
|
protected |
|
protected |
annotate identified features with m/z, isotope probabilities, etc.
|
protected |
|
protected |
Calculate global IM statistics from MS data and peptide identifications.
Uses MSExperiment::getMinMobility()/getMaxMobility() to get the full IM range from raw data (min/max), and calculates median from peptide identifications for robust central tendency. Must be called BEFORE addSeeds_() to ensure global statistics are based only on identified peptides.
Seeds may or may not have IM annotation depending on the feature finder. Seeds with IM annotation use their own IM value; seeds without IM are extracted across the full IM range of the dataset.
|
protected |
Calculate RT bounds with optional tolerance expansion.
|
protected |
|
inlineprotected |
Chunks an iterator range (allowing advance and distance) into batches of size batch_size. Last batch might be smaller.
|
protected |
creates an assay library out of the peptide sequences and their RT elution windows the PeptideMap is mutable since we clear it on-the-go clear_IDs set to false to keep IDs in internal charge maps (only needed for debugging purposes)
|
protected |
|
protected |
|
protected |
generate transitions (isotopic traces) for a peptide ion and add them to the library:
| PeakMap& getChromatograms | ( | ) |
| const PeakMap& getChromatograms | ( | ) | const |
| TargetedExperiment& getLibrary | ( | ) |
| const TargetedExperiment& getLibrary | ( | ) | const |
| PeakMap& getMSData | ( | ) |
Referenced by NuXLRTPrediction::train().
| const PeakMap& getMSData | ( | ) | const |
| ProgressLogger& getProgressLogger | ( | ) |
| const ProgressLogger& getProgressLogger | ( | ) | const |
Calculate ion mobility statistics for peptide identifications in an RT region.
Computes median, min, and max IM values from peptide identifications within the given RT region (across all charge states). Individual IDs lacking IM annotation are skipped (with warning), and statistics are calculated from the remaining IDs with valid IM data. The median is used for robust central tendency estimation and is more resistant to outliers than the mean.
Seeds from untargeted feature finders may or may not have an IM meta value set, depending on the feature finder. If IM is annotated on the seed, it is used for targeted extraction. If not, the seed is extracted across the full IM range (ChromatogramExtractor disables IM filtering when ion_mobility < 0).
Note: RT region boundaries are determined from ALL IDs (including those without IM), so this only affects IM statistics calculation, not RT extraction.
| r | RT region containing peptide identifications grouped by charge state |
|
protected |
get regions in which peptide eludes (ideally only one) by clustering RT elution times
|
protected |
|
staticprotected |
Helper function to check if a peptide hit is a seed pseudo-ID.
|
protected |
|
protected |
| void run | ( | PeptideIdentificationList | peptides, |
| const std::vector< ProteinIdentification > & | proteins, | ||
| PeptideIdentificationList | peptides_ext, | ||
| std::vector< ProteinIdentification > | proteins_ext, | ||
| FeatureMap & | features, | ||
| const FeatureMap & | seeds = FeatureMap(), |
||
| const String & | spectra_file = "" |
||
| ) |
Main method for actual FeatureFinder External IDs (peptides_ext, proteins_ext) may be empty, in which case no machine learning or FDR estimation will be performed. Optional seeds from e.g. untargeted FeatureFinders can be added with seeds. Results will be written to features. Note: The primaryMSRunPath of features will be updated to the primaryMSRunPath stored in the MSExperiment. If that path is not a valid and readable mzML spectra_file will be annotated as a fall-back. Caution: peptide IDs will be shrunk to best hit, FFid metavalues added and potential seed IDs added.
FAIMS data is handled automatically: if the MS data contains multiple FAIMS compensation voltages, each CV group is processed independently (with peptide IDs filtered by FAIMS_CV) and results are combined with FAIMS_CV annotation on features. IDs without FAIMS_CV annotation are included in all groups for backward compatibility. For multi-FAIMS data, getLibrary() returns an empty library since each FAIMS group has its own assay library.
Referenced by NuXLRTPrediction::train().
| void runOnCandidates | ( | FeatureMap & | features | ) |
|
protected |
Core processing logic for a single (non-FAIMS or single FAIMS group) dataset Called by run() either directly or for each FAIMS CV group
| void setMSData | ( | const PeakMap & | ms_data | ) |
set the MS data used for feature detection
| void setMSData | ( | PeakMap && | ms_data | ) |
|
protected |
some statistics on detected features
|
overrideprotectedvirtual |
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
|
protected |
Helper functions for run()
|
protected |
non-zero if for every feature an additional offset features should be extracted
|
protected |
nr of peptides to use at the same time during chromatogram extraction
|
protected |
|
protected |
accumulated chromatograms (XICs)
|
protected |
|
protected |
|
protected |
Handler for external peptide IDs.
|
protected |
OpenSWATH feature finder.
|
protected |
|
protected |
|
protected |
|
protected |
Global ion mobility statistics from all peptide identifications.
Calculated from peptide identifications BEFORE seeds are added (ensuring we only learn from real IDs with IM annotation). Provides context for the typical IM range in the dataset.
Ion mobility statistics per peptide reference (peptide sequence/charge:region)
Maps from full peptide reference (e.g., "PEPTIDE/2:1") to IM statistics. Populated during createAssayLibrary_() and used during annotateFeatures_() to add IM_median, IM_min, and IM_max meta-values to features.
|
protected |
min. isotope probability for peptide assay
|
protected |
TransformationDescription trafo_; // RT transformation (to range 0-1)
isotope probabilities of transitions
|
protected |
assays for peptides (cleared per chunk during processing)
|
protected |
RT tolerance for mapping IDs to features.
|
protected |
|
protected |
input LC-MS data
|
protected |
m/z window width
|
protected |
m/z window width is given in PPM (not Da)?
|
protected |
external feature counter (for FDR calculation)
|
protected |
number of external peptides
|
protected |
internal feature counter (for FDR calculation)
|
protected |
number of internal peptide
|
protected |
number of isotopes for peptide assay
|
protected |
accumulated assays for output (populated from library_ before clearing)
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
RT window width.
|
protected |
extraction window used for seeds (smaller than rt_window_ as we know the exact apex positions)
|
protected |
|
protected |
|
protected |
number of partitions for SVM cross-validation
|
protected |
number of samples for SVM training
|
protected |
|
protected |
SVM probabilities for "external" features (for FDR calculation):
SVM probability -> number of pos./neg. features (for FDR calculation):
|
protected |
|
protected |
|
protected |
|
protected |