![]() |
OpenMS
|
#include <OpenMS/CONCEPT/ProgressLogger.h>#include <OpenMS/DATASTRUCTURES/DefaultParamHandler.h>#include <OpenMS/ANALYSIS/ID/FragmentIndex.h>#include <OpenMS/ANALYSIS/ID/OpenSearchModificationAnalysis.h>#include <OpenMS/CHEMISTRY/EnzymaticDigestion.h>#include <OpenMS/CHEMISTRY/ModifiedPeptideGenerator.h>#include <OpenMS/FORMAT/FASTAFile.h>#include <OpenMS/KERNEL/MSExperiment.h>#include <OpenMS/METADATA/PeptideIdentificationList.h>#include <algorithm>#include <iosfwd>#include <map>#include <string>#include <utility>#include <vector>Go to the source code of this file.
Classes | |
| class | ProSEAlgorithm |
| Fragment-index-based peptide database search algorithm (experimental). More... | |
| struct | ProSEAlgorithm::RunStatistics |
| Per-run identification statistics for the end-of-search report. More... | |
| struct | ProSEAlgorithm::SharedSearchStats |
| Configuration, database and fragment-index facts shared across all input files of one ProSE invocation. More... | |
| struct | ProSEAlgorithm::SearchResult |
| Comprehensive search result including modification analysis. More... | |
| struct | ProSEAlgorithm::MultiFileSearchResult |
| Multi-file search result bundle. More... | |
| struct | ProSEAlgorithm::SearchContext |
| Prepared per-database state shared across multiple spectrum files. More... | |
| struct | ProSEAlgorithm::AnnotatedHit_ |
| Slimmer structure as storing all scored candidates in PeptideHit objects takes too much space. More... | |
| struct | ProSEAlgorithm::DecoyStrategy_ |
| Resolved decoy handling for one concrete input database. More... | |
| struct | ProSEAlgorithm::CalibrationResult_ |
| Result of a calibration pass. More... | |
Namespaces | |
| namespace | OpenMS |
| Main OpenMS namespace. | |
| struct OpenMS::ProSEAlgorithm::RunStatistics |
Per-run identification statistics for the end-of-search report.
Populated by collectRunStatistics_() once a single spectrum file has been searched (post-FDR), plus a few fields captured at well-defined points during search() (target/decoy counts pre-FDR, achieved q-value, timing). All counts refer to one input file. Cross-file/shared facts (database, fragment index, configuration) live in SharedSearchStats instead.
| Class Members | ||
|---|---|---|
| double | achieved_psm_fdr = -1.0 | max retained q-value after FDR (<0 = n/a) |
| map< Int, Size > | charge_histogram | precursor charge -> PSM count |
| Size | decoy_psms = 0 | decoy PSMs in the final IDs (after FDR, if applied) |
| bool | fdr_applied = false | true if PSM-level FDR filtering ran |
| double | frag_err_mad = 0.0 | |
| double | frag_err_median = 0.0 | |
| double | frag_err_recommended = 0.0 | |
| bool | frag_tol_valid = false | true if fragment-error estimate present |
| double | hyperscore_max = 0.0 | |
| double | hyperscore_median = 0.0 | |
| double | hyperscore_min = 0.0 | |
| string | input_file | spectrum file this run searched (basename or path) |
| Size | matched_spectra = 0 | spectra with >=1 retained PSM in the final IDs (after FDR, if applied) |
| map< Size, Size > | missed_cleavage_histogram | missed cleavages -> PSM count |
| Size | ms2_spectra = 0 | number of MS2 spectra in the input |
| double | prec_err_mad = 0.0 | |
| double | prec_err_median = 0.0 | |
| double | prec_err_recommended = 0.0 | |
| bool | prec_tol_valid = false | true if precursor-error estimate present |
| bool | score_stats_valid = false | true if hyperscore_* below are meaningful |
| double | seconds_calibration = 0.0 | calibration pass wall time (0 if disabled) |
| double | seconds_fdr = 0.0 | FDR filtering wall time (0 if not applied) |
| double | seconds_search = 0.0 | scoring + post-processing wall time |
| Size | target_psms = 0 | target PSMs in the final IDs (after FDR, if applied) |
| Size | unique_peptides = 0 | distinct peptide sequences among top hits |
| Size | unique_proteins = 0 | distinct protein accessions among top hits |
| struct OpenMS::ProSEAlgorithm::SharedSearchStats |
Configuration, database and fragment-index facts shared across all input files of one ProSE invocation.
Computed once (the fragment index is built once and reused), so these costs/counts must NOT be summed per file. Populated by the multi-file searchWithModificationAnalysis() overloads.
| Class Members | ||
|---|---|---|
| bool | calibration_enabled = false | |
| bool | chunked = false | |
| string | database_file | FASTA path (empty for in-memory db) |
| Size | db_decoy_proteins = 0 | decoy entries in the searched (augmented) db |
| Size | db_target_proteins = 0 | target entries in the searched (augmented) db |
| string | decoy_mode | "generated" | "external" | "none (target-only)" |
| string | enzyme | |
| vector< string > | fixed_mods | |
| double | fragment_tol = 0.0 | |
| string | fragment_tol_unit | |
| Size | indexed_fragments = 0 | theoretical fragments in the index (summed over chunks) |
| Size | indexed_peptides = 0 | peptides in the fragment index (summed over chunks) |
| vector< string > | ion_series | |
| Int | max_charge = 0 | |
| Int | min_charge = 0 | |
| Size | missed_cleavages = 0 | |
| bool | open_search = false | |
| double | precursor_tol_lower = 0.0 | |
| string | precursor_tol_unit | |
| double | precursor_tol_upper = 0.0 | |
| double | protein_fdr_threshold = 0.0 | |
| double | psm_fdr_threshold = 0.0 | |
| double | seconds_index_build = 0.0 | decoy generation + fragment index build wall time |
| double | seconds_total = 0.0 | whole-search wall time (set by the caller) |
| bool | snes_mode = false | |
| vector< string > | variable_mods | |
| struct OpenMS::ProSEAlgorithm::SearchResult |
Comprehensive search result including modification analysis.
This structure contains all outputs from an open search including:
| Class Members | ||
|---|---|---|
| ExitCodes | exit_code = ExitCodes::EXECUTION_OK | |
| bool | is_open_search = false | |
| OpenSearchAnalysisResult | modification_analysis | |
| PeptideIdentificationList | peptide_ids | |
| vector< ProteinIdentification > | protein_ids | |
| RunStatistics | stats | |
| struct OpenMS::ProSEAlgorithm::MultiFileSearchResult |
Multi-file search result bundle.
Returned by the file-list searchWithModificationAnalysis() overloads. Holds one SearchResult per input file (in per_file, in input order) and a single aggregate result whose peptide_ids are the concatenation of all per-file PSMs and whose modification_analysis is computed once on the pooled set of PSMs.
Special cases for aggregate:
aggregate is left almost-empty (only is_open_search and exit_code are set) — the single-file pooled aggregate would just duplicate per_file[0] and re-run modification analysis on the same PSMs. Callers should use per_file[0] for the result in this case.aggregate.exit_code is set to the first non-OK per-file exit code (so callers can inspect it without walking the per_file vector).The aggregate's protein_ids template is taken from the first successful per-file result (search parameters are identical across files by construction), with the primary MS run path overwritten to list every input file.
| Class Members | ||
|---|---|---|
| SearchResult | aggregate | |
| bool | decoy_is_prefix = true |
Position of decoy_string (true = prefix, false = suffix). |
| string | decoy_string |
Effective decoy marker resolved from the shared database, for a caller-side merged-PSM protein-FDR step (e.g. the ProSE TOPP tool's -out_merged path). Empty when the search was target-only (decoys=ignore). |
| bool | have_decoys = false | True when the searched databases contained decoys (FDR possible). |
| vector< SearchResult > | per_file | |
| SharedSearchStats | shared |
Configuration / database / fragment-index facts shared across all input files (the index is built once and reused), for the end-of-search report. |
| struct OpenMS::ProSEAlgorithm::SearchContext |
Prepared per-database state shared across multiple spectrum files.
Holds the (decoy-augmented) protein database and the built FragmentIndex so that searching N spectrum files against the same FASTA pays the index build cost only once. Construct via prepareContext() and pass to the context-taking search() overload.
| Class Members | ||
|---|---|---|
| vector< FASTAEntry > | db | |
| bool | decoy_is_prefix = true |
Position of decoy_string (true = prefix, false = suffix). |
| string | decoy_string |
Effective decoy marker carried by |
| FragmentIndex | fragment_index | |
| bool | have_decoys = false |
True when |
| bool | release_fragment_index_after_scoring = false |
When true, the context-taking search() overload will release |
| struct OpenMS::ProSEAlgorithm::DecoyStrategy_ |
Resolved decoy handling for one concrete input database.
Produced by resolveDecoyStrategy_() and consumed by buildDecoyAugmentedDB_() and the downstream PeptideIndexing / FDR steps, so the same decoys that are searched are also the ones scored.
| Class Members | ||
|---|---|---|
| string | decoy_string | effective marker for PeptideIndexing + protein FDR |
| bool | generate {false} | reverse target proteins to synthesise decoys |
| bool | have_decoys {false} | searched DB will contain decoys (FDR possible) |
| bool | is_prefix {true} | position of decoy_string |
| bool | strip_existing {false} | drop pre-existing decoy entries before searching |
| bool | strip_is_prefix {true} | position of strip_string |
| string | strip_string | marker of pre-existing decoys to strip |
| struct OpenMS::ProSEAlgorithm::CalibrationResult_ |
Result of a calibration pass.
Holds the estimated precursor and fragment tolerances computed from confident PSMs during the calibration pass. When success is false, the tolerance values are undefined and should not be used.