![]() |
OpenMS
|
Generates from a set of Fasta files a 2D-datastructure which stores all theoretical masses of all b and y ions from all peptides generated from the Fasta file. The datastructure is build such that on one axis the fragments are sorted by their own mass and the axis by the mass of their precursor/protein. The FI has two options: Bottom-up and Top Down. In later digestion is skiped and the fragments have a direct reference to the mass of the proteins instead of digested peptides. More...
#include <OpenMS/ANALYSIS/ID/FragmentIndex.h>
Classes | |
| struct | Fragment |
| One entry in the fragment index. More... | |
| struct | Hit |
| struct | IonOffsets |
| Precomputed ion-type mass offsets (from Residue::getInternalTo*Ion formulas) More... | |
| struct | ModSlot |
| A candidate modification slot for a specific peptide. More... | |
| struct | Peptide |
| Compact descriptor of a peptide instance held by the FragmentIndex. More... | |
| struct | SpectrumMatch |
| Match between a query peak and an entry in the DB. More... | |
| struct | SpectrumMatchesTopN |
| container for SpectrumMatch. Also keeps count of total number of candidates and total number of matches. More... | |
| struct | VarModEntry |
| Entry in the per-AA variable modification lookup table. More... | |
Public Member Functions | |
| FragmentIndex () | |
| Default constructor. | |
| ~FragmentIndex () override=default | |
| Default destructor. | |
| bool | isBuild () const |
| Indicates whether the fragment index has been built. | |
| const std::vector< Peptide > & | getPeptides () const |
| Returns a reference to the internal peptide container. | |
| Size | getNumFragments () const noexcept |
| Number of theoretical fragments stored in the index (0 before build()). | |
| void | build (const std::vector< FASTAFile::FASTAEntry > &fasta_entries) |
| Given a set of Fasta files, builds the Fragment Index datastructure (FID). First all fragments are sorted by their own mass. Next they are placed in buckets. The min-fragment mass is stored for each bucket, whereupon the fragments are sorted within the buckets by their originating precursor mass. | |
| void | clear () |
| Delete fragment index. Sets is_build=false. | |
| std::pair< size_t, size_t > | getPeptidesInMassWindow (float precursor_mass, const std::pair< float, float > &window) const |
Public Member Functions inherited from DefaultParamHandler | |
| DefaultParamHandler (const std::string &name) | |
| Constructor with name that is displayed in error messages. | |
| DefaultParamHandler (const DefaultParamHandler &rhs) | |
| Copy constructor. | |
| virtual | ~DefaultParamHandler () |
| Destructor. | |
| DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
| Assignment operator. | |
| virtual bool | operator== (const DefaultParamHandler &rhs) const |
| Equality operator. | |
| void | setParameters (const Param ¶m) |
| Sets the parameters. | |
| const Param & | getParameters () const |
| Non-mutable access to the parameters. | |
| const Param & | getDefaults () const |
| Non-mutable access to the default parameters. | |
| const std::string & | getName () const |
| Non-mutable access to the name. | |
| void | setName (const std::string &name) |
| Mutable access to the name. | |
| const std::vector< std::string > & | getSubsections () const |
| Non-mutable access to the registered subsections. | |
Static Public Member Functions | |
| static bool | isOpenSearchMode (double lower_magnitude, double upper_magnitude, bool unit_ppm) noexcept |
Static Public Member Functions inherited from DefaultParamHandler | |
| static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const std::string &key_prefix="") |
| Writes all parameters to meta values. | |
SNES (Speedy Non-specific Enzyme Search) bit encoding | |
When the index is built in SNES mode (isSnesMode), a Peptide entry represents a mother peptide — the longest peptide anchored at one terminus of the protein. Bit 31 of When the index is built in the default (non-SNES) mode, Rationale for stealing a single bit from the bitmask instead of adding a field to | |
| enum class | SnesAnchor { NONE , PROT_NTERM , PROT_CTERM } |
| static constexpr uint32_t | SNES_KIND_BIT_MASK = 1u << 31 |
| bit 31; set = Single-C mother | |
| static constexpr uint32_t | SNES_SLOT_MASK = ~SNES_KIND_BIT_MASK |
| bits 0..30 in SNES mode | |
| bool | is_build_ {false} |
| true, if the database has been populated with fragments | |
| std::array< double, 128 > | fixed_mod_deltas_ {} |
| Per-AA fixed modification delta mass (0.0 if no fixed mod applies) | |
| std::array< const ResidueModification *, 128 > | fixed_mod_ptrs_ {} |
| Per-AA fixed modification pointer (nullptr if none) | |
| double | fixed_nterm_delta_ {0.0} |
| Fixed N-terminal mod delta (0 if none) | |
| double | fixed_cterm_delta_ {0.0} |
| Fixed C-terminal mod delta (0 if none) | |
| const ResidueModification * | fixed_nterm_mod_ptr_ {nullptr} |
| const ResidueModification * | fixed_cterm_mod_ptr_ {nullptr} |
| std::array< std::vector< VarModEntry >, 128 > | variable_mod_table_ {} |
| Per-AA variable modification table: for each ASCII char, list of possible variable mods. | |
| std::vector< VarModEntry > | variable_nterm_mods_ |
| Pure N-terminal variable mods (not residue-specific) | |
| std::vector< VarModEntry > | variable_cterm_mods_ |
| Pure C-terminal variable mods (not residue-specific) | |
| bool | mod_tables_initialized_ {false} |
| bool | is_snes_mode_ {false} |
| bool | snes_enabled_ {false} |
| std::vector< double > | snes_sigma_delta_set_ |
| std::vector< double > | snes_sigma_delta_set_with_prot_nterm_ |
| std::vector< double > | snes_sigma_delta_set_with_prot_cterm_ |
| std::vector< Peptide > | fi_peptides_ |
| vector of all (digested) peptides | |
| std::vector< Fragment > | fi_fragments_ |
| vector of all theoretical fragments (b- and y- ions) | |
| std::vector< uint32_t > | protein_lengths_ |
| float | fragment_min_mz_ |
| smallest fragment mz | |
| float | fragment_max_mz_ |
| largest fragment mz | |
| size_t | min_ion_index_ {0} |
| skip ions below this index (0=all, 2=skip b1/b2/y1/y2) | |
| size_t | bucketsize_ |
| number of fragments per outer node | |
| std::vector< float > | bucket_min_mz_ |
| vector of the smalles fragment mz of each bucket | |
| double | precursor_mass_tolerance_lower_ {20.0} |
| positive magnitude, effective lower bound is -lower | |
| double | precursor_mass_tolerance_upper_ {20.0} |
| positive magnitude, effective upper bound is +upper | |
| bool | precursor_mass_tolerance_unit_ppm_ {true} |
| float | fragment_mz_tolerance_ |
| bool | fragment_mz_tolerance_unit_ppm_ {true} |
| static constexpr size_t | MAX_MOD_SLOTS = 32 |
| max variable mod slots per peptide (uint32_t bitmask) | |
| static std::array< double, 128 > | residue_mass_table_ |
| static std::once_flag | mass_table_once_flag_ |
| static IonOffsets | ion_offsets_ |
| bool | add_b_ions_ |
| bool | add_y_ions_ |
| bool | add_a_ions_ |
| bool | add_c_ions_ |
| bool | add_x_ions_ |
| bool | add_z_ions_ |
| std::string | digestion_enzyme_ |
| EnzymaticDigestion::Specificity | enzyme_specificity_ {EnzymaticDigestion::SPEC_FULL} |
| 'full' (default), 'semi' (semi-tryptic), or 'none' (e.g. immunopeptidomics) | |
| size_t | missed_cleavages_ |
| number of missed cleavages | |
| float | peptide_min_mass_ |
| float | peptide_max_mass_ |
| size_t | peptide_min_length_ |
| size_t | peptide_max_length_ |
| StringList | modifications_fixed_ |
| Modification that are one all peptides. | |
| StringList | modifications_variable_ |
| Variable Modification -> all possible comibnations are created. | |
| size_t | max_variable_mods_per_peptide_ |
| uint16_t | min_matched_peaks_ |
| PSM with less hits are discarded. | |
| int16_t | min_isotope_error_ |
| Minimal possible isotope error. | |
| int16_t | max_isotope_error_ |
| Maximal possible isotope error (both only used for closed search) | |
| uint16_t | min_precursor_charge_ |
| minimal possible precursor charge (usually always 1) | |
| uint16_t | max_precursor_charge_ |
| maximal possible precursor charge | |
| uint16_t | max_fragment_charge_ |
| The maximal possible charge of the fragments. | |
| uint32_t | max_processed_hits_ |
| The amount of PSM that will be used. the rest is filtered out. | |
| bool | isSnesMode () const noexcept |
| std::vector< Hit > | query (const Peak1D &peak, const std::pair< size_t, size_t > &peptide_idx_range, uint16_t peak_charge) |
| Queries one peak. | |
| void | querySpectrum (const MSSpectrum &spectrum, SpectrumMatchesTopN &sms) |
| : queries one complete experimental spectra against the Database. Loops over all precursor charges Starts at min_precursor_charge and iteratively goes to max_precursor_charge. We query all peaks multiple times with all the different precursor charges and corresponding precursor masses | |
| void | querySpectrum (const MSSpectrum &spectrum, const std::vector< FASTAFile::FASTAEntry > &fasta_entries, SpectrumMatchesTopN &sms) |
| Query a spectrum against the fragment index with FASTA context. | |
| AASequence | reconstructModifiedSequence (const Peptide &peptide, const std::vector< FASTAFile::FASTAEntry > &fasta_entries) const |
| Reconstruct a fully modified AASequence from a Peptide's bitmask. | |
| int | realizeSNESLength (const Peptide &mother, const std::vector< FASTAFile::FASTAEntry > &fasta_entries, double target_mh_plus, double tolerance_lower_magnitude, double tolerance_upper_magnitude, bool tolerance_ppm) const |
| Find the realized sub-peptide length of a SNES mother that best matches the observed precursor mass. | |
| AASequence | reconstructRealizedSubSequence (const Peptide &mother, const std::vector< FASTAFile::FASTAEntry > &fasta_entries, size_t realized_length, uint32_t subset_bitmask=0) const |
| static bool | isSingleCMother (uint32_t mod_bitmask) noexcept |
| static bool | isSingleNMother (uint32_t mod_bitmask) noexcept |
| void | updateMembers_ () override |
| This method is used to update extra member variables at the end of the setParameters() method. | |
| void | generatePeptides (const std::vector< FASTAFile::FASTAEntry > &fasta_entries) |
| Generates all peptides from given fasta entries. If Bottom-up is set to false skips digestion. If set to true the Digestion enzyme can be set in the parameters. Additionally introduces fixed and variable modifications for restrictive PSM search. | |
| void | generateSNESMothers_ (const std::vector< FASTAFile::FASTAEntry > &fasta_entries) |
| SNES-mode peptide enumeration: emit Single-N + Single-C mother peptides. | |
| void | initModificationTables_ () |
| size_t | buildModSlots_ (const char *sequence, size_t seq_len, ModSlot *out_slots, bool is_protein_nterm=false, bool is_protein_cterm=false) const |
| std::vector< double > | computeSnesSigmaDeltaSet_ (bool include_prot_nterm_mods, bool include_prot_cterm_mods) const |
| void | generateFragmentsLightweight_ (std::vector< Fragment > &fragments, const char *sequence, size_t seq_len, UInt32 peptide_idx, double n_term_mod_mass, double c_term_mod_mass, const double *residue_mod_masses) const |
| void | generateFragmentsForSeries_ (std::vector< Fragment > &fragments, const char *sequence, size_t seq_len, UInt32 peptide_idx, double n_term_mod_mass, double c_term_mod_mass, const double *residue_mod_masses, bool add_b, bool add_a, bool add_c, bool add_y, bool add_x, bool add_z) const |
| static void | initResidueMassTable_ () |
| void | querySpectrumSNES_ (const MSSpectrum &spectrum, const std::vector< FASTAFile::FASTAEntry > &fasta_entries, SpectrumMatchesTopN &sms) |
| SNES-mode spectrum query (MetaMorpheus-style: byte-count + b-ion filter). | |
| void | queryPeaks (SpectrumMatchesTopN &candidates, const MSSpectrum &spectrum, const std::pair< size_t, size_t > &candidates_range, const int16_t isotope_error, const uint16_t precursor_charge) |
| queries peaks for a given experimental spectrum with a set range of potential peptides, isotope error and precursor charge. Hits are transferred into a PSM list. Technically an adapter between query(...) and openSearch(...)/searchDifferentPrecursorRanges(...) | |
| void | searchDifferentPrecursorRanges (const MSSpectrum &spectrum, float precursor_mass, SpectrumMatchesTopN &sms, uint16_t charge) |
| If closed search loops over all isotope errors. For each iteration loop over all peaks with queryPeaks. | |
| void | trimHits (SpectrumMatchesTopN &init_hits) const |
| places the k-largest elements in the front of the input array. Inside of the k-largest elements and outside the elements are not sorted | |
| bool | isOpenSearchMode_ () const noexcept |
| Instance delegate — same rule, reads the member bounds. | |
| std::pair< float, float > | computeMassWindow_ (float precursor_mass) const |
Additional Inherited Members | |
Protected Member Functions inherited from DefaultParamHandler | |
| void | defaultsToParam_ () |
| Updates the parameters after the defaults have been set in the constructor. | |
Protected Attributes inherited from DefaultParamHandler | |
| Param | param_ |
| Container for current parameters. | |
| Param | defaults_ |
| Container for default parameters. This member should be filled in the constructor of derived classes! | |
| std::vector< std::string > | subsections_ |
| Container for registered subsections. This member should be filled in the constructor of derived classes! | |
| std::string | error_name_ |
| Name that is displayed in error messages during the parameter checking. | |
| bool | check_defaults_ |
| If this member is set to false no checking if parameters in done;. | |
| bool | warn_empty_defaults_ |
| If this member is set to false no warning is emitted when defaults are empty;. | |
Generates from a set of Fasta files a 2D-datastructure which stores all theoretical masses of all b and y ions from all peptides generated from the Fasta file. The datastructure is build such that on one axis the fragments are sorted by their own mass and the axis by the mass of their precursor/protein. The FI has two options: Bottom-up and Top Down. In later digestion is skiped and the fragments have a direct reference to the mass of the proteins instead of digested peptides.
| struct OpenMS::FragmentIndex::IonOffsets |
Precomputed ion-type mass offsets (from Residue::getInternalTo*Ion formulas)
| Class Members | ||
|---|---|---|
| double | a_offset {0.0} | |
| double | b_offset {0.0} | |
| double | c_offset {0.0} | |
| double | x_offset {0.0} | |
| double | y_offset {0.0} | |
| double | z_offset {0.0} | |
| struct OpenMS::FragmentIndex::SpectrumMatch |
Match between a query peak and an entry in the DB.
| struct OpenMS::FragmentIndex::VarModEntry |
Entry in the per-AA variable modification lookup table.
| Class Members | ||
|---|---|---|
| double | delta_mass | mass delta from this modification |
| const ResidueModification * | mod_ptr | pointer to the modification (for AASequence reconstruction) |
| TermSpecificity | term_spec | where this mod can be applied |
|
strong |
SNES v1.1: constrain bin-walk hits to mothers with a specific protein anchor. Used to gate walks that enumerate PROTEIN_N_TERM / PROTEIN_C_TERM variable mods.
| Enumerator | |
|---|---|
| NONE | no anchor restriction (baseline walks) |
| PROT_NTERM | mother must have sequence_.first == 0 |
| PROT_CTERM | mother must have sequence_.first + sequence_.second == protein length |
| FragmentIndex | ( | ) |
Default constructor.
Initializes an empty FragmentIndex. Call build() before using any query functions. After clear(), the index returns to this unbuilt state.
Thread-safety: constructing the object is thread-safe as long as the instance is not shared across threads before initialization completes.
|
overridedefault |
Default destructor.
Releases owned memory. If the index was built, all internal buffers and fragment buckets are freed. No exceptions are thrown.
| void build | ( | const std::vector< FASTAFile::FASTAEntry > & | fasta_entries | ) |
Given a set of Fasta files, builds the Fragment Index datastructure (FID). First all fragments are sorted by their own mass. Next they are placed in buckets. The min-fragment mass is stored for each bucket, whereupon the fragments are sorted within the buckets by their originating precursor mass.
| [in] | fasta_entries | The FASTA entries used to build the index. |
|
protected |
Scan a peptide sequence to find all variable modification slots. Returns the number of slots written to out_slots (at most MAX_MOD_SLOTS). Deterministic ordering: N-term pure-terminal mods, then left-to-right residue mods (ANYWHERE + position-specific terminal), then C-term pure-terminal mods.
| sequence | raw amino acid character array |
| seq_len | length of the sequence |
| out_slots | output array for modification slots (must have space for MAX_MOD_SLOTS entries) |
| is_protein_nterm | true if this peptide starts at protein position 0 |
| is_protein_cterm | true if this peptide ends at the last protein residue |
| void clear | ( | ) |
Delete fragment index. Sets is_build=false.
|
private |
Compute the signed mass window {lo, hi} around a precursor_mass, converting ppm → Da if the unit is ppm. lo is negative (or zero), hi is positive (or zero). This is the only place where positive member magnitudes become signed offsets.
|
protected |
Enumerate distinct Σ values achievable by any subset of configured variable mods with popcount ≤ max_variable_mods_per_peptide_. Configuration-global; does not consider per-peptide residue inventory. Per-peptide applicability is enforced at query-time subset enumeration.
| include_prot_nterm_mods | include mods with PROTEIN_N_TERM specificity |
| include_prot_cterm_mods | include mods with PROTEIN_C_TERM specificity |
|
protected |
Fragment generation with explicit per-call ion-series selection.
Called by the SNES mother path to restrict a Single-N mother to b-ions and a Single-C mother to y-ions regardless of the class-level add_*_ions_ flags. generateFragmentsLightweight_ forwards to this function after packing the class flags; both share a single implementation.
| [out] | fragments | Output vector to append Fragment entries to |
| [in] | sequence | Raw amino acid string (no modifications) |
| [in] | seq_len | Length of sequence |
| [in] | peptide_idx | Index of this peptide in fi_peptides_ |
| [in] | n_term_mod_mass | Mass delta from N-terminal modification (0 if none) |
| [in] | c_term_mod_mass | Mass delta from C-terminal modification (0 if none) |
| [in] | residue_mod_masses | Per-residue modification mass deltas (nullptr if none; array of seq_len doubles) |
| [in] | add_b | Emit b-ions (prefix). |
| [in] | add_a | Emit a-ions (prefix). |
| [in] | add_c | Emit c-ions (prefix). |
| [in] | add_y | Emit y-ions (suffix). |
| [in] | add_x | Emit x-ions (suffix). |
| [in] | add_z | Emit z-ions (suffix). |
|
protected |
Lightweight fragment generation: compute b/y ion m/z directly from amino acid chars. Bypasses AASequence::fromString and TheoreticalSpectrumGenerator. Uses the class-level add_b_ions_ / add_y_ions_ / ... flags for the ion series selection. See generateFragmentsForSeries_ for the explicit-flag variant used by the SNES mother path.
| [out] | fragments | Output vector to append Fragment entries to |
| [in] | sequence | Raw amino acid string (no modifications) |
| [in] | seq_len | Length of sequence |
| [in] | peptide_idx | Index of this peptide in fi_peptides_ |
| [in] | n_term_mod_mass | Mass delta from N-terminal modification (0 if none) |
| [in] | c_term_mod_mass | Mass delta from C-terminal modification (0 if none) |
| [in] | residue_mod_masses | Per-residue modification mass deltas (nullptr if none; array of seq_len doubles) |
|
protected |
Generates all peptides from given fasta entries. If Bottom-up is set to false skips digestion. If set to true the Digestion enzyme can be set in the parameters. Additionally introduces fixed and variable modifications for restrictive PSM search.
| [in] | fasta_entries |
|
protected |
SNES-mode peptide enumeration: emit Single-N + Single-C mother peptides.
Called instead of the usual enzymatic-digestion path when is_snes_mode_ is true. For each protein, emits:
SNES_KIND_BIT_MASK in the mod_bitmask; indexed with b-ion series only.SNES_KIND_BIT_MASK; indexed with y-ion series only.All sub-peptides of the mother (down to the configured min_length) can be realized from a single mother record, which is what gives SNES its memory and speed win over naïve O(L^2) non-specific enumeration.
v1 restriction: variable modifications are disabled in SNES mode. A warning is emitted at build time if any variable modification is configured. Fixed modifications (both residue-specific and terminal) are fully supported.
| [in] | fasta_entries | Protein database (same semantics as generatePeptides). |
|
inlinenoexcept |
Number of theoretical fragments stored in the index (0 before build()).
| const std::vector< Peptide > & getPeptides | ( | ) | const |
Returns a reference to the internal peptide container.
Provides read-only access to all peptides currently held by the index, typically populated during build().
Preconditions: The vector may be empty if build() has not been called yet. Thread-safety: read-only view; safe to access concurrently as long as no thread mutates the index (e.g., build()/clear()).
| std::pair< size_t, size_t > getPeptidesInMassWindow | ( | float | precursor_mass, |
| const std::pair< float, float > & | window | ||
| ) | const |
Return the [begin_idx, end_idx) peptide index range such that fi_peptides_[i].precursor_mz_ ∈ [precursor_mass + window.first, precursor_mass + window.second] for all i in the returned range.
| [in] | precursor_mass | The mono-charged precursor mass (M+H). |
| [in] | window | Signed absolute offsets around the precursor mass. By convention window.first is <= 0 and window.second is >= 0 (produced by computeMassWindow_). A reversed window trivially returns an empty range; no diagnostic is emitted. No hidden tolerance is added. |
fi_peptides_.
|
protected |
Build per-AA modification lookup tables from modifications_fixed_ and modifications_variable_. Called once at the start of generatePeptides().
|
staticprotected |
| bool isBuild | ( | ) | const |
Indicates whether the fragment index has been built.
Thread-safety: read-only and can be called concurrently with other read-only methods. Must not race with build()/clear() on the same instance.
|
inlinestaticnoexcept |
Shared auto-detection: open-search iff max(lower, upper) > threshold (1000 ppm or 1 Da). Strict >: exactly 1000 ppm stays closed. This is the single source of truth for the open-search auto-detection rule and is reused by ProSEAlgorithm and the TOPP tool.
|
inlineprivatenoexcept |
Instance delegate — same rule, reads the member bounds.
|
inlinestaticnoexcept |
mod_bitmask is a Single-C (C-anchored) mother. Only meaningful for peptides from an SNES-built index.
|
inlinestaticnoexcept |
mod_bitmask is a Single-N (N-anchored) mother.
|
inlinenoexcept |
snes_enabled is set to true and peptide:enzyme_specificity is none. snes_enabled defaults to false in v1 (opt-in), so specific/semi-specific searches and non-specific searches without the flag produce the same fragment index as the pre-SNES code path. | std::vector< Hit > query | ( | const Peak1D & | peak, |
| const std::pair< size_t, size_t > & | peptide_idx_range, | ||
| uint16_t | peak_charge | ||
| ) |
Queries one peak.
| [in] | peak | The queried peak |
| [in] | peptide_idx_range | The range of precursors/peptides the peptide could potentially belongs to |
| [in] | peak_charge | The charge of the peak. Is used to calculate the mass from the mz |
|
private |
queries peaks for a given experimental spectrum with a set range of potential peptides, isotope error and precursor charge. Hits are transferred into a PSM list. Technically an adapter between query(...) and openSearch(...)/searchDifferentPrecursorRanges(...)
| [out] | candidates | The n best Spectrum matches |
| [in] | spectrum | The queried experimental spectrum |
| [in] | candidates_range | The range of precursors/peptides the peptide could potentially belong to |
| [in] | isotope_error | The applied isotope error |
| [in] | precursor_charge | The applied precursor charge |
| void querySpectrum | ( | const MSSpectrum & | spectrum, |
| const std::vector< FASTAFile::FASTAEntry > & | fasta_entries, | ||
| SpectrumMatchesTopN & | sms | ||
| ) |
Query a spectrum against the fragment index with FASTA context.
Required when FragmentIndex is in SNES mode with variable modifications; the FASTA is needed to realize sub-peptide sequences and apply variable mods. Non-SNES and SNES-without-var-mods paths ignore the fasta_entries argument.
| [in] | spectrum | Experimental spectrum with a single precursor. |
| [in] | fasta_entries | The FASTA database passed to build(). |
| [out] | sms | Accumulated candidate matches. |
| void querySpectrum | ( | const MSSpectrum & | spectrum, |
| SpectrumMatchesTopN & | sms | ||
| ) |
: queries one complete experimental spectra against the Database. Loops over all precursor charges Starts at min_precursor_charge and iteratively goes to max_precursor_charge. We query all peaks multiple times with all the different precursor charges and corresponding precursor masses
| [in] | spectrum | experimental spectrum |
| [out] | sms | The n best Spectrum matches |
|
private |
SNES-mode spectrum query (MetaMorpheus-style: byte-count + b-ion filter).
Implements the Rolfs/Smith 2020 inverted-index search strategy as executed in MetaMorpheus's NonSpecificEnzymeSearchEngine. Replaces the pre-SNES searchDifferentPrecursorRanges flow with a two-phase design:
mother_mass >= P - tol and admitted the top half of the index as candidates.M_sub+H+ = b_k + water, so the mother's b_k ion falls at M_obs+H+ − water when the realized length matches).M_sub+H+ = y_k exactly). Every mother with a fragment in the target bin that has the correct kind (Single-N vs Single-C) and whose byte-score meets fragment:min_matched_ions is emitted as a candidate.This design produces a candidate set sized like a single fragment-bin lookup (dozens, not thousands), matching MetaMorpheus's algorithmic scalability. The byte table is reused across calls via thread_local.
Only called when isSnesMode is true; otherwise the pre-SNES searchDifferentPrecursorRanges path is used.
| [in] | spectrum | Experimental spectrum with a single precursor. |
| [in] | fasta_entries | Source database passed to build(); required for realizing sub-peptides and applying variable mods inside the SNES v1.1 subset-enumeration post-pass. |
| [out] | sms | Accumulated candidate matches, ordered by insertion (caller runs full-score and top-N selection downstream). |
| int realizeSNESLength | ( | const Peptide & | mother, |
| const std::vector< FASTAFile::FASTAEntry > & | fasta_entries, | ||
| double | target_mh_plus, | ||
| double | tolerance_lower_magnitude, | ||
| double | tolerance_upper_magnitude, | ||
| bool | tolerance_ppm | ||
| ) | const |
Find the realized sub-peptide length of a SNES mother that best matches the observed precursor mass.
Scans realizable lengths k in [peptide_min_length_, mother.sequence_.second] from the appropriate terminus (left-to-right for Single-N mothers, right-to-left for Single-C mothers), computing the cumulative residue mass plus fixed modifications. Returns the length whose realized (M+H)+ mass is closest to target_mh_plus within the given tolerance, or -1 if no length satisfies the tolerance.
Must only be called in SNES mode; returns -1 immediately otherwise.
| [in] | mother | A Peptide representing a Single-N or Single-C mother. |
| [in] | fasta_entries | Source database (same one passed to build()). |
| [in] | target_mh_plus | Observed (M+H)+ mass after isotope-error correction. |
| [in] | tolerance_lower_magnitude | Positive tolerance magnitude on the low side (realized_mass - target >= -tolerance_lower_magnitude). |
| [in] | tolerance_upper_magnitude | Positive tolerance magnitude on the high side (realized_mass - target <= +tolerance_upper_magnitude). |
| [in] | tolerance_ppm | True if the magnitudes are in ppm. |
| AASequence reconstructModifiedSequence | ( | const Peptide & | peptide, |
| const std::vector< FASTAFile::FASTAEntry > & | fasta_entries | ||
| ) | const |
Reconstruct a fully modified AASequence from a Peptide's bitmask.
Used for result output - only called for final hits (not in the build hot path). Applies fixed modifications, then uses the bitmask to determine which variable modifications are active at which positions.
| [in] | peptide | The Peptide descriptor with mod_bitmask_ |
| [in] | fasta_entries | The FASTA database used during build() |
| AASequence reconstructRealizedSubSequence | ( | const Peptide & | mother, |
| const std::vector< FASTAFile::FASTAEntry > & | fasta_entries, | ||
| size_t | realized_length, | ||
| uint32_t | subset_bitmask = 0 |
||
| ) | const |
Reconstruct a realized SNES sub-peptide as an AASequence.
| mother | the SNES mother Peptide entry |
| fasta_entries | the FASTA entries used to build the index |
| realized_length | the length of the realized sub-peptide (from realizeSNESLength) |
| subset_bitmask | SNES v1.1: active slots from buildModSlots_(seq_ptr, realized_length, ...) to apply as variable modifications. 0 = unmodified (backward compatible). |
|
private |
If closed search loops over all isotope errors. For each iteration loop over all peaks with queryPeaks.
If open search applies a precursor-mass window
| [in] | spectrum | experimental query-spectrum |
| [in] | precursor_mass | The mass of the precursor (mz * charge) |
| [out] | sms | The Top m SpectrumMatches |
| [in] | charge | Applied charge |
|
private |
places the k-largest elements in the front of the input array. Inside of the k-largest elements and outside the elements are not sorted
|
overrideprotectedvirtual |
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
protected |
vector of the smalles fragment mz of each bucket
|
protected |
number of fragments per outer node
|
private |
|
private |
'full' (default), 'semi' (semi-tryptic), or 'none' (e.g. immunopeptidomics)
|
protected |
vector of all theoretical fragments (b- and y- ions)
|
protected |
vector of all (digested) peptides
|
protected |
Fixed C-terminal mod delta (0 if none)
|
protected |
|
protected |
Per-AA fixed modification delta mass (0.0 if no fixed mod applies)
|
protected |
Per-AA fixed modification pointer (nullptr if none)
|
protected |
Fixed N-terminal mod delta (0 if none)
|
protected |
|
protected |
largest fragment mz
|
protected |
smallest fragment mz
|
protected |
|
protected |
|
staticprotected |
|
protected |
true, if the database has been populated with fragments
|
protected |
SNES mode state. Set in updateMembers_ from the snes_enabled parameter. When true, generatePeptides dispatches to generateSNESMothers_ and the fragment-index query layer switches to the one-sided lookup. When false, no SNES code path is active and the index behaves identically to the original precursor-window-based implementation.
|
staticprotected |
|
private |
The maximal possible charge of the fragments.
|
private |
Maximal possible isotope error (both only used for closed search)
|
staticconstexprprotected |
max variable mod slots per peptide (uint32_t bitmask)
|
private |
maximal possible precursor charge
|
private |
The amount of PSM that will be used. the rest is filtered out.
|
private |
|
protected |
skip ions below this index (0=all, 2=skip b1/b2/y1/y2)
|
private |
Minimal possible isotope error.
|
private |
PSM with less hits are discarded.
|
private |
minimal possible precursor charge (usually always 1)
|
private |
number of missed cleavages
|
protected |
|
private |
Modification that are one all peptides.
|
private |
Variable Modification -> all possible comibnations are created.
|
private |
|
private |
|
private |
|
private |
|
protected |
positive magnitude, effective lower bound is -lower
|
protected |
|
protected |
positive magnitude, effective upper bound is +upper
|
protected |
Protein lengths indexed by protein_idx, populated at build() time. Used by SNES v1.1 to gate PROTEIN_C_TERM variable-mod bin walks.
|
staticprotected |
Precomputed residue mass lookup table: ASCII char -> internal monoisotopic mass (Da). Indexed by single-letter amino acid code (e.g., 'A'=65). Entries for non-AA chars are 0.
|
protected |
User-facing SNES opt-in switch (parameter "snes_enabled"). Only takes effect when the configured enzyme specificity is SPEC_NONE — specific / semi- specific searches ignore it. Exposed as a separate member so the parameter can be set/queried independently of the derived is_snes_mode_ state (which captures the combined decision specificity && snes_enabled).
|
staticconstexpr |
bit 31; set = Single-C mother
|
protected |
SNES v1.1: precomputed distinct Σ_delta values for bin-walk targets. Baseline set excludes protein-term-only variable mods.
|
protected |
SNES v1.1: Σ values including PROTEIN_C_TERM-only variable mods. Used only for Single-C mothers anchored at the protein C-terminus.
|
protected |
SNES v1.1: Σ values including PROTEIN_N_TERM-only variable mods. Used only for Single-N mothers anchored at protein position 0.
|
staticconstexpr |
bits 0..30 in SNES mode
|
protected |
Pure C-terminal variable mods (not residue-specific)
|
protected |
Per-AA variable modification table: for each ASCII char, list of possible variable mods.
|
protected |
Pure N-terminal variable mods (not residue-specific)