Generates from a set of Fasta files a 2D-datastructure which stores all theoretical masses of all b and y ions from all peptides generated from the Fasta file. The datastructure is build such that on one axis the fragments are sorted by their own mass and the axis by the mass of their precursor/protein. The FI has two options: Bottom-up and Top Down. In later digestion is skiped and the fragments have a direct reference to the mass of the proteins instead of digested peptides.
More...
#include <OpenMS/ANALYSIS/ID/FragmentIndex.h>
|
| void | updateMembers_ () override |
| | This method is used to update extra member variables at the end of the setParameters() method.
|
| |
| void | generatePeptides (const std::vector< FASTAFile::FASTAEntry > &fasta_entries) |
| | Generates all peptides from given fasta entries. If Bottom-up is set to false skips digestion. If set to true the Digestion enzyme can be set in the parameters. Additionally introduces fixed and variable modifications for restrictive PSM search.
|
| |
Protected Member Functions inherited from DefaultParamHandler |
| void | defaultsToParam_ () |
| | Updates the parameters after the defaults have been set in the constructor.
|
| |
|
| void | queryPeaks (SpectrumMatchesTopN &candidates, const MSSpectrum &spectrum, const std::pair< size_t, size_t > &candidates_range, const int16_t isotope_error, const uint16_t precursor_charge) |
| | queries peaks for a given experimental spectrum with a set range of potential peptides, isotope error and precursor charge. Hits are transferred into a PSM list. Technically an adapter between query(...) and openSearch(...)/searchDifferentPrecursorRanges(...)
|
| |
| void | searchDifferentPrecursorRanges (const MSSpectrum &spectrum, float precursor_mass, SpectrumMatchesTopN &sms, uint16_t charge) |
| | If closed search loops over all isotope errors. For each iteration loop over all peaks with queryPeaks.
|
| |
| void | trimHits (SpectrumMatchesTopN &init_hits) const |
| | places the k-largest elements in the front of the input array. Inside of the k-largest elements and outside the elements are not sorted
|
| |
| bool | isOpenSearchMode_ () const |
| | Helper function to determine if open search should be used based on tolerance.
|
| |
Generates from a set of Fasta files a 2D-datastructure which stores all theoretical masses of all b and y ions from all peptides generated from the Fasta file. The datastructure is build such that on one axis the fragments are sorted by their own mass and the axis by the mass of their precursor/protein. The FI has two options: Bottom-up and Top Down. In later digestion is skiped and the fragments have a direct reference to the mass of the proteins instead of digested peptides.
◆ OpenMS::FragmentIndex::SpectrumMatch
| struct OpenMS::FragmentIndex::SpectrumMatch |
Match between a query peak and an entry in the DB.
| Class Members |
|
int16_t |
isotope_error_ {} |
|
|
uint32_t |
num_matched_ {} |
Number of peaks-fragment hits. |
|
size_t |
peptide_idx_ {} |
< The isotope_error used for the performed search The idx this struct belongs to
|
|
uint16_t |
precursor_charge_ {} |
The precursor_charged used for the performed search. |
◆ FragmentIndex()
Default constructor.
Initializes an empty FragmentIndex. Call build() before using any query functions. After clear(), the index returns to this unbuilt state.
Thread-safety: constructing the object is thread-safe as long as the instance is not shared across threads before initialization completes.
◆ ~FragmentIndex()
Default destructor.
Releases owned memory. If the index was built, all internal buffers and fragment buckets are freed. No exceptions are thrown.
◆ build()
Given a set of Fasta files, builds the Fragment Index datastructure (FID). First all fragments are sorted by their own mass. Next they are placed in buckets. The min-fragment mass is stored for each bucket, whereupon the fragments are sorted within the buckets by their originating precursor mass.
- Parameters
-
◆ clear()
Delete fragment index. Sets is_build=false.
◆ generatePeptides()
Generates all peptides from given fasta entries. If Bottom-up is set to false skips digestion. If set to true the Digestion enzyme can be set in the parameters. Additionally introduces fixed and variable modifications for restrictive PSM search.
- Parameters
-
◆ getPeptides()
| const std::vector< Peptide > & getPeptides |
( |
| ) |
const |
Returns a reference to the internal peptide container.
Provides read-only access to all peptides currently held by the index, typically populated during build().
- Returns
- const reference to the internal std::vector of Peptide.
Preconditions: The vector may be empty if build() has not been called yet. Thread-safety: read-only view; safe to access concurrently as long as no thread mutates the index (e.g., build()/clear()).
◆ getPeptidesInPrecursorRange()
| std::pair< size_t, size_t > getPeptidesInPrecursorRange |
( |
float |
precursor_mass, |
|
|
const std::pair< float, float > & |
window |
|
) |
| |
Return index range of all possible Peptides/Proteins, such that a vector can be created fitting that range (safe some memory)
- Parameters
-
| [in] | precursor_mass | The mono-charged precursor mass (M+H) |
| [in] | window | Defines the lower and upper bound for the precusor mass. For closed search it only contains the tolerance. In case of open search it contains both tolerance and open-search-window |
- Returns
- a pair of indexes defining all possible peptides which the current peak could hit
◆ isBuild()
Indicates whether the fragment index has been built.
- Returns
- true if build() has completed successfully and the index is ready for queries; false otherwise (e.g., after construction or after clear()).
Thread-safety: read-only and can be called concurrently with other read-only methods. Must not race with build()/clear() on the same instance.
◆ isOpenSearchMode_()
| bool isOpenSearchMode_ |
( |
| ) |
const |
|
inlineprivate |
Helper function to determine if open search should be used based on tolerance.
◆ query()
| std::vector< Hit > query |
( |
const Peak1D & |
peak, |
|
|
const std::pair< size_t, size_t > & |
peptide_idx_range, |
|
|
uint16_t |
peak_charge |
|
) |
| |
Queries one peak.
- Parameters
-
| [in] | peak | The queried peak |
| [in] | peptide_idx_range | The range of precursors/peptides the peptide could potentially belongs to |
| [in] | peak_charge | The charge of the peak. Is used to calculate the mass from the mz |
- Returns
- a vector of Hits(matching peptide_idx_range and matching fragment_mz_) containing the idx of the hitted peptide and the mass of the hit
◆ queryPeaks()
| void queryPeaks |
( |
SpectrumMatchesTopN & |
candidates, |
|
|
const MSSpectrum & |
spectrum, |
|
|
const std::pair< size_t, size_t > & |
candidates_range, |
|
|
const int16_t |
isotope_error, |
|
|
const uint16_t |
precursor_charge |
|
) |
| |
|
private |
queries peaks for a given experimental spectrum with a set range of potential peptides, isotope error and precursor charge. Hits are transferred into a PSM list. Technically an adapter between query(...) and openSearch(...)/searchDifferentPrecursorRanges(...)
- Parameters
-
| [out] | candidates | The n best Spectrum matches |
| [in] | spectrum | The queried experimental spectrum |
| [in] | candidates_range | The range of precursors/peptides the peptide could potentially belong to |
| [in] | isotope_error | The applied isotope error |
| [in] | precursor_charge | The applied precursor charge |
◆ querySpectrum()
: queries one complete experimental spectra against the Database. Loops over all precursor charges Starts at min_precursor_charge and iteratively goes to max_precursor_charge. We query all peaks multiple times with all the different precursor charges and corresponding precursor masses
- Parameters
-
| [in] | spectrum | experimental spectrum |
| [out] | sms | The n best Spectrum matches |
◆ searchDifferentPrecursorRanges()
If closed search loops over all isotope errors. For each iteration loop over all peaks with queryPeaks.
If open search applies a precursor-mass window
- Parameters
-
| [in] | spectrum | experimental query-spectrum |
| [in] | precursor_mass | The mass of the precursor (mz * charge) |
| [out] | sms | The Top m SpectrumMatches |
| [in] | charge | Applied charge |
◆ trimHits()
places the k-largest elements in the front of the input array. Inside of the k-largest elements and outside the elements are not sorted
◆ updateMembers_()
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
◆ add_a_ions_
◆ add_b_ions_
◆ add_c_ions_
◆ add_x_ions_
◆ add_y_ions_
◆ add_z_ions_
◆ bucket_min_mz_
| std::vector<float> bucket_min_mz_ |
|
protected |
vector of the smalles fragment mz of each bucket
◆ bucketsize_
number of fragments per outer node
◆ digestion_enzyme_
| std::string digestion_enzyme_ |
|
private |
◆ fi_fragments_
vector of all theoretical fragments (b- and y- ions)
◆ fi_peptides_
vector of all (digested) peptides
◆ fragment_max_mz_
◆ fragment_min_mz_
◆ fragment_mz_tolerance_
| float fragment_mz_tolerance_ |
|
protected |
◆ fragment_mz_tolerance_unit_ppm_
| bool fragment_mz_tolerance_unit_ppm_ {true} |
|
protected |
◆ is_build_
true, if the database has been populated with fragments
◆ max_fragment_charge_
| uint16_t max_fragment_charge_ |
|
private |
The maximal possible charge of the fragments.
◆ max_isotope_error_
| int16_t max_isotope_error_ |
|
private |
Maximal possible isotope error (both only used for closed search)
◆ max_precursor_charge_
| uint16_t max_precursor_charge_ |
|
private |
maximal possible precursor charge
◆ max_processed_hits_
| uint32_t max_processed_hits_ |
|
private |
The amount of PSM that will be used. the rest is filtered out.
◆ max_variable_mods_per_peptide_
| size_t max_variable_mods_per_peptide_ |
|
private |
◆ min_isotope_error_
| int16_t min_isotope_error_ |
|
private |
Minimal possible isotope error.
◆ min_matched_peaks_
| uint16_t min_matched_peaks_ |
|
private |
PSM with less hits are discarded.
◆ min_precursor_charge_
| uint16_t min_precursor_charge_ |
|
private |
minimal possible precursor charge (usually always 1)
◆ missed_cleavages_
number of missed cleavages
◆ modifications_fixed_
Modification that are one all peptides.
◆ modifications_variable_
Variable Modification -> all possible comibnations are created.
◆ open_precursor_window_lower_
| float open_precursor_window_lower_ |
|
private |
Defines the lower bound of the precursor-mass range.
◆ open_precursor_window_upper_
| float open_precursor_window_upper_ |
|
private |
Defines the upper bound of the precursor-mass range.
◆ peptide_max_length_
| size_t peptide_max_length_ |
|
private |
◆ peptide_max_mass_
◆ peptide_min_length_
| size_t peptide_min_length_ |
|
private |
◆ peptide_min_mass_
◆ precursor_mz_tolerance_
| float precursor_mz_tolerance_ |
|
protected |
◆ precursor_mz_tolerance_unit_ppm_
| bool precursor_mz_tolerance_unit_ppm_ {true} |
|
protected |