OpenMS  2.6.0
Quality Control

Quality Control

Goal: Check the quality of the data (supports label-free workflows and IsobaricAnalyzer output).

The QualityControl TOPP tool computes and collects data which allow to compute QC metrics to check the quality of LC-MS data. Depending on the given input data this tool collects data for metrics (see section 'Metrics' below). New metavalues will be added to existing data and the information will be written out in mzTab format. This mzTab file can then be processed using your custom scripts or via the R package (see PTXQC).

Workflow

An example workflow can be found in OpenMS/share/OpenMS/examples/TOPPAS/QualityControl.toppas.

For data from IsobaricAnalyzer, just provide the consensusXML as input to QualityControl. No FeatureXMLs or TrafoXMLs are required. The mzML raw file can be added as input though.

Metrics

This Table shows what each of the included metrics does, what they need to be executed and what they add to the data or what they return.

Input Data

PostFDR FeatureXML: A FeatureXML after FDR filtering.
Contaminants Fasta file: A Fasta file containing contaminant proteins.
Raw mzML file: An unchanged mzML file.
InternalCalibration MzML file: An MzML file after internal calibration.
TrafoXML file: The RT alignment function as obtained from a MapAligner.


Contaminants
Required input data: Contaminants Fasta file, PostFDR FeatureXML Output
Description:
The Contaminants metric takes the contaminants database and digests the protein sequences with the digestion enzyme that is given in the featureXML. Afterwards it checks whether each of all peptide sequences of the featureXML (including the unassigned PeptideIdentifications) is registered in the contaminants database.

Changes in files:
Metavalue:

  • 'is_contaminant' set to '1' or to '0' if the peptide is found in the contaminant database or not and sets a '0' if not.

Other outputs:
Returns:

  • Contaminant ratio of all peptides
  • Contaminant ratio of all assigned peptides
  • Contaminant ratio of all unassigned peptides
  • Intensity ratio of all contaminants in the assigned peptides
  • Number of empty features, Number of all found features


FragmentMassError
Required input data: PostFDR FeatureXML, raw mzML file Output

Description:
The FragmentMassError metric computes a list of fragment mass errors for each annotated MS2 spectrum in ppm and Da. Afterwards it calculates the mass delta between observed and theoretical peaks.

Changes in files:
Metavalue:
  • 'fragment_mass_error_ppm' set to the fragment mass error in parts per million
  • 'fragment_mass_error_da' set to the fragment mass error in Dalton
Other Output:
Returns:
  • Average and variance of fragment mass errors in ppm

MissedCleavages
Required input data: PostFDR FeatureXML Output

Description:
This MissedCleavages metric counts the number of MissedCleavages per PeptideIdentification given a FeatureMap and returns an agglomeration statistic (observed counts). Additionally the first PeptideHit of each PeptideIdentification in the FeatureMap is augmented with metavalues.

Changes in files:
Metavalue:
  • 'missed_cleavages'
Other Output:
Returns:
  • Frequency map of missed cleavages as key/value pairs.

MS2IdentificationRate
Required input data: PostFDR FeatureXML, raw mzML file Output

Description:
The MS2IdentificationRate metric calculates the Rate of the MS2 identification as follows: The number of all PeptideIdentifications are counted and that number is divided by the total number of MS2 spectra.

Changes in files:
This metric does not change anything in the data.
Other Output:
Returns:
  • Number of PeptideIdentifications
  • Number of MS2 spectra
  • Ratio of #pepID/#MS2

MzCalibration
Required input data: PostFDR FeatureXML
Optional input data: InternalCalibration MzML file
Output

Description:
This metric adds new metavalues to the first (best) hit of each PeptideIdentification. For this metric it is also possible to use this without an MzML File, but then only uncalibrated m/z error (ppm) will be reported. However for full functionality a PeakMap/MSExperiment with original m/z-values before m/z calibration generated by InternalCalibration has to be given.

Changes in files:
Metavalues:
  • 'mz_raw'set to m/z value of original experiment
  • 'mz_ref' set to m/z value of calculated reference
  • 'uncalibrated_mz_error_ppm' set to uncalibrated m/z error in parts per million
  • 'calibrated_mz_error_ppm' set to calibrated m/z error in parts per million
Other Output:
No additional output.

RTAlignment
Required input data: PostFDR FeatureXML, trafoXML file Output

Description:
The RTAlignment metric checks what the retention time was before the alignment and how it is after the alignment. These two values are added to the metavalues in the PeptideIdentification.

Changes in files:
Metavalues:
  • 'rt_align'set to retention time after alignment
  • 'rt_raw' set to retention time before alignment
Other Output:
No additional output.

TIC
Required input data: raw mzML file Output

Description:
This TIC metric calculates the total ion count of an MSExperiment if a bin size in RT seconds greater than 0 is given. All MS1 abundances within a bin are summed up.

Changes in files:
This metric does not change anything in the data.
Other Output:
Returns:
  • TIC chromatograms

TopNoverRT
Required input data: PostFDR FeatureXML, raw mzML file Output

Description:
The TopNoverRT metric calculates the ScanEventNumber (number of the MS2 scans after the MS1 scan) and adds them as the new metavalue 'ScanEventNumber' to the PeptideIdentifications. It finds all unidentified MS2-Spectra and adds corresponding 'empty' PeptideIdentifications without sequence as placeholders to the unassigned PeptideIdentification list. Furthermore it adds the metavalue 'identified' to the PeptideIdentification.

Changes in files:
Metavalues:
  • 'ScanEventNumber' set to the calculated value
  • 'identified' set to '+' or '-'
If provided:
  • 'FWHM' set to RT peak width for all assigned PIs
  • 'ion_injection_time'set to injection time from MS2 spectrum
  • 'activation_method'set to activation method from MS2 spectrum
  • 'total_ion_count'set to summed intensity from MS2 spectrum
  • 'base_peak_intensity'set to highest intensity from MS2 spectrum
Additionally:
  • Adds empty PeptideIdentifications
Other Output:
No additional output.