Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Signal processing (Smoothing, baseline reduction, calibration)

OpenMS offers several filters for the reduction of noise and baseline which disturb LC-MS measurements. These filters work spectra-wise and can therefore be applied to a whole raw data map as well as to a single raw spectrum. All filters offer functions for the filtering of raw data containers (e.g. PeakSpectrum) "filter" as well as functions for the processing of a collection of raw data containers (e.g. PeakMap) "filterExperiment". The functions "filter" and "filterExperiment" can both be invoked with an input container along with an output container or with iterators that define a range on the input container along with an output container. The classes described in this section can be found in the FILTERING folder.

Baseline filters

Baseline reduction can be performed by the TopHatFilter. The top-hat filter is a morphological filter which uses the basic morphological operations "erosion" and "dilatation" to remove the baseline in raw data. Because both operations are implemented as described by Van Herk the top-hat filter expects equally spaced raw data points. If your data is not uniform yet, please use the LinearResampler to generate equally spaced data.

The TopHatFilter removes signal structures in the raw data which are broader than the size of the structuring element.

The following example (Tutorial_MorphologicalFilter.cpp) shows how to instantiate a tophat filter, set the length of the structuring element and remove the base line in a raw LC-MS map.

int main(int argc, const char** argv)
{
if (argc < 2) return 1;
// the path to the data should be given on the command line
String tutorial_data_path(argv[1]);
PeakMap exp;
MzMLFile mzml_file;
mzml_file.load(tutorial_data_path + "/data/Tutorial_MorphologicalFilter.mzML", exp);
Param parameters;
parameters.setValue("struc_elem_length", 1.0);
parameters.setValue("struc_elem_unit", "Thomson");
parameters.setValue("method", "tophat");
MorphologicalFilter mf;
mf.setParameters(parameters);
mf.filterExperiment(exp);
return 0;
} //end of main

Note
In order to remove the baseline, the width of the structuring element should be greater than the width of a peak.

Smoothing filters

We offer two smoothing filters to reduce noise in LC-MS measurements.

Gaussian filter

The class GaussFilter is a Gaussian filter. The wider the kernel width, the smoother the signal (the more detail information gets lost).

We show in the following example (Tutorial_GaussFilter.cpp) how to smooth a raw data map. The gaussian kernel width is set to 1 m/z.

int main(int argc, const char** argv)
{
if (argc < 2) return 1;
// the path to the data should be given on the command line
String tutorial_data_path(argv[1]);
PeakMap exp;
MzMLFile mzdata_file;
mzdata_file.load(tutorial_data_path + "/data/Tutorial_GaussFilter.mzML", exp);
GaussFilter g;
Param param;
param.setValue("gaussian_width", 1.0);
g.setParameters(param);
g.filterExperiment(exp);
return 0;
} //end of main

Note
Use a Gaussian filter kernel which has approximately the same width as your mass peaks.

Savitzky Golay filter

The Savitzky Golay filter is implemented in two ways SavitzkyGolaySVDFilter and SavitzkyGolayQRFilter. Both filters come to the same result but in most cases the SavitzkyGolaySVDFilter has a better run time. The Savitzky Golay filter works only on equally spaced data. If your data is not uniform use the LinearResampler to generate equally spaced data. The smoothing degree depends on two parameters: the frame size and the order of the polynomial used for smoothing. The frame size corresponds to the number of filter coefficients, so the width of the smoothing interval is given by frame_size*spacing of the raw data. The bigger the frame size or the smaller the order, the smoother the signal (the more detail information gets lost!).

The following example (Tutorial_SavitzkyGolayFilter.cpp) shows how to use a SavitzkyGolaySVDFilter (the SavitzkyGolayQRFilter has the same interface) to smooth a single spectrum. The single raw data spectrum is loaded and resampled to uniform data with a spacing of 0.01 /m/z. The frame size of the Savitzky Golay filter is set to 21 data points and the polynomial order is set to 3. Afterwards the filter is applied to the resampled spectrum.

int main(int argc, const char** argv)
{
if (argc < 2) return 1;
// the path to the data should be given on the command line
String tutorial_data_path(argv[1]);
PeakSpectrum spectrum;
DTAFile dta_file;
dta_file.load(tutorial_data_path + "/data/Tutorial_SavitzkyGolayFilter.dta", spectrum);
LinearResampler lr;
Param param_lr;
param_lr.setValue("spacing", 0.01);
lr.setParameters(param_lr);
lr.raster(spectrum);
SavitzkyGolayFilter sg;
Param param_sg;
param_sg.setValue("frame_length", 21);
param_sg.setValue("polynomial_order", 3);
sg.setParameters(param_sg);
sg.filter(spectrum);
return 0;
} //end of main

Calibration

OpenMS offers methods for external and internal calibration of raw or peak data.

Internal Calibration

The InternalCalibration uses reference masses for calibration. At least two reference masses have to exist in each spectrum, otherwise it is not calibrated. The data to be calibrated can be raw data or already picked data. If we have raw data, a peak picking step is necessary. For the important peak picking parameters, have a look at the Peak picking section.

The following example (Tutorial_InternalCalibration.cpp) shows how to use the InternalCalibration for raw data. First the data and reference masses are loaded.

Then we set the important peak picking parameters and run the internal calibration:

TOF Calibration

The TOFCalibration uses calibrant spectra to convert a spectrum containing time-of-flight values into one with m/z values. For the calibrant spectra, the expected masses need to be known as well as the calibration constants in order to convert the calibrant spectra tof into m/z (determined by the instrument). Using the calibrant spectra's tof and m/z-values, first a quadratic curve fitting is done. The remaining error is estimated by a spline curve fitting. The quadratic function and the splines are used to determine the calibration equation for the conversion of the experimental data.

The following example (Tutorial_TOFCalibration.cpp) shows how to use the TOFCalibration for raw data. First the spectra and reference masses are loaded.

int main(int argc, const char** argv)
{
if (argc < 2) return 1;
// the path to the data should be given on the command line
String tutorial_data_path(argv[1]);
TOFCalibration ec;
PeakMap exp_raw, calib_exp;
MzMLFile mzml_file;
mzml_file.load(tutorial_data_path + "/data/Tutorial_TOFCalibration_peak.mzML", calib_exp);
mzml_file.load(tutorial_data_path + "/data/Tutorial_TOFCalibration_raw.mzML", exp_raw);
vector<double> ref_masses;
TextFile ref_file;
ref_file.load(tutorial_data_path + "/data/Tutorial_TOFCalibration_masses.txt", true);
for (TextFile::ConstIterator iter = ref_file.begin(); iter != ref_file.end(); ++iter)
{
ref_masses.push_back(String(iter->c_str()).toDouble());
}

Then we set the calibration constants for the calibrant spectra.

std::vector<double> ml1;
ml1.push_back(418327.924993827);
std::vector<double> ml2;
ml2.push_back(253.645187196031);
std::vector<double> ml3;
ml3.push_back(-0.0414243465397252);
ec.setML1s(ml1);
ec.setML2s(ml2);
ec.setML3s(ml3);

Finally, we set the important peak picking parameters and run the external calibration:

Param param;
param.setValue("PeakPicker:peak_width", 0.1);
ec.setParameters(param);
ec.pickAndCalibrate(calib_exp, exp_raw, ref_masses);
return 0;
} //end of main


OpenMS / TOPP release 2.3.0 Documentation generated on Tue Jan 9 2018 18:22:05 using doxygen 1.8.13