WiPP: Workflow for improved Peak Picking for Gas Chromatography-Mass Spectrometry (GC-MS) data

Borgsmüller, N.; Gloaguen, Y.; Opialla, T.; Blanc, E.; Sicard, E.; Royer, A.-L.; Le Bizec, B.; Durand, S.; Migne, C.; Petera, M.; Pujos-Guillot, E.; Giacomoni, F.; Guitton, Y.; Beule, D.; Kirwan, J.

Abstract

Lack of reliable peak detection impedes automated analysis of large scale GC-MS metabolomics datasets. Performance and outcome of individual peak-picking algorithms can differ widely depending on both algorithmic approach and parameters as well as data acquisition method. Comparing and contrasting between algorithms is thus difficult. Here we present a workflow for improved peak picking (WiPP), a parameter optimising, multi-algorithm peak detection for GC-MS metabolomics. WiPP evaluates the quality of detected peaks using a machine learning-based classification scheme based on seven peak classes. The quality information returned by the classifier for each individual peak is merged with results from different peak detection algorithms to create one final high quality peak set for immediate down stream analysis. Medium and low quality peaks are kept for further inspection. By applying WiPP to standard compound mixes and a complex biological dataset we demonstrate that peak detection is improved through the novel way to assign peak quality, an automated parameter optimisation, and results integration across different embedded peak picking algorithms. Furthermore, our approach can provide an impartial performance comparison of different peak picking algorithms. WiPP is freely available on GitHub (https://github.com/bihealth/WiPP) under MIT licence.

Word Cloud

Created with Highcharts 10.0.0peakalgorithmsqualitydetectionWiPPGC-MSimprovedpickingdifferentautomatedanalysismetabolomicsindividualcanapproachparameterpeaksresultsLackreliableimpedeslargescaledatasetsPerformanceoutcomepeak-pickingdifferwidelydependingalgorithmicparameterswelldataacquisitionmethodComparingcontrastingthusdifficultpresentworkflowoptimisingmulti-algorithmevaluatesdetectedusingmachinelearning-basedclassificationschemebasedsevenclassesinformationreturnedclassifiermergedcreateonefinalhighsetimmediatestreamMediumlowkeptinspectionapplyingstandardcompoundmixescomplexbiologicaldatasetdemonstratenovelwayassignoptimisationintegrationacrossembeddedFurthermoreprovideimpartialperformancecomparisonfreelyavailableGitHubhttps://githubcom/bihealth/WiPPMITlicenceWiPP:WorkflowPeakPickingGasChromatography-MassSpectrometrydatanull

Similar Articles

Cited By

No available data.