Integrating near-infrared hyperspectral imaging with machine learning and feature selection: Detecting adulteration of extra-virgin olive oil with lower-grade olive oils and hazelnut oil.

Derick Malavi, Katleen Raes, Sam Van Haute
Author Information
  1. Derick Malavi: Department of Food Technology, Safety and Health, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium.
  2. Katleen Raes: Department of Food Technology, Safety and Health, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium.
  3. Sam Van Haute: Department of Food Technology, Safety and Health, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium.

Abstract

Detecting adulteration in extra virgin olive oil (EVOO) is particularly challenging with oils of similar chemical composition. This study applies near-infrared hyperspectral imaging (NIR-HSI) and machine learning (ML) to detect EVOO adulteration with hazelnut, refined olive, and olive pomace oils at various concentrations (1%, 5%, 10%, 20%, 40%, and 100% m/m). Savitzky-Golay filtering, first and second derivatives, multiplicative scatter correction (MSC), standard normal variate (SNV), and their combinations were used to preprocess the spectral data, with Principal Component Analysis (PCA) reducing dimensionality. Classification was performed using Partial Least Squares-Discriminant Analysis (PLS-DA) and ML algorithms, including k-Nearest Neighbors (k-NN), Naïve Bayes, Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Networks (ANN). PLS-DA, k-NN, RF, SVM, NB, and ANN models achieved accuracy rates of 97.0-99.0%, 96.2-100%, 96.5-100%, 98.6-99.5%, 93.9-99.7%, and 99.2-100%, respectively, in discriminating between pure EVOO, adulterants, and adulterated oils. PLS-DA, RF, SVM, and ANN significantly outperformed Naïve Bayes (p < 0.05) in binary classification, with Matthews correlation coefficient (MCC) values exceeding 0.90. All the binary classifiers except Naïve Bayes, when coupled with SNV/MSC, Savitzky-Golay smoothing and derivatives, consistently achieved perfect scores (1.0) for accuracy, sensitivity, specificity, F1 score, precision, and MCC in distinguishing pure EVOO from adulterated oils. No significant differences (p > 0.05) in model performance were found between those using full spectra and those based on key variable selection. However, PLS-DA and ANN significantly outperformed k-NN, RF, and SVM (p < 0.05), with MCC values ranging from 0.95 to 1.00, indicating superior classification performance. These findings demonstrate that combining NIR-HSI with machine learning, along with key variable selection, potentially offers an effective, non-destructive solution for detecting adulteration in EVOO and combating fraud in the olive oil industry.

Keywords

References

  1. Crit Rev Food Sci Nutr. 2012;52(11):1039-58 [PMID: 22823350]
  2. Food Chem. 2019 Sep 30;293:323-332 [PMID: 31151619]
  3. Lipids. 2017 May;52(5):443-455 [PMID: 28401382]
  4. J Agric Food Chem. 2010 Feb 10;58(3):1679-84 [PMID: 20070088]
  5. J Agric Food Chem. 2009 Dec 23;57(24):11550-6 [PMID: 19928817]
  6. Food Chem. 2017 Feb 15;217:735-742 [PMID: 27664692]
  7. PLoS One. 2014 May 30;9(5):e98522 [PMID: 24879306]
  8. Food Chem. 2021 Oct 30;360:130033 [PMID: 34023716]
  9. Foods. 2023 Jan 17;12(3): [PMID: 36765958]
  10. Food Chem. 2024 Apr 16;438:138029 [PMID: 38006696]
  11. Epidemiol Rev. 2001;23(2):231-47 [PMID: 12192735]
  12. Talanta. 2018 Feb 1;178:751-762 [PMID: 29136891]
  13. Lancet. 1981 Sep 12;2(8246):567-8 [PMID: 6116011]
  14. Food Chem. 2020 Jan 1;302:125329 [PMID: 31404874]
  15. Food Addit Contam Part A Chem Anal Control Expo Risk Assess. 2010 Jan;27(1):1-10 [PMID: 19763990]
  16. Food Chem. 2014 Jan 15;143:472-8 [PMID: 24054269]
  17. Anal Chim Acta. 2010 May 14;667(1-2):14-32 [PMID: 20441862]
  18. Food Chem. 2012 Sep 15;134(2):1192-8 [PMID: 23107747]
  19. BMC Genomics. 2020 Jan 2;21(1):6 [PMID: 31898477]
  20. Anal Bioanal Chem. 2011 Jan;399(3):1315-24 [PMID: 21107823]
  21. J Chromatogr A. 2005 May 13;1074(1-2):215-21 [PMID: 15941058]
  22. Food Chem. 2024 Jan 15;431:137077 [PMID: 37611361]
  23. BioData Min. 2023 Feb 17;16(1):4 [PMID: 36800973]
  24. Food Chem. 2013 Jun 1;138(2-3):1829-36 [PMID: 23411315]
  25. J Food Sci. 2014 Sep;79(9):C1672-7 [PMID: 25124993]
  26. Talanta. 2013 May 15;109:74-83 [PMID: 23618142]
  27. Food Chem. 2008 Sep 1;110(1):248-56 [PMID: 26050190]
  28. Talanta. 2016 Dec 1;161:304-308 [PMID: 27769410]
  29. J Agric Food Chem. 2003 Oct 8;51(21):6145-50 [PMID: 14518936]
  30. Anal Bioanal Chem. 2007 Aug;388(8):1859-65 [PMID: 17611742]
  31. Food Sci Nutr. 2020 Nov 04;9(1):180-189 [PMID: 33473282]
  32. Curr Res Food Sci. 2024 May 22;8:100773 [PMID: 38840806]
  33. Curr Res Food Sci. 2024 Apr 20;8:100742 [PMID: 38708100]
  34. Comput Intell Neurosci. 2022 Apr 28;2022:3854635 [PMID: 35528334]
  35. J Agric Food Chem. 2017 Jul 5;65(26):5375-5383 [PMID: 28609617]
  36. Food Addit Contam Part A Chem Anal Control Expo Risk Assess. 2010 Jan;27(1):11-8 [PMID: 19760526]
  37. Semin Arthritis Rheum. 1993 Oct;23(2):104-24 [PMID: 8266108]
  38. Spectrochim Acta A Mol Biomol Spectrosc. 2007 Mar;66(3):568-74 [PMID: 16859975]
  39. J Chromatogr A. 2000 Jun 9;881(1-2):93-104 [PMID: 10905696]
  40. PLoS One. 2016 Jan 28;11(1):e0146547 [PMID: 26820311]
  41. Food Chem. 2021 May 30;345:128866 [PMID: 33348130]
  42. J Agric Food Chem. 2008 Jun 25;56(12):4348-51 [PMID: 18512931]
  43. J Food Sci Technol. 2018 Jul;55(7):2429-2435 [PMID: 30042558]
  44. Food Chem. 2023 Dec 15;429:136986 [PMID: 37516053]
  45. J Food Sci. 2012 Apr;77(4):R118-26 [PMID: 22486545]

Word Cloud

Created with Highcharts 10.0.0oliveEVOOoiloilsadulterationlearningPLS-DARFSVMANNmachinek-NNNaïveBayes05MCC0selectionDetectingnear-infraredhyperspectralimagingNIR-HSIMLhazelnut5%Savitzky-GolayderivativesAnalysisClassificationusingMachinemodelsachievedaccuracy962-100%pureadulteratedsignificantlyoutperformedp < 0binaryclassificationvalues1performancekeyvariableextravirginparticularlychallengingsimilarchemicalcompositionstudyappliesdetectrefinedpomacevariousconcentrations1%10%20%40%100%m/mfilteringfirstsecondmultiplicativescattercorrectionMSCstandardnormalvariateSNVcombinationsusedpreprocessspectraldataPrincipalComponentPCAreducingdimensionalityperformedPartialLeastSquares-Discriminantalgorithmsincludingk-NearestNeighborsRandomForestSupportVectorArtificialNeuralNetworksNBrates970-990%5-100%986-99939-997%99respectivelydiscriminatingadulterantsMatthewscorrelationcoefficientexceeding90classifiersexceptcoupledSNV/MSCsmoothingconsistentlyperfectscoressensitivityspecificityF1scoreprecisiondistinguishingsignificantdifferencesp > 0modelfoundfullspectrabasedHoweverranging9500indicatingsuperiorfindingsdemonstratecombiningalongpotentiallyofferseffectivenon-destructivesolutiondetectingcombatingfraudindustryIntegratingfeatureselection:extra-virginlower-gradeAdulterationAuthenticationExtra-virginVariable

Similar Articles

Cited By