Classification of the geographical origin of medicinal herbs using multivariate spectral analysis

Authors

  • Polina S. Kolodochka B. I. Stepanov Institute of Physics, National Academy of Sciences of Belarus, 68 Niezaliezhnasci Avenue, Minsk 220072, Belarus
  • Mikhail A. Khodasevich B. I. Stepanov Institute of Physics, National Academy of Sciences of Belarus, 68 Niezaliezhnasci Avenue, Minsk 220072, Belarus

Keywords:

spectral analysis, principal component analysis, classification and regression tree, spectral variable selection, medicinal herbs

Abstract

Classification of the geographical origin and manufacturer of medicinal herbs was carried out by multivariate analysis of the optical density spectra of 70 % alcohol tinctures in the wavelength range 230–2600 nm using the example of chamomile from Russia and Belarus. Principal component analysis, classification and regression tree method, and spectral variable selection were used to build the models. The principal component analysis allows one to significantly reduce the dimension of the feature space. Classification and regression trees are being constructed in it. The maximum number of principal components considered is limited to 10, which made it possible to describe more than 0.999 of the total dispersion of the measured spectra. Classification and regression trees with tenfold cross-validation classify the country of origin of samples in a four-dimensional space and the manufacturer in a three-dimensional space of the principal components of broadband optical density spectra with an accuracy of more than 0.93. Ranking the spectral variables in decreasing order of the absolute value of the average deviation of optical density from the average value made it possible to improve the accuracy of classification models. A reliable classification of the geographical origin of chamomile is achieved in the space of principal components of 20 variables out of 2623 available in the broadband spectra. The manufacturer’s classification accuracy was improved to 0.94 by selecting 14 spectral variables.

Author Biographies

  • Polina S. Kolodochka, B. I. Stepanov Institute of Physics, National Academy of Sciences of Belarus, 68 Niezaliezhnasci Avenue, Minsk 220072, Belarus

    junior researcher at the centre «Diagnostic systems»

  • Mikhail A. Khodasevich, B. I. Stepanov Institute of Physics, National Academy of Sciences of Belarus, 68 Niezaliezhnasci Avenue, Minsk 220072, Belarus

    doctor of science (physics and mathe­ matics), docent; chief researcher at the centre «Diagnostic systems»



References

  1. Liang Y-Z, Xie P, Chan K. Quality control of herbal medicines. Journal of Chromatography B. 2004;812(1–2):53–70. DOI: 10.1016/j.jchromb.2004.08.041.
  2. Noviana E, Indrayanto G, Rohman A. Advances in fingerprint analysis for standardization and quality control of herbal medicines. Frontiers in Pharmacology. 2022;13:853023. DOI:10.3389/fphar.2022.853023.
  3. Wang P, Yu Z. Species authentication and geographical origin discrimination of herbal medicines by near infrared spectroscopy: a review. Journal of Pharmaceutical Analysis. 2015;5(5):277–284. DOI:10.1016/j.jpha.2015.04.001.
  4. Klein LC Jr, de Souza MR, Viaene J, Bresolin TMB, de Gasper AL, Henriques AT, et al. Quality control of herbal medicines: from traditional techniques to state-of-the-art approaches. Planta Medica. 2021;87(12–13):964–988. DOI:10.1055/a-1529-8339.
  5. Chen R, Liu F, Zhang C, Wang W, Yang R, Zhao Y, et al. Trends in digital detection for the quality and safety of herbs using infrared and Raman spectroscopy. Frontiers in Plant Science. 2023;14:1128300. DOI: 10.3389/fpls.2023.1128300.
  6. Drivelos SA, Georgiou CA. Multi-element and multi-isotope-ratio analysis to determine the geographical origin of foods in the European Union. Trends in Analytical Chemistry. 2012;40:38–51. DOI: 10.1016/j.trac.2012.08.003.
  7. Resce G, Vaquero-Piñeiro C. Predicting agri-food quality across space: a machine learning model for the acknowledgment of geographical indications. Food Policy. 2022;112:102345. DOI: 10.1016/j.foodpol.2022.102345.
  8. Li S, Yu X, Zhen Z, Huang M, Lu J, Pang Y, et al. Geographical origin traceability and identification of refined sugar using UPLC-QTof-MS analysis. Food Chemistry. 2021;348:128701. DOI:10.1016/j.foodchem.2020.128701.
  9. Bro R, Smilde AK. Principal component analysis. Analytical Methods. 2014;6(9):2812–2831. DOI:10.1039/C3AY41907J.
  10. Loh W-Y. Fifty years of classification and regression trees. International Statistical Review. 2014;82(3):329–348. DOI:10.1111/insr.12016.
  11. Mishra S, Datta-Gupta A. Applied statistical modeling and data analytics: a practical guide for the petroleum geosciences. [S. l.]: Elsevier; 2018. Chapter 5, Multivariate data analysis; p. 97–118. DOI: 10.1016/B978-0-12-803279-4.00005-5.
  12. Kolodochka PS, Khodasevich MA. Classification of sugar types by UV-VIS-NIR spectroscopy and multivariate analysis. In: The 12th International conference on photonics and applications (ICPA-12); 2022 September 28 – October 1; Con Dao, Ba Ria – Vung Tau, Vietnam. [S. l.]: [s. n.]; 2023. p. 244–247.

Downloads

Published

2024-09-23

How to Cite

(1)
Kolodochka, P. S. .; Khodasevich, M. A. . Classification of the Geographical Origin of Medicinal Herbs Using Multivariate Spectral Analysis. Журнал Белорусского государственного университета. Физика 2024, No. 3, 10-16.