Identification of combinations of genomic mutations using genome-wide association studies by example of Mycobacterium tuberculosis
Keywords:
genome-wide association studies, GWAS, single nucleotide polymorphism interaction, drug-resistant tuberculosisAbstract
Genome-wide association studies play a key role in identifying relationships between genomes and phenotypes. Many studies in this field are devoted to the investigations of genetic variations and their interactions in genomes. However, despite significant progress achieved in this direction, the problem under consideration is still highly relevant and requires the development of effective methods and algorithms for solving it. In this paper, four new algorithms based on the study of single nucleotide polymorphism interactions in the two modes, additive and multiplicative, are proposed to find combinations of single nucleotide polymorphisms associated with phenotypes. In the first stage, the algorithms use exhaustive search of single nucleotide polymorphism pairs to predict their association with phenotype, and in the second stage, greedy procedures are applied to find combinations of up to five single nucleotide polymorphisms with the best association values. The developed computational approach is tested on the dataset containing 3178 Mycobacterium tuberculosis genomes to identify single nucleotide polymorphism combinations and predict resistance of Mycobacte-rium tuberculosis strains to 20 drugs. The results obtained are compared with those of the modern prediction software systems Mykrobe and TB-Profiler. For the 5 first-line drugs and the 1 second-line drug (Ofloxacin), Mykrobe and TB-Profiler systems slightly exceed the prediction accuracy of the proposed algorithms, but for the other 14 second-line drugs, they are inferior to them.
References
- Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 years of GWAS discovery: biology, function, and translation. The American Journal of Human Genetics. 2017;101(1):5–22. DOI: 10.1016/j.ajhg.2017.06.005.
- Abdellaoui A, Yengo L, Verweij KJH, Visscher PM. 15 years of GWAS discovery: realizing the promise. The American Journal of Human Genetics. 2023;110(2):179–194. DOI: 10.1016/j.ajhg.2022.12.011.
- Zhu Zhixiang, Tong Xiaoran, Zhu Zhihong, Liang Meimei, Cui Wenyan, Su Kunkai, et al. Development of GMDR-GPU for gene – gene interaction analysis and its application to WTCCC GWAS data for type 2 diabetes. PloS One. 2013;8(4):e61943. DOI: 10.1371/journal.pone.0061943.
- Díez Díaz F, Sánchez Lasheras F, Moreno V, Moratalla-Navarro F, Molina de la Torre AJ, Martín Sánchez V. GASVeM: a new machine learning methodology for multi-SNP analysis of GWAS data based on genetic algorithms and support vector machines. Mathematics. 2021;9(6):654. DOI: 10.3390/math9060654.
- Nguyen T-T, Huang JZ, Wu Q, Nguyen TT, Li MJ. Genome-wide association data classification and SNPs selection using twostage quality-based random forests. BMC Genomics. 2015;16(supplement 2):S5. DOI: 10.1186/1471-2164-16-S2-S5.
- Ritchie MD, Van Steen K. The search for gene – gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. Annals of Translational Medicine. 2018;6(8):157. DOI: 10.21037/atm.2018.04.05.
- Berrandou T-E, Balding D, Speed D. LDAK-GBAT: fast and powerful gene-based association testing using summary statistics. The American Journal of Human Genetics. 2023;110(1):23–29. DOI: 10.1016/j.ajhg.2022.11.010.
- Zhang J, Liang X, Gonzales S, Liu J, Gao XR, Wang X. A gene based combination test using GWAS summary data. BMC Bioinformatics. 2023;24:2. DOI: 10.1186/s12859-022-05114-x.
- Evans LM, Arehart CH, Grotzinger AD, Mize TJ, Brasher MS, Stitzel JA, et al. Transcriptome-wide gene – gene interaction associations elucidate pathways and functional enrichment of complex traits. PLoS Genetics. 2023;19(5):e1010693. DOI: 10.1371/journal.pgen.1010693.
- World Health Organization. Global tuberculosis report – 2023. Geneva: World Health Organization; 2023. XIV, 57 p.
- Goossens SN, Sampson SL, Van Rie A. Mechanisms of drug-induced tolerance in Mycobacterium tuberculosis. Clinical Microbiology Reviews. 2020;34(1):e00141-20. DOI: 10.1128/cmr.00141-20.
- Islam MM, Hameed HMA, Mugweru J, Chhotaray C, Wang C, Tan Y, et al. Drug resistance mechanisms and novel drug targets for tuberculosis therapy. Journal of Genetics and Genomics. 2017;44(1):21–37. DOI: 10.1016/j.jgg.2016.10.002.
- Cui Z-J, Yang Q-Y, Zhang H-Y, Zhu Q, Zhang Q-Y. Bioinformatics identification of drug resistance-associated gene pairs in Mycobacterium tuberculosis. International Journal of Molecular Sciences. 2016;17(9):1417. DOI: 10.3390/ijms17091417.
- Kuang X, Wang F, Hernandez KM, Zhang Z, Grossman RL. Accurate and rapid prediction of tuberculosis drug resistance from genome sequence data using traditional machine learning algorithms and CNN. Scientific Reports. 2022;12:2427. DOI: 10.1038/s41598- 022-06449-4.
- Hunt M, Bradley P, Lapierre SG, Heys S, Thomsit M, Hall MB, et al. Antibiotic resistance prediction for Mycobacterium tuberculosis from genome sequence data with Mykrobe. Wellcome Open Research. 2019;4:191. DOI: 10.12688/wellcomeopenres.15603.1.
- Benavente ED, Coll F, Furnham N, McNerney R, Glynn JR, Campino S, et al. PhyTB: phylogenetic tree visualisation and sample positioning for M. tuberculosis. BMC Bioinformatics. 2015;16:155. DOI: 10.1186/s12859-015-0603-3.
- Ezewudo M, Borens A, Chiner-Oms Á, Miotto P, Chindelevitch L, Starks AM, et al. Integrating standardized whole genome sequence analysis with a global Mycobacterium tuberculosis antibiotic resistance knowledgebase. Scientific Reports. 2018;8:15382. DOI: 10.1038/s41598-018-33731-1.
- Gröschel M, Owens M, Freschi L, Vargas R Jr, Marin MG, Phelan J, et al. GenTB: a user-friendly genome-based predictor for tuberculosis resistance powered by machine learning. Genome Medicine. 2021;13:138. DOI: 10.1186/s13073-021-00953-4.
- Iwai H, Kato-Miyazawa M, Kirikae T, Miyoshi-Akiyama T. CASTB (the comprehensive analysis server for the Mycobacterium tuberculosis complex): a publicly accessible web server for epidemiological analyses, drug-resistance prediction and phylogenetic comparison of clinical isolates. Tuberculosis. 2015;95(6):843–844. DOI: 10.1016/j.tube.2015.09.002.
- Phelan JE, O’Sullivan DM, Machado D, Ramos J, Oppong YEA, Campino S, et al. Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs. Genome Medicine. 2019;11:41. DOI: 10.1186/s13073-019-0650-x.
- Sekizuka T, Yamashita A, Murase Y, Iwamoto T, Mitarai S, Kato S, et al. TGS-TB: total genotyping solution for Mycobacterium tuberculosis using short-read whole-genome sequencing. PloS One. 2015;10(11):e0142951. DOI: 10.1371/journal.pone.0142951.
- Sergeev RS, Kavaliou IS, Sataneuski UV, Gabrielian A, Rosenthal A, Tartakovsky M, et al. Genome-wide analysis of MDR and XDR tuberculosis from Belarus: machine-learning approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2019;16(4):1398–1408. DOI: 10.1109/TCBB.2017.2720669.
- Steiner A, Stucki D, Coscolla M, Borrell S, Gagneux S. KvarQ: targeted and direct variant calling from fastq reads of bacterial genomes. BMC Genomics. 2014;15:881. DOI: 10.1186/1471-2164-15-881.
- Rosenthal A, Gabrielian A, Engle E, Hurt DE, Alexandru S, Crudu V, et al. The TB Portals: an open-access, web-based platform for global drug-resistant-tuberculosis data sharing and analysis. Journal of Clinical Microbiology. 2017;55(11):3267–3282. DOI: https://doi.org/10.1128/jcm.01013-17.
Downloads
Additional Files
Published
Issue
Section
License
The authors who are published in this journal agree to the following:
- The authors retain copyright on the work and provide the journal with the right of first publication of the work on condition of license Creative Commons Attribution-NonCommercial. 4.0 International (CC BY-NC 4.0).
- The authors retain the right to enter into certain contractual agreements relating to the non-exclusive distribution of the published version of the work (e.g. post it on the institutional repository, publication in the book), with the reference to its original publication in this journal.
- The authors have the right to post their work on the Internet (e.g. on the institutional store or personal website) prior to and during the review process, conducted by the journal, as this may lead to a productive discussion and a large number of references to this work. (See The Effect of Open Access.)

















