x
Volume 44 Issue 5
Sep.  2021
Turn off MathJax
Article Contents
QI Qiaona, LIU Yan, CHEN Jihui, LIU Xinzhu, YANG Rui, ZHANG Jinyuan, CUI Mengxuan, XIE Yimeng, WANG Zeyuan, YU Ze, GAO Fei, ZHANG Jian. Research progress on machine learning XGBoost algorithm in medicine[J]. Journal of Molecular Imaging, 2021, 44(5): 856-862. doi: 10.12122/j.issn.1674-4500.2021.05.25
Citation: QI Qiaona, LIU Yan, CHEN Jihui, LIU Xinzhu, YANG Rui, ZHANG Jinyuan, CUI Mengxuan, XIE Yimeng, WANG Zeyuan, YU Ze, GAO Fei, ZHANG Jian. Research progress on machine learning XGBoost algorithm in medicine[J]. Journal of Molecular Imaging, 2021, 44(5): 856-862. doi: 10.12122/j.issn.1674-4500.2021.05.25

Research progress on machine learning XGBoost algorithm in medicine

doi: 10.12122/j.issn.1674-4500.2021.05.25
  • Received Date: 2021-08-05
  • Publish Date: 2021-09-20
  • The XGBoost algorithm was first proposed in 2014. Based on the boosting algorithm, it has shown its excellent performance and usability in many data science competitions. At present, the classification and regression models for prediction based on XGBoost algorithm have been widely used for data analysis in health care, finance, education, manufacturing, and other fields. In the medical field, XGBoost has been used for disease diagnosis, prediction of risk, outcomes, prognosis, rational and safe drug use, drug research and development, etc. It provides solutions with great possibilities to improve the efficiency and quality of decision making and reduce the false positive rate. At the same time, XGBoost can automatically learn the splitting direction when processing missing values. It can also simulate nonlinear effect when dealing with large data sets with a high efficiency and accuracy.

     

  • loading
  • [1]
    Murdoch TB, Detsky AS. The inevitable application of big data to health care[J]. JAMA, 2013, 309(13): 1351-2. doi: 10.1001/jama.2013.393
    [2]
    Merelli I, Pérez-Sánchez H, Gesing S, et al. Managing, analysing, and integrating big data in medical bioinformatics: open problems and future perspectives[J]. Biomed Res Int, 2014, 2014: 134023.
    [3]
    Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential[J]. Health Inf Sci Syst, 2014, 2: 3. doi: 10.1186/2047-2501-2-3
    [4]
    Shavlik JW. Readings in Machine Learning[M]. Los Altos, CA: Morgan Kaufmann, 1990.
    [5]
    Michalski RS, Bratko I, Kubat M. Machine learning, data mining and knowledge discovery: methods and applications[M]. New York: Wiley, 1998.
    [6]
    Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects[J]. Science, 2015, 349(6245): 255-60. doi: 10.1126/science.aaa8415
    [7]
    Altman RB. Artificial intelligence (AI) systems for interpreting complex medical datasets[J]. Clin Pharmacol Ther, 2017, 101(5): 585-6. doi: 10.1002/cpt.650
    [8]
    Sun Y, Todorovic S, Goodison S. Local-learning-based feature selection for high-dimensional data analysis[J]. IEEE Trans Pattern Anal Mach Intell, 2010, 32(9): 1610-26. doi: 10.1109/TPAMI.2009.190
    [9]
    Chen TQ, Guestrin C. XGBoost: a scalable tree boosting system[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco California USA. New York, NY, USA: ACM, 2016: 785-94.
    [10]
    Cios KJ, William Moore G. Uniqueness of medical data mining[J]. Artif Intell Med, 2002, 26(1/2): 1-24. http://www.ncbi.nlm.nih.gov/pubmed/12234714
    [11]
    Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines[J]. Int J Med Inform, 2008, 77(2): 81-97. doi: 10.1016/j.ijmedinf.2006.11.006
    [12]
    Zhang X, Yan C, Gao C, et al. Predicting missing values in medical data via XGBoost regression[J]. J Healthc Inform Res, 2020, 4(4): 383-94. doi: 10.1007/s41666-020-00077-1
    [13]
    Newgard CD, Lewis RJ. Missing data: how to best account for what is not known[J]. JAMA, 2015, 314(9): 940-1. doi: 10.1001/jama.2015.10516
    [14]
    Luo Y, Szolovits P, Dighe AS, et al. 3D-MICE: integration of cross-sectional and longitudinal imputation for multi-analyte longitudinal clinical data[J]. J Am Med Inform Assoc, 2018, 25(6): 645-53. doi: 10.1093/jamia/ocx133
    [15]
    Cismondi F, Fialho AS, Vieira SM, et al. Missing data in medical databases: impute, delete or classify?[J]. Artif Intell Med, 2013, 58 (1): 63-72. doi: 10.1016/j.artmed.2013.01.003
    [16]
    Torlay L, Perrone-Bertolotti M, Thomas E, et al. Machine learning-XGBoost analysis of language networks to classify patients with epilepsy[J]. Brain Inform, 2017, 4(3): 159-69. doi: 10.1007/s40708-017-0065-7
    [17]
    Nishio M, Nishizawa M, Sugiyama O, et al. Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization[J]. PLoS One, 2018, 13(4): 195875. http://arxiv.org/ftp/arxiv/papers/1708/1708.05897.pdf
    [18]
    Taylor RA, Moore CL, Cheung KH, et al. Predicting urinary tract infections in the emergency department with machine learning[J]. PLoS One, 2018, 13(3): e0194085. doi: 10.1371/journal.pone.0194085
    [19]
    Maass F, Michalke B, Leha A, et al. Elemental fingerprint as a cerebrospinal fluid biomarker for the diagnosis of Parkinson's disease[J]. J Neurochem, 2018, 145(4): 342-51. doi: 10.1111/jnc.14316
    [20]
    Yu DP, Liu ZD, Su CY, et al. Copy number variation in plasma as a tool for lung cancer prediction using Extreme Gradient Boosting (XGBoost) classifier[J]. Thorac Cancer, 2020, 11(1): 95-102. doi: 10.1111/1759-7714.13204
    [21]
    Ye C, Fu T, Hao S, et al. Prediction of incident hypertension within the next year: prospective study using statewide electronic health records and machine learning[J]. J Med Internet Res, 2018, 20(1): e22. doi: 10.2196/jmir.9268
    [22]
    Chen X, Huang L, Xie D, et al. EGBMMDA: extreme gradient boosting machine for MiRNA-disease association prediction[J]. Cell Death Dis, 2018, 9(1): 3. doi: 10.1038/s41419-017-0003-x
    [23]
    Trakadis YJ, Sardaar S, Chen A, et al. Machine learning in schizophrenia genomics, a case-control study using 5, 090 exomes[J]. Am J Med Genet, 2019, 180(2): 103-12. doi: 10.1002/ajmg.b.32638
    [24]
    van Rosendael AR, Maliakal G, Kolli KK, et al. Maximization of the usage of coronary CTA derived plaque information using a machine learning based algorithm to improve risk stratification; insights from the CONFIRM registry[J]. J Cardiovasc Comput Tomogr, 2018, 12(3): 204-9. doi: 10.1016/j.jcct.2018.04.011
    [25]
    Livne M, Boldsen JK, Mikkelsen IK, et al. Boosted tree model reforms multimodal magnetic resonance imaging infarct prediction in acute stroke[J]. Stroke, 2018, 49(4): 912-8. doi: 10.1161/STROKEAHA.117.019440
    [26]
    Donovan FO, Brecht T, Kekeh C, et al. Machine learning generated risk model to predict unplanned hospital admission in heart failure[J]. Circulation, 2018, 1(1): 136-42. http://3b2dgy3hpb1t33upat2k0467-wpengine.netdna-ssl.com/wp-content/uploads/2017/11/AHA-HF-admit-poster-40x84.pdf
    [27]
    Zhou F, Li TF, Li H, et al. TPCNN: two-phase patch-based convolutional neural network for automatic brain tumor segmentation and survival prediction[C]//Brainlesion: Glioma Mult Scler Stroke Trauma Brain Inj, 2018. DOI: 10.1007/978-3-319-75238-9_24.
    [28]
    Gao C, Sun H, Wang T, et al. Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson's disease[J]. Sci Rep, 2018, 8(1): 7129. doi: 10.1038/s41598-018-24783-4
    [29]
    Li ZZ, Yuan L, Zhang C, et al. A novel prognostic scoring system of intrahepatic cholangiocarcinoma with machine learning basing on real-world data[J]. Front Oncol, 2021, 10: 576901. doi: 10.3389/fonc.2020.576901
    [30]
    Liu L, Yu Y, Fei Z, et al. An interpretable boosting model to predict side effects of analgesics for osteoarthritis[J]. BMC Syst Biol, 2018, 12(suppl 6): 105. doi: 10.1186%2Fs12918-018-0624-4.pdf
    [31]
    Mo X, Chen X, Ieong C, et al. Early prediction of clinical response to etanercept treatment in juvenile idiopathic arthritis using machine learning[J]. Front Pharmacol, 2020, 21(4): 1164-75. http://www.researchgate.net/publication/343338927_Early_Prediction_of_Clinical_Response_to_Etanercept_Treatment_in_Juvenile_Idiopathic_Arthritis_Using_Machine_Learning
    [32]
    Hatmal MM, Al-Hatamleh MAI, Olaimat AN, et al. Side effects and perceptions following COVID-19 vaccination in Jordan: a randomized, cross-sectional study implementing machine learning for predicting severity of side effects[J]. Vaccines, 2021, 9(6): 556. doi: 10.3390/vaccines9060556
    [33]
    Kan JT, Li A, Zou H, et al. A machine learning based dose prediction of lutein supplements for individuals with eye fatigue[J]. Front Nutr, 2020, 7: 577923. doi: 10.3389/fnut.2020.577923
    [34]
    Huang X, Yu Z, Wei X, et al. Prediction of vancomycin dose on high-dimensional data using machine learning techniques[J]. Expert Rev Clin Pharmacol, 2021, 14(6): 761-71. doi: 10.1080/17512433.2021.1911642
    [35]
    Huang X, Yu Z, Bu S, et al. An ensemble model for prediction of vancomycin trough concentrations in pediatric patients[J]. Drug Des Devel Ther, 2021, 15: 1549-59. doi: 10.2147/DDDT.S299037
    [36]
    Mamada H, Iwamoto K, Nomura Y, et al. Predicting blood-to-plasma concentration ratios of drugs from chemical structures and volumes of distribution in humans[J]. Mol Divers, 2021, 25(3): 1261-70. doi: 10.1007/s11030-021-10186-7
    [37]
    Nguyen M, Brettin T, Long SW, et al. Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumonia[J]. Sci Rep, 2018, 8(1): 421-6. doi: 10.1038/s41598-017-18972-w
    [38]
    Cui W, Bachi K, Hurd Y, et al. Using big data to predict outcomes of opioid treatment programs[J]. Stud Health Technol Inform, 2020, 272: 366-9.
    [39]
    Sidorov P, Naulaerts S, Ariey-Bonnet J, et al. Predicting synergism of cancer drug combinations using NCI-ALMANAC data[J]. Front Chem, 2019, 7: 509. doi: 10.3389/fchem.2019.00509
    [40]
    Wacker S, Noskov SY. Performance of machine learning algorithms for qualitative and quantitative prediction drug blockade of hERG1 channel[J]. Comput Toxicol, 2018, 6: 55-63. doi: 10.1016/j.comtox.2017.05.001
    [41]
    Babajide Mustapha I, Saeed F. Bioactive molecule prediction using extreme gradient boosting[J]. Molecules, 2016, 21(8): 983. doi: 10.3390/molecules21080983
    [42]
    Lu J, Chen M, Qin Y. Drug-induced cell viability prediction from LINCS-L1000 through WRFEN-XGBoost algorithm[J]. BMC Bioinformatics, 2021, 22(1): 13. doi: 10.1186/s12859-020-03949-w
    [43]
    Zhong JC, Sun YS, Peng W, et al. XGBFEMF: an XGBoost-based framework for essential protein prediction[J]. IEEE Trans Nanobioscience, 2018, 17(3): 243-50. doi: 10.1109/TNB.2018.2842219
    [44]
    Yu B, Qiu W, Chen C, et al. SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting[J]. Bioinformatics, 2020, 36(4): 1074-81. doi: 10.1093/bioinformatics/btz734
    [45]
    Kaushal R, Shojania KG, Bates DW. Effects of computerized physician order entry and clinical decision support systems on medication safety: a systematic review[J]. Arch Intern Med, 2003, 163(12): 1409-16. doi: 10.1001/archinte.163.12.1409
    [46]
    Mishra AK, Keserwani PK, Samaddar SG, et al. A decision support system in healthcare prediction[M]//Lecture Notes in Electrical Engineering. Singapore: Springer Singapore, 2018: 156-67.
    [47]
    Fitriyani NL, Syafrudin M, Alfian G, et al. HDPM: an effective heart disease prediction model for a clinical decision support system[J]. IEEE Access, 2020, 8: 133034-50. doi: 10.1109/ACCESS.2020.3010511
    [48]
    Mo XL, Chen XJ, Li HW, et al. Early and accurate prediction of clinical response to methotrexate treatment in juvenile idiopathic arthritis using machine learning[J]. Front Pharmacol, 2019, 10: 1155. doi: 10.3389/fphar.2019.01155
    [49]
    Hou N, Li M, He L, et al. Predicting 30-days mortality for MIMIC-Ⅲ patients with Sepsis-3: a machine learning approach using XGboost[J]. J Transl Med, 2020, 18(1): 462. doi: 10.1186/s12967-020-02620-5
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(1)

    Article Metrics

    Article views (805) PDF downloads(126) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return