基于super learner算法的集成学习及其在纵向删失数据预测建模中的应用 " />
基于super learner算法的集成学习及其在纵向删失数据预测建模中的应用 " />
Ensemble learning based on super learner algorithm and its application in prediction modeling for longitudinal censored data" />
Objective Ensemble learning is a novel approach to improving learning accuracy in machine learning field recently. This paper aims to introduce the application of ensemble learning method based on super learner algorithm in the prediction modeling of longitudinal censored data and its implementation of R language. Methods This paper introduced the principle in modeling longitudinal censored data based on super learner algorithm and its implementation method with R-programming language. In addition, tumor survival data from TCGA database were used for real data analysis to illustrate its performance in practice. Results The estimation methods for model parameters and definition of ensemble learning parameters based on super learner algorithm are more flexible. In actual data analysis, super learner algorithm can make full use of the obtained data to establish the prediction model. The prediction accuracy of the model is 0.8737 (95% CI: 0.7897-0.9330) and the C-index is 0.883, so the prediction performance is good. Conclusion The ensemble learning approach with super learner algorithm provides a new choice for the prediction analysis based on longitudinal censored data.
[1]BETENSKY R A. Measures of follow-up in time-to-event studies: Why provide them and what should they be?[J]. Clinical Trials, 2015, 12(4):403-408.
[2]GEORGE B, SEALS S, ABAN I. Survival analysis and regression models [J]. Journal of Nuclear Cardiology, 2014, 21(4):686-694.
[3]DOUPE P, FAGHMOUS J, BASU S. Machine learning for health services researchers[J]. Value in Health, 2019, 22(7):808-815.
[4]HOSNI M, ABNANE I, IDRI A, et al. Reviewing ensemble classification methods in breast cancer[J]. Computer Methods Programs Biomedicine, 2019, 177:89-112.
[5]CHIAYU SU E, IQBAL U, JACKLI Y C. Unity is Strength: Improving biomedical classification performance based on ensemble learning approaches[J]. Computer
Methods Programs Biomedicine, 2017, 142:A1.
[6]XU G W, LIU M, JIANG Z F, et al. Bearing fault diagnosis method based on deep convolutional neural network and random forest ensemble learning[J]. Sensors, 2019,
19(5):1088.
[7]VAN DER LAAN M J, RUBIN D. Targeted maximum likelihood learning[J]. The International Journal of Biostatistics, 2006, 2(1):1-40.
[8]PIRRACCHIO R, PETERSEN M L, CARONE M, et al. Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): A populationbased stud
[J]. Lancet Respir Med, 2015, 3(1):42-52.
[9]VAN DER L M, ROSE S. Targeted Learning: Causal inference for observational and experimental data[M].New York: Springer Science+Business Media, 2011.
[10]WONG J, MANDERSON T, ABRAHAMOWICZ M, et al. Can hyperparameter tuning improve the performance of a super learner?:A case study[J]. Epidemiology,
2019, 30(4):521-531.
[11]TOMCZAK K, CZERWINSKA P, WIZNEROWICZ M. The Cancer Genome Atlas (TCGA): An immeasurable source of knowledg[J]. Contemp Oncol(Poznan), 2015, 19(1A):A68-A77.
[12]CHEN F, LI Z, ZHOU H. Identification of prognostic miRNA biomarkers for predicting overall survival of colon adenocarcinoma and bioinformatics analysis:A study based on The Cancer Genome Atlas database[J]. Journal of Cellular Biochemistry, 2019, 120(6):9839-9849.
[13]LONGATO E, VETTORETTI M, DI CAMILLO B. A practical perspective on the concordance index for the evaluation and selection of prognostic time-to-event models[J]. Journal of Biomedical Informatics, 2020, 108:103496.
[14]LI J Q, GU J H, LU Y, et al. Development and validation of a Super learnerbased model for predicting survival in Chinese Han patients with resected colorectal cancer[J]. Jpn J Clin Oncol, 2020, 50(10):1133-1140.
[15]GOLMAKANI M K, POLLEY E C. Super learner for survival data prediction[J/OL]. Int J Biostat, 2020,16(2).https://www.degruyter.com/view/journals/ijb/16/2ijb.16.issue-2.xml.