期刊文献+

基于LASSO回归与随机森林算法的心血管代谢性共病危险因素

Risk factors for cardiometabolic multimorbidity based on LASSO regression and random forest algorithms
原文传递
导出
摘要 目的 基于LASSO回归和随机森林算法分析心血管代谢性共病(cardiometabolic multimorbidity,CMM)的危险因素,为临床决策提供依据。方法 基于中国健康与养老追踪调查(China health and retirement longitudinal study,CHARLS)2011―2020年随访14 358名≥45岁人群的数据,通过LASSO回归和随机森林的特征重要性评估进行变量筛选后,将研究对象按8∶2的比例随机分为训练集和测试集,利用合成少数样本过采样方法(synthetic minority over-sampling technique,SMOTE)将训练集调整为平衡数据集,应用随机森林算法构建疾病预测模型,应用网格搜索和5折交叉验证优化预测模型。采用敏感性分析保证模型的稳健性。结果 该预测模型的准确率达到99.46%,召回率达到69.03%,F1得分为0.82,平均曲线下面积为0.93,敏感性分析显示,模型具有良好稳健性。性别、年龄、腰围、职业、教育程度、空腹血糖、不良行为生活方式、基线自报疾病、风速、使用不清洁能源等可作为CMM的发病预测因素(均P<0.05)。结论 本研究成功构建了CMM的预测模型,发现多种危险因素与CMM发生相关,为临床医生在CMM高危群体中实施早期干预提供科学依据。 Objective This study aims to identify risk factors of cardiometabolic multimorbidity(CMM) based on LASSO regression and random forest algorithms,providing a scientific basis for clinical decisions.Methods Using data from 14 358 individuals over 45 during the 2011-2020 follow-up period in the China health and retirement longitudinal study(CHARLS),variables were selected using the feature importance assessment from LASSO regression and random forest.Study participants were randomly divided into a training set and a test set at a ratio of 8∶2.The synthetic minority over-sampling technique(SMOTE) was employed to adjust the training set to a balanced dataset,then a disease prediction model was built using the random forest algorithm.Finally,grid search and 5-fold cross-validation were used to optimize the prediction model.Sensitivity analysis was conducted to ensure the model′s robustness.Results The accuracy of the prediction model reached 99.46%,the recall rate was 69.03%,the F1 score was 0.82,and the average area under curve value was 0.93.The model demonstrated good robustness through sensitivity analysis.The following factors were identified as predictors for CMM:gender,age,waist circumference,occupation,education level,fasting blood glucose,unhealthy lifestyle behaviors,mobility,baseline self-reported diseases,wind force,and use of unclean energy sources.Conclusions This study successfully builds a prediction model for CMM,indicating the correlation of several risk factors with CMM,which provides a scientific basis for clinicians to undertake early intervention in CMM high-risk groups.
作者 张书迎 许珊 谭艳芳 凌可欣 李元 刘相佟 ZHANG Shuying;XU Shan;TAN Yanfang;LING Kexin;LI Yuan;LIU Xiangtong(School of Public Health,Capital Medical University,Beijing 100069,China;Beijing Municipal Key Laboratory of Clinical Epidemiology,Beijing 100069,China)
出处 《中华疾病控制杂志》 北大核心 2025年第1期82-88,共7页 Chinese Journal of Disease Control & Prevention
基金 国家自然科学基金(82003559) 国家教育部“春晖计划”合作科研项目(HZKY20220056)。
关键词 心血管代谢性共病 LASSO回归 随机森林算法 合成少数样本过采样方法 Cardiometabolic multimorbidity LASSO regression Random forest algorithm Synthetic minority over-sampling technique
  • 相关文献

参考文献1

二级参考文献10

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部