基于机器学习的2型糖尿病发病因素分析与预测
作者:
作者单位:

(1.陕西省中医医院 西安 710004;2.西北工业大学管理学院 西安 710129)

作者简介:

成路平,主治医师,发表论文6篇;通信作者:路波。

通讯作者:

中图分类号:

R-058

基金项目:

陕西省高水平中医药重点学科(中医内分泌病学)(项目编号:SX2YY2DXK-2024006);陕西省卫生健康科研创新能力提升计划(项目编号:2025YF-41)。


Analysis and Prediction of Risk Factors of Type 2 Diabetes Mellitus Based on Machine Learning
Author:
Affiliation:

(1.Shaanxi Provincial Hospital of Traditional Chinese Medicine, Xi’an 710004, China;2.School of Management, Northwestern Polytechnical University, Xi’an 710129, China)

Fund Project:

  • 摘要
  • 图/表
  • 访问统计
  • 参考文献
  • 相似文献
  • 引证文献
  • 资源附件
  • 文章评论
    摘要:

    目的/意义 构建多维数据挖掘预测框架,提升2型糖尿病风险预测准确率与临床决策效率。方法/过程 基于Pima数据集,通过单、双、多变量分析筛选核心因素,采用逻辑回归、随机森林、支持向量机、极限梯度提升和轻量梯度提升机 5种机器学习模型建模,经网格搜索与交叉验证优化参数。结果/结论 识别出血糖水平、身体质量指数、年龄等核心风险因素,与传统循证结论相符;随机森林预测准确率达0.870 1,整体性能最优。通过数据挖掘与特征筛选,降低了数据采集成本,缩短了风险因素识别周期,揭示了变量间非线性交互机制,为社区高危人群普筛提供了高效工具。

    Abstract:

    Purpose/Significance To construct a multidimensional data mining prediction framework, and to enhance the accuracy of risk prediction of type 2 diabetes mellitus (T2DM) and the efficiency of clinical decision-making. Method/Process Based on the Pima dataset, univariate, bivariate, and multivariate analyses are conducted to screen core risk factors. Five machine learning models, namely logistic regression, random forest, support vector machine, extreme gradient boosting and light gradient boosting machine, are employed for modeling. Hyperparameter optimization is performed using grid search and cross-validation. Result/Conclusion The identified key risk factors such as blood glucose level, body mass index, and age are consistent with conclusions from traditional evidence based medicine. The prediction accuracy of random forest reaches 0.870 1, and the overall performance is the best. By data mining and feature selection, the cost of data collection is reduced, the cycle of risk factor identification is shortened, and the nonlinear interaction mechanism among variables is revealed, providing an efficient tool for the general screening of high-risk groups in the community.

    参考文献
    相似文献
    引证文献
引用本文

成路平,吴思扬,路波.基于机器学习的2型糖尿病发病因素分析与预测[J].医学信息学杂志,2025,46(9):17-24

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:2025-08-06
  • 录用日期:
  • 在线发布日期: 2025-10-16
  • 出版日期:

扫码关注

官方微信