基于RoBERTa-CRF的肝癌电子病历实体识别研究 |
修订日期:2023-03-18 点此下载全文 |
引用本文:邓嘉乐,胡振生,连万民,等.基于RoBERTa-CRF的肝癌电子病历实体识别研究[J].医学信息学杂志,2023,44(6):42-47 |
摘要点击次数: |
全文下载次数: |
|
基金项目:国家重点研发计划(项目编号:2021YFC2009402);国家重点研发计划(项目编号:2022YFC3601600);广东省自然科学基金项目(项目编号:2021A1515011897)。 |
|
中文摘要:目的/意义 肝癌电子病历中蕴涵大量医学专业知识,且大部分以非结构化数据形式存在,难以自动化提取。肝癌电子病历实体识别研究有助于构建肝癌领域医疗辅助决策系统和医学知识图谱。方法/过程 构建RoBERTa算法与CRF算法相结合的命名实体识别模型,利用自标注肝癌电子病历真实数据进行模型训练与测试。结果/结论 RoBERTa-CRF模型优于其他基线模型,具有较好实体识别效果。 |
中文关键词:肝癌电子病历 实体识别 知识提取 RoBERTa-CRF模型 自然语言处理 |
|
Study on Entity Recognition of Liver Cancer Electronic Medical Records Based on RoBERTa-CRF |
|
|
Abstract:Purpose/Significance Electronic medical records (EMR) of liver cancer contain a large amount of medical knowledge, and most of the knowledge is in the form of unstructured data which is difficult to extract automatically. The research on entity recognition of liver cancer EMR is important in the construction of clinical decision support systems and medical knowledge graphs in the area of liver cancer. Method/Process A named entity recognition (NER) model combined with RoBERTa algorithm and CRF algorithm is built, and the model achieves excellent effect. The real data of self-labeled EMR of liver cancer are used for model training and testing. Result/Conclusion RoBERTa-CRF model is better than other baseline models and has good entity recognition effect. |
keywords:electronic medical records of liver cancer entity recognition knowledge extraction RoBERTa-CRF model natual language processing(NLP) |
查看全文 查看/发表评论 下载PDF阅读器 |