融合相似度算法与预训练模型的中文电子病历实体映射方法研究 |
修订日期:2023-02-20 点此下载全文 |
引用本文:冯凤翔,任慧玲,李晓瑛,等.融合相似度算法与预训练模型的中文电子病历实体映射方法研究[J].医学信息学杂志,2023,44(5):45-50 |
摘要点击次数: |
全文下载次数: |
|
基金项目:科技创新2030——“新一代人工智能”重大专项课题“中文医学术语体系构建”(项目编号:2020AAA0104901)。 |
|
中文摘要:采用自标注中文电子病历标准数据集,融合相似度算法与预训练模型并分别应用于实体映射的候选实体生成和实体消歧阶段,对不同相似度算法和预训练模型的性能进行比较分析。提出基于别名间相似性改进药物类实体映射效果的方法,结合Jaccard相似度算法与BERT预训练模型,高效实现海量中文电子病历实体映射任务。 |
中文关键词:实体映射 实体标准化 相似度算法 电子病历 BERT模型 |
|
Study on Chinese Electronic Medical Record Entity Mapping Method by Fusing Similarity Algorithms and Pre-trained Models |
|
|
Abstract:The self-annotated Chinese electronic medical record(EMR) standard datasetisused, the similarity algorithms and pre-trained models are fused and applied to the candidate entity generation and entity disambiguation stages of entity mapping, and the performance of different similarity algorithms and pre-trained models is compared and analyzed. A method is proposed to improve the mapping effect of drug class entities based on alias similarity, and the Jaccard similarity algorithm and BERT pre-trained model are combined to efficiently realize the task of mapping the entities of massive Chinese EMRs. |
keywords:entity mapping entity standardization similarity algorithm electronic medical record(EMR) BERT model |
查看全文 查看/发表评论 下载PDF阅读器 |