TcmYiAnBERT:基于无监督学习的中医医案预训练模型

doi:10.3969/j.issn.1673-6036.2023.07.010

首页 > 过刊浏览>2023年第44卷第7期 >TcmYiAnBERT:基于无监督学习的中医医案预训练模型

TcmYiAnBERT:基于无监督学习的中医医案预训练模型
DOI:
                        10.3969/j.issn.1673-6036.2023.07.010
                    
作者:
                        
                        
                    
作者单位:(湖南中医药大学信息科学与工程学院 长沙 410013)
作者简介:胡为,助教,发表论文3篇；通信作者:刘伟,博士,副教授。
通讯作者:
中图分类号:R-058
基金项目:湖南省自然科学基金项目(项目编号:2022JJ30438)；湖南中医药大学校级自然科学基金项目(项目编号:2022XJZKC016)；湖南省教育厅科学研究项目(项目编号:20C1435)。

TcmYiAnBERT:A Traditional Chinese Medicine Case Pre-training Model Based on Unsupervised Learning

Author:

Affiliation:

(School of Informatics,Hunan University of Chinese Medicine, Changsha 410013, China)

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

目的/意义充分挖掘中医医案中的文本信息,提高中医药信息化程度和中医医案症状术语抽取、关系抽取等下游任务的准确率。方法/过程通过光学字符识别和爬虫技术获取大量中医医案数据并进行预处理,构建面向中医医案领域预训练数据集,使用BERT模型预训练方法,经过多轮训练得到首个面向中医领域专有预训练模型TcmYiAnBERT,并将该模型开源。结果/结论中医领域专有预训练模型TcmYiAnBERT在中医命名实体识别任务中比未使用该模型的预训练模型F1值提高2.8个百分点。

Abstract:

Purpose/Significance To fully mine the text information in traditional Chinese medicine (TCM) medical records, to improve the degree of TCM informatization, and to improve the accuracy of downstream tasks such as symptom term extraction and relationship extraction in TCM medical records.Method/Process A large number of TCM medical case data are obtained through optical character recognition (OCR) technology and crawler technology, and data preprocessing is carried out. A pre-training data set for TCM medical case field is constructed. The first proprietary pre-training model, namely TcmYiAnBERT, for TCM field is obtained through multiple rounds of training by using the BERT model pre-training method, and the model is open source. Result/Conclusion The experiment shows that the recognition accuracy of TCM domain specific pre-training model TcmYiAnBERT in the task of TCM named entity recognition (NER) is 2.8 percentage points higher than that of other pre-training models.

参考文献

相似文献

引证文献

引用本文

胡为,刘伟,盛威,等. TcmYiAnBERT:基于无监督学习的中医医案预训练模型[J].医学信息学杂志,2023,44(7):63-67

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:2022-12-24
录用日期:
在线发布日期: 2023-08-29
出版日期:

首页

期刊介绍

在线期刊

投稿指南

出版政策

专家中心

学术交流

引用本文

分享

文章指标

历史

友情链接