

浏览全部资源
扫码关注微信
1.湖南科技大学 数学与计算科学学院,湖南 湘潭 411201
2.中国中医科学院 中医临床基础医学研究所,北京 100700
3.中国中医科学院 中医药信息研究所,北京 100700
Received:08 March 2024,
Published Online:03 June 2024,
Published:20 December 2024
移动端阅览
杨航,彭叶辉,杨伟等.基于BRL神经网络模型的名家医案实体识别[J].中国实验方剂学杂志,2024,30(24):167-173.
YANG Hang,PENG Yehui,YANG Wei,et al.Entity Recognition in Famous Medical Records Based on BRL Neural Network Model[J].Chinese Journal of Experimental Traditional Medical Formulae,2024,30(24):167-173.
杨航,彭叶辉,杨伟等.基于BRL神经网络模型的名家医案实体识别[J].中国实验方剂学杂志,2024,30(24):167-173. DOI: 10.13422/j.cnki.syfjx.20241165.
YANG Hang,PENG Yehui,YANG Wei,et al.Entity Recognition in Famous Medical Records Based on BRL Neural Network Model[J].Chinese Journal of Experimental Traditional Medical Formulae,2024,30(24):167-173. DOI: 10.13422/j.cnki.syfjx.20241165.
目的
2
提高医案文本中命名实体的识别准确率,实现对医案知识的有效挖掘和利用,针对医案文本特点,构建一种Bert-Radical-Lexicon(BRL)神经网络模型识别医案实体。
方法
2
从《中华历代名医医案全库》中选取408篇与高血压病相关的医案,并通过人工标注构建一个包含1 672条医案语料的数据集。随后,将这些语料随机分为3个子集,即训练集(1 004条)、测试集(334条)和验证集(334条)。以此为基础,构建融合多种医案文本特征的BRL模型,及其变体模型BRL-B、BRL-L、BRL-R,以及一个基线模型Base。在模型训练阶段,利用训练集对上述模型进行训练,为了减少过拟合的风险,在训练过程中持续监控各模型在验证集上的表现,并保存效果最优的模型。最后,在测试集上评估这些模型的性能。
结果
2
与其他模型比较,BRL模型在医案命名实体识别任务中的性能最优,对疾病、症状、舌象、脉象、证候、治法、方剂及中药共8类实体的整体识别精确率为90.09%,召回率为90.61%,精确率与召回率的调和平均数(
F
1)为90.35%。BRL模型较Base模型,对实体识别的整体
F
1提升了5.22%,其中对脉象实体
F
1提升了6.92%,提升幅度最大。
结论
2
通过在嵌入层融入多种医案文本特征,BRL神经网络模型具有更强的命名实体识别能力,进而提取更准确可靠的中医临床信息。
Objective
2
In order to improve the recognition accuracy of named entities in medical record texts and realize the effective mining and utilization of medical record knowledge, a Bert-Radical-Lexicon(BRL) neural network model is constructed to recognize medical record entities with respect to the characteristics of medical record texts.
Method
2
We selected 408 medical records related to hypertension from the the
Complete Library of Famous Medical Records of Chinese Dynasties
and constructed a dataset consisting of 1 672 medical records by manually labeling. Then, we randomly divided the dataset into three subsets, including the training set(1 004 cases), the testing set (334 cases) and the validation set(334 cases). Based on this dataset, we built a BRL model that fused various text features of medical records, as well as its variants BRL-B, BRL-L and BRL-R, and a baseline model Base for experiments. During the model training phase, we trained the above models using the training set to reduce the risk of overfitting. We continuously monitored the performance of each model on the validation set during training and saved the model with the best performance. Finally, we evaluated the performance of these models on the testing set.
Result
2
Compared with other models, the BRL model had the best performance in the medical records named entity recognition task, with an overall recognition precision of 90.09%, a recall of 90.61%, and the harmonic mean of the precision and recall(
F
1) of 90.35% for eight types of entities, including disease, symptom, tongue manifestation, pulse condition, syndrome, method of treatment, prescription and traditional Chinese medicine(TCM). Compared with the Base model, the BRL model improved the overall
F
1 value of entity recognition by 5.22%, and the
F
1 value of pulse condition entity increased by 6.92%, which was the largest increase.
Conclusion
2
By incorporating a variety of medical record text features in the embedding layer, the BRL neural network model has stronger named entity recognition ability, and thus extracts more accurate and reliable TCM clinical information.
洪燕珠 , 周昌乐 , 张志枫 , 等 . 中医医案的研究进展 [J]. 中医药通报 , 2008 , 7 ( 3 ): 62 - 65 .
鲁兆麟 . 中华历代名医医案全库 [M]. 北京 : 北京科学技术出版社 , 2015 .
王琦 . 与高徒谈如何学习与整理名家医案(二) [J]. 天津中医药 , 2014 , 31 ( 2 ): 65 - 68 .
王琦 . 与高徒谈如何学习与整理名家医案(一) [J]. 天津中医药 , 2014 , 31 ( 1 ): 1 - 4 .
王若佳 , 赵常煜 , 王继民 . 中文电子病历的分词及实体识别研究 [J]. 图书情报工作 , 2019 , 63 ( 2 ): 34 - 42 .
ZHOU G D , SU J . Named entity recognition using an HMM-based chunk tagger [C]// Proceedings of the 40th annual meeting of the association for computational linguistics . 2002 : 473 - 480 .
FRESKO M , ROSENFELD B , FELDMAN R . A hybrid approach to NER by integrating manual rules into MEMM [C]// International Symposium on Artificial Intelligence and Mathematics(ISAIM 2006) . Fort Lauderdale : DBLP , 2006 .
CHEN W , ZHANG Y , ISAHARA H . Chinese named entity recognition with conditional random fields [C]// Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing . 2006 : 118 - 121 .
刘博 , 杜建强 , 聂斌 , 等 . 基于二阶HMM的中医诊断古文词性标注 [J]. 计算机工程 , 2017 , 43 ( 7 ): 211 - 216 .
王世昆 , 李绍滋 , 陈彤生 . 基于条件随机场的中医命名实体识别 [J]. 厦门大学学报:自然科学版 , 2009 , 48 ( 3 ): 359 - 364 .
YAN Y , CAI B , SONG S . Nested named entity recognition as building local hypergraphs [C]// Proceedings of the AAAI Conference on Artificial Intelligence . 2023 , 37 ( 11 ): 13878 - 13886 .
LI J , SUN A , HAN J , et al . A survey on deep learning for named entity recognition [J]. IEEE Trans Knowl Data Eng , 2022 , 34 ( 1 ): 50 - 70 .
肖瑞 , 胡冯菊 , 裴卫 . 基于BiLSTM-CRF的中医文本命名实体识别 [J]. 世界科学技术—中医药现代化 , 2020 , 22 ( 7 ): 2504 - 2510 .
DEVLIN J , CHANG M-W , LEE K , et al . BERT:Pre-training of deep bidirectional transformers for language understanding [J]. arXiv , 2018 , doi: 10.48550/arXiv.1810.04805 http://dx.doi.org/10.48550/arXiv.1810.04805 .
屈倩倩 , 阚红星 . 基于Bert-BiLSTM-CRF的中医文本命名实体识别 [J]. 电子设计工程 , 2021 , 29 ( 19 ): 40 - 43,48 .
中华中医药学会心血管病分会 . 高血压中医诊疗专家共识 [J]. 中国实验方剂学杂志 , 2019 , 25 ( 15 ): 217 - 221 .
关媛媛 , 王东军 , 田之魁 , 等 . 高血压中医证候研究的系统综述与Meta分析 [J]. 世界中医药 , 2023 , 18 ( 9 ): 1253 - 1259 .
李元 , 韩学杰 , 李献平 , 等 . 高血压病中医证类的客观化研究进展 [J]. 世界中西医结合杂志 , 2014 , 9 ( 10 ): 1139 - 1141 .
HUANG Z , XU W , YU K . Bidirectional LSTM-CRF models for sequence tagging [J]. arXiv , 2015 , doi: 10.48550/arXiv.1508.01991 http://dx.doi.org/10.48550/arXiv.1508.01991 .
WANG P , REN Z . The uncertainty-based retrieval framework for ancient Chinese CWS and POS [J]. arXiv , 2023 , doi: 10.48550/arXiv.2310.08496 http://dx.doi.org/10.48550/arXiv.2310.08496 .
ZHANG Y , YANG J . Chinese NER using lattice LSTM [J]. arXiv , 2018 , doi: 10.48550/arXiv.1805.02023 http://dx.doi.org/10.48550/arXiv.1805.02023 ..
PENG M , MA R , ZHANG Q , et al . Simplify the usage of lexicon in Chinese NER [J]. arXiv , 2019 , doi: 10.48550/arXiv.1908.05969 http://dx.doi.org/10.48550/arXiv.1908.05969 .
王清海 . 论高血压的中医概念与病名 [J]. 中华中医药学刊 , 2008 , 26 ( 11 ): 2321 - 2323 .
衷敬柏 . 基于医家经验的高血压病中医病名、病因病机与证候研究 [J]. 世界中西医结合杂志 , 2009 , 4 ( 12 ): 843 - 846 .
ZHANG S , WANG Z , YAO K , et al . The BaiBu knowledge engine:A solution for improving the semantic knowledge base of traditional Chinese medicine [J]. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , 2023 : 4718 - 4725 .
国家中医药管理局 . 中医临床诊疗术语第1部分:疾病 : GB/T16751.1-2023 [S/OL].( 2020-11-23 )[ 2024-03-01 ]. http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2020-11-23/18461.html http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2020-11-23/18461.html .
国家中医药管理局 . 中医临床诊疗术语第2部分:证候 : GB/T16751.2-2023 [S/OL].( 2020-11-23 )[ 2024-03-01 ]. http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2020-11-23/18461.html http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2020-11-23/18461.html .
国家中医药管理局 . 中医临床诊疗术语第3部分:治法 : GB/T16751.3-2023 [S/OL].( 2020-11-23 )[ 2024-03-01 ]. http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2020-11-23/18461.html http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2020-11-23/18461.html .
姚乃礼 . 中医症状鉴别诊断学 [M]. 北京 : 人民卫生出版社 , 2000 .
0
Views
187
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621