摘要
Objective To develop and evaluate a fine-tuned large language model(LLM)for traditional Chinese medicine(TCM)prescription recommendation named TCMLLM-PR.Methods First,we constructed an instruction-tuning dataset containing 68654 samples(ap-proximately 10 million tokens)by integrating data from eight sources,including four TCM textbooks,Pharmacopoeia of the People’s Republic of China 2020(CHP),Chinese Medicine Clinical Cases(CMCC),and hospital clinical records covering lung disease,liver disease,stroke,diabetes,and splenic-stomach disease.Then,we trained TCMLLM-PR using Chat-GLM-6B with P-Tuning v2 technology.The evaluation consisted of three aspects:(i)compari-son with traditional prescription recommendation models(PTM,TCMPR,and PresRecST);(ii)comparison with TCM-specific LLMs(ShenNong,Huatuo,and HuatuoGPT)and general-domain ChatGPT;(iii)assessment of model migration capability across different disease datasets.We employed precision,recall,and F1 score as evaluation metrics.Results The experiments showed that TCMLLM-PR significantly outperformed baseline models on TCM textbooks and CHP datasets,with F1@10 improvements of 31.80%and 59.48%,respectively.In cross-dataset validation,the model performed best when migrating from TCM textbooks to liver disease dataset,achieving an F1@10 of 0.1551.Analysis of real-world cases demonstrated that TCMLLM-PR's prescription recommendations most closely matched actual doctors’prescriptions.Conclusion This study integrated LLMs into TCM prescription recommendations,leverag-ing a tailored instruction-tuning dataset and developing TCMLLM-PR.This study will pub-licly release the best model parameters of TCMLLM-PR to promote the development of the decision-making process in TCM practices(https://github.com/2020MEAI/TCMLLM).
目的构建并评估一个面向中医(TCM)处方推荐的微调大语言模型(LLM),命名为TCMLLM-PR。方法首先,我们通过整合来自八个来源的数据构建了一个包含68654个样本(约1000万个令牌)的指令微调数据集,包括四本中医教材、《中华人民共和国药典》(2020年版)(CHP)、中医临床病例(CMCC)以及涵盖肺病、肝病、中风、糖尿病和脾胃病的医院临床记录。然后,我们使用ChatGLM-6B和P-Tuning v2技术微调TCMLLM-PR。评估包括三个方面:(1)与传统处方推荐模型(PTM、TCMPR、PresRecST)的比较;(2)与中药特异性LLM(神农、华佗、华佗GPT)和通用领域ChatGPT的比较;(3)评估不同疾病数据集之间的模型迁移能力。此外,我们采用了在处方推荐任务中常用的精确度、召回率和F1分数作为评估指标。结果实验表明TCMLLM-PR在中医教材和CHP数据集上的表现明显优于基线模型,F1@10提升分别为31.80%和59.48%。在跨数据集验证中,该模型在从中医教材迁移到肝病数据集时表现最佳,F1@10为0.1551。对实际案例的分析表明,TCMLLM-PR的处方建议与实际医生处方最为匹配。结论本研究将LLMs整合到中医处方推荐中,利用量身定制的指令微调数据集并开发了TCMLLM-PR。同时,本研究将公开TCMLLM-PR的最佳模型参数,促进中医临床决策支持的发展(https://github.com/2020MEAI/TCMLLM)。
基金
National Key Research and Development Program(2023YFC3502604)
National Natural Science Foundation of China(U23B2062 and 82374302).