摘要
[目的/意义]典故作为文学创作中一种重要且广泛使用的修辞手法,对于研究中国古代文学具有不可估量的价值。尽管如此,典故的自动识别技术尚未成熟,目前主要依赖人工识别。因此,对典故的智能识别技术有待进一步深入研究。[方法/过程]文章提出一种决策层融合大模型修正的典故引用识别方法。该方法结合了传统序列标注技术和通用大语言模型,引入提示模板在决策层进行输出融合,以提高识别的准确性。此外,文章还构建了一套专门针对典故识别问题的评价指标体系。[结果/结论]通过泛化式检验,AR_BBC_LP典故识别模型在实验中表现出色,P典、R典、F1典指标分别达到了89.75%、89.38%、89.56%,明显优于现有基线模型。结果表明,该模型不仅提升了传统序列标注模型的性能,还为大语言模型的应用开辟了新领域,也为典故识别及其在中国古代文学研究中的应用提供了新视角和强有力的方法支持。
[Purpose/significance]Allusions,as an important and widely used rhetorical device in literary creation,hold immeasurable value for the study of ancient Chinese literature.Despite this,the automatic identification technology for allusions is not yet mature and currently relies mainly on manual identification,which requires further in-depth research.[Method/process]The article proposes an allusion citation recognition method that incorporates the function of making corrections using large language models at the decision-making level.This method combines traditional sequence labeling techniques with general large language models,introduces prompt templates,and performs output fusion at the decision layer to improve accuracy.In addition,this study also constructs a set of evaluation metrics specifically for the problem of allusion identification.[Result/conclusion]Through generalization testing,the AR_BBC_LP allusion identification model performed excellently in the experiment,with P_allu,R_allu,and F1_allu reaching 89.75%,89.38%,and 89.56%respectively,significantly better than existing baseline models.The results show that the model not only enhances the performance of traditional sequence labeling models but also opens up new areas for the application of large language models.It also provides a new perspective and strong methodological support for the identification of allusions and their application in the study of ancient Chinese literature.
作者
布文茹
王昊
李晓敏
周抒
邓三鸿
BU Wenru;WANG Hao;LI Xiaomin;ZHOU Shu;DENG Sanhong(School of Information Management,Nanjing University,Nanjing 210023;Key Laboratory of Data Engineering and Knowledge Services in Provincial Universities(Nanjing University),Nanjing 210023)
出处
《科技情报研究》
CSSCI
2024年第4期37-52,共16页
Scientific Information Research
基金
国家社会科学基金重大项目“新时代我国数字强边战略及实施路径研究”(编号:21&ZD163)
国家自然科学基金项目“关联数据驱动下我国非遗文本的语义解析与人文计算研究”(编号:72074108)。
关键词
典故识别
决策层融合
序列标注
大语言模型
提示学习
allusion identification
decision-layer fusion
sequence labeling
large language model
prompt learning