摘要
[目的/意义]科学文献中的知识实体的挖掘、利用与评价对知识发现、构建知识网络、探索知识之间潜在关联均具有重要意义。随着机器学习、深度学习和大语言模型的发展及其应用,相比最早的基于人工标注的知识实体抽取技术,如今已经发生了翻天覆地的变化;此外,近年来,学者对科学文献中知识实体的评价也进行一些探索,取得了较大进展。[方法/过程]在相关文献调研基础上,回顾并比较了基于人工标注的方法、基于规则的方法、传统机器学习、基于深度学习与大语言模型在知识实体抽取方面的优缺点,列举了相关数据集、软件与工具及相关专业会议;从提及频率、替代计量及其影响因素、实体共现网络及实体扩散/引文网络、基于知识实体的同行评议、基于知识实体的论文新颖性和临床转化进展五大方面,对知识实体的评价研究最新进展进行了归纳与整理。[结果/结论]针对目前存在的问题,建议在具体的知识实体抽取任务中,抽取方法选择应权衡多方面因素,再依此选择一个或多个模型完成实体抽取任务;在知识实体评价方面,应重视指标多样化、可靠性、有效性、系统性和规范化研究,关注对知识实体评价指标的影响因素、指标间相关关系与因果关系的实证分析,构建基于知识实体的论文评价指标体系,从细粒度和智能化视角赋能未来的科技评价与应用。
[Purpose/Significance]The mining,utilization,and evaluation of knowledge entities in scientific literature are significant to knowledge discovery,knowledge network construction and potential relationship exploration.With the development and application of machine learning,deep learning and large language models,tremendous changes take place comparing with the earliest knowledge entity extraction technology based on manual annotation.In addition,in recent years,scholars make some explorations on the evaluation of knowledge entities in scientific literature and made great progress.[Method/Process]On the basis of literature investigation,this paper reviewed and compared the advantages and disadvantages of manual annotation-based methods,rule-based methods,traditional machine learning,deep learning,and large language models in knowledge entity extraction,and listed relevant data sets,software and tools,and relevant professional conferences.This paper summarized the latest research progress in the evaluation of knowledge entities from five aspects:mention frequency,altmetrics and its influencing factors,entity co-occurrence network and entity diffusion/citation network,peer review,novelty,and clinical translation progress of papers based on knowledge entities.[Results/Conclusions]In view of the existing problems,it is suggested that in the specific knowledge entity extraction task,the selection of extraction method should weigh many factors,and then select one or more models to complete the entity extraction task.In terms of knowledge entity evaluation,the study should pay attention to the diversification,reliability,validity,systematization,and standardization of indicators,pay attention to the empirical analysis of influencing factors of evaluation indicators,correlation,and causality among indicators,build a paper evaluation indicator system based on knowledge entities,and empower future science and technology evaluation and application from a fine-grained and intelligent perspective.
作者
刘春丽
陈爽
Liu Chunli;Chen Shuang(Library,China Medical University,Shenyang 110122,China;School of Health Management,China Medical University,Shenyang 110122,China)
出处
《现代情报》
CSSCI
2023年第12期143-163,共21页
Journal of Modern Information
关键词
知识实体
实体抽取
实体评价
科学文献
实体计量学
综述
knowledge entity
entity extraction
entity evaluation
scientific literature
entitymetrics
review