期刊文献+

基于词向量扩展的学术资源语义检索技术 被引量:12

Semantic Retrieval Technology of Academic Resources Based on Word Embedding Extension
原文传递
导出
摘要 [目的/意义]尝试以统计的方法为指导思想,探究基于词向量扩展的语义检索技术来提升学术资源的语义检索能力。[方法/过程]利用自然语言处理、文本挖掘技术,对采集来的学术资源(主要是学术论文)元数据进行预处理,结合word2vec词向量生成工具和elasticsearch全文检索引擎搭建语义检索系统,对学术资源进行语义检索的探索研究。[结果/结论]本文提出的方法能够有效提升学术信息的检索效果,一定程度上实现学术资源的语义检索,并为后续语义检索的进一步研究提供借鉴。 [ Purpose/significance] Based on the statistical method, the paper explored the semantic retrieval tech- nology based on word embedding expansion to enhance the semantic retrieval ability of academic resources. [ Method/ process] Using Natural Language Processing and text mining technology, the paper preprocessed the collected academic resources (mainly academic papers) metadata, combined the Word2vec word embedding generation tool and the elastic- search full text retrieval engine to build semantic retrieval system, and explored the semantic retrieval of academic re- sources. [ Result/conclusion ] The method proposed in this paper can effectively improve the retrieval effect of academic information, and it realizes the semantic retrieval of academic resources to a certain extent, and could provide reference for further research on the follow-up semantic retrieval.
作者 王仁武 陈川宝 孟现茹 Wang Renwu;Chen Chuanbao;Meng Xianru(Department of Information Management,Faculty of Economics and Management,East China Normal University,Shanghai 200241)
出处 《图书情报工作》 CSSCI 北大核心 2018年第19期111-119,共9页 Library and Information Service
基金 国家社会科学资金项目“基于数据驱动的图书馆资源发现平台研究”(项目编号:16BTQ026)研究成果之一,
关键词 word2vec Elasticsearch 语义检索 学术资源 Word2vec elasticsearch semantic retrieval academic resources
  • 相关文献

参考文献8

二级参考文献69

  • 1周爱武,汪贤惠,刘慧婷.基于HowNet词汇相关性的文本聚类[J].微电子学与计算机,2015,32(4):90-93. 被引量:4
  • 2董振东,董强.知网和汉语研究[J].当代语言学,2001,3(1):33-44. 被引量:57
  • 3宋丹,王卫东,陈英.基于改进向量空间模型的话题识别与跟踪[J].计算机技术与发展,2006,16(9):62-64. 被引量:23
  • 4廖玲,文敦伟.基于改进向量空间模型的邮件分类[J].计算机与数字工程,2007,35(4):190-193. 被引量:3
  • 5Lin X. Self-organizing semantic maps as graphical in-terfaces for information retrieval [D]. USA: University of Maryland, 1993.
  • 6Sahon G., Wong A., and Yang C.S. A vector space mod- el for automatic indexing [J]. Communication of the ACM, 1975, 18(11): 613-620.
  • 7Mao W., Wesley W. C. The phrase-based vector space model for automatic retrieval of free-text medical doc- uments [J]. Data & Knowledge Engineering, 2007, 61 (1): 76-92.
  • 8Blei D.M., Ng A.Y., and Jordan M.I. Latent Dirichlet Allocation [J]. Journal of Machine Learning Research, 2003, 3(1): 993-1022.
  • 9Chien J.T., Wu M.S. Adaptive Bayesian latent seman- tic analysis [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(I): 198 -207.
  • 10Teh Y.W., Jordan M. I., Beal M. J., and Blei D.M. Hier- archical Dirichlet processes [J]. Journal of the Ameri- can Statistical Association, 2006, 101(476): 1566 - 1581.

共引文献56

同被引文献162

引证文献12

二级引证文献58

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部