期刊文献+

基于概念统计的英文自动文摘研究 被引量:9

Research on Automatic Summarization Based on Concept Counting for English Texts
在线阅读 下载PDF
导出
摘要 文章提出了一种基于概念统计和语义层次分析的自动文摘方法,并以此实现了一个英文自动文摘系统。系统利用WordNet对英文文章进行词语分析,用概念统计的方法选取文章的主题概念,以此构建向量空间模型;并根据主题概念在概念层次树上的分布划分意义块,以意义块为单位抽取文摘,初步解决多主题文章的文摘结构不平衡问题。该文主要介绍概念层次树的构造,主题概念的抽取步骤,句子重要度的计算和意义块的划分算法。测试表明该文提到的方法比传统的基于词频统计的方法有更高的召回率与精确率。 This paper puts forward a new summarizing method based on concept counting and semantic hierarchy anal-ysis.Based on the extracted topic concepts,it constructs concept counting and semantic hierarchy analysis an effective English Text Summarizing system is developed.This system uses topic concepts to construct Vector Space Model.Combing with discourse analysis and readability improvement ,the abstract of a text is generated.This paper proposes the parame-ters of evaluating topic concepts,and mainly describes the detail algorithm of building concept hierarchy tree,extracting topic concepts and the application of topic concepts in generating abstracts.The experiment result shows that compared to word counting,this new method has enhanced both the recall rate and the precision rate of the system,and it helps to solve the abstract distribution problem of multi-topic texts.
出处 《计算机工程与应用》 CSCD 北大核心 2002年第24期7-9,16,共4页 Computer Engineering and Applications
基金 国家自然科学基金项目(批准号:69972025)
关键词 概念统计 英文自动文摘 主题概念 向量空间模型 句子重要度 计算机 Concept counting,Topic concept ,Vector space model,Sentence significance
  • 相关文献

参考文献8

  • 1Edmundson H P.New methods in automatic extraction[J].Journal of the ACM, 1968; 16(2)
  • 2Kupiec J,Pedersen J,Chen F.A trainable document summarizer[C].In:Proceedings of the Eighteenth Annual International ACM Conference on Research and Development in Information Retrieval(SIGIR),1995
  • 3郭玉箐,万敏,罗振声.面向非受限领域的综合式自动中文文摘方法[J].清华大学学报(自然科学版),2002,42(1):139-142. 被引量:10
  • 4郭玉箐,张旭平,罗振声.自动文摘中统计信息与文本结构自动分析初探[C].In:International Conference on Machine Translation & Computer Language Information Processing,1999
  • 5WAN Min,LUO Zhensheng,GUO Yuqing. Study on semantic paragraph partition in automatic abstracting system[C].In:Natural Language Processing and Knowledge Engineering(NLPKE)Mini Symposium of the 2001 IEEE International Conference on Systems, Man,and Cybernetics(SMC2001) ,2001
  • 6Lin. Knowledge-based automatic topic identification[J].Information Processing and Management , 1997; 26 (1)
  • 7Grishman R,Macleod C,Meyers A.COMPLEX syntax:building a computational lexicon[C].In: Proceedings of COLING-94,1994
  • 8DeJong G.Fast Skimming of News Stories:The FRUMP System[D].PhD thesis. 1978

二级参考文献2

共引文献9

同被引文献100

引证文献9

二级引证文献104

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部