期刊文献+

基于主题划分的网页自动摘要 被引量:8

Automatic summarization of Web document based on topic segmentation
在线阅读 下载PDF
导出
摘要 提出了一种以网页结构为指导的自动摘要方法。对页面源文件进行解析时,利用文档的结构信息生成DOM树,并在此基础上划分文档主题。同时充分挖掘网页标记对主题词提取和句子重要性计算的价值。最后以主题块为单位,根据句子间的相似度调整句子权重,动态生成摘要。实验结果表明该方法能有效解决文档摘要分布不平衡问题,减少了文摘内容的冗余。 A method of automatic summarization in Web information retrieval was proposed based on the struetruc of the Web document. The document was partitioned into several topic blocks through parsing the document into DOM( Document Object Model) tree and comparing the semantic similarity. The tag information was fully used to extract topic words and key sentences. Finally the abstract was created dynamically through adjusting the weights of sentences. The experiment results show that the new method can slove the imbalance problem of abstract and reduce the redundancy of the content effectively.
出处 《计算机应用》 CSCD 北大核心 2006年第3期641-644,共4页 journal of Computer Applications
基金 江苏省高校自然科学基金资助项目(MB20022312)
关键词 WEB信息检索 文档对象模型 主题划分 句子重要度 Web information retrieval DOM topic segmentation sentence significance
  • 相关文献

参考文献13

  • 1LUHN HP.The automatic creation of literature abstract[J].IBM Journal of Research and Development,1958,2(2):159-165.
  • 2RUSH JE,SALVADOR R,ZAMORA A.Automatic abstracting and indexing production of indicative abstracts by application of contextual inference and syntactic coherence criteria[J].Journal of American Society for Information Society,1971,22(4):260-274.
  • 3SALTON G,SINGHAL A,MITRA M.Automatic Text Structuring and Summarization[J].Information Processing and Management,1997,33(2):193-207.
  • 4王永成,许慧敏.OA中文文献自动摘要系统[J].情报学报,1997,16(2):128-132. 被引量:26
  • 5RAU LF.Concpetual information extraction and retrieval from natural language input[A].Proceedings of RIAO 88 Conference[C],1988.424-437.
  • 6刘挺,吴岩,王开铸.基于信息抽取和文本生成的自动文摘系统设计[J].情报学报,1997,16(S1):31-36. 被引量:13
  • 7DELORT JY,BOUCHON-MEUNIER B,RIFQI M.Enhanced Web Document Summarization Using Hyperlinks[A].Proceedings of the fourteenth ACM conference on Hypertext and hypermedia[C].United Kingdom,2003.208-215.
  • 8HU M,LIU B.Mining and Summarizing Customer Reviews[A].KDD04[C],2004.22-25.
  • 9王继成,武港山,周源远,张福炎.一种篇章结构指导的中文Web文档自动摘要方法[J].计算机研究与发展,2003,40(3):398-405. 被引量:43
  • 10GUPTA S,KAISER G,NSISTADT D,et al.DOM-based Content Extraction of HTML Documents[A].Proceedings International WWW Conference[C].New York:ACM Press,2003.207-214.

二级参考文献8

共引文献75

同被引文献59

引证文献8

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部