期刊文献+

基于SVM的汉语句子片段划分

Chinese sentence segmentation based on SVM method
在线阅读 下载PDF
导出
摘要 针对长句子引起句法分析性能下降的问题,本文提出了一种基于SVM的句子片段划分方法:先根据语法结构将句子划分为多个片段,识别出每个片段的类别;然后根据片段的类别将句子分割为几个部分,每个部分作为句法分析的基本单元;最后将句法分析之后的各个部分进行合并,形成完整的分析结果.该方法减小了句法分析的复杂度,提高了分析的准确率. Aimed at the decreased performance of syntactic parsing caused by long sentence, this paper presents a method of identifying the segments based on the SVM classifier to solve this problem. In this method, a sentence is firstly divided into different segments, each of which is assigned a label to indicate its syntactic type. Then the sentence is parsed based on the segments. Finally, all the segments are linked together through the dependency relations and the parsing of the whole dependency tree is completed. Experiments show that the identification of segments decreases the complexity of parsing and improves the accuracy of Chinese dependency parsing.
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2009年第5期52-55,共4页 Journal of Harbin Institute of Technology
基金 国家自然科学基金资助项目(60575042 60675034)
关键词 依存句法分析 句子片段 依存关系 支持向量机 dependency parsing segment dependency relation SVM
  • 相关文献

参考文献9

  • 1SHIUAN P L, ANN C T H. A divide-and-conquer strategy for parsing[ C]//Proceedings of the 5th International Workshop on Parsing Technologies, Santa Cruz: [ s. n. ], 1996:57 - 66.
  • 2LYON C, DICKERSON B. Reducing the complexity of parsing by a method of decomposition [ C ]//International Workshop on Parsing Technology, [ S. L. ] : Association of Computational Linguistics Massachusetts, 1997.
  • 3SANG E F T K, DEJEAN H. Introduction to the CoNLL-2001 shared task: clause identification [ C ]//Proceedings of CoNLL - 2001, Toulouse: [ s. n. ] , 2001 : 53 - 57.
  • 4CHIANG D, BIKEL D M. Recovering latent information in treebanks[ C ]//Proceedings of the 19th International Conference on Computational Linguistics, Taipei: [ s. n. ] , 2002:183 - 189.
  • 5RIEDEL S, CAKICI R, MEZA-RUIZ I. Multi-lingual dependency parsing with incremental integer linear pro- gramming[ C]//Proceedings of the CoNLL- 2006, New York : [ s. n. ] , 2006 : 226 - 230.
  • 6刘挺,马金山,李生.基于词汇支配度的汉语依存分析模型[J].软件学报,2006,17(9):1876-1883. 被引量:24
  • 7KIM S D, ZHANG B T, KIM Y T. Reducing parsing complexity by intra- sentence segmentation based on maximum entropy[ C ]//Proceedings of EMNLP/VLC - 2000, Hong Kong: [ s. n. ], 2000 : 64 - 171.
  • 8JIN M, MI-YOUNG K, KIM D, et al. Segmentation of chinese long sentences using commas [ C ]//Proceedings of 3rd ACL SIGHAN Workshop, Spain: Association for Computational Linguistics, 2004 : 1 - 8.
  • 9MCDONALD R, LERMAN K, PEREIRA F. Multilingual dependency analysis with a two - stage discriminative parser [ C ]//Proceedings of the CoNLL - 2006, New York : [ s. n. ] ,2006 : 216 - 220.

二级参考文献1

共引文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部