摘要
2005年7月13日至15日,中国科学院自动化研究所、计算技术研究所和厦门大学计算机系联合举办了我国首届统计机器翻译研讨班。本文主要介绍本次研讨班参加单位的测试系统和实验结果,并给出相应的分析。测试结果表明,我国的统计机器翻译研究起步虽晚,但已有快速进展,参评系统在短期内得到了较好的翻译质量,与往年参加863评测的基于规则方法的系统相比性能虽还有差距,但差距已经不大。从目前国际统计机器翻译研究的现状和发展趋势来看,随着数据资源规模的不断扩大和计算机性能的迅速提高,统计机器翻译还有很大的发展空间。在未来几年内,在基于短语的主流统计翻译方法中融入句法、语义信息,必将成为机器翻译发展的趋势。
Institute of Automation, Institute of Computing Technology of Chinese Academy of Sciences, and Department of Computer Science of Xiamen University held the first Statistical Machine Translation Workshop in China together, from July 13th to 15th in 2005. This paper describes the tested systems of involved institutions, and analyzes the results of their experiments. The test results show that although the research of statistical machine translation started late in China,it develops rapidly. The tested systems got quite good results in a short period. Compared with the rule-based systems reported in the formal "863" evaluation, the performance is somewhat lower; however, the difference is small. According to the state of art and the trend of international statistic machine translation research,we believe that there is still great space for the improvement of statistic machine translation, with larger-scale data resources and more powerful hardware. In near future, phrase-based method incorporated with syntax and semantic information will become the mainstream of statistical machine translation.
出处
《中文信息学报》
CSCD
北大核心
2006年第5期1-9,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60272041)
关键词
人工智能
机器翻译
统计机器翻译
基于短语的翻译模型
机器翻译评测
artificial intelligence
machine translation
statistical machine translation
phrase-based translation model
machine translation evaluation