2005统计机器翻译研讨班研究报告被引量：10

Current Statistical Machine Translation Research in China

在线阅读下载PDF

导出

摘要 2005年7月13日至15日,中国科学院自动化研究所、计算技术研究所和厦门大学计算机系联合举办了我国首届统计机器翻译研讨班。本文主要介绍本次研讨班参加单位的测试系统和实验结果,并给出相应的分析。测试结果表明,我国的统计机器翻译研究起步虽晚,但已有快速进展,参评系统在短期内得到了较好的翻译质量,与往年参加863评测的基于规则方法的系统相比性能虽还有差距,但差距已经不大。从目前国际统计机器翻译研究的现状和发展趋势来看,随着数据资源规模的不断扩大和计算机性能的迅速提高,统计机器翻译还有很大的发展空间。在未来几年内,在基于短语的主流统计翻译方法中融入句法、语义信息,必将成为机器翻译发展的趋势。 Institute of Automation, Institute of Computing Technology of Chinese Academy of Sciences, and Department of Computer Science of Xiamen University held the first Statistical Machine Translation Workshop in China together, from July 13th to 15th in 2005. This paper describes the tested systems of involved institutions, and analyzes the results of their experiments. The test results show that although the research of statistical machine translation started late in China,it develops rapidly. The tested systems got quite good results in a short period. Compared with the rule-based systems reported in the formal ＂863＂ evaluation, the performance is somewhat lower; however, the difference is small. According to the state of art and the trend of international statistic machine translation research,we believe that there is still great space for the improvement of statistic machine translation, with larger-scale data resources and more powerful hardware. In near future, phrase-based method incorporated with syntax and semantic information will become the mainstream of statistical machine translation.

作者徐波史晓东刘群宗成庆庞薇陈振标杨振东魏玮杜金华陈毅东刘洋熊德意侯宏旭何中军

机构地区中国科学院自动化研究所厦门大学中国科学院计算技术研究所

出处《中文信息学报》 CSCD 北大核心 2006年第5期1-9,共9页 Journal of Chinese Information Processing

基金国家自然科学基金资助项目(60272041)

关键词人工智能机器翻译统计机器翻译基于短语的翻译模型机器翻译评测 artificial intelligence machine translation statistical machine translation phrase-based translation model machine translation evaluation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献32

1Peter F.Brown,Stephen A.Della Pietra,Vincent J.Della Pietra,and Pobert L.Mercer.1993.The Mathematics of Statistical Machine Translation:Parameter Estimation.Computational Linguistics[J].,vol.19,no.2,263-311.
2Kenji Yamada and Kevin Knight.2001.A syntax-based statistical translation model[A].In:Proceedings of the 39th Annual Meeting of the ACL[C],pages 523 -530.
3Stephan Vogel,Ying Zhang,Fei Huang,Alicia Tribble,Ashish Venugopal,Bing Zhao,Alex Waibel.2003.The CMU Statistical Machine Translation System[A].In:proceedings of the Ninth Machine Translation Summit[C].402-409.
4Xie,Guodong,Chengqing Zong and Bo Xu.2002.Chinese Spoken Language Analyzing Based on Combination of Statistical and Rule Methods[A].In:Proceedings of the International Conference of Spoken Language Processing (ICSLP' 2002)[C].Sept.16 -20,2002.Colorado,USA.Pages 613 -616.
5Wu Hua,Taiyi Huang,Chengqing Zong and Bo Xu.2000.Chinese Generation in a Spoken Dialogue Translation System[A].In:Proceedings of COLING,[C] July 27-August 4,2000.Germany.Pages 1141-1145.
6Zhou Yu,Chengqing Zong and Bo Xu.2005.Various Aligned Models In Chinese-to-English Statistical Machine Translation[A].In:Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)[C].October 30th-November 1 st,2005.Wuhan,China.Pages 443-448.
7Zong,Chengqing,Yumi WAKITA,Bo Xu,Kenji Matsui and Zhenbiao Chen.2000.Japanese-to-Chinese Spoken Language Translation Based on the Simple Expression[A].In:Proceedings of the International Conference on Spoken Language Processing (ICSLP)[C].October 16-20,2000.Beijing.Pages 418 -421.
8Pang,Wei,Zhendong Yang,Zhenbiao Chen,Wei Wei,Bo Xu and Chengqing Zong.2005.The CASIA Phrasebased Machine Translation System[A].In:Proc.IWSLT-05[C],Oct.24-25,2005.Pittsburgh,USA.114 -121.
9Xu,Bo,Zhenbiao Chen,Wei Wei,Wei Pang,and Zhendong Yang.2005.Phrase-based Statistical Machine Translation for MANOS System[A].In:Proc.MT Summit X[C].Sept.12-16,2005.Phuket,Thailand.i23-i26.
10Hua-Ping ZHANG,Qun LIU,Hong-Kui YU,Xue-Qi CHENG,Shou BAI,Chinese Named Entity Recognition Using Role Model[J].Computational Linguistics and Chinese Language Processing,Vol.8,No.2,August2003,29-60.

二级参考文献78

1陈肇雄,高庆狮.智能化英汉机译系统IMT/EC[J].中国科学（A辑）,1989,20(2):186-194. 被引量：16
2俞士汶等.机器翻译译文质量自动评估系统[A]..中国中文信息学会1991年会论文集[C].,.314—319.
3俞士汶.Automatic Evaluation of Output Quality for Machine Translation Systems .Machine Translation,1993,8:17-126.
4俞士汶姜新朱学锋.基于测试集与测试点的机译系统评估[A].见陈肇雄主编.机器翻译研究进展[C].电子工业出版社,1992年.524—537.
5俞士汶段慧明.英汉机器翻译译文质量测试大纲[J].计算机世界(D版技术专题D10-D11),1998,(13).
6H Y Tan. Chinese place automatic recognition research. In: C N Huang, Z D Dong, eds. Proc of Computational Language.Beijing: Tsinghua University Press, 1999
7Zhang Huaping, Liu Qun, Zhang Hao, et al. Automatic recognition of Chinese unknown words recognition. First SIGHAN Workshop Attached with the 19th COLING, Taipei, 2002
8S R Ye, T S Chua, J M Liu. An agent-based approach to Chinese named entity recognition. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002
9J Sun, J F Gao, L Zhang, et al. Chinese named entity identification using class-based language model. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002
10Lawrence R Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proc of IEEE, 1989,77(2): 257～286

共引文献304

1刘苗苗,李燕,王欣萌,甘琳琳,李虹.分级阅读初探:基于小学教材的汉语可读性公式研究[J].语言文字应用,2021(2):116-126. 被引量：10
2魏伟,郭崇慧,邢小宇.基于语义关联规则的试题知识点标注及试题推荐[J].数据分析与知识发现,2020,4(2):182-191. 被引量：10
3唐元楠.论机器翻译的现状[J].南国博览,2019,0(4):380-380.
4贾承勋,赖华,余正涛,文永华,于志强.基于短语替换的汉越伪平行句对生成[J].中文信息学报,2021,35(8):47-55. 被引量：2
5李霞,马骏腾,覃世豪.融合图像注意力的多模态机器翻译模型[J].中文信息学报,2020(7):68-78. 被引量：5
6陈博逊,黄晶晓.一种基于HMM和CRF的双层分词模型[J].硅谷,2009,2(22).
7徐琳宏,林鸿飞.基于语义特征和本体的语篇情感计算[J].计算机研究与发展,2007,44(z2):356-360. 被引量：13
8尹继豪,樊孝忠,刘士宁,于江德.一种基于Bootstrapping构建训练语料的方法[J].计算机研究与发展,2007,44(z2):394-397.
9于江德,谷川,葛文英,樊孝忠.一种基于字和子串联合标注的汉语分词方法[J].山西大学学报（自然科学版）,2011,34(3):357-362. 被引量：2
10于江德,周宏宇,余正涛.基于单个词语特征模板的汉语词性标注[J].山西大学学报（自然科学版）,2011,34(4):513-517. 被引量：1

同被引文献82

1那顺乌日图.计算机处理现代蒙古语TAI、TEI形式的尝试[J].民族语文,1991(3):74-79. 被引量：2
2刘春燕.论科技文体的翻译原则与方法[J].中国科技翻译,2004,17(3):13-16. 被引量：24
3刘洋,刘群,林守勋.机器翻译评测中的模糊匹配[J].中文信息学报,2005,19(3):45-53. 被引量：9
4张孝飞,陈肇雄,黄河燕,胡春玲.多策略机器翻译系统IHSMTS中实例模式泛化匹配算法[J].中文信息学报,2005,19(4):1-9. 被引量：1
5曹海龙 Zhao Tiejun Yang Muyun Li Sheng.Two-stage approach to full Chinese parsing[J].High Technology Letters,2005,11(4):359-363. 被引量：3
6黄河燕,陈肇雄,张孝飞,张克亮.大规模句子相似度计算方法[J].中文信息学报,2006,20(B03):47-52. 被引量：6
7侯宏旭,刘群,那顺乌日图.基于实例的汉蒙机器翻译[J].中文信息学报,2007,21(4):65-72. 被引量：16
8Och F J, Tillman C, Ney H. Improved alignment models for statistical machine translation. In: Proceedings of the Conference on Empirical Methods of Natural Language Processing, College Park, Maryland, USA, 1999.20-28.
9Koehn P, Och F J, Marcu D. Statistical phrase-based translation. In: Proceedings of the Homan Language Technology/ North American Chapter of the Association for Computing Linguistics 2003, Edmonton, Canada, 2003. 127-133.
10Brown P F, Cocke J, Della Pietra S A, et al. A statistical approach to machine translation. ComputationalLinguistics, 1990, 16(2) :79-85.

引证文献10

1侯宏旭,刘群,那顺乌日图.基于实例的汉蒙机器翻译[J].中文信息学报,2007,21(4):65-72. 被引量：16
2苗洪霞,蔡东风,宋彦.基于短语的统计机器翻译方法[J].沈阳航空工业学院学报,2007,24(2):32-34. 被引量：1
3薛永增,李生,赵铁军,杨沐昀.短语统计机器翻译的句法调序模型[J].通信学报,2008,29(1):7-14. 被引量：6
4王正,孙东云.统计机器翻译系统在网络翻译教学中的应用[J].上海翻译,2009(1):73-77. 被引量：18
5侯宏旭,刘群,李锦涛.一种基于短语的汉蒙统计机器翻译与调序模型[J].高技术通讯,2009,19(5):475-479. 被引量：3
6那斯尔江.吐尔逊,吾守尔.斯拉木.基于隐马尔可夫模型的维吾尔语连续语音识别系统[J].计算机应用,2009,29(7):2009-2011. 被引量：17
7杨攀,李淼,张建.基于短语统计翻译的汉维机器翻译系统[J].计算机应用,2009,29(7):2022-2025. 被引量：5
8杜金华,张萌,宗成庆,孙乐.中国机器翻译研究的机遇与挑战——第八届全国机器翻译研讨会总结与展望[J].中文信息学报,2013,27(4):1-8. 被引量：32
9涂正正.谷歌翻译与百度翻译APP的功能三维度之比较[J].江西广播电视大学学报,2018,20(1):72-77. 被引量：12
10张晓颖.统计机器翻译在商务英语翻译教学中的应用[J].课程教育研究（学法教法研究）,2019,0(14):15-16.

二级引证文献107

1彭紫婷,杨惠芳.百度翻译和谷歌翻译中的词语准确性对比研究[J].英语广场（学术研究）,2020(24):9-12. 被引量：1
2朱贞姬.机器翻译在大学英语翻译教学中的运用概述[J].现代英语,2024(10):118-120.
3陈欢,侯艳宾.农业科技类文本被动语态的翻译探究——基于谷歌翻译、搜狗翻译与小牛翻译的对比分析[J].现代英语,2020(19):47-49. 被引量：1
4蒙洁琼,熊莉芸.机器翻译在中医院校大学英语翻译教学中的应用策略研究[J].文化创新比较研究,2020,0(2):99-100. 被引量：2
5张子悦,张拯民,邢浩.有道词典、百度翻译和金山词霸的翻译质量比较研究[J].文化创新比较研究,2019,0(30):130-131. 被引量：1
6周生丹.从翻译质量分析两类翻译软件的特点[J].汉字文化,2022(3):171-172.
7杨攀,张建,李淼,乌达巴拉,雪艳.汉蒙统计机器翻译中的形态学方法研究[J].中文信息学报,2009,23(1):50-57. 被引量：10
8侯宏旭,刘群,李锦涛.一种基于短语的汉蒙统计机器翻译与调序模型[J].高技术通讯,2009,19(5):475-479. 被引量：3
9刘志文,侯宏旭,李沙茹拉,柳林.基于trigger对的蒙古语语言模型的三种实现方法比较[J].中文信息学报,2009,23(6):105-109. 被引量：1
10蔡静.新世纪以来国内信息化翻译教学研究述评[J].外语界,2010(2):8-18. 被引量：24

1汪昆,宗成庆,苏克毅.统计机器翻译和翻译记忆的动态融合方法研究[J].中文信息学报,2015,29(2):87-94. 被引量：6
2刘洋,刘群,林守勋.机器翻译评测中的模糊匹配[J].中文信息学报,2005,19(3):45-53. 被引量：9
3张剑,吴际,周明.机器翻译评测的新进展[J].中文信息学报,2003,17(6):1-8. 被引量：15
4邓志宏,张智,李建奇,汪永琳.基于MVP模式的进销存系统的软件架构设计[J].计算机与数字工程,2010,38(12):96-99. 被引量：7
5李涛.嵌入式网络数控技术与系统的研究[J].电子技术与软件工程,2013(23):224-225. 被引量：2
6白硕.关于基于规则方法的反思[J].心智与计算,2012,0(2):66-74. 被引量：1
7米海涛,赵红梅,刘群.第十二届机器翻译峰会和NIST2009机器翻译评测研讨会简介[J].中文信息学报,2009,23(6):122-125. 被引量：4
8侯宏旭,刘群,张玉洁,井佐原均.2005年度863机器翻译评测方法研究与实施[J].中文信息学报,2006,20(B03):7-18. 被引量：6
9张卫晴,张政.从机器翻译评测看机器翻译发展[J].中国科技翻译,2008,21(2):13-17. 被引量：7
10曹志玺.网格计算技术浅析[J].现代教育科学（教学研究）,2010(7):58-58.

中文信息学报

2006年第5期

浏览历史

内容加载中请稍等...

2005统计机器翻译研讨班研究报告被引量：10

参考文献32

二级参考文献78

共引文献304

同被引文献82

引证文献10

二级引证文献107

相关作者

相关机构

相关主题

浏览历史

2005统计机器翻译研讨班研究报告 被引量：10

参考文献32

二级参考文献78

共引文献304

同被引文献82

引证文献10

二级引证文献107

相关作者

相关机构

相关主题

浏览历史

2005统计机器翻译研讨班研究报告被引量：10