期刊文献+

文本自动评分算法建构研究综述

A review of studies on the construction of automated writing evaluation algorithm
在线阅读 下载PDF
导出
摘要 近年来,文本自动评分作为一种人工智能形式,在语言教学、测评研究中的关注度逐渐上升。本文从特征提取、模型建构和模型检验三个方面,对应用语言学领域英语写作能力自动评分研究进行梳理。第一,特征提取依托自然语言处理技术,将写作能力的重要构念量化,为自动评分的模型建构奠定数据基础。同时,自动评分模型能够筛选出预测文本质量的语言特征,为写作教学提供参考。第二,模型建构的方法逐渐从传统统计算法过渡到机器学习算法,再到大语言模型微调技术。然而,目前大语言模型缺乏透明度,难以聚焦预测文本质量的重要特征,直接对比大语言模型和其他算法在文本评估方面预测准确度的经验证据仍然不足。此外,模型建构的因变量——文本质量的评价标准也逐渐拓展,包括了内部评估、标准化测试和语言能力框架。第三,在模型检验方面,研究者逐渐认识到以人机一致性为唯一检测标准的局限性,并提出和延伸了自动评分效度验证框架。 In recent years,automated writing evaluation,as a form of artificial intelligence,has gained increasing attention in language teaching and assessment research.This paper reviews research on automated English writing evaluation within applied linguistics,focusing on feature extraction,model construction,and model validation.First,text feature extraction relies on natural language processing techniques to quantify linguistic features,laying the data foundation for the construction of automated evaluation models.Meanwhile,these models can identify linguistic features that predict text quality,providing reference for writing instruction.Second,the methods for model construction have gradually transitioned from traditional statistical methods to machine learning algorithms,and then to fine-tuning of large language models.However,large language models currently lack transparency,making it difficult to pinpoint the linguistic features predictive of text quality and there is still a lack of empirical evidence directly comparing the predictive accuracy of large language models and other algorithms in writing evaluation.Besides,the criteria for evaluating writing quality have expanded to include internal assessments,standardized tests,and language proficiency frameworks.Third,in terms of model validation,researchers have gradually recognized the limitations of using human-machine consistency as the sole evaluation criterion,and have proposed and extended the self-assessment validity framework.
作者 马鸿 刘可怡 MA Hong;LIU Keyi
机构地区 浙江大学
出处 《语言测试与评价》 2024年第2期1-12,共12页 Language Testing and Assessment
基金 国家社会科学基金一般项目“基于机器学习对标《量表》的中国英语学习者写作能力发展研究”(项目编号:23BYY153)的阶段性成果。
关键词 自动评分 大语言模型 特征提取 模型建构 模型检验 AWE large language model feature extraction model construction model validation
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部