文本自动评分算法建构研究综述

A review of studies on the construction of automated writing evaluation algorithm

在线阅读下载PDF

导出

摘要近年来,文本自动评分作为一种人工智能形式,在语言教学、测评研究中的关注度逐渐上升。本文从特征提取、模型建构和模型检验三个方面,对应用语言学领域英语写作能力自动评分研究进行梳理。第一,特征提取依托自然语言处理技术,将写作能力的重要构念量化,为自动评分的模型建构奠定数据基础。同时,自动评分模型能够筛选出预测文本质量的语言特征,为写作教学提供参考。第二,模型建构的方法逐渐从传统统计算法过渡到机器学习算法,再到大语言模型微调技术。然而,目前大语言模型缺乏透明度,难以聚焦预测文本质量的重要特征,直接对比大语言模型和其他算法在文本评估方面预测准确度的经验证据仍然不足。此外,模型建构的因变量——文本质量的评价标准也逐渐拓展,包括了内部评估、标准化测试和语言能力框架。第三,在模型检验方面,研究者逐渐认识到以人机一致性为唯一检测标准的局限性,并提出和延伸了自动评分效度验证框架。 In recent years,automated writing evaluation,as a form of artificial intelligence,has gained increasing attention in language teaching and assessment research.This paper reviews research on automated English writing evaluation within applied linguistics,focusing on feature extraction,model construction,and model validation.First,text feature extraction relies on natural language processing techniques to quantify linguistic features,laying the data foundation for the construction of automated evaluation models.Meanwhile,these models can identify linguistic features that predict text quality,providing reference for writing instruction.Second,the methods for model construction have gradually transitioned from traditional statistical methods to machine learning algorithms,and then to fine-tuning of large language models.However,large language models currently lack transparency,making it difficult to pinpoint the linguistic features predictive of text quality and there is still a lack of empirical evidence directly comparing the predictive accuracy of large language models and other algorithms in writing evaluation.Besides,the criteria for evaluating writing quality have expanded to include internal assessments,standardized tests,and language proficiency frameworks.Third,in terms of model validation,researchers have gradually recognized the limitations of using human-machine consistency as the sole evaluation criterion,and have proposed and extended the self-assessment validity framework.

作者马鸿刘可怡 MA Hong;LIU Keyi

机构地区浙江大学

出处《语言测试与评价》 2024年第2期1-12,共12页 Language Testing and Assessment

基金国家社会科学基金一般项目“基于机器学习对标《量表》的中国英语学习者写作能力发展研究”(项目编号:23BYY153)的阶段性成果。

关键词自动评分大语言模型特征提取模型建构模型检验 AWE large language model feature extraction model construction model validation

分类号 G63 [文化科学—教育学]

引文网络
相关文献

1沈进,罗卫华.聚焦预测关键问题,切实提升阅读能力——以统编版小学语文三年级上册第四单元为例[J].最小说,2022(10):9-12.
2沈丽燕.聚焦预测习得策略--以三年级上册预测策略单元为例[J].小学语文教学,2022(26):5-6.
3李成帅.社会主义核心价值观视域下外语测试效度验证框架的构建[J].西部素质教育,2025,11(2):57-61.
4潘月娟,李宛真.我国幼儿园内部评估政策的演变与发展研究[J].幼儿教育,2025(3):39-44.
5马利红.基于Rasch模型的英语学科思维品质测评研究——以思维的批判性为例[J].语言测试与评价,2024(2):79-90.
6邵秋芳,张艳,许丽元.应用型本科高校开展工程教育专业认证工作研究[J].科研成果与传播,2024(7):218-221.
7曹琦林,石义杰,李正福.教育督导第三方评估机构建设研究[J].中国现代教育装备,2025(1):162-165.
8陈康.基于因子分析的新高考英语效度验证[J].中国考试,2025(2):32-40.
9林敦来,高淼,刘森.融合型论证式效度验证框架下的初中英语学业水平考试命题评估标准构建[J].语言测试与评价,2024(2):45-60.
10李桂华,王曼旌.内容平台个性化推荐合理性:构念与效应[J].现代情报,2025,45(3):10-24.

语言测试与评价

2024年第2期

浏览历史

内容加载中请稍等...

文本自动评分算法建构研究综述

相关作者

相关机构

相关主题

浏览历史