期刊文献+

ChatGPT在中国临床执业医师资格模拟考试中的表现研究

A Study on the Performance of ChatGPT in the Simulated Examination for Clinical Practitioner Qualification in China
在线阅读 下载PDF
导出
摘要 目的评估聊天生成预训练转换器(chat generative pre-trained transformer,ChatGPT)在中国临床执业医师资格模拟考试中的表现,并探讨其优势和局限性,以期对医学教育和知识评估提供参考。方法研究于2023年7月1—至9月1日进行,使用一组涵盖多个题型和专业的中国临床执业医师资格考试模拟选择题来评估ChatGPT的答题表现。所有试题都来自医学生常用的备考题库,旨在匹配中国执业医师资格考试的风格、内容和难度。根据试题类型和专业对300个选择题进行分组,并进一步将其细分为高阶思维试题和低阶思维试题。ChatGPT的表现通过回答准确率进行评估。结果在所有试题中,ChatGPT回答准确率为70.3%。ChatGPT对低阶思维试题的回答准确率(78.3%)高于高阶思维试题(66.0%),差异有统计学意义(P<0.05)。ChatGPT对临床医学和非临床医学试题的回答准确率分别为71.0%和68.7%,差异无统计学意义(P>0.05)。在4个题型中,ChatGPT的回答准确率分别为69.1%、64.3%、73.9%、70.8%,差异无统计学意义(P>0.05)。即使不正确,ChatGPT也能始终如一地使用自信的语言(100%)。结论ChatGPT能够顺利实现通过中国临床执业医师资格模拟考试的目标,预示其在医学教育和医疗实践中的具有巨大潜力。但是也必须意识到ChatGPT的局限性,例如它在不准确回答时仍然自信地表达。 Objective To evaluate the performance of the chat generative pre-trained transformer(ChatGPT)in Chinese practicing physician licensing simulated examinations and explore its advantages and limitations to provide inspiration for medical education and knowledge assessment.Methods The study was conducted from July 1 to September 1,2023,and the ChatGPT answer performance was evaluated using a set of simulated choice questions of Chinese practicing physician licensing examinations covering multiple item types and specialties.All questions were drawn from a commonly used test-prep item bank for medical students,and the questions were designed to match the style,content,and difficulty of the chinese medical licensing examination.300 choice questions were grouped according to question types and specialty,and further subdivided them into higher-order and lower-order thinking questions.ChatGPT performance was assessed by answer accuracy.Results Among all questions,the answer accuracy of ChatGPT was 70.3%.The answer accuracy of ChatGPT on lower-order thinking problems(78.3%)was higher than that on higher-order thinking problems(66.0%),and the difference was statistically significant(P<0.05).The answer accuracy of ChatGPT was 71.0%and 68.7%on clinical medicine problems and nonclinical medicine problems respectively,and the difference was not statistically significant(P>0.05).Among the four question types,the accuracy of ChatGPT was 69.1%,64.3%,73.9%and 70.8%respectively,and the difference was not statistically significant(P>0.05).ChatGPT consistently uses confident language(100%),even when incorrect.Conclusion ChatGPT can successfully achieve the goal of passing the Chinese practicing physician licensing simulated examination,which indicates the great potential of ChatGPT in medical education and medical practice.However,it is also necessary to be aware of the limitations of ChatGPT,such as its confident expression in the face of inaccurate answers.
作者 张丽 张雪 周海燕 温馨 姜九明 李健维 李谭谭 李蒙 ZHANG Li;ZHANG Xue;ZHOU Haiyan;WEN Xin;JIANG Jiuming;LI Jianwei;LI Tantan;LI Meng(Department of Diagnostic Radiology,National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100021,China;Department of Education,National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100021,China;Department of Radiation Oncology,National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100021,China)
出处 《中国继续医学教育》 2024年第15期157-162,共6页 China Continuing Medical Education
关键词 人工智能 自然语言处理 聊天生成预训练转化器 中国临床执业医师资格考试 继续教育 医学 artificial intelligence natural language process ChatGPT Chinese practicing physician licensing examination continuing education medicine
  • 相关文献

参考文献2

二级参考文献4

共引文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部