Distilling ChatGPT for Explainable Automated Student Answer Assessment
This paper presents AERA, a framework that uses ChatGPT to generate rationales for automated scoring of student short-answer responses, then distills this capability into a smaller model that can simultaneously score answers and provide explanations. The approach is evaluated on a benchmark dataset of student responses to science questions, showing improved scoring accuracy and comparable rationale quality to ChatGPT.
Providing explainable and faithful feedback is crucial for automated student answer assessment. In this paper, we introduce a novel framework that explores using ChatGPT, a cutting-edge large language model, for the concurrent tasks of student answer scoring and rationale generation. We identify the appropriate instructions by prompting ChatGPT with different templates to collect the rationales, where inconsistent rationales are refined to align with marking standards. The refined ChatGPT output