A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science

Relevance: 9/10 73 cited 2024 paper

This paper develops a chain-of-thought prompting approach using GPT-4 to automatically score and generate explanations for middle school Earth Science formative assessment responses, employing human-in-the-loop few-shot and active learning methods. The system evaluates open-ended short-answer responses and provides meaningful feedback to support student learning.

This paper explores the use of large language models (LLMs) to score and explain short-answer assessments in K-12 science. While existing methods can score more structured math and computer science assessments, they often do not provide explanations for the scores. Our study focuses on employing GPT-4 for automated assessment in middle school Earth Science, combining few-shot and active learning with chain-of-thought reasoning. Using a human-in-the-loop approach, we successfully score and provid

Source

View source Open PDF

Framework Categories

4.1 Scoring and grading 4.2 Feedback with reasoning 2.2 Pedagogy of generated outputs

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science

Source

Framework Categories

Tool Types

Tags