Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Relevance: 8/10 7 cited 2024 paper

This paper compares three models (fine-tuned Mistral/GOAT, SBERT-Canberra, and GPT-4) for automatically scoring and providing qualitative feedback on middle-school students' open-ended math responses, evaluating both scoring accuracy and feedback quality using teacher judgments.

The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research has explored methodologies to enhance the effectiveness of feedback. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated feedback systems. This study aims to explore the potential of LLMs in facilitating automated feedback in math education. We examine the effectiveness of LLMs in evaluating student respons

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

large language model evaluation educationcomputer-science