Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Benchmark (Not Published) Relevance: 8/10 7 cited 2024 paper

This paper compares three approaches (fine-tuned Mistral/GOAT, SBERT-Canberra, and zero-shot GPT-4) for automatically scoring and providing feedback on middle-school students' open-ended math responses, evaluating both scoring accuracy and feedback quality using teacher judgments against rubrics.

The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research has explored methodologies to enhance the effectiveness of feedback. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated feedback systems. This study aims to explore the potential of LLMs in facilitating automated feedback in math education. We examine the effectiveness of LLMs in evaluating student respons

Study Type

Benchmark (Not Published)

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

large language model evaluation educationcomputer-science