Improving the Validity of Automatically Generated Feedback via Reinforcement Learning

Relevance: 9/10 20 cited 2024 paper

This paper develops and evaluates a reinforcement learning framework for automatically generating pedagogically valid feedback for incorrect student answers in math education, using GPT-4 to evaluate feedback quality according to a rubric measuring both correctness and alignment with educational goals. The work demonstrates that fine-tuning Llama 2 with direct preference optimization significantly improves feedback quality across correctness and pedagogical alignment dimensions.

Automatically generating feedback via large language models (LLMs) in intelligent tutoring systems and online learning platforms has the potential to improve the learning outcomes of many students. However, both feedback generation and evaluation are challenging: feedback content has to be valid especially in subjects like math, which requires models to understand the problem, the solution, and where the student's error lies. Feedback also has to be pedagogically valid to reflect effective tutor

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.
Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

intelligent tutoring system evaluationcomputer-science