Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues

Relevance: 9/10 24 cited 2025 paper

This paper trains an open-source LLM (Llama 3.1 8B) to generate tutor utterances that maximize student learning outcomes in math tutoring dialogues by optimizing for both student response correctness and pedagogical quality using direct preference optimization. The approach uses a student model to predict correctness and GPT-4o to evaluate adherence to pedagogical principles, directly measuring the impact on student learning through dialogue interactions.

Generative artificial intelligence (AI) has the potential to scale up personalized tutoring through large language models (LLMs). Recent AI tutors are adapted for the tutoring task by training or prompting LLMs to follow effective pedagogical principles, though they are not trained to maximize student learning throughout the course of a dialogue. Therefore, they may engage with students in a suboptimal way. We address this limitation by introducing an approach to train LLMs to generate tutor utt

Source

View source

Framework Categories

2.3 Pedagogical interactions 4.2 Feedback with reasoning 3.1 Content knowledge

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues

Source

Framework Categories

Tool Types

Tags