Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues

Relevance: 9/10 24 cited 2025 paper

This paper trains an open-source LLM (Llama 3.1 8B) to generate tutor utterances that maximize student learning outcomes in math tutoring dialogues by optimizing for both student response correctness and pedagogical quality using direct preference optimization. The approach uses a student model to predict correctness and GPT-4o to evaluate adherence to pedagogical principles, directly measuring the impact on student learning through dialogue interactions.

Generative artificial intelligence (AI) has the potential to scale up personalized tutoring through large language models (LLMs). Recent AI tutors are adapted for the tutoring task by training or prompting LLMs to follow effective pedagogical principles, though they are not trained to maximize student learning throughout the course of a dialogue. Therefore, they may engage with students in a suboptimal way. We address this limitation by introducing an approach to train LLMs to generate tutor utt

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

tutoring dialogue evaluationcomputer-science