Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues

Research / Other Relevance: 9/10 24 cited 2025 paper

This paper trains an open-source LLM (Llama 3.1 8B) to generate tutor utterances that maximize student learning outcomes in math tutoring dialogues by optimizing for both student response correctness and pedagogical quality using direct preference optimization. The approach uses a student model to predict correctness and GPT-4o to evaluate pedagogical principles, demonstrating that the trained tutor increases correct student responses while maintaining high pedagogical quality.

Generative artificial intelligence (AI) has the potential to scale up personalized tutoring through large language models (LLMs). Recent AI tutors are adapted for the tutoring task by training or prompting LLMs to follow effective pedagogical principles, though they are not trained to maximize student learning throughout the course of a dialogue. Therefore, they may engage with students in a suboptimal way. We address this limitation by introducing an approach to train LLMs to generate tutor utt

Study Type

Research / Other

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

tutoring dialogue evaluationcomputer-science