Letting Tutor Personas"Speak Up"for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization

Relevance: 9/10 2026 paper

This paper develops a method using preference optimization to learn steering vectors that capture diverse tutoring personas from human tutor-student math dialogues, enabling LLMs to exhibit different instructional styles (scaffolding levels, affective support, directiveness) rather than a single tutoring approach. The method is evaluated on real K-12 math tutoring dialogues, showing improved alignment with ground-truth tutor behaviors while preserving pedagogical quality.

With the emergence of large language models (LLMs) as a powerful class of generative artificial intelligence (AI), their use in tutoring has become increasingly prominent. Prior works on LLM-based tutoring typically learn a single tutor policy and do not capture the diversity of tutoring styles. In real-world tutor-student interactions, pedagogical intent is realized through adaptive instructional strategies, with tutors varying the level of scaffolding, instructional directiveness, feedback, an

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

tutoring dialogue evaluationcomputer-science