Letting Tutor Personas"Speak Up"for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization
This paper develops a method using preference optimization to learn steering vectors that capture diverse tutoring personas from human tutor-student math dialogues, enabling LLMs to exhibit different instructional styles (scaffolding levels, affective support, directiveness) rather than a single tutoring approach. The method is evaluated on real K-12 math tutoring dialogues, showing improved alignment with ground-truth tutor behaviors while preserving pedagogical quality.
With the emergence of large language models (LLMs) as a powerful class of generative artificial intelligence (AI), their use in tutoring has become increasingly prominent. Prior works on LLM-based tutoring typically learn a single tutor policy and do not capture the diversity of tutoring styles. In real-world tutor-student interactions, pedagogical intent is realized through adaptive instructional strategies, with tutors varying the level of scaffolding, instructional directiveness, feedback, an