TeachLM: Post-Training LLMs for Education Using Authentic Learning Data

Relevance: 8/10 3 cited 2025 paper

TeachLM is a fine-tuned LLM for one-on-one tutoring, trained on 100,000 hours of authentic student-tutor interactions from Polygence. The paper introduces a novel multi-turn evaluation protocol using synthetic dialogues and demonstrates improvements in pedagogical interactions including doubled student talk time, better questioning strategies, and increased dialogue turns.

The promise of generative AI to revolutionize education is constrained by the pedagogical limits of large language models (LLMs). A major issue is the lack of access to high-quality training data that reflect the learning of actual students. Prompt engineering has emerged as a stopgap, but the ability of prompts to encode complex pedagogical strategies in rule-based natural language is inherently limited. To address this gap we introduce TeachLM - an LLM optimized for teaching through parameter-

Source

View source

Framework Categories

2.3 Pedagogical interactions 2.2 Pedagogy of generated outputs 1 General reasoning

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

TeachLM: Post-Training LLMs for Education Using Authentic Learning Data

Source

Framework Categories

Tool Types

Tags