Hierarchical Pedagogical Oversight: A Multi-Agent Adversarial Framework for Reliable AI Tutoring
This paper introduces Hierarchical Pedagogical Oversight (HPO), a multi-agent adversarial framework that evaluates AI tutor responses in K-12 mathematics dialogues by using specialist agents and structured debate to assess mistake identification and guidance quality. The system is validated on MRBench, a dataset of 1,214 middle-school mathematics tutoring dialogues, achieving superior performance in detecting sycophancy and inappropriate scaffolding.
Large Language Models (LLMs) are increasingly deployed as automated tutors to address educator shortages; however, they often fail at pedagogical reasoning, frequently validating incorrect student solutions (sycophancy) or providing overly direct answers that hinder learning. We introduce Hierarchical Pedagogical Oversight (HPO), a framework that adapts structured adversarial synthesis to educational assessment. Unlike cooperative multi-agent systems that often drift toward superficial consensus