Hierarchical Pedagogical Oversight: A Multi-Agent Adversarial Framework for Reliable AI Tutoring

Benchmark (Published & Automated) Relevance: 9/10 2025 paper

This paper introduces Hierarchical Pedagogical Oversight (HPO), a multi-agent adversarial framework that uses structured debate between specialist agents to evaluate AI tutor responses for pedagogical quality, specifically detecting sycophancy and overly direct answers. The framework is evaluated on the MRBench dataset of 1,214 middle-school mathematics dialogues, achieving superior performance in classifying mistake identification and guidance quality compared to GPT-4o.

Large Language Models (LLMs) are increasingly deployed as automated tutors to address educator shortages; however, they often fail at pedagogical reasoning, frequently validating incorrect student solutions (sycophancy) or providing overly direct answers that hinder learning. We introduce Hierarchical Pedagogical Oversight (HPO), a framework that adapts structured adversarial synthesis to educational assessment. Unlike cooperative multi-agent systems that often drift toward superficial consensus

Study Type

Benchmark (Published & Automated)

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

tutoring dialogue evaluationcomputer-science