EduGuardBench: A Holistic Benchmark for Evaluating the Pedagogical Fidelity and Adversarial Safety of LLMs as Simulated Teachers

Relevance: 10/10 2025 paper

EduGuardBench is a dual-component benchmark designed to evaluate LLMs acting as simulated teachers, measuring both pedagogical fidelity (role-playing accuracy, teaching competence) and adversarial safety (resistance to jailbreaking, handling of academic misconduct requests). The benchmark identifies harmful teaching behaviors (incompetence, indolence, offensiveness) and uses persona-based adversarial prompts to test ethical boundaries specific to educational contexts.

Large Language Models for Simulating Professions (SP-LLMs), particularly as teachers, are pivotal for personalized education. However, ensuring their professional competence and ethical safety is a critical challenge, as existing benchmarks fail to measure role-playing fidelity or address the unique teaching harms inherent in educational scenarios. To address this, we propose EduGuardBench, a dual-component benchmark. It assesses professional fidelity using a Role-playing Fidelity Score (RFS) wh

Source

View source

Framework Categories

2.1 Pedagogical knowledge 2.2 Pedagogy of generated outputs 2.3 Pedagogical interactions 5 Ethics and bias

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

EduGuardBench: A Holistic Benchmark for Evaluating the Pedagogical Fidelity and Adversarial Safety of LLMs as Simulated Teachers

Source

Framework Categories

Tool Types

Tags