EduGuardBench: A Holistic Benchmark for Evaluating the Pedagogical Fidelity and Adversarial Safety of LLMs as Simulated Teachers

Benchmark (Published & Automated) Relevance: 9/10 2025 paper

EduGuardBench is a dual-component benchmark that evaluates LLMs as simulated teachers by measuring both professional fidelity (using Role-playing Fidelity Score to detect pedagogical harms like incompetence, indolence, and offensiveness) and adversarial safety (using persona-based jailbreak prompts to test vulnerability to harmful requests including academic misconduct). The benchmark tests 14 leading models and is publicly available with automated evaluation code.

Large Language Models for Simulating Professions (SP-LLMs), particularly as teachers, are pivotal for personalized education. However, ensuring their professional competence and ethical safety is a critical challenge, as existing benchmarks fail to measure role-playing fidelity or address the unique teaching harms inherent in educational scenarios. To address this, we propose EduGuardBench, a dual-component benchmark. It assesses professional fidelity using a Role-playing Fidelity Score (RFS) wh

Study Type

Benchmark (Published & Automated)

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

teacher knowledge evaluation AIcomputer-science