BD at BEA 2025 Shared Task: MPNet Ensembles for Pedagogical Mistake Identification and Localization in AI Tutor Responses

Benchmark (Published & Automated) Relevance: 9/10 1 cited 2025 paper

This paper presents an MPNet ensemble system for the BEA 2025 Shared Task that automatically classifies AI tutor responses in educational dialogues across two tracks: whether tutors correctly identify student mistakes (Track 1) and whether they locate the mistakes (Track 2). The system uses fine-tuned Transformer models with grouped cross-validation and hard-voting ensemble to achieve macro-F1 scores of 0.7110 and 0.5543 on the respective tracks.

We present Team BD's submission to the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors, under Track 1 (Mistake Identification) and Track 2 (Mistake Location). Both tracks involve three-class classification of tutor responses in educational dialogues - determining if a tutor correctly recognizes a student's mistake (Track 1) and whether the tutor pinpoints the mistake's location (Track 2). Our system is built on MPNet, a Transformer-based language model that combines B

Study Type

Benchmark (Published & Automated)

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

tutoring dialogue evaluationcomputer-science