Pedagogy-driven Evaluation of Generative AI-powered Intelligent Tutoring Systems

Research / Other Relevance: 9/10 1 cited 2025 paper

This paper critically reviews evaluation practices for GenAI-powered Intelligent Tutoring Systems (ITSs), highlighting the lack of reliable, pedagogy-driven evaluation frameworks and benchmarks. It analyzes existing challenges through case studies and proposes three research directions for developing fair, unified, and scalable ITS evaluation methodologies grounded in learning science principles.

The interdisciplinary research domain of Artificial Intelligence in Education (AIED) has a long history of developing Intelligent Tutoring Systems (ITSs) by integrating insights from technological advancements, educational theories, and cognitive psychology. The remarkable success of generative AI (GenAI) models has accelerated the development of large language model (LLM)-powered ITSs, which have potential to imitate human-like, pedagogically rich, and cognitively demanding tutoring. However, t

Study Type

Research / Other

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

tutoring dialogue evaluationcomputer-science