Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

Relevance: 10/10 73 cited 2024 paper

This paper presents LearnLM-Tutor, a fine-tuned Gemini model for educational use, and introduces a comprehensive evaluation framework spanning seven diverse benchmarks (quantitative, qualitative, automatic, and human evaluations) grounded in learning science principles to assess pedagogical quality in K-12 AI tutoring systems. The work includes real-world deployment at Arizona State University and systematic evaluation of pedagogical dimensions including Socratic dialogue, adaptive scaffolding, and learning-centered interactions.

A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily due to the difficulties with verbalising pedagogical intuitions into gen AI prompts and the lack of

Source

View source

Framework Categories

1 General reasoning 2.1 Pedagogical knowledge 2.2 Pedagogy of generated outputs 2.3 Pedagogical interactions 3.1 Content knowledge 4.2 Feedback with reasoning 5 Ethics and bias

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

Source

Framework Categories

Tool Types

Tags