Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning

Benchmark (Not Published) Relevance: 8/10 2025 paper

EduAlign is a framework that uses reinforcement learning to align large language models with three pedagogical principles: Helpfulness, Personalization, and Creativity (HPC). The authors develop HPC-RM, a multi-dimensional reward model trained on 8k annotated educational interactions, and use it to fine-tune an LLM via Group Relative Policy Optimization, demonstrating improved pedagogical alignment in AI tutoring responses.

The integration of large language models (LLMs) into education presents unprecedented opportunities for scalable personalized learning. However, standard LLMs often function as generic information providers, lacking alignment with fundamental pedagogical principles such as helpfulness, student-centered personalization, and creativity cultivation. To bridge this gap, we propose EduAlign, a novel framework designed to guide LLMs toward becoming more effective and responsible educational assistants

Study Type

Benchmark (Not Published)

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

benchmark dataset education learningcomputer-science