Enabling Multi-Agent Systems as Learning Designers: Applying Learning Sciences to AI Instructional Design
This paper evaluates three multi-agent LLM systems for generating K-12 math and science learning activities guided by the Knowledge-Learning-Instruction (KLI) framework, comparing their pedagogical quality through teacher evaluations and LLM-as-judge assessments using Quality Matters standards. The collaborative multi-agent system (MAS-CMD) produced activities that teachers found significantly more creative, contextually relevant, and classroom-ready despite only small differences in rubric scores.
K-12 educators are increasingly using Large Language Models (LLMs) to create instructional materials. These systems excel at producing fluent, coherent content, but often lack support for high-quality teaching. The reason is twofold: first, commercial LLMs, such as ChatGPT and Gemini which are among the most widely accessible to teachers, do not come preloaded with the depth of pedagogical theory needed to design truly effective activities; second, although sophisticated prompt engineering can b