Problems With Large Language Models for Learner Modelling: Why LLMs Alone Fall Short for Responsible Tutoring in K-12 Education

Relevance: 9/10 2025 paper

This paper empirically evaluates LLM-based tutoring systems against traditional deep knowledge tracing (DKT) models for learner modelling in K-12 education, demonstrating that LLMs fall short in accurately tracking student knowledge over time even after fine-tuning. The study directly measures prediction accuracy, temporal coherence, and multi-skill mastery estimation using a large-scale K-12 dataset to assess whether LLMs can responsibly support adaptive instruction.

The rapid rise of large language model (LLM)-based tutors in K--12 education has fostered a misconception that generative models can replace traditional learner modelling for adaptive instruction. This is especially problematic in K--12 settings, which the EU AI Act classifies as high-risk domain requiring responsible design. Motivated by these concerns, this study synthesises evidence on limitations of LLM-based tutors and empirically investigates one critical issue: the accuracy, reliability,

Source

View source

Framework Categories

2.3 Pedagogical interactions 1 General reasoning

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Personalised Adaptive Learning Systems that adapt content and difficulty to individual learners.

Problems With Large Language Models for Learner Modelling: Why LLMs Alone Fall Short for Responsible Tutoring in K-12 Education

Source

Framework Categories

Tool Types

Tags