Alignment Drift in CEFR-prompted LLMs for Interactive Spanish Tutoring

Research / Other Relevance: 7/10 5 cited 2025 paper

This paper evaluates whether CEFR-based system prompting can reliably constrain LLMs to generate Spanish text appropriate to different student proficiency levels (A1, B1, C1) in simulated tutor-student dialogues, finding that prompting effectiveness degrades over sustained interactions (alignment drift). The study uses automated dialogue simulation with open-source LLMs ranging from 7B to 12B parameters to assess proficiency-aligned adaptive tutoring without human participants.

This paper investigates the potentials of Large Language Models (LLMs) as adaptive tutors in the context of second-language learning. In particular, we evaluate whether system prompting can reliably constrain LLMs to generate only text appropriate to the student's competence level. We simulate full teacher-student dialogues in Spanish using instruction-tuned, open-source LLMs ranging in size from 7B to 12B parameters. Dialogues are generated by having an LLM alternate between tutor and student r

Study Type

Research / Other

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.
Personalised Adaptive Learning Systems that adapt content and difficulty to individual learners.

Tags

tutoring dialogue evaluationcomputer-science