Language Bottleneck Models for Qualitative Knowledge State Modeling
This paper introduces Language Bottleneck Models (LBMs) that use LLMs to generate interpretable natural language summaries of student knowledge states from their quiz/test responses, then use those summaries to predict future performance. The approach aims to provide more nuanced diagnostic insights (including misconceptions) than traditional Cognitive Diagnosis and Knowledge Tracing models while maintaining competitive predictive accuracy.
Accurately assessing student knowledge is central to education. Cognitive Diagnosis (CD) models estimate student proficiency at a fixed point in time, while Knowledge Tracing (KT) methods model evolving knowledge states to predict future performance. However, existing approaches either provide quantitative concept mastery estimates with limited expressivity (CD, probabilistic KT) or prioritize predictive accuracy at the cost of interpretability (deep learning KT). We propose Language Bottleneck