LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education
This paper evaluates bias in large language models when they act as personalized tutors, analyzing how 9 LLMs generate and select educational content across different demographic groups (race, ethnicity, sex, gender, disability, income) using over 17,000 educational explanations. The study introduces two bias metrics (MAB and MDB) to measure whether models provide difficulty-appropriate content fairly across student demographics.
With the increasing adoption of large language models (LLMs) in education, concerns about inherent biases in these models have gained prominence. We evaluate LLMs for bias in the personalized educational setting, specifically focusing on the models' roles as"teachers."We reveal significant biases in how models generate and select educational content tailored to different demographic groups, including race, ethnicity, sex, gender, disability status, income, and national origin. We introduce and a