LLM Safety for Children

Benchmark (Not Published) Relevance: 8/10 4 cited 2025 paper

This paper develops a comprehensive taxonomy of content harms specific to children interacting with LLMs and creates Child User Models based on child psychology literature to evaluate the safety of six state-of-the-art LLMs through red-teaming. The evaluation reveals significant safety gaps in LLMs for child-specific harm categories that are not captured by standard adult-focused safety evaluations.

This paper analyzes the safety of Large Language Models (LLMs) in interactions with children below age of 18 years. Despite the transformative applications of LLMs in various aspects of children's lives such as education and therapy, there remains a significant gap in understanding and mitigating potential content harms specific to this demographic. The study acknowledges the diverse nature of children often overlooked by standard safety evaluations and proposes a comprehensive approach to evalu

Study Type

Benchmark (Not Published)

Framework Categories

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

safety evaluation language model childrencomputer-science