Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions

Relevance: 9/10 2 cited 2025 paper

Safe-Child-LLM introduces a comprehensive benchmark with 200 adversarial prompts designed to systematically evaluate LLM safety across two developmental stages (children 7-12 and adolescents 13-17), testing leading models including ChatGPT, Claude, Gemini, and others for age-appropriate responses and ethical refusal behaviors. The benchmark addresses critical safety gaps in child-facing AI interactions by measuring how well LLMs handle developmental-stage-specific vulnerabilities and harmful content targeting minors.

As Large Language Models (LLMs) increasingly power applications used by children and adolescents, ensuring safe and age-appropriate interactions has become an urgent ethical imperative. Despite progress in AI safety, current evaluations predominantly focus on adults, neglecting the unique vulnerabilities of minors engaging with generative AI. We introduce Safe-Child-LLM, a comprehensive benchmark and dataset for systematically assessing LLM safety across two developmental stages: children (7-12)

Framework Categories

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

safety evaluation language model childrencomputer-science