Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions
Safe-Child-LLM introduces a comprehensive benchmark with 200 adversarial prompts designed to systematically evaluate LLM safety across two developmental stages (children 7-12 and adolescents 13-17), testing leading models including ChatGPT, Claude, Gemini, and others for age-appropriate responses and ethical refusal behaviors. The benchmark addresses critical safety gaps in child-facing AI interactions by measuring how well LLMs handle developmental-stage-specific vulnerabilities and harmful content targeting minors.
As Large Language Models (LLMs) increasingly power applications used by children and adolescents, ensuring safe and age-appropriate interactions has become an urgent ethical imperative. Despite progress in AI safety, current evaluations predominantly focus on adults, neglecting the unique vulnerabilities of minors engaging with generative AI. We introduce Safe-Child-LLM, a comprehensive benchmark and dataset for systematically assessing LLM safety across two developmental stages: children (7-12)