MinorBench: A hand-built benchmark for content-based risks for children

Benchmark (Published & Automated) Relevance: 10/10 4 cited 2025 paper

MinorBench is a hand-built, open-source benchmark that evaluates LLMs' ability to refuse unsafe or age-inappropriate content requests from children, using a taxonomy of content-based risks specific to minors derived from real middle-school chatbot deployment. The benchmark tests six prominent LLMs under different system prompts to assess child-safety compliance.

Large Language Models (LLMs) are rapidly entering children's lives - through parent-driven adoption, schools, and peer networks - yet current AI ethics and safety research do not adequately address content-related risks specific to minors. In this paper, we highlight these gaps with a real-world case study of an LLM-based chatbot deployed in a middle school setting, revealing how students used and sometimes misused the system. Building on these findings, we propose a new taxonomy of content-base

Study Type

Benchmark (Published & Automated)

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

safety evaluation language model childrencomputer-science