KidsArtBench: Multi-Dimensional Children's Art Evaluation with Attribute-Aware MLLMs
KidsArtBench introduces a benchmark of 1,046 children's artworks (ages 5-15) annotated by expert educators across 9 rubric-aligned dimensions, designed to evaluate MLLMs' ability to provide multi-dimensional assessment and formative feedback on student artwork. The paper proposes an attribute-aware multi-LoRA approach with regression-aware fine-tuning to align AI predictions with ordinal evaluation scales used in art education.
Multimodal Large Language Models (MLLMs) show remarkable progress across many visual-language tasks; however, their capacity to evaluate artistic expression remains limited. Aesthetic concepts are inherently abstract and open-ended, and multimodal artwork annotations are scarce. We introduce KidsArtBench, a new benchmark of over 1k children's artworks (ages 5-15) annotated by 12 expert educators across 9 rubric-aligned dimensions, together with expert comments for feedback. Unlike prior aestheti