BloomVQA: Assessing Hierarchical Multi-modal Comprehension

Benchmark (Published & Automated) Relevance: 7/10 2 cited 2023 paper

BloomVQA is a novel VQA benchmark dataset that evaluates multi-modal comprehension in large vision-language models using picture stories mapped to Bloom's Taxonomy levels (remember through create). The benchmark includes 1200 core samples based on early childhood education stories, with hierarchical graph representations enabling automated evaluation of model performance across different cognitive skill levels.

We propose a novel VQA dataset, BloomVQA, to facilitate comprehensive evaluation of large vision-language models on comprehension tasks. Unlike current benchmarks that often focus on fact-based memorization and simple reasoning tasks without theoretical grounding, we collect multiple-choice samples based on picture stories that reflect different levels of comprehension, as laid out in Bloom's Taxonomy, a classic framework for learning assessment widely adopted in education research. Our data map

Study Type

Benchmark (Published & Automated)

Source

View source

Framework Categories

1 General reasoning 6.1 Multimodal capabilities

BloomVQA: Assessing Hierarchical Multi-modal Comprehension

Study Type

Source

Framework Categories

Tool Types

Tags