Automated Educational Question Generation at Different Bloom's Skill Levels Using Large Language Models: Strategies and Evaluation
This paper evaluates five large language models' ability to generate educational questions at different Bloom's taxonomy cognitive levels using various prompting strategies, with both expert human evaluation and LLM-based automated evaluation of question quality, linguistic correctness, and pedagogical relevance.
Developing questions that are pedagogically sound, relevant, and promote learning is a challenging and time-consuming task for educators. Modern-day large language models (LLMs) generate high-quality content across multiple domains, potentially helping educators to develop high-quality questions. Automated educational question generation (AEQG) is important in scaling online education catering to a diverse student population. Past attempts at AEQG have shown limited abilities to generate questio