How Useful are Educational Questions Generated by Large Language Models?

Research / Other Relevance: 7/10 43 cited 2023 paper

This paper evaluates the quality and usefulness of educational questions generated by large language models (specifically InstructGPT/GPT-3) using controllable text generation with Bloom's taxonomy and difficulty levels, through human evaluation by teachers across two domains (computer science and biology). Teachers rated the generated questions on quality and usefulness for classroom use.

Controllable text generation (CTG) by large language models has a huge potential to transform education for teachers and students alike. Specifically, high quality and diverse question generation can dramatically reduce the load on teachers and improve the quality of their educational content. Recent work in this domain has made progress with generation, but fails to show that real teachers judge the generated questions as sufficiently useful for the classroom setting; or if instead the question

Study Type

Research / Other

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

large language model evaluation educationcomputer-science