LLM-Based Automated Grading with Human-in-the-Loop

Relevance: 7/10 13 cited 2025 paper

This paper presents GradeHITL, an LLM-based automated grading framework with human-in-the-loop that enables AI to ask clarifying questions about rubrics to human experts, dynamically refining grading standards for short-answer assessment. The system is evaluated on mathematics teaching knowledge questions, using reinforcement learning to filter high-quality clarification questions.

The rise of artificial intelligence (AI) technologies, particularly large language models (LLMs), has brought significant advancements to the education field. Among various applications, automatic short answer grading (ASAG), which focuses on evaluating open-ended textual responses, has seen remarkable progress with LLMs. These models not only enhance grading performance compared to traditional ASAG approaches but also move beyond simple comparisons with predefined answers, enabling more sophist

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

large language model evaluation educationcomputer-science