CHECK-MAT: Checking Hand-Written Mathematical Answers for the Russian Unified State Exam

Benchmark (Published & Automated) Relevance: 8/10 2 cited 2025 paper

This paper introduces EGE-Math Solutions Assessment Benchmark, evaluating Vision-Language Models on their ability to grade handwritten mathematical solutions from Russia's high-stakes graduation exam (EGE) by assessing student work against fixed rubrics, identifying errors, and assigning grades like human expert graders. The benchmark includes 122 scanned solutions with official expert grades and tests seven state-of-the-art VLMs across three inference modes.

This paper introduces a novel benchmark, EGE-Math Solutions Assessment Benchmark, for evaluating Vision-Language Models (VLMs) on their ability to assess hand-written mathematical solutions. Unlike existing benchmarks that focus on problem solving, our approach centres on understanding student solutions, identifying mistakes, and assigning grades according to fixed criteria. We compile 122 scanned solutions from the Russian Unified State Exam (EGE) together with official expert grades, and evalu

Study Type

Benchmark (Published & Automated)

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

AI grading rubric evaluationcomputer-science