FEANEL: A Benchmark for Fine-Grained Error Analysis in K-12 English Writing

Relevance: 10/10 2025 paper

FEANEL is a benchmark for evaluating LLMs' ability to provide fine-grained error analysis and pedagogical feedback on K-12 English writing, comprising 1,000 student essays with expert-annotated errors categorized by type, severity, and explanations. The benchmark specifically assesses whether AI systems can identify writing errors and provide educationally meaningful, interpretable feedback to support student learning.

Large Language Models (LLMs) have transformed artificial intelligence, offering profound opportunities for educational applications. However, their ability to provide fine-grained educational feedback for K-12 English writing remains underexplored. In this paper, we challenge the error analysis and pedagogical skills of LLMs by introducing the problem of Fine-grained Error Analysis for English Learners and present the Fine-grained Error ANalysis for English Learners (FEANEL) Benchmark. The bench

Tool Types

Teacher Support Tools Tools that assist teachers — lesson planning, content generation, grading, analytics.

Tags

LLM evaluation K-12 educationcomputer-science