A Report on the llms evaluating the high school questions

Relevance: 7/10 2025 paper

This paper evaluates the performance of at least eight large language models (LLMs) on Chinese college entrance examination mathematics questions from 2019-2023, assessing accuracy, response time, logical reasoning, and creativity across different question types and difficulty levels.

This report aims to evaluate the performance of large language models (LLMs) in solving high school science questions and to explore their potential applications in the educational field. With the rapid development of LLMs in the field of natural language processing, their application in education has attracted widespread attention. This study selected mathematics exam questions from the college entrance examinations (2019-2023) as evaluation data and utilized at least eight LLM APIs to provide

Tool Types

Tags

educational assessment natural language processingcomputer-science