Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks
This paper introduces MathQuest, a mathematics dataset derived from Indian 11th and 12th grade NCERT textbooks, and evaluates three large language models (LLaMA-2, WizardMath, MAmmoTH) on mathematical problem-solving tasks through fine-tuning experiments. The work benchmarks LLM performance on secondary-level mathematics content across varying complexity levels.
The rapid progress in the field of natural language processing (NLP) systems and the expansion of large language models (LLMs) have opened up numerous opportunities in the field of education and instructional methods. These advancements offer the potential for tailored learning experiences and immediate feedback, all delivered through accessible and cost-effective services. One notable application area for this technological advancement is in the realm of solving mathematical problems. Mathemati