MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Relevance: 7/10 145 cited 2024 paper

MathScale proposes a method to generate large-scale mathematical reasoning training data using GPT-3.5 by extracting concepts from seed questions and creating a concept graph, then uses this data to fine-tune open-source LLMs. The paper introduces MWPBENCH, a comprehensive benchmark covering K-12 through college-level math word problems across ten datasets including GSM8K and MATH.

Large language models (LLMs) have demonstrated remarkable capabilities in problem-solving. However, their proficiency in solving mathematical problems remains inadequate. We propose MathScale, a simple and scalable method to create high-quality mathematical reasoning data using frontier LLMs (e.g., {\tt GPT-3.5}). Inspired by the cognitive mechanism in human mathematical learning, it first extracts topics and knowledge points from seed math questions and then build a concept graph, which is subs

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

reasoning evaluation LLMcomputer-sciencehighly-cited