Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology
This paper evaluates the effectiveness of LLMs (GPT-4o, Llama-3-8B) as teachers generating pedagogical hints for simulated student LLMs solving high-school mathematics problems designed using cognitive science principles. The study measures hint quality, error correction rates, and compares different prompting strategies across temperature settings.
The automatic generation of hints by Large Language Models (LLMs) within Intelligent Tutoring Systems (ITSs) has shown potential to enhance student learning. However, generating pedagogically sound hints that address student misconceptions and adhere to specific educational objectives remains challenging. This work explores using LLMs (GPT-4o and Llama-3-8B-instruct) as teachers to generate effective hints for students simulated through LLMs (GPT-3.5-turbo, Llama-3-8B-Instruct, or Mistral-7B-ins