Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions
This paper develops and evaluates scaffolding techniques for multi-modal intelligent tutoring systems that guide children in describing images for language learning, using GPT-4V with pedagogical instructions grounded in four learning theories. The work constructs a seven-dimension rubric to evaluate the scaffolding process both manually and automatically.
Intelligent tutoring systems (ITSs) that imitate human tutors and aim to provide immediate and customized instructions or feedback to learners have shown their effectiveness in education. With the emergence of generative artificial intelligence, large language models (LLMs) further entitle the systems to complex and coherent conversational interactions. These systems would be of great help in language education as it involves developing skills in communication, which, however, drew relatively le