Toward Automated Qualitative Analysis: Leveraging Large Language Models for Tutoring Dialogue Evaluation
This paper develops an automated system using GPT-3.5 to evaluate five key tutoring strategies (praise, error reaction, knowledge assessment, managing inequity, responding to negative self-talk) in one-on-one tutoring dialogues, classifying whether each strategy is employed effectively or ineffectively. The system analyzes tutoring transcripts to provide color-coded feedback on pedagogical quality of tutor-student interactions.
Our study introduces an automated system leveraging large language models (LLMs) to assess the effectiveness of five key tutoring strategies: 1. giving effective praise, 2. reacting to errors, 3. determining what students know, 4. helping students manage inequity, and 5. responding to negative self-talk. Using a public dataset from the Teacher-Student Chatroom Corpus, our system classifies each tutoring strategy as either being employed as desired or undesired. Our study utilizes GPT-3.5 with fe