ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue

Benchmark (Published & Automated) Relevance: 9/10 2026 paper

ConvoLearn is a dataset of 1,250 semi-synthetic tutor-student dialogues in middle school Earth Science, grounded in constructivist knowledge-building theory and operationalizing six pedagogical dimensions (cognitive engagement, formative assessment, accountability, cultural responsiveness, metacognition, and power dynamics). The authors demonstrate that fine-tuning LLMs on this dataset shifts their behavior toward knowledge-building strategies, with their Mistral-7B model outperforming base models and Claude Sonnet 4.5 in teacher evaluations.

In educational applications, LLMs exhibit several fundamental pedagogical limitations, such as their tendency to reveal solutions rather than support dialogic learning. We introduce ConvoLearn (https://huggingface.co/datasets/masharma/convolearn ), a dataset grounded in knowledge building theory that operationalizes six core pedagogical dimensions: cognitive engagement, formative assessment, accountability, cultural responsiveness, metacognition, and power dynamics. We construct a semi-synthetic

Study Type

Benchmark (Published & Automated)

Tool Types

AI Tutors 1-to-1 conversational tutoring systems.

Tags

teacher knowledge evaluation AIcomputer-science