ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue
ConvoLearn introduces a dataset of 1,250 constructivist tutor-student dialogues in middle school Earth Science, grounded in knowledge-building theory across six pedagogical dimensions (cognitive engagement, formative assessment, accountability, cultural responsiveness, metacognition, and power dynamics). The paper demonstrates that fine-tuning LLMs on this dataset shifts model behavior toward constructivist teaching strategies, with the fine-tuned Mistral-7B significantly outperforming base models and Claude Sonnet in teacher evaluations.
In educational applications, LLMs exhibit several fundamental pedagogical limitations, such as their tendency to reveal solutions rather than support dialogic learning. We introduce ConvoLearn (https://huggingface.co/datasets/masharma/convolearn ), a dataset grounded in knowledge building theory that operationalizes six core pedagogical dimensions: cognitive engagement, formative assessment, accountability, cultural responsiveness, metacognition, and power dynamics. We construct a semi-synthetic