home - login - register

PERC 2024 Abstract Detail Page

Previous Page  |  New Search  |  Browse All

Abstract Title: Comparing three natural language processing methods for grading students' written responses to conceptual physics questions
Abstract Type: Contributed Poster Presentation
Abstract: A limiting factor in examining student reasoning on conceptual assessments has been that such assessments are typically administered in a multiple-choice format. Therefore, options to analyze data from large numbers of students at the college level don't exist. Current tools such as Machine Learning and Large language models (LLMs) promise to assess students' written responses in a fair and consistent way. Our study compares three methods to classify, as correct or incorrect, students' written explanations to multiple-choice questions on the Energy and Momentum Conceptual Survey (EMCS). These methods are supervised machine learning (ML), LLM and human graders. We then compare classification of students' written explanations with the ground truth i.e. students' multiple-choice responses. We find that human-graded and supervised ML essays align closer with multiple-choice essays than LLMs. The results of this study caution the use of LLMs for analyzing students' written responses to conceptual questions in physics.
Footnote: This work is supported in part by U.S. National Science Foundation grant NSF-2300645. Opinions expressed are of the authors and not of the Foundation.
Session Time: Poster Session 2
Poster Number: B88

Author/Organizer Information

Primary Contact: Sean Savage
Purdue University
West Lafayette, IN 47907
Co-Author(s)
and Co-Presenter(s)
N. Sanjay Rebello, Purdue University