home - login - register

PERC 2024 Abstract Detail Page

Previous Page  |  New Search  |  Browse All

Abstract Title: Applying machine learning models in multi-institutional studies can generate bias
Abstract Type: Contributed Poster Presentation
Abstract: There is increasing interest in deploying machine learning models at scale for multi-institutional studies in physics education research. Here we investigate the efficacy of applying machine learning models to institutions outside of their training set, using natural language processing to code open-ended survey responses. We find that, in general, changing institutional contexts affects the variability associated with machine learning estimates of code frequencies: either previously documented sources of uncertainty increase in magnitude, new unknown sources of uncertainty emerge, or both. We also find one example where uncertainties do not change between the institution used in the training data and an institution not in the training data. Results suggest that attention to uncertainty is critical, especially when making measurements of student writing across multi-institutional data sets.
Session Time: Poster Session 2
Poster Number: B89
Contributed Paper Record: Contributed Paper Information
Contributed Paper Download: Download Contributed Paper

Author/Organizer Information

Primary Contact: Rebeckah Fussell
Cornell University
Ithaca, NY 14850
Co-Author(s)
and Co-Presenter(s)
Meagan Sundstrom (she/her), Drexel University and Cornell University
Sabrina McDowell (she/her), Cornell University
N. G. Holmes (she/her), Cornell University

Contributed Poster

Contributed Poster: Download the Contributed Poster