home - login - register

PERC 2022 Abstract Detail Page

Previous Page  |  New Search  |  Browse All

Abstract Title: Machine learning methods in PER: Intuition and methodological discussion
Abstract Type: Custom Format
Abstract: Over the last several years, various PER researchers and groups have begun to explore how machine learning methods can be used to conduct qualitative and quantitative research. However, these methods have a tendency to turn into "black boxes" due to their complexity and their novelty within PER. Because of this, there is a great need to both explain the intuition underlying them and develop methodological standards for how these methods are used, communicated, and evaluated within PER--especially when combined with more established quantitative and qualitative methodologies. This session will be a combination of talk symposia and discussion group. During the first half, four PER researchers will present high-level talks (more TED talk than standard research talk) on different applications of machine learning methods, explaining the intuition behind their methods in an accessible way and briefly presenting a snapshot of the types of results their methods can provide. Then, the format will shift to a discussion group where the speakers and audience can begin a dialogue on how these kinds of methods can be used and evaluated within PER.

This session will be suitable for people with all levels of prior exposure to machine learning methods, and we are particularly interested in the perspectives of non-machine learning users as we discuss methodological use cases and evaluation.
Session Time: Parallel Sessions Cluster III
Room: Vandenberg A

Author/Organizer Information

Primary Contact: Tor Ole Odden
University of Oslo, Norway
Oslo, Non U.S.
Co-Author(s)
and Co-Presenter(s)
Rebeckah Fussell, Cornell
Colin Green, Drexel
Nicholas Young, University of Michigan

Parallel Session Information

Format Description: This session will be a combination of talk symposia and discussion group. During the first half, four PER researchers will present high-level talks (more TED talk than standard research talk) on different applications of machine learning methods, explaining the intuition behind their methods in an accessible way and briefly presenting a snapshot of the types of results their methods can provide. Then, the format will shift to a discussion group where the speakers and audience can begin a dialogue on how these kinds of methods can be used and evaluated within PER.
Moderator: Eric Brewe, Drexel
Anticipated Participants: This session will be suitable for people with all levels of prior exposure to machine learning methods, and we are particularly interested in the perspectives of non-machine learning users as we discuss methodological use cases and evaluation.

Symposium Specific Information

Discussant: Eric Brewe, Drexel
Presentation 1 Title: Using LDA to thematically analyze PER literature
Presentation 1 Authors: Tor Ole B. Odden
Presentation 1 Abstract: Latent Dirichlet Allocation is a technique from the field of Natural Language Processing used to extract topics or themes from a set of texts. Over the last several years, I and collaborators have been using this technique to analyze large amounts of literature from PER and Science Education in order to investigate which topics have seen sustained interest and how that interest has changed over time. In this talk I will describe the intuition behind LDA and present results from an analysis of all PERC proceedings published between 2001 and 2021. This analysis shows several distinct waves of research interest, most notably an overwhelming shift towards student identities and social communities in recent years.
Presentation 2 Title: Content Analysis at scale: using NLP and neural networks to analyze large quantities of student writing about their approach to experimental physics
Presentation 2 Authors: Rebeckah Fussell
Presentation 2 Abstract: Content Analysis is crucial in PER for understanding and measuring student thinking and behavior. This method is very time-consuming, however, not only in the development of coding schemes but also in the continued application of these coding schemes to growing data sets. Methods from Natural Language Processing (NLP) in conjunction with neural networks allow us to automate the process of applying certain coding schemes to incoming data. I will discuss how we have used these techniques to analyze student writing for the purpose of understanding the evolution of students' approaches to experimental physics over the course of a semester of lab instruction. Furthermore, I will explore how much data is necessary to make use of these tools and what features of coding schemes best lend themselves to automation with machine learning.
Presentation 3 Title: Applying Natural Language Processing to COVID Transition Surveys
Presentation 3 Authors: Colin Green
Presentation 3 Abstract: In addition to extracting themes and content, Natural Language Processing techniques can also be used to analyze sentiment within text. Using sentiment analysis and thematic analysis, our present project aims to understand physics faculty responses to transition to online teaching during the COVID-19 pandemic. We surveyed physics faculty following the Spring 2020 and Fall 2020 term, and used Latent Dirichlet Allocation to extract topics from the responses. This analysis revealed that while the mean change in sentiment was found to be approximately zero; there was a distinct shift in themes from Spring to Fall. The topics found in the initial survey largely revolved around technological and cheating difficulties experienced by the instructors. The topics were noticeably different in the follow up survey with showing themes related to reflection and successful and sustainable practices.
Presentation 4 Title: Using random forests to study physics graduate school admissions
Presentation 4 Authors: Nicholas Young
Presentation 4 Abstract: Random Forest is a machine learning technique designed for both creating predictive models and determining which variables in a dataset are most useful for making those predictions. In this talk, I will describe how the random forest algorithm works and when it might be preferrable over more traditional quantitative methods. I will also present an example of how the algorithm works in practice by determining what parts of a physics graduate school application are predictive of admission.