Expectations of how student views on experimental physics develop during an undergraduate degree

,


I. INTRODUCTION
Goals for undergraduate teaching laboratories often include epistemological development, such that students learn how to think like an experimental physicist [1,2]. Indeed, in the United Kingdom, such experimental skills form an important part of the degree accreditation process [3]. We will refer to the combination of student expectations and epistemologies about experimental physics as their 'views'. While there is a growing body of work that has looked at the development of student views over one course, the slow nature of epistemological change motivates taking a longer-term view [4], which has yet to be explored outside of the US context [5,6]. There is also growing evidence of a difference between students' views of experimental physics in Europe and the US [7]. In this paper, we introduce an on-going longitudinal study which is aimed at addressing the broad research question: how do student views on experimental physics change during their undergraduate degree in the UK?
The method we have chosen to answer this question is to use the Colorado Learning Attitudes about Science Survey for Experimental Physics (E-CLASS) [8], which has been designed to measure student views on experimental physics. This survey has been administered for the first time at our institution (see Section II) in the academic year 2021-2022, covering all three years of lab courses. Therefore, we cannot answer our principal research question yet. Instead, following recommendations on research best practices [9,10], we focus on making explicit our current expectations for the outcome of the survey. This is so we can be conscious of those expectations when constructing hypotheses to test and analysing the final data. Hence, our immediate research question is: what are our expectations about how the E-CLASS results will change across the three years of undergraduate lab courses?
To make our expectations explicit we describe in Section II the context of the lab courses in which we are measuring student views; in Section III we ground the expectations in findings from literature on epistemological development; and in Section IV we position these expectations with respect to prior results from the E-CLASS. This leads to the identification, in Section IV A, of specific E-CLASS items that relate to an overarching goal of the undergraduate lab courses. Preempting the following section, we discuss the current limitations of the study in Section V. In Section VI, we then use the data collected from the initial year of the study to update the expectations we identified in the previous sections. Specifically, in Section VI A we analyse the data for selection effects and in Section VI B we identify where further contextual information about the labs is needed. We summarise the results of this work in Section VII.

II. CONTEXT
The longitudinal research study is taking place in the Department of Physics at Imperial College London, located in the UK. The university is a large, public, research-intensive university focusing on Science, Technology, Engineering, and Medicine (STEM) subjects. Approximately 50% of stu-dents are international. In the UK system, a university degree typically specialises in one subject area (e.g., Physics) from the start of the degree. Students take the vast majority of courses within their department and do so in a linear sequence. In this university, the department offers three-year bachelor and four-year combined master's undergraduate degrees, as well as awarding Ph.D. degrees. Over 250 students graduate with a physics undergraduate degree each year.
As part of the physics degree, students have compulsory lab courses in each of the first two years. In the third year, students not taking the Theoretical Physics variant of the degree, also have a one-term lab class. In addition to these classes, there is an optional Advanced Electronics course in the first year, as well as a variety of project-based courses that are taken in the first and third years. All three years of lab courses are separate from the lecture courses though may rely on content from the lectures. In all labs, students work in pairs to complete each activity.
One overarching goal of the undergraduate lab course is for students to develop the skills needed to be independent experimental physicists. The structure of the lab courses has been designed to facilitate this independence through reducing the level of scaffolding provided to the students as they progress through their degree and increasing the open-ended nature of the experimental activities that are provided. Therefore, we expect E-CLASS items that measure this independence would show an increase in expert-like agreement over the period of the undergraduate course. We now describe, for each lab course year, the lab activities and how the labs are assessed.
The first-year laboratory (Year 1) course introduces students in the first three weeks to the undergraduate lab environment through scaffolded activities to learn how to use common measurement devices (oscilloscopes, digital multimeters, etc.) through traditional experiments measuring physical quantities. In these first weeks, students also complete a computing course introducing Python as a tool for analysing experimental data. The lab activities then expand so that students gain experience of techniques foundational to different areas of experimental physics (e.g., optics and electronics). Students spend one three-hour session per week in the lab, with a second two-hour, online analysis session the following day. Each weekly pair of sessions covers one experiment, with a total of eleven experiments completed in the year. Summative assessment is through submission of a digital lab book for each experiment and two lab reports.
The second-year laboratory (Year 2) course is split into four parts over two terms. In the first part, students develop their programming skills in a computational lab, ultimately writing a simulation of a physical system. In the remaining three parts, students cycle through three experiments covering: interferometry, radioactivity, and waves & wave propagation. These experiments last for approximately four weeks (with two mornings per week allocated in the lab) and begin with students gaining familiarity with the equipment so that in later weeks they can plan and perform their own in-TABLE I. Gender percentages taken from responses to the post survey for each curriculum year. Denominators (N ) for these percentages can be found in Table IV. The Woman and/or gender nonbinary category contains those respondents who chose the option "Woman" or "Other gender", those choosing the latter were then invited to write in their gender. Students who selected "Prefer not to say" are grouped with those who did not provide a response. vestigations. The experimental components of the lab course are assessed through student lab notebooks and general professional skills in the lab context, as well as through either a short technical report, a publication-style report, or an oral presentation (one for each experiment). The third-year laboratory (Year 3) course can be taken by students in either the first or second term of the academic year. Students choose three experiments from a set of thirteen and spend approximately three weeks on each one [11]. Students have access to the laboratory during normal working hours, with demonstrators (instructors) timetabled to be present for three mornings per week (8 hours in total per week). Students are assessed for the first experiment through an oral presentation, completed in pairs online; for the second experiment as a two-page conference paper; and for the third experiment as a six-page report. The lab scripts provide information on experiments students can complete with the apparatus, with very little guidance on the experimental and analysis methods needed giving students a free choice in the approaches they take.
We present demographic data collected from student responses to provide the context of the study. In the post survey, we asked students for their gender (Table I) and currently enrolled degree (Table II). The percentage of women and/or gender non-binary students responding was slightly higher than the percentage of women and/or gender non-binary students enrolled in the physics degree (about 25%). The percentage of students who responded that are enrolled in the Theoretical Physics degree (theory students) was less than the same percentage from the total physics student population (about 40%). In Year 3, theory students do not complete the lab course and were not invited to complete the survey.

III. EPISTEMOLOGICAL DEVELOPMENT
As mentioned in the Introduction, previous work has shown that epistemological changes in undergraduate degrees tend to be small. Specifically, King and Kitchener [4] found that on a seven-point scale, undergraduate students showed an average increase of less than half a point from the beginning to the end of their degree course. Therefore, we expect changes of similar magnitude in the present study, however we note that both the measurement instrument and the educational context (being the US in Ref. [4]) are different.
In Hofer's review of epistemological development [12], they highlight three models for how personal epistemology relates to learning. The first is that epistemology is developmental, such that one's epistemology transitions through a recognised series of stages as one grows older [13]. The second is that epistemology is a set of (potentially fixed) beliefs which influence the modes and nature in which one engages with learning [14]. The third is more nuanced in that a person does not hold a single epistemology but that epistemological resources are activated in context [15]. This context dependence of the third model resonates with our own experiences of students in labs and, therefore, we identify most with that model. Therefore, we expect to interpret the results of the longitudinal study as indicating which epistemic resources were, or were not, activated by both the lab course and for each individual E-CLASS item statement. Consequently, we do not expect the same items to show the same changes in each curriculum year and we similarly do not expect the post survey of one year to agree with the pre survey of the following year for the same student. This differs from the first model in that we recognise that a student may be able to activate only a finite number of epistemological resources at any time and that some may become more dominant in one course compared to another.

IV. THE E-CLASS
The E-CLASS has 30 items consisting of statements with which students are asked, in the context of a lab course, whether or not they agree with on a five-point Likert scale from Strongly disagree to Strongly agree. Student scores are generate by comparing the student response to each item, on a reduced three-point scale, with the expert response [8]. A score of +1 is given if the student agrees with the expert response, -1 if they disagree, and 0 if the student selects the "neither agree nor disagree" option. A total score for the E-CLASS is then computed by summing the score for each item, which then gives a value in the range -30 to 30. The E-CLASS is administered as a pre/post survey, so that changes arising during the period of the course can be isolated.
A full review of previous results from the E-CLASS can be found in Ref. [16]. Here we focus on the results relevant to our principal research question. E-CLASS scores tend to decrease during traditional (cookbook) lab courses and show small positive gains in open-ended lab courses [17]. Therefore, given the increasing open-ended nature of the lab courses from Year 1 to Year 3 (Section II), we expect to also see similar increases.
A previous longitudinal study, which was completed at the University of Colorado Boulder in the United States, saw an increase in mean E-CLASS scores for three classes in each successive year of the physics major [5]. However, when students were tracked over all three courses, those students showed no significant changes in their mean E-CLASS score both within and between courses. Interestingly, this supports the second epistemological development model described by Hofer [12], which is in contrast to our own expectations discussed in Sections II and III. This, therefore, moderates our previously stated expectations, yet further motivates the full investigation to understand whether the previous result [5] is reproducible in a different context.
A separate study, covering 131 courses, found major differences in E-CLASS score also occur due to student population differences in first-year/introductory courses compared to beyond-first-year courses in the US that arose due to the presence of non-physics majors taking the introductory course [18]. Therefore, we must consider how such effects may influence our present study. One potential selection effect is from the theory students not taking labs in their third year. We expect these theory students to have lower mean E-CLASS scores than non-theory students, due to the fact that the scoring system has been calibrated against the views of experimental physicists. Specifically, the items that relate to student affect, such as item 20: "I enjoy building things and working with my hands" to have a higher disagreement for theory students simply because of their pre-declared disposition toward theory.

A. Relevant E-CLASS items
As described in Section II, one goal for the three years of the lab course is for students to develop the skills needed to be independent experimental physicists. We now identify specific E-CLASS items, in a process recommend by Wilcox and Lewandowski [19] when analysing E-CLASS data, which capture the key feature of this goal which is to be "independent". Therefore, we select items that reveal • the self-efficacy of the student; • how the student relates to sources of authority. We have chosen these two aspects because the self-efficacy of the student is important in determining their reaction when faced with new tasks and subsequent susceptibility to influence by others [20], while the relation to sources of authority relates to how a student views sources of knowledge and their own imperative in creating new knowledge [21]. We do not choose items that relate to the development of specific lab skills (such as troubleshooting or calculating uncertainties) or communication skills. The 11 items we have identified are presented in Table III. The two items relating to enjoyment (7 and 20) have been included as 'self-efficacy' items because if a student reports that they enjoy an activity, we assume that they will have more self-efficacy in attempting that activity [22].
Our expectations are that for each of these items students would give more expert-like responses after three years of instruction and, as they are aligned with the goals of the course, that they are more likely to show changes compared to other items [23]. This part of the analysis can only be used for internal comparisons, while when comparing to external reference marks we will use the full E-CLASS score to ensure validity is maintained.

V. LIMITATIONS
Before we look at the data from the first year of administration, we must highlight the extensive limitations of the following analysis. Firstly, these data are only pseudolongitudinal with different students in each year group. Secondly, the response rates for each survey administration varied between 11% to 30% (Table IV), which indicates a bias from undersampling the student population and is related to our discussion of selection effects. We did not provide any incentives (such as course credit) for students to complete the survey.
The data below has not been matched from the pre to post survey. We were required by our Ethics Board to administer the survey such that student responses were anonymous. Therefore, we implemented a local, online version of the survey using the Blackboard course management system. We used a set of linking questions to connect the pre/post administrations. The combination of low response rates and unreliable linking questions meant that the number of matched responses from all year groups was 38 (of 94 post survey responses). Similar issues have been found in a German E-CLASS study [7], with suggestions for increasing the response rate that we plan to implement in subsequent years. Consequently, we choose to present all the data we currently have to better understand the range of possible responses the lab courses elicit from students, while remaining cognisant of the fact that pre and post student samples may differ.

A. Selection effects
In Section IV, we identified that it is important to ask the question of whether changes in E-CLASS scores were due to TABLE III. E-CLASS items that meet the selection criteria given in Section IV A. The column 'Self-efficacy' labels items selected because student responses reveal the self-efficacy of the student, while the column 'Authority' labels items selected because student responses reveal how the student relates to sources of authority. The expert response is given in parenthesis after each item statement. A full list of E-CLASS items can be found in [8]. *The word instructor has been replaced with demonstrator for the British English context.  selection effects rather than student views actually changing.
There are two methods that can be used to isolate this effect, the first being to track individual students over the three years of the undergraduate course. The second is to identify and analyse separately the groups of students which may lead to these selection effects. While the first approach is not possible at this time, we can consider the second approach.
In the preliminary data (Table IV), the mean E-CLASS score increases from the first year to the third year, as expected -though these changes are not statistically significant (Mann-Whitney U test comparing Year 1 and Year 3: pre p = 0.23, Cliff's d = 0.15 ± 0.24; post p = 0.17, d = 0.23 ± 0.32 with 95% confidence interval [24,25]). We note the overall lower scores in the second year and conjecture this is a result of pandemic disruption. We have some indication that selection effects will be important to consider, as first-year non-theory students have a marginally (not significant) higher mean score on the post survey compared to the whole class mean. This selection effect is also exacerbated by the biased sampling of the student population that favours non-theory students in Year 1 and 2 (Table II).

B. Identifying where further context is needed
We now highlight how having made our expectations explicit at this stage provides the basis for future critical analyses in that it allows us to challenge our own assumptions. We have identified one item -item 16 ( Figure 1) -that appears to show a different behaviour to our expectations from Section IV A. We now use this to guide further inquiry: specif-ically the low percentage of students in the Year 3 post survey who disagreed with the statement that "the primary purpose of doing a physics experiment is to confirm previously known results" compared to both the pre survey and all responses from other years. This suggests a detailed analysis of the lab scripts to identify differences between the years in the extent to which the activities are presented as confirmatory exercises. At this stage, due to the limitations outlined in Section V, we deliberately do not draw any stronger conclusions from this data.

VII. CONCLUSIONS
By adopting the principle of transparency in our expectations, we hope that this will provide valuable context to readers of later work (including ourselves). Our expectations are summarised as: 1. the deliberate increase in open-ended labs will lead to increased mean E-CLASS scores; 2. the increase will be greatest in those items identified to be associated with being an independent experimental physicist (Table III); 3. the effect size of changes in E-CLASS score will be small; 4. individual item scores may not change monotonically with time; 5. theory students will have a lower mean E-CLASS score than non-theory students. We also expect to develop new expectations as the study progresses. We plan to use these expectations to shape predictive hypotheses which we can then test, the results then guiding qualitative work to ask why those expectations did or did not hold.