How Accurate are Physics Students in Evaluating Changes in their Understanding ?

An assessment question involving Newton’s 2 law was administered in a physics course for preservice elementary teachers before and again after instruction. The posttest included a prompt asking students to describe the specific ways their thinking changed. Student reasoning was coded for physics content accuracy; many students exhibited changes from primitive, experientially-based reasoning to more formal reasoning. Students' self-reported reflections were then compared to the differences in the preand posttest codes. We find that many students do not identify substantive changes in their reasoning, while other students reflect at only a surface level. We also find that some students overestimate their initial level of understanding.


I. INTRODUCTION
Research shows that experts engage in metacognition, the active monitoring of one's own understanding, and best practices recommend that metacognitive skills be taught in the context of specific subject matter [1].One such skill is reflection, through which a learner articulates how they came to know what they know [2].To practice this retrospective form of metacognition, learners must recognize specific differences between their current and initial ideas.But how accurate are students in identifying changes in their thinking?
Under hindsight bias, a construct in cognitive science, people overestimate the accuracy of their predictions once outcomes are known [3].It follows that a learner may overestimate the level of their initial understanding once a concept has been better understood through instruction.This could mask the depth and nature of learning that has occurred, making it more difficult for the student to recognize the value of instruction, and perhaps impeding the student's development as an independent learner.These considerations have implications for the preparation of K-12 teachers, who are increasingly expected to teach science as a process of inquiry.In part because teachers often teach as they were taught, best practices in professional development advocate inquiry-based science instruction for preservice teachers [4].Personal reflection is commonly emphasized in such approaches.For example, in the Physics and Everyday Thinking (PET) curriculum [5], students document their initial ideas about specific content, and then, after guided instruction, compare their resulting understanding with their initial ideas to reflect on how their understanding changed.
As part of an ongoing project, we are investigating student metacognition in physics.Previous work examined the type of statements students make when asked to write reflectively about how their understanding of a specific physics concept has changed [6].Here we explore a more focused question: To what extent do preservice elementary teachers accurately describe how their understanding of Newton's 2 nd law has changed during instruction?

II. RESEARCH METHODS
A challenge in studying metacognition is to bring what is generally a hidden, internal dialogue into the open.Most studies have used clinical interviews, rather than observations of students' natural behavior in classroom settings.Our approach strikes a balance between these, asking students to reflect on their physics learning as a regular part of the course.While similar to the approach of May and Etkina [7], who asked students to describe what they learned and how they learned it in an open-ended, general way, we focus reflection on a specific content question that students complete before and after relevant course instruction.We analyze student reasoning on the preand posttest to assess physics learning, and then compare our assessment to the student's self-reported description of their learning as a measure of the accuracy of student reflection.
Instructional context.All preservice elementary teachers take Science Education (SCED) 201, a 60hour course in which students develop a conceptual understanding of mechanics, while learning about how people learn.SCED 201 uses PET, a constructivist-based curriculum involving small group, guided experiments.Each chapter culminates with a comparison of class consensus ideas and accepted physics concepts.Sections enroll 24 students and are taught by one science faculty member and one teaching assistant.
Research task.Chapter 2 of PET contains eight activities, completed over 4-5 weeks, and focuses on force and motion in one dimension.In Activity 6, students are expected to understand that the rate of change of an object's speed is proportional to the ratio of the strength of the force exerted on the object to the mass of the object.The curriculum does not use the term acceleration.
We focused our investigation on Act.6 because instructors repeatedly observed that it elicited a rich set of productive and problematic student ideas.We adapted an initial ideas elicitation question from the activity as our research task (Figure 1).Two identical low friction carts, one with mass added, are placed on horizontal tracks, with fan units of equal strength attached.A ruler is used to launch the carts with the same initial speed.The fan units push in the direction opposite to the motion.
After the launch, which cart is first to stop and reverse direction -the one with less mass, the one with more mass, or would they act the same?Explain.To answer correctly, students can recognize that after the launch the carts slow down due to the equal, constant strength forces exerted by the fans.The larger mass cart has a smaller force-to-mass ratio, so according to Newton's 2 nd law it will have a smaller rate of change of speed.Since the carts start at the same speed, the cart with less mass will be the first to come to a stop.
Data set.Students completed the Fan Carts question before and after Act. 6.The posttest included a reflection prompt, which read, "Compare your original ideas to your current understanding.
Describe the specific way(s) in which your thinking changed."While completing this reflection, students had access to both their original pre-and posttest responses.The students completed the Fan Carts question in writing, under exam conditions.Responses were collected in eight sections, taught by four different instructors over four academic quarters.Students who did not complete both the pre-and posttest were excluded, resulting in a set of matched responses from 159 students.
Data analysis.Pre-and posttest responses were read and discussed by all three researchers in order to identify common themes in student reasoning.After developing a tentative set of categories, researchers independently sorted a subset of responses.Differences in sorting were discussed and resolved, leading to changes in the coding scheme.Two researchers then independently coded all responses.After comparing codes and discussing discrepancies, final refinements to the coding scheme were made.Responses receiving the same code were compared as a final consistency check.A similar process was used to develop a coding scheme for student reflections.

III. RESULTS
Student reasoning.The above process led to three main code groups of reasoning for responses to the Fan Carts question: 1) the target physics reasoning, 2) emerging, or partially correct reasoning, and 3) primitive, experientially-based reasoning.Subcategories in each group correspond to specific difficulties and levels of alignment with the target reasoning.Code group 1 responses include the correct answer as well as a clearly articulated rate-ofchange-of-speed (ROCOS) concept linked appropriately via cause-and-effect to the force by the fan and the mass of the cart.
Code group 2, representing partially correct reasoning, is similar to the first code group, but is characterized by a weaker ROCOS idea.These responses often describe a "change in motion" or "change in speed" of the carts, without an explicit statement about the rate of change.Responses in code group 3 do not clearly articulate or even suggest a ROCOS idea, but instead tend to link force to speed or motion (rather than changes in motion).These responses seem based on a student's life experience rather than formal knowledge.Code group 3 had two prominent sub-categories: the first involving a primitive idea that heavy objects are "harder to stop," or "will stay in motion," and the second involving an idea that heavy objects are "sluggish," or "like to be at rest." On the pretest, most responses were coded as primitive, while only a tiny fraction used formal knowledge.On the posttest, three-quarters gave a correct or partially correct response and less than one-quarter used primitive reasoning.Figure 2 summarizes student reasoning.These results suggest a substantial shift within the student population toward more Newtonian thinking.We compared pre-and posttest codes to identify specific changes in the reasoning of individual students.We compared the learning gains we observed to students' self-reported gains to investigate the accuracy and depth of reflection.
Student reflection.The method of analyzing student reflections led to two code groups: reflections that are 1) inconsistent, or 2) consistent with student reasoning responses.The inconsistent group included incorrect statements that no changes in reasoning are present, misrepresentations of reasoning, and discussions of physics ideas that demonstrated a lower level of understanding of Newton's 2 nd Law than did the pretest and posttest responses.Students who did not represent their ideas consistently were considered to have inaccurate reflections.
Code group 2 includes five categories that define how students reflected and the depth in which they reflected.The categories are described in Table 1.
One-third of the student reflections were coded as inconsistent, and the remaining two-thirds were coded as consistent.Figure 3 indicates the depth of reflection students engaged in when reflecting consistently by showing the distribution of the code group 2 responses.2a The student does not address any specific changes and focuses only on pretest response.2b The student restates their pre-and posttest responses with no analysis of the changes in their thinking, and does not generalize the concept(s) they developed.2c The student restates their pre-and posttest responses, but highlights their new understanding or identifies specific flaws in pretest reasoning.2d The student correctly identifies pre-and posttest reasoning as equivalent, or recognizes specific gaps in their understanding.2e The student contrasts their pre-and posttest responses to either negate or validate their previous reasoning, identifies gaps in their understanding, and explicitly identifies the fundamental concept(s) learned.

IV. DISCUSSION
Consistent reflections.Pre-and posttest responses to the Fan Carts question indicate that around 70% of the students experienced substantial gains in their understanding of Newton's 2 nd law.This suggests that when asked to reflect, most students should acknowledge changes in thinking and describe a shift toward more formal physics reasoning.Indeed, twothirds of student reflections were coded as consistent.Of those, Code 2c was assigned more than any other single reflection code (see Figure 3).Students in this group identified the general, underlying physics concept they learned, a productive metacognitive step that could help a student apply the learned idea to a new situation.However, 2c reflections did not contrast the student's new understanding and initial ideas.These students may be missing important opportunities to reconcile current and initial thinking and to reinforce what they have learned, perhaps compromising the durability of their learning.The 2e code demonstrated deeper reflection, including an analysis of initial ideas, but fewer than 5% of responses received this code.Students with consistent reflections receiving the 2a or 2b code essentially restated their pre-and posttest responses.These reflections seemed to "cut and paste" the preand posttest responses with no commentary or analysis.Students in these groups did not generalize their understanding beyond the two carts context, suggesting that they have not internalized Newton's 2 nd law as a robust model of how the world works.
Inconsistent reflections.Fully one-third of student reflections were coded as inconsistent.Many of these students, representing about 13% of the total sample, did not identify an evident change in their reasoning.For example, one student explained on their pretest, "the lighter mass cart would have had a weaker force push to get it started, and so I think it would take less time for the fan's force to overcome it."On their posttest, they stated, "the [cart] with less mass would be the first one to stop and start moving back towards me as it has a greater/higher rate of change of speed than the heavier cart."On the reflection, the student explained, "I believe my original ideas were consistent with my current understanding.I thought that the lighter weight would turn back first."The student does not seem to recognize that their final explanation incorporated a strong rate of change of speed concept, which was absent in the pretest.Nearly all students that failed to identify a fundamental change in reasoning had the same answer on the pre-and posttest.This suggests a tendency to focus on the answer, rather than the underlying reasoning, when reflecting.
Even among the consistent reflections, student language often indicated a focus on the answer.Many students used words such as "ideas," "explanation," and "understanding" when they seemed to be referring exclusively to their answer.Failure to differentiate answer from explanation could impede meaningful reflection on changes in understanding of force and motion.Other students giving inconsistent reflections, representing an additional 15% of the total sample, seemed to overestimate their level of initial understanding.These inconsistencies may stem from a form of hindsight bias.

V. CONCLUSION
We have developed a method for assessing reflective metacognition in the context of the learning of specific physics content.We find that one-third of the students describe their own learning in a way that explicitly contradicts researcher analysis of their physics explanations.We additionally identify distinct patterns of metacognition: some students restate their initial and final response, without really reflecting at all, some generalize by identifying a learned concept, and still others go on to connect their new ideas to their previous thinking.
We acknowledge limitations in our approach.Student reflections, as a group, vary considerably.We thus consider our coding scheme to have limited precision, in essence describing "dense regions" on a continuum.Furthermore, students' internal mental dialogues may include reflection that is more substantial than what is expressed in writing.Therefore, future work may include interviews to probe student metacognition in more depth.
The ability to track one's own thinking is important for the durability and transfer of learned content, and for guiding the learning of others.It seems unlikely that teachers unable to recognize changes in their own understanding will be effective in fostering student metacognition.We suspect that improvement in reflective metacognition requires exposure and practice over multiple courses, and advocate further study of the development of metacognition over time as well as the links between metacognition, content learning, and teacher practice.

FIG 2 .
FIG 2. Results from the Fan Carts question.

TABLE 1 .
Code group 2, describing student reflections that were consistent with the reasoning responses.