Effects Of Training Examples On Student Understanding Of Force And Motion

We examined the effects of simple training tasks on student responses to questions about the relationship between the directions of net force, velocity, and acceleration. Six training conditions were constructed, including a 2x2 design (abstract vs. concrete contexts) x (force-velocity training vs. acceleration-velocity training), a force-acceleration training condition, and a control (no training) condition. We found that the force-velocity and acceleration-velocity training significantly improved scores on both of these question types, but acceleration-velocity showed larger gains on the untrained question type, which is inconsistent with some interpretations of hierarchies of student understanding of force and motion found in previous works. This result implies that some students are learning the multiple relations between the variables that are typically learned in the course of standard instruction, while other students may be "gaming" the simple training tasks and not learning those relations between variables.


INTRODUCTION
There is a relatively long history of studies of student understanding of force and motion [1][2][3][4][5][6][7], often revealing now well-known incorrect student beliefs, such as the common belief that an object experiencing a non-zero net force must have a non-zero velocity parallel to that net force and the belief that a non-zero acceleration implies a non-zero velocity.Recently, Rosenblatt and Heckler [8] found evidence for a hierarchy of understanding of the relations between force, velocity and acceleration and an associated empirical progression of student understanding of these relations.If, for example, a student is correctly able to identify the possible directions of an object's velocity given the net force acting on it, that student is also very likely to be able to identify the possible directions of velocity given acceleration but not vice versa.
Although it is tempting to use these hierarchies as the basis for instructional strategies for improving student understanding, as hierarchies may suggest corresponding learning progressions, observing patterns in student responses is not sufficient to show that a particular pedagogical approach is superior.Similarly, these hierarchies do not necessarily indicate that students tend to learn a particular topic more easily or that understanding certain topics is a necessary prerequisite for understanding other topics.These patterns could, for example, be artifacts of existing course structure: because the relationship between acceleration and velocity is usually presented before the relationship between force and velocity, the above hierarchy is perhaps unsurprising, but this curriculum structure may not optimize student un-TABLE 1. Example of an F → v question: "The net force acting on a dog in a park points towards a small group of tulips at an instant in time.In what direction is the dog's velocity at that instant?"Note that a → v questions are obtained by replacing the words "net force" with "acceleration," providing uniformity in question structure.derstanding.To draw conclusions regarding curriculum design, we must investigate the effects of instruction directly instead of relying on pre-and post-test data as in [8].
To study the efficacy of different instructional techniques, we examined the effects of specific training examples to determine which training examples resulted in the most learning gains and whether the results of specific training were consistent with these previously observed hierarchies and progressions of learning.Each training example and test question provides the direction of one of the three quantities of force and motion (net force, velocity, and acceleration) in one dimension (e.g., net force F) and asks the student to identify possible directions for another quantity (e.g., velocity v).We label this type of question as F → v; an example is given in Table 1.Our work focused on two basic types of questions, which were found to have the largest prevalence of incorrect responses and compose the pieces of the specific

PARTICIPANTS AND DESIGN
In addition to studying the effect of training different question types, we also investigated whether training with abstract or concrete examples had any effect on performance, because we have found in previous studies that performance can depend on the relative abstractness of the training [9] or the target questions [10], with more concrete scenarios proving to be more difficult for students.Concrete questions consisted of relatively familiar situations, such as in Table 1, and abstract questions had the same format, but were written in a generic style referring to "objects" and generic directions (e.g., positive direction) with no reference to familiar scenarios, such as parks, dogs, etc. as in Table 1.
We investigated the effects of training with students enrolled in a calculus-based electricity and magnetism course at the Ohio State University, a large public research university.These students had already completed a calculus-based mechanics course that used a similar style and format and is a prerequisite for the electricity and magnetism course.With each student, we administered a brief training routine followed immediately by an assessment, both of which were presented electronically on computers in a quiet room.
A total of 274 participants were randomly assigned to one of six training conditions.Table 2 describes the training conditions and number of students assigned to each condition.Each training routine presented students with four multiple choice questions of the type listed in Table 2 (with the F → a training condition receiving a mix of abstract and concrete questions) interspersed with four "filler" questions about energy and momentum (the control condition only received the filler questions).The training questions were similar in form to Table 1 and those used in Rosenblatt and Heckler [8].The filler questions were designed to provide variety and avoid having the same correct answer choice for all the training examples.Immediately after answering a training question, feedback was automatically provided by indicating whether the student's response was correct or incorrect and displaying the correct answer.
Following the completion of the training examples, students were presented with a series of assessment questions with no feedback.There were twelve total questions: two of each of abstract F → v, abstract a → v, concrete F → v, and concrete a → v, and four additional filler questions.
Unfortunately, it is difficult to study student responses to training pre-instruction, as university students will likely have seen some topics related to force and motion during their primary and secondary education.Furthermore, we have extremely limited access to preinstruction students, making a detailed study logistically difficult.While these complications mean that we cannot truly assess the effect that traditional course structure may have on the development of student misconceptions, measuring student responses to specific training examples still demonstrates the effect of different presentations.In essence, we cannot measure the effect of training on a blank slate, but we can measure the effect of training given a particular initial state, which provides insight into the way that students process and internalize force and motion concepts.

RESULTS AND DISCUSSION
The major results of our study are shown in Figures 1  and 2. Note that a two-way (abstract-concrete x question type) ANOVA analysis revealed a significant main effect of question type (F(1, 180) = 8.262, p = 0.005) in the training, no significant effect of concrete vs. abstract training, and no significant interaction.We also examined student performance on test questions in the control condition.A repeated measures 2-way ANOVA found that students performed marginally better in a → v problems (with a mean score of 42%) than F → v problems (mean score 34%) (F(1, 44) = 2.973, p = 0.092), and they performed significantly better on abstract questions (mean score 44%) than concrete questions (mean score 32%) (F(1, 44) = 17.111, p < 0.0005).These results are consistent with previous results in [8] and [10].Nonetheless, the absolute difference in performance between concrete and abstract contexts is small compared to the effects of training.Therefore, for the remainder of the analysis, we combined the abstract and concrete training conditions as well as the scores for abstract and concrete test questions.
Figure 1 shows average scores for students who received each type of training.One-way ANOVA analysis indicates a significant difference between the three groups (F(2, 226) = 39.662,p < 0.0005).Furthermore, consistent with the 2-way ANOVA analysis mentioned above, a Tukey post-hoc test shows a significant difference in scores between the two types of training, with students receiving the a → v training earning a higher score (82.6% ± 2.5%), compared to F → v training (72.7% ± 2.9%) (p = 0.026, d = 0.423).
To find the source of this difference in scores, we can consider the scores for individual question types, as shown in Figure 2. Because we are interested in effects that may exist within individual students, we used a repeated measures analysis to find a significant main effect from training type (F(1, 182) = 8.246, p = 0.005) and a significant interaction between training type and question type (F(1, 182) = 21.834,p < 0.0005) but no significant effect of question type.
In short, training on a → v leads to a larger gain in This study presents a purely empirical perspective on student learning, but it is still useful to put the results in the context of previously observed hierarchies to construct a broader picture of student understanding of force and motion.Of particular relevance for our training are the student answering patterns for F → v and a → v questions.Rosenblatt and Heckler found that most students who correctly answer F → v questions also correctly answer a → v questions, and most students who incorrectly answer a → v questions also incorrectly answer F → v questions [8].These two observations indicate the conditional relationship that correctly answering F → v questions implies correctly answering a → v questions.
If we wish to interpret a uni-directional logical conditional in terms of causality, then we have a number of significantly different options.For example, the observed presence of a conditional could be an indication that the presence of the antecedent will cause the presence of the consequent (e.g., if it rains, then the street will be wet; rain causes the street to be wet).Alternatively, a conditional might indicate that the consequent is a necessary (but generally not sufficient) cause of the presence of the antecedent (e.g., if it rains, then the atmospheric pressure is low; low pressure is one of the necessary causes of rain).Note that these two interpretations of causality are mutually exclusive in the sense that the antecedent is the cause in one interpretation and the consequent is the cause in the other.
The training results found here indicate that the most reasonable way to interpret the causality of the hierarchy found by Rosenblatt and Heckler is that understanding a → v is one of the necessary causes of student understanding of F → v but not vice versa.While both types of training lead to significant gains in the trained question type, a → v training yields higher gains in the untrained question type, which suggests that learning a → v is necessary but not sufficient for learning F → v.In fact, a → v training preserves the response pattern found in the control and in Rosenblatt and Heckler [8] (i.e. a → v scores are higher than F → v scores), which suggests that a → v training preserves the usual progression for learning force and motion that students follow.
We also see evidence that this relationship is not biconditional because the responses from students receiving F → v training break this pattern, as many students correctly answered F → v but incorrectly answered a → v questions.Therefore, it is clear that learning F → v does not necessarily lead automatically to learning a → v.It seems that F → v training attempts to circumvent the usual progression of learning by improving student scores on F → v questions without first teaching a → v.
It is interesting to note that the F → v training did lead to higher a → v scores than control.This result may be a consequence of some students following a different learning progression, a result of which is that some students will need to learn F → v before learning a → v.
However, the much larger gain in F → v scores suggests that students are simply learning to "game" the test by identifying the correct answer to F → v questions without a concurrent understanding of closely related essential concepts.Because the correct answer for both F → v and a → v questions is the same, we expect some portion of students to choose that answer preferentially when guessing.Furthermore, student scores following F → v training break the previously-observed hierarchy (a student who correctly answers F → v questions is not also likely to answer a → v questions correctly), suggesting that this training opposes the commonly observed learning progression.Whatever the precise cause, the large difference in gains in untrained question types indicates that the a → v training taps into student learning progressions more effectively than F → v training.
Interesting issues for future research might be to determine the extent to which the gains in scores are retained over a period of days, weeks, or months, or to try to examine student learning progressions directly.

CONCLUSION
We have found that two kinds of relatively brief training examples significantly improve student performance on simple (but typically low scoring) questions regarding some of the relationships between net force, velocity, and acceleration.Specifically, we found that training on one of either F → v or a → v questions improves the score on both types of questions, but a → v questions improves the score the greatest.
This difference in performance after training is consistent with previous observations that students come to understand a → v before they understand F → v.The question as to why students are empirically found to learn a → v before F → v and why this appears to be a more effective progression is still an open question, but knowing this hierarchy is useful for planning curriculum and devising instructional sequences and strategies.That is, teaching a → v first appears to facilitate learning F → v.In contrast, when students learn F → v first, it appears that some fraction of students are learning to "game" the test without also learning the relevant concepts, thus impeding them from correctly learning a → v.
While these results are not necessarily generalizable to overall course design because we could not remove all influence of curriculum order, a somewhat weaker but still valuable claim can still be made.In the context of a traditional curriculum, a → v examples provide larger overall gains to student scores than F → v examples, so placing greater emphasis on a → v scenarios in discussions of force and motion or practice problems may help students develop deeper understanding than alternatives.
Finally, to get a perspective on how these results relate to a larger question of instructional design, let us consider the following question about empirically observed hierarchy of conceptual understanding and progression of learning: if two related concepts x and y are to be learned, and it is observed empirically that in the natural setting of a course x is learned before y is learned, does this imply that x should be taught before y is taught in order to maximize learning?Keep in mind that this is to be contrasted with some potentially different expert-constructed hierarchy of conceptual understanding.Such constructed hierarchies are commonly found in the learning-progression literature [11], and are at least implicit in any curriculum.While the hierarchy studied in this paper ( a → v learned before F → v) is typically also the progression used in most curricula, our results indicate the instructional sequence should follow the empirically observed hierarchy.
a. Towards the tulips b.Away from the tulips c.It is zero d.Both a and b are possible e.Both a and c are possible f.a, b, and c are all possible

FIGURE 1 .
FIGURE 1.Average score for students in each training condition.Error bars indicate the standard error for each group.

FIGURE 2 .
FIGURE 2. Average scores for students in each training condition.F → v question scores are shown on the left in blue, and a → v question scores are shown on the right in red.Error bars indicate the standard error for each group.

TABLE 2 .
List of experimental conditions and the number of students in each condition.