Effects of Argumentation Scaffolds on Student Performance on Conceptual Physics Problems

Studies have shown that embedding scientific argumentation in problem solving can enhance problem solving skills. However, research has also indicated that students have difficulties constructing arguments without appropriate scaffolds. We investigated the use of argumentation scaffolds on students’ argumentation quality, conceptual quality, and solution strategies on conceptual problems in an introductory physics class. In this mixed method study we compared students’ performance in two guided conditions – constructing an argument and evaluating two arguments – as well as one control condition. Our results indicate that the use of guiding prompts improves the argumentation and conceptual quality of students’ solutions. Further, students in the guided conditions tended to use a wider variety of problem solving strategies than in the control condition. We discuss the implications of these results on the use of argumentation prompts on conceptual problems in introductory physics.


INTRODUCTION
Studies have shown that scientific argumentation activities can enhance students' critical thinking and problem solving skills [1,2].Research has indicated that students have difficulties constructing arguments without appropriate scaffolds [3][4][5][6].Further, research has shown that students have limited problem solving abilities [7].They apply formula-centered approaches, rarely reflecting on the appropriateness of equations or considering alternative solutions [8][9][10].
We examined the effects of alternative forms of argumentation on students' solutions to physics problems across multiple topics.Specifically, we compared students' strategies when they were prompted to construct or evaluate arguments.Our research questions were: 1) What level of argumentation and conceptual quality do our participants demonstrate in physics problems with no argumentation scaffolding?
2) How does the level of argumentation and conceptual quality change based upon prompts designed to scaffold the construction and evaluation of arguments?
3) What strategies do students use to solve problems incorporating construction or evaluation of arguments and how do the strategies vary from those used in traditional problem formats with no argumentation scaffolding?

THEORETICAL FRAMEWORK
To characterize students' argumentation quality we adapted Toulmin's argumentation pattern (TAP) [11].TAP describes the features of an argument as (i) claims -conclusions or assertions, (ii) data -facts providing foundations for claims, (iii) warrants -reasons, rules, or principles connecting the data with the claim, (iv) backing -assumptions justifying the warrant, (v) qualifiers -conditions for when claim is true, and (vi) rebuttals -conditions when the claim is untrue.We adapted TAP to assess students' argumentation quality.
Students' problem solving strategies in physics have been extensively studied.Tuminaro and Redish [12] identified six epistemic games employed by students while solving physics problems: (i) mapping meaning to mathematics -connecting math to physical reality, (ii) mapping mathematics to meaningcreating a mathematical description of physical reality, (iii) physical mechanism -qualitatively describing the situation, (iv) pictorial analysis -creating a pictorial representation of the situation, (v) recursive plug-andchug -inserting numbers into equations, and (vi) transliteration to mathematics or case reuse.Tuminaro and Redish [12] emphasize that new games may emerge while solving physics problems.We adapted their framework to characterize students' strategies while solving conceptual physics problems.Other, more recently discovered e-games such as the answer making e-game [13] were not included in the analysis here.

METHODOLOGY
We used a two-phase concurrent mixed-methods approach [14] for this study in an introductory calculus-based physics course at a Midwestern U.S. public university.Five homework problems targeting key misconceptions were adapted from literature.We administered each problem online to 246 participants divided equally in three conditions -construct, evaluate, and control.Figure 1 shows an example of the problem types and the corresponding scaffolding prompts for the construct and evaluate condition.The construct prompts were based on work by Jonassen [5] and the evaluate prompts were based on work by Mason and Scirica [15] to scaffold students' argumentation.The control group received the same problem as the construct condition, except with the prompts replaced by "What is your answer?Explain your reasoning."All students completed the Force and Motion Conceptual Evaluation (FMCE) [16].Based on their responses and gender, students were assigned one of three conditions to ensure representative samples.[17].

ANALYSIS
The quantitative aspect of the mixed methods design allowed us to address the first two research questions, while the qualitative aspect addressed the third research question.For the quantitative analysis we adapted the Argumentation Quality Rubric [18] based upon TAP for argumentation quality and designed another rubric for conceptual quality of problem solutions (Table 1).The argumentation prompts and analysis rubric used in this study are identical to those used by us in a previous study [19].
Participants' responses were scored by two independent raters.After independent scoring, the raters discussed all scores with 100% agreement after discussion.To compare argumentation and conceptual quality in the three conditions, a multivariate analysis of variance (MANOVA) was performed with conceptual and argumentation scores as the dependent variables (DVs) and the three conditions as the independent variables (IVs).We then conducted univariate ANOVAs to determine if the conditions had a significant effect on each DV.

EVALUATE
You and your friends are trekking across a nearly frictionless frozen lake to a camp on the other side.Two of you are each pulling a sled loaded with camping equipment and supplies.You both notice, however that one of you is pulling a much heavier load on their sled than the other one.Your friend, knowing that you are taking physics, asks "Suppose we were both to continuously pull our respective sleds with exactly the same force from the same starting point on this shore of the lake all the way to the opposite shore of the lake, which one of us -the one pulling the heavier or the lighter sledwould do more work?Which will have the greater energy?"Two other friends who are trekking with you, and who are also taking physics, jump in to answer.Bill: "You will clearly need to do the greater amount of work on the heavier sled, since it is heavier and because of that, the heavier cart sled will also have greater energy because whatever work you do is converted into kinetic energy."Bob: "Of course, the heavier sled will need more work, but the heavier sled will have a smaller kinetic energy, because kinetic energy depends upon the mass and the square of the speed and although the heavier sled has a greater mass, it has a smaller speed, so it will have a smaller kinetic energy."  Which statement (of the ones provided) best describes the physical phenomenon?Or do you have another argument? Explain, elaborate, and justify your preferred solution.Remember to consider:  What evidence and reasons support your selection? Explain your reasoning for not choosing the alternative solution(s).
What are the weaknesses in the alternative argument? How might a classmate supporting the other solution disagree with your preferred solution? What would your reply be to your classmate to explain that your position is right?CONSTRUCT You and your friend are trekking across a nearly frictionless frozen lake to a camp on the other side.Two of you are each pulling a sled loaded with camping equipment and supplies.You both notice, however that you are pulling a much heavier load on your sled than your friend.Your friend, knowing that you are taking physics, asks "Suppose we were both to continuously pull our respective sleds with exactly the same force from the same starting point on this shore of the lake all the way to the opposite shore of the lake, which one of us -the one pulling the heavier or the lighter sled -would do more work?Which will have the greater energy?"  Construct an argument to justify your answer. Explain your position clearly and completely by providing all reasons that support your conclusion.Remember to consider:  What evidence and reasons support your solution? One of your classmates may disagree with your conclusion.What might they think is the alternative conclusion? What reasons would your classmate provide to support their conclusion? What would your reply to your classmate be to explain that your position is right?For the qualitative part we used a multiple case study approach [19] with semi-structured individual think-aloud interviews.Via stratified sampling, we selected four participants from each condition based upon FMCE performance to participate in four individual interviews over the course of the semester.Each condition served as a case.After participants had completed each online problem, we interviewed them to ask them to solve a similar argumentation problem.Interviews were audio-recorded and transcribed.We organized all multi-source data into a case record [19].Analysis of participants' responses showed that the emergent problem solving strategies aligned with Tuminaro and Redish [12].Once themes emerged for a case, we investigated the data for refuting evidence [20].We performed a cross-case analysis to investigate similarities and differences between conditions [18].

FINDINGS
The quantitative results showing the mean conceptual and argumentation quality scores across all five problems in each condition appear in Fig. 2.
MANOVA revealed a statistically significant difference among the conditions [Wilks' Λ= 0.640, F(20.0, 390.0) = 4.875, p < .001,η2 = 0.20] for all five problems.This constitutes a small-sized effect.Univariate ANOVAs revealed significant differences between conditions in argumentation scores for all five problems and conceptual scores for two problems.Follow-up Tukey's HSD analysis with an overall alpha level of .05revealed statistically greater argumentation scores for construct and evaluate conditions compared to the control condition for four problems, but no significant difference between construct and evaluate conditions.Results for argumentation quality revealed that construct and evaluate condition prompts yielded a higher argumentation quality than the control.Post hoc results for conceptual quality scores seem to suggest that differences in problem format may have been influenced by problem context or topic.
Our qualitative results indicate alignment with the epistemic games outlined by Tuminaro and Redish [12] as well as six additional games that are described below.
Recursive concept testing -testing different concepts and seeing which is applicable.

Extreme case thought experiment -predicting what happens when a variable takes an extreme value.
Covariational reasoning -examining how a change in one quantity propagates to another quantity.
Qualitative concept application -identifying a concept and arguing how it applied to the problem.
Evaluate problem scenarios -reading each pseudo debate argument to determine which is incorrect.
Construct and compare solutions -constructing a solution and comparing it with arguments provided.
The cross-case analysis showed no single prevalent strategy in either the construct or the evaluate conditions.However, control condition participants tended to utilize less sophisticated personal intuition or formula-centered strategies than students in either the construct or evaluate conditions.Conversely students in the construct condition utilized covariational reasoning and case reuse in addition to hypothetical plug-and-chug and qualitative concept application.The evaluate condition employed qualitative concept application or covariational reasoning along with evaluate problem cases and construct and compare strategies.These differences could potentially be attributed to the ways in which participants utilized the hypothetical pseudo debate arguments provided in the evaluate condition.

CONCLUSIONS
The ability and willingness to construct sound scientific arguments is an important skill for scientific literacy.However, no significant work has been completed that examines the use of scientific argumentation in physics problem solving.To address our research questions: 1) On average the participants in the control condition who did not receive any prompts were unable to create arguments with more than a single ground for justification.Their conceptual scores indicate that for the most part these students are able to answer the problem correctly, but are only able to provide partly correct reasoning.
2) We found that when argumentation prompts for both the construct and evaluate conditions are provided, there is a statistically significant increase in argumentation quality for both the construct and evaluate conditions.Participants on average are able to provide a justification with multiple grounds.
3) We found that the strategies that students use tend to be more diverse than the ones reported in the literature [12].No particular strategies were more prevalent in any of the conditions.However, in general, students who received the prompts seemed to have improved their solving strategies compared to those in the control condition.
Overall, this study demonstrates that the quality of argumentation for tasks that require both construction and evaluation of arguments can be enhanced through the use of appropriate prompts, such as those used in this study.The reasons for differences between conceptual quality on different problems merits further study.No differences were observed with regard to gender or FMCE scores.Further, the use of prompts also invokes students to use different problem solving strategies that they might otherwise not have considered.
Newton [21] indicated that the quality of an argument could depend upon the context and nature of the task.As evident by the mean argumentation quality scores, problems designed to prompt students to construct or evaluate arguments yielded higher argumentation quality with elaborately justified claims.This study also demonstrates that typical statements such as "explain your reasoning" may not produce higher argumentation quality unless students are appropriately guided to provide justifications.

FIGURE 2 .
FIGURE 2. Mean argumentation and conceptual quality scores across five problems in the three conditions.The error bars are the standard error.

TABLE 1
Conceptual and Argumentation Quality Rubric