We find that symbolic physics questions are significantly more difficult than their analogous numerical versions. Very few of the errors are due to manipulation errors of the symbolic equations. Instead, most errors are due to confusions of symbolic meaning. We also find that performance on symbolic questions is more highly correlated with the overall performance than performance on numeric questions. We devised a coding scheme that distinguishes questions based only on the mathematical structure of the solutions. The coding scheme can be used to identify both difficult and discriminating physics questions. The questions identified by this coding scheme require an algebraic representation and discourage problem solving strategies that do not require an understanding of symbolic equations. Our results suggest that an inability to interpret physics equations may be a major contributor to student failure in introductory physics.

1.
E.
Torigoe
and
G.
Gladding
, “
Same to us, different to them: Numeric computation versus symbolic representation
,” in
2006 Physics Education Research Conference
, edited by
L.
McCullough
 et al. (
AIP
,
New York
,
2007
), pp.
153
156
.
2.
E.
Torigoe
, “
What kind of math matters? A study of the relationship between mathematical ability and success in physics
,” Ph.D. dissertation,
University of Illinois at Urbana-Champaign
,
2008
.
3.
C.
Kieran
, “
Cognitive processes involved in learning school algebra
,” in
Mathematics and Cognition: A Research Synthesis by the International Group for the Psychology of Mathematics Education
, edited by
P.
Nesher
and
K.
Kilpatrick
(
Cambridge U. P.
,
Cambridge
,
1990
), pp.
96
112
.
4.
C.
Kieran
, “
The learning and teaching of school algebra
,” in
Handbook of Research on Mathematics Learning and Teaching
, edited by
D.
Grouws
(
Macmillan
,
New York
,
1992
), pp.
390
419
.
5.
E.
Filloy
and
T.
Rojano
, “
Solving equations: The transition from arithmetic to algebra
,”
For the Learning of Mathematics
9
(
2
),
19
25
(
1989
).
6.
The examples shown are from Ref. 4.
7.
J. H.
Larkin
,
J.
McDermott
,
D. P.
Simon
, and
H. A.
Simon
, “
Models of competence in solving physics problems
,”
Cogn. Sci.
4
(
4
),
317
345
(
1980
).
8.
J.
Clement
, “
Algebra word problem solutions: Thought processes underlying a common misconception
,”
J. Res. Math. Educ.
13
(
1
),
16
30
(
1982
).
9.
E.
Cohen
and
S. E.
Kanim
, “
Factors influencing the algebra ‘reversal error’
,”
Am. J. Phys.
73
(
11
),
1072
1078
(
2005
).
10.
E.
Soloway
,
J.
Lochhead
, and
J.
Clement
, “
Does computer programming enhance problem solving ability? Some positive evidence on algebra word problems
,” in
Computer Literacy
, edited by
R. J.
Seidel
,
R. E.
Anderson
, and
B.
Hunter
(
Academic
,
Burlington
,
1982
), pp.
171
201
.
11.
M.
Scott
,
T.
Stelzer
, and
G.
Gladding
, “
Evaluating multiple-choice exams in large introductory physics courses
,”
Phys. Rev. ST Phys. Educ. Res.
2
(
2
),
020102
(
2006
).
12.
See supplementary material at http://dx.doi.org/10.1119/1.3487941 for all ten numeric and symbolic pairs of questions used in this study.
13.
Question 4 was created by modifying an existing symbolic question. When numbers were introduced to create the numeric version, one of the symbolic options corresponded to an imaginary quantity. To ensure the similarity of all of the options, only the magnitude of this quantity was displayed in the numeric version. Two of the other five options for this question do not agree between the versions, but each of these options was chosen by 2% or less of the students.
14.
The p-value represents the likelihood that such a difference can be observed under the assumption that the null hypothesis is true (see Ref. 15).
15.
G. V.
Glass
and
K. D.
Hopkins
,
Statistical Methods in Education and Psychology
, 2nd ed. (
Prentice-Hall
,
Englewood Cliffs, NJ
,
1984
), pp.
229
235
.
16.
Some questions were common between the two versions of the final exam.
17.
The discrimination of multiple-choice questions is most commonly measured using the point biserial coefficient of correlation because the result of a multiple-choice question is most commonly dichotomous. The multiple-choice questions in this study were analyzed using the Pearson correlation coefficient r, because students were given partial credit for multiple selections. As a result a student could receive a score of 0, 0.33, 0.5, or 1 on each question.
18.
The error of the mean difference shown in Table III is less than what one would calculate if the errors for the top and bottom groups were combined in quadrature. To calculate the error shown, we took advantage of the fact that the difference in score between the top and bottom groups could be determined for each question. The error in the mean difference for the equation priority questions was determined by calculating the variance of the distribution of differences for the 40 equation priority questions. This process of pairing data is analogous to how one would calculate gains on the FCI by pairing the each precourse and postcourse score by student, rather than finding the mean difference between the average precourse score and the postcourse score for the class as a whole.
19.
D. L.
Schwartz
,
T.
Martin
, and
J.
Pfaffman
, “
How mathematics propels the development of physical knowledge
,”
Cognit. Dev.
6
(
1
),
65
88
(
2005
).
20.
V. M.
Sloutsky
,
J. A.
Kaminski
, and
A. F.
Heckler
, “
The advantage of simple symbols for learning and transfer
,”
Psychon. Bull. Rev.
12
(
3
),
508
513
(
2005
).
21.
J.
Tuminaro
and
E. F.
Redish
, “
Elements of a cognitive model of physics problem solving: Epistemic games
,”
Phys. Rev. ST Phys. Educ. Res.
3
(
2
),
020101
(
2007
).
22.
E.
Mazur
,
Peer Instruction: A User’s Manual
(
Prentice-Hall
,
Upper Saddle River, NJ
,
1997
), pp.
5
7
.

Supplementary Material

AAPT members receive access to the American Journal of Physics and The Physics Teacher as a member benefit. To learn more about this member benefit and becoming an AAPT member, visit the Joining AAPT page.