For decades, Student Evaluations of Instruction (or Teaching) have been used to evaluate the quality of teaching at universities and colleges nationwide. Often, student evaluations are the sole measurement of teaching quality in higher education, and as a result have been the subject of extensive study. While many of these investigations make claims about the correlations between student evaluations of instruction and student learning, the validity and reliability of both the methodologies and measurement tools in these studies is not clear. The study reported here uses research-based conceptual inventories, such as the Force Concept Inventory (FCI), to make the more rigorous claim that Student Evaluations of Instruction do not correlate with conceptual learning gains on the FCI. In addition, grading leniency by an instructor (i.e., giving easy A grades) does not correlate with increased student evaluations of instruction.

1.
C.
Henderson
, “
Assessment of teaching effectiveness: Lack of alignment between instructors, institutions, and research recommendations
,”
Phys. Rev. ST Phys. Educ. Res.
10
,
1
20
(
2014
).
2.
H.
Laube
,
K.
Massoni
,
J.
Sprague
, and
A. L.
Ferber
, “
The impact of gender on the evaluation of teaching: What we know and what we can do
,”
NWSA J.
19
,
87
104
(
2007
), see http://www.jstor.org/stable/40071230.
3.
J.
Sprague
and
K.
Massoni
, “
Student evaluations and gendered expectations: What we can't count can hurt us
,”
Sex Roles
53
,
779
793
(
2005
).
4.
N.
Denson
,
T.
Loveday
, and
H.
Dalton
, “
Student evaluation of courses: what predicts satisfaction?
,”
Higher Educ. Res. Develop.
29
,
339
356
(
2010
).
5.
H. K.
Wachtel
, “
Student evaluation of college teaching effectiveness: A brief review
,”
Assess. Eval. Higher Educ.
23
,
191
212
(
1998
).
6.
P. A.
Cohen
, “
Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies
,”
Rev. Educ. Res.
51
,
281
309
(
1981
).
7.
K. A.
Feldman
, “
The association between student ratings of specific instructional dimensions and student achievement: Refining and extending the synthesis of data from multisection validity studies
,”
Res. High Educ.
30
,
583
645
(
1989
).
8.
D. E.
Clayson
, “
Student evaluations of teaching: Are they related to what students learn?: A meta-analysis and review of the literature
,”
J. Marketing Educ.
31
,
16
30
(
2009
).
9.
B.
Uttl
,
C. A.
White
, and
D. W.
Gonzalez
, “
Meta-analysis of faculty's teaching effectiveness: Student evaluation of teaching ratings and student learning are not related
,”
Stud. Educ. Eval.
54
,
22
42
(
2017
).
10.
S.
Stehle
,
B.
Spinath
, and
M.
Kadmon
, “
Measuring teaching effectiveness: Correspondence between students' evaluations of teaching and different measures of student learning
,”
Res. High Educ.
53
,
888
904
(
2012
).
11.
R. R.
Hake
, “
Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses
,”
Am. J. Phys.
66
,
64
74
(
1998
).
12.
L.
Ding
, “
Evaluating an electricity and magnetism assessment tool: Brief electricity and magnetism assessment
,”
Phys. Rev. ST Phys. Educ. Res.
2
,
1
7
(
2006
).
13.
D. P.
Maloney
,
T. L.
O'Kuma
,
C. J.
Hieggelke
, and
A.
Van Heuvelen
, “
Surveying students' conceptual knowledge of electricity and magnetism
,”
Am. J. Phys.
69
,
S12
S23
(
2001
).
14.
R. K.
Thornton
and
D. R.
Sokoloff
, “
Assessing student learning of Newton's laws: The force and motion conceptual evaluation and the evaluation of active learning laboratory and lecture curricula
,”
Am. J. Phys.
66
,
338
352
(
1998
).
15.
J.
Von Korff
,
B.
Archibeque
,
K. A.
Gomez
,
T.
Heckendorf
,
S. B.
McKagan
,
E. C.
Sayre
,
E. W.
Schenk
,
C.
Shepherd
, and
L.
Sorell
, “
Secondary analysis of teaching methods in introductory physics: A 50 k-student study
,”
Am. J. Phys.
84
,
969
974
(
2016
).
16.
D. E.
Meltzer
and
R. K.
Thornton
, “
Resource letter ALIP–1: Active-learning instruction in physics
,”
Am. J. Phys.
80
,
478
496
(
2012
).
17.
C.
Henderson
, “
Promoting instructional change in new faculty: An evaluation of the physics and astronomy new faculty workshop
,”
Am. J. Phys.
76
,
179
187
(
2008
).
18.
A.
Olmstead
, “
Assessing the interactivity and prescriptiveness of faculty professional development workshops: The real-time professional development observation tool
,”
Phys. Rev. Phys. Educ. Res.
12
,
1
30
(
2016
).
19.
I. A.
Halloun
and
D.
Hestenes
, “
The initial knowledge state of college physics students
,”
Am. J. Phys.
53
,
1043
1055
(
1985
).
20.
S.
Glantz
and
B.
Slinker
,
Primer of Applied Regression & Analysis of Variance
, 2nd. ed. (
McGraw-Hill Education/Medical
,
2000
).
21.
A. G.
Greenwald
and
G. M.
Gillmore
, “
Grading leniency is a removable contaminant of student ratings
,”
Am. Psychol.
52
,
1209
1217
(
1997
).
22.
J. I.
Smith
and
K.
Tanner
, “
The problem of revealing how students think: Concept inventories and beyond
,”
CBE Life Sci. Educ.
9
,
1
5
(
2010
).
23.
C.
Henderson
,
A.
Beach
, and
N.
Finkelstein
, “
Facilitating change in undergraduate STEM instructional practices: An analytic review of the literature
,”
J. Res. Sci. Teach.
48
,
952
984
(
2011
).
24.
C.
Wieman
, “
A better way to evaluate undergraduate teaching
,”
Change
47
,
6
15
(
2015
).
AAPT members receive access to the American Journal of Physics and The Physics Teacher as a member benefit. To learn more about this member benefit and becoming an AAPT member, visit the Joining AAPT page.