Much of physics class time is spent helping students learn the explanations generally accepted by the scientific community. Concept inventories have become a key tool in this effort. Assessments like the Force Concept Inventory1 and Brief Electricity and Magnetism Assessment2 are valuable resources for instructors to gauge students’ understanding of core concepts in physics. These assessments typically present students with a scenario, a question about that scenario, and a series of possible answers. Following instruction, students should be able to select the best option, one that demonstrates their grasp of core concepts and principles.

But what if the learning didn’t stop there? Even the best explanations contain gaps or limitations. It is important for students to appreciate that physics progresses not through enduring satisfaction with our current understandings, but through the search for gaps or inconsistencies requiring further explanation. Many important discoveries in physics emerge from a willingness to question even our most closely held theories about the natural world. Using an illustrative example, we present the possibility of repurposing concept questions to teach physics students to go beyond correct answers and search for gaps and limitations of those answers.

Our illustrative example begins with what students refer to as the “penguin problem” after the main characters—emperor penguins. The problem was designed by the Diagnostic Question Clusters project3 and appeared in a paper by C. Wilson and colleagues.4 The team designed the penguin problem, along with a handful of others, to assess students’ tendency to trace matter and energy in biological processes. Figure 1 shows the original wording of the problem.

Fig. 1.

The original “penguin problem.” This material originally appeared in CBE—Life Sciences Education and is made available for non-commercial use by the general public under an Attribution-Noncommercial-Share Alike 3.0 Unported Creative Commons License.4 Use here does not imply endorsement by the original authors or journal.

Fig. 1.

The original “penguin problem.” This material originally appeared in CBE—Life Sciences Education and is made available for non-commercial use by the general public under an Attribution-Noncommercial-Share Alike 3.0 Unported Creative Commons License.4 Use here does not imply endorsement by the original authors or journal.

Close modal

The penguin problem tests students’ tracing of matter and energy in cellular respiration, a key topic in introductory biology. It is also a fruitful question to use with physics and chemistry students because it elicits interesting ideas about matter, energy, and thermodynamics. Wilson and colleagues report that before targeted intervention, most students pick answer B, that the mass is used up. This answer suggests the mass turns into energy, an idea possibly derived from students’ everyday experience that something must be consumed for an action to occur, such as a car needing fuel.5 Some students justify answer B by appealing to E = mc2, thinking the penguin matter is transformed directly into energy. Understandably, instructors and researchers have expressed a great deal of concern about student performance on concept inventory questions like the penguin problem.

After trying out the question as an instructor (first author) and student (second author), we were concerned, too, but not because most students initially picked the wrong answer. Answer A may be the best choice of the options provided, but it is not really a complete explanation for the phenomenon of weight loss in penguins during fasting. Our concern was how readily students learned to trace “H’s,” “C’s,” and “O’s” to arrive at answer A6 and seemed satisfied with the result. In The Log from the Sea of Cortez, Steinbeck wrote of the danger of scientists treating ideas as complete:

There is one great difficulty with a good hypothesis. When it is completed and rounded, the corners smooth and the content cohesive and coherent, it is likely to become a thing in itself, a work of art. It is then like a finished sonnet or a painting completed. One hates to disturb it. Even if subsequent information should shoot a hole in it, one hates to tear it down because it once was beautiful and whole.7 

If the lesson ends with students arriving at answer A, we risk accidentally messaging that this is a complete answer, a “finished sonnet,” so to speak. Though not the typical way these tools are used, we think concept inventory questions can help students learn to disturb answers, find their limitations, and see what does not make sense.

Here, we describe how the penguin problem was repurposed for these goals in a pedagogy course for undergraduate physics, chemistry, and biology peer educators. The first day of the penguin problem work begins with an announcement that emperor penguins are “the big penguins,” standing about 4 ft tall. We discuss pictures of the penguins in their natural habitat in Antarctica. Sometimes, students will have more information to share about these penguins from documentaries they have watched or other personal experiences. We spend a few more minutes delighting in penguin facts and photos, before turning to the problem at hand, a modified version shown in Fig. 2.

Fig. 2.

The modified penguin problem.

Fig. 2.

The modified penguin problem.

Close modal

The modified penguin problem has no multiple-choice options. Students must generate their own possibilities for what happens to the mass, yielding many answers that are not included in the original set of options, such as “feathers.” When considering individual penguins, students sometimes treat the mass loss as negligible, reminiscent of how young children will view a small grain of rice as having no mass.8 The extreme case of the colony amplifies the mass loss9; hundreds of thousands of pounds of penguin must be accounted for. Both U.S. customary and SI units appear in the problem statement; that is intentional, to help students who do not yet have a sense of a kilogram to think in terms of pounds, or vice versa. Some of the more technical terms are replaced, so students who are most comfortable with the technical language will not dominate the discussion. (D. Hestenes, a creator of the Force Concept Inventory, also wrote about the importance of using everyday language to “get closer to what students really think”10). Finally, the construction of the chart encourages students to work together and commit to an initial idea about where the mass goes.

These changes are not meant to be a general improvement to the original penguin problem, nor do we claim students’ initial performance on the modified task is substantially different from the original. Rather, the two versions of the penguin problem are an illustration of the idea that “the same physics scenario posed as a problem in different ways, can emphasize different learning goals for students and can be used in diverse situations to meet various instructional goals.”11 The goal of the revised penguin problem is for students to practice an important aspect of doing physics commonly overlooked, finding limitations and gaps in an answer.

After students share their initial pie charts (see Fig. 3), conversations may go in different directions. Eventually, at an instructor or student’s suggestion, we trace matter to arrive at answer A. Some instructors may stop here, satisfied when students arrive at the idea that the mass is lost as carbon dioxide and water. For us, the work is just beginning. Now we need to consider why we abandoned very sensible ideas such as feathers and feces in favor of the much less intuitive answer the air. A question is posed to students along the lines of, “Well, are we all just going to go along with the idea that the 600,000 pounds of penguin went into the air? What doesn’t make sense about that? What questions do we still have?” Students must now consider what the answer leaves out.

Fig. 3.

Examples of pie charts observed over the years to account for the lost penguin mass, reconstructed by the authors.

Fig. 3.

Examples of pie charts observed over the years to account for the lost penguin mass, reconstructed by the authors.

Close modal

A student might say it does not make sense that all of that weight could go into the air, because air seems so light. Some ask what percentage of the mass is lost as H2O vs. CO2, and how to calculate the ratio theoretically or measure it empirically. Students will also ask about application to other contexts, such as weight loss in humans. Others will bring up the incubation of the egg. Answer A reveals little about how the egg stays warm, yet the egg is the most compelling part of the penguin problem for many students. When students consider what “the answer” leaves out, they begin to see the problem in a different way. They look for phenomena that need explaining and details that are missing. So far, students have raised questions about how the egg stays warm, molting, penguin digestion, care and feeding of baby penguins, fat composition of penguins, and thermodynamics of penguin huddles.

Students are also encouraged to bring up ideas they initially included in their pie charts, such as the mass going into the feathers or feces or E = mc2. Too often, such as in Clicker Questions, students share initial ideas, they become convinced their first answer is wrong and another answer is better, and they change their response. This process risks reinforcing a “one true path” view of science—that some answers are right and should not be questioned, and others are wrong and should not be entertained.12 Instead, we encourage students to think of their wrong answers as entry points into critically examining the “correct answer.” Consider the answer feathers. Biologists have conducted studies of the physiology of penguins during molting. Unlike some other birds, penguins replace their entire plumage at once. During this “catastrophic molt,” penguins lose the waterproofing and insulation provided by their feathers. Unable to safely swim for feeding purposes, they must fast instead.13 Answer A makes sense for the specific fasting period during egg incubation, but it would not be as correct for describing what happens to emperor penguins’ weight during their molting fast.

The willingness to evaluate explanations and conclusions is an important part of physics and in science more generally, though invitations for students to engage in this work are often limited.14 (Mathematics educator S. Humble expressed a similar sentiment in the essay “Beware answers with questions,” writing, “Students often forget to ask questions once they have found an answer. This paper suggests that students would always benefit by questioning answers.”15)

We propose concept inventories as a ready-made tool teachers can adapt to help students learn to “go beyond” answers. The penguin problem was a good concept inventory question for our context, an undergraduate class enrolling peer educators from the physics and biology departments. We also changed the problem substantially to suit our needs. However, instructors could apply this approach in their own classroom by selecting a question from an inventory aligning better with their curriculum, and one that may not need substantial modification. PhysPort16 maintains an extensive repository of concept inventories with publicly released example problems, or instructors can use their own favorite concept questions from their slides or textbooks.

To begin repurposing concept inventories as we have done with the penguin problem, instructors can start by asking students questions like “Now that you know the answer, what doesn’t make sense about it?” and “What does the answer leave out that you were thinking about?” Students may need an example or two to get started. For a mechanics course, examples might address a part of the physical scenario that was neglected in the original problem (e.g., what about if the string does have mass?), changing aspects of the physical situation (e.g., does the material the table is made of matter?), trying to find scenarios in which the wrong answer would apply (e.g., can we think of any situation that would generate the graph shown in answer b?). It is unlikely that an instructor will have time to guide the students through every question generated, but the goal is for the students to learn to ask these questions. And, as with any teaching strategy focused on cultivating curiosity and reasoning, instructors must be attentive to their students and the classroom environment. Students are unlikely to engage in the type of intellectual work we describe if they have no real investment in the question, or in the class, for that matter, or if they feel unsafe to share their ideas. In our setting, students are not graded at any point in the process, but they do submit snapshots of their thinking in the form of written explanations, drawings, or sharing with the whole class.

In terms of assessment, it is important to distinguish between helping students achieve the correct answer and helping them learn to question answers. One way to check what students learn from the penguin question is to ask them to examine other contexts involving cellular respiration to see if students can trace matter and energy to arrive at the scientifically accepted answer. This is the typical way that concept inventories are used. But that only tells us whether students have learned a particular strategy (albeit an important one) to analyze a problem. Our goal is to teach students to question answers. One way we have assessed that learning is to assign students the task of selecting a concept question, answering it, and then generating their own “going beyond” questions. Instructors can also look for signs that students are starting to notice gaps or limitations in explanations they are learning in class. For example, the second author, a former student in the class and a peer educator, began to disturb one of the narratives taught in his biology laboratory:

I think it is important to critique the question of “why are plants green?” Being a student of science, we always learn why plants are green … However, there are plants that have leaves with no green, but rather orange, red, yellow, and the many shades in between. It’s important to think about this metacognitively and ask, “Why do we learn how plants are green, and why don’t we learn how plants are red or yellow? How do we arrive at this particular question?” In class, we learn about accessory pigments that are supposedly responsible for this different coloration, but we never learn how or why they are useful. We accept the fact that they exist; similar to how we accept the reason why most plants are green. A goal in lab is to think more like a scientist and critique what we know–so if plants want to maximize their energy generation, why don’t they accept all of the light energy they can get, instead of reflecting away a certain part of the light spectrum?

The authors of the original penguin problem remind us that “reforming undergraduate science education cannot proceed by merely changing the content, or modifying the instruction, but rather must involve both reconceptualizing what it means to understand the content, and reframing the instruction accordingly.”4 The penguin problem can be used as a tool to teach a narrative of cellular respiration. Or, as Wilson and colleagues propose, it can be used as a tool to teach a way of accounting for and analyzing biological phenomena. Or, as this paper proposes, it can be used to teach students to find gaps in explanations, to find new questions to explore—to disturb the sonnets and finished paintings.

The authors thank Dr. Chris Wilson, Dr. Philip Trathan, and anonymous reviewers at TPT and PERC for taking the time to provide thoughtful feedback on drafts of this manuscript.

1.
D.
Hestenes
,
M.
Wells
, and
G.
Swackhamer
, “
Force concept inventory
,”
Phys. Teach.
30
,
141
158
(
1992
).
2.
L.
Ding
,
R.
Chabay
,
B.
Sherwood
, and
R.
Beichner
, “
Evaluating an electricity and magnetism assessment tool: Brief electricity and magnetism assessment
,”
Phys. Rev. Spec. Top. Phys. Educ. Res.
2
,
010105
(
2006
).
3.
NSF Award No. 0243126.
4.
C. D.
Wilson
et al, “
Assessing students’ ability to trace matter in dynamic systems in cell biology
,”
CBE Life Sci. Educ.
5
,
323
331
(
2006
), p.
326
.
5.
R. E.
Scherr
,
H. G.
Close
,
S. B.
McKagan
, and
S.
Vokos
, “
Representing energy. I. Representing a substance ontology for energy
,”
Phys. Rev. Spec. Top. Phys. Educ. Res.
8
,
020114
(
2012
).
6.
Interested readers may consult Wilson et al.’s original study for a detailed description of a teaching strategy to help students reason through these types of problems.
7.
J.
Steinbeck
and
E. F.
Ricketts
,
The Log from the Sea of Cortez
(
Penguin, New York
,
1995
), p.
148
.
8.
C. L.
Smith
,
M.
Wiser
,
C. W.
Anderson
, and
J.
Krajcik
, “
Focus article: Implications of research on children’s learning for standards and assessment: A proposed learning progression for matter and the atomic-molecular theory
,”
Meas.: Interdiscip. Res. Perspect.
4
,
1
97
(
2006
).
9.
A.
Zietsman
and
J.
Clement
, “
The role of extreme case reasoning in instruction for conceptual change
,”
J. Learn. Sci.
6
,
61
89
(
1997
).
10.
D.
Hestenes
, “
Who needs physics education research!?
,”
Am. J. Phys.
66
,
465
467
(
1998
), p.
466
.
11.
D.
Brookes
and
E.
Etkina
, “
In search of alignment: Matching learning goals and class assessments
,”
AIP Conf. Proc.
1413
,
11
14
(
2012
).
12.
M.
DiPenta
, “
Leaving the ‘one true path’: Teaching physics without single correct answers
,”
Phys. Teach.
69
,
666
668
(
2021
).
13.
P. N.
Trathan
et al, “
The emperor penguin-Vulnerable to projected rates of warming and sea ice loss
,”
Biol. Conserv.
241
,
108216
(
2020
).
14.
N. G.
Holmes
,
C. E.
Wieman
, and
D. A.
Bonn
, “
Teaching critical thinking
,”
Proc. Natl. Acad. Sci. U.S.A.
112
,
11199
11204
(
2015
).
15.
S.
Humble
, “
Beware answers with questions
,”
Teach. Math. Appl.
24
,
37
41
(
2005
).
16.
S.
McKagan
,
PhysPort
(
American Association of Physics Teachers, College Park
,
2011
), https://www.physport.org.

Tiffany-Rose Sikorski is an associate professor of curriculum and instruction at The George Washington University, and a former high school physics teacher.

Brandon K. Lee is a graduate of the Department of Exercise and Nutrition Sciences in The George Washington University Milken Institute School of Public Health.