A modest proposal…

I’m worried about the term “formative assessment.” The term refers to an important idea – using assessment data DURING learning to make a change. But so many people use the term so differently, I fear that the important, core idea got lost.

In my district a decision was made quite a while ago to include the term formative assessment in on-line gradebooks. I understand why that decision was made, and well-meaning folks did it for good reasons. But one of the side effects was that the term “formative assessment” now means “an assignment that is worth fewer points” to students.

Dylan Wiliam is one of my favorite education researchers, and one of the people who “originated” the term formative assessment. A while back, someone asked a him “What’s the biggest mistake you made as a researcher?” on twitter, and his reaction was fascinating:

I like this idea: maybe including the word “assessment” in the term “formative assessment” wasn’t a great idea in the first place. Here’s my modest proposal: let’s replace the term formative assessment with two terms: responsive teaching, and student practice.

Responsive teaching could refer examples of teachers using assessment data (exit tickets, short quiz results, etc.) to make a change in their teaching. Student practice could refer to any time students use feedback to revise their work, try again, etc.

I don’t expect many folks to really stop using the “formative assessment,” but I think the terms Responsive Teaching and Student Feedback might be more descriptive and clear in most situations. Below is a slide I’ve used during discussions about all this – please feel free to steal!

(Note: thanks to Alex Bahe for letting me swipe a couple graphics)



An important confession.

I can’t believe I’m saying this, but one of the most important things I’ve read about education recently comes from the CEO of an educational technology company.

I usually approach educational pronouncements from ed. tech. companies with skepticism. That may not be fair – there are a lot of great teachers and smart folks who work in ed. tech. – but often what I see coming from those sources seems to be breathless excitement about something that will “save” education, “revolutionize” learning, or (worse) bring the “outdated industrial age education model into the 21st century.”

The underlying principles of teaching and learning don’t change much over time, and they don’t need “revolutionizing.” I love learning and talking about educational technology, but it’s just a tool like the other tools teachers and students use. There are many COOL tools (my favorite lately = PearDeck – I’m very excited to start exploring how to use that) but I think we should doubt anyone who claims that this or that new technology is “the answer” to supposed problems in education.

And that’s why I’m surprised at how much I love this “Confession and Question about Personalized Learning” from Larry Berger, CEO of Amplify (reproduced in Frederick Hess‘s blog). In the confession, Larry Berger describes a “conversion experience,” and I think it’s remarkable and important.

Here’s my favorite quote: “Until a few years ago, I was a great believer in what might be called the “engineering” model of personalized learning, which is still what most people mean by personalized learning…I spent a decade believing in this model—the map, the measure, and the library, all powered by big data algorithms. Here’s the problem: The map doesn’t exist, the measurement is impossible, and we have, collectively, built only 5% of the library… So we need to move beyond this engineering model. Once we do, we find that many more compelling and more realistic frontiers of personalized learning opening up. Which brings me to the question that I hope might kick off your conversation: “What did your best teachers and coaches do for you—without the benefit of maps, algorithms, or data—to personalize your learning?”

That’s a BIG admission, and I admire Mr. Berger for overcoming what must have been considerable confirmation bias, etc. as he worked toward this realization. It would have been in his best interest to just keep going on the path of “engineering” personalized learning, but I think he’s absolutely right: the people who think that learning can be carefully engineered and “personalized” in the sense of “automation” will continue to struggle. John Dewey said that learning is inherently “relational,” that social interactions are integral to real educative experiences, and I think he’s right. I’m glad Mr. Berger found his way back to that idea, and I’m excited to see what he does with this realization.

Thinking and Acting and Mindset

I love Carol Dweck’s research on Mindset. Her perspective as a cognitive psychologist is very useful as we think about why students might “give up” on learning in specific contexts. Her team has solid evidence that a growth mindset is associated with good outcomes for students, and a “fixed mindset” isn’t a good sign for future learning.

[Side note: one of my favorite assessment authors, Rick Stiggins, talked about an idea very close to what Dweck calls Mindset in the article “Assesment Through the Students’ Eyes” in 2007, the same year Dweck published Mindset. It’s one of my favorite classroom assessment articles, and it’s inspiring, and it’s short. You should probably read that article rather than this blog post.]

But I wonder whether or not students who already HAVE a fixed mindset benefit from TALKING about mindset. The students who I know who have fixed mindsets (and there are plenty of terms for this: losing streaks, learned helplessness, etc.) ended up thinking this way because of powerful past experiences: usually repeated assessment events that showed them failure. They learned through these experiences that they won’t succeed no matter how hard they try. They learned that their efforts don’t make a difference.

Here’s my thought: since students “experienced” their way into fixed mindsets, they probably need to experience their way “out” of fixed mindsets. If a student is convinced that “no matter what I do, it won’t help,” it won’t help to talk with them about mindset. I don’t think any kind of “pep talk,” or conceptual conversation about Dweck’s theory, or any “cognitive intervention” will help change their mind. They may need to SEE success, to EXPERIENCE and BEHAVE their way into a new way of thinking. A teacher (a teacher they trust and have a good relationship with) has to convince them to try one more time, then get feedback from that teacher, then USE the feedback as they try again, and SEE that they “got better.”

Some teachers and I started talking about assessment as a “loop,” and teachers or students USING feedback as “closing the loop.”

The idea of “closing the loop” is at the heart of what some researchers and educators call Formative Assessment. Unfortunately, the perfectly fine term “formative assessment” has been used SO often to mean SO many different things that I worry it’s turning into “edubabble.” And that’s sad.

I believe Dweck is right: the way we think about our abilities matters. But when we’re convinced we can’t,  we may have to experience and behave our way to a growth mindset instead of just talking about it.

I Only Have One Question…

[Note: this blog post original appeared on the Noba Blog]

In a 2006 article Wylie and Ciafolo describe a technique called “single diagnostic items” that may be a great tool for teachers to use to gauge the impact of classroom demonstrations. Single diagnostic items focus on one important concept and “diagnose” student misconceptions about that concept. Imagine using a single item to determine what your students are learning! Wylie and Ciafolo define these items as “single, multiple choice questions connected to a specific content standard or objective. They have one or more answer choices that are incorrect but related to common student misconceptions regarding that standard or objective” (p. 4). The incorrect responses indicate a specific misconception about the concept, so that student responses identify specific misconceptions.

I wanted to see how single diagnostic items worked in a real classroom so I asked an instructor of an introductory psychology class at a local small liberal arts college for permission to work with one of her classes. She and I decided to focus on the topic of working memory. The text for the course did not cover this topic thoroughly and the instructor had not yet discussed this topic with the class.

My experience with single diagnostic items in the classroom

After introducing myself and explaining the goals of the research project, I asked the class to respond in writing to the prompt: “In a few sentences, please briefly describe working memory.” Then I conducted a working memory demonstration: Students closed their eyes and mentally counted the number of windows in their house. After they finished, they closed their eyes again to “count the number of words in the sentence I just said.” After they finished this task, students indicated whether they had to use their fingers to count when I asked them about the number of windows in their house (none of the students raised their hands). Then I asked how many used their fingers to count the number of words in the sentence (almost all the students raised their hands). Then I projected a single diagnostic item on the screen:

Why do most people use their fingers when they count the words in the sentence, but not when they count the windows?

  • A. Windows are visual, and visual things are easy to process.
  • B. Most people are visual learners.
  • C. The windows are in long term memory, but the words are in short term memory.
  • D. Familiarity – I’m more familiar with my windows than I am the words in that sentence, so that task is harder.
  • E. I can picture the windows but I can’t picture the words, and that has something to do with it.
  • F. Working memory must process words and pictures differently.

Students then indicated their response to this item (using their cell phones and the website Poll Everywhere: http://www.polleverywhere.com/). We briefly discussed the diversity of their responses, shown here:

In our discussion the students pointed out that at least one student in the class chose each of the possible responses. We discussed the frequency of the different responses : most students chose answer C (“The windows are in long term memory, but the words are in short term memory”) or answer E (“I can picture the windows but I can’t picture the words, and that has something to do with it”). We briefly discussed the diversity of responses and concluded that the data indicate that the class doesn’t yet have a common explanation for why the word counting task required almost everyone to count on their fingers and the windows counting task did not.

Then I explained the origin of the task: Baddeley and Hitch (1974) established that working memory is an active system made up of separate elements that deal with different kinds of information differently. To complete the “counting the windows” task, first working memory determines that the windows need to be pictured and then counted (“central executive” function). Then working memory activates the element that handles words and numbers in order to count the windows (“phonological loop”), and the element that can picture each window visually (visuo-spatial sketchpad”). When faced with the “count the number of words in the sentence I just said” task, the central executive encounters a problem. The phonological loop has to repeat the words in the sentence, but the visuo-spatial sketchpad can’t count, so most people have to use their fingers to complete the task.

After explaining the working memory research and terminology to the class, the students again wrote answers to the writing prompt “In a few sentences, please briefly describe working memory. “ They also again used their cell phones to vote on the correct answer to the diagnostic item:

The class discussed these data and agreed that the memory demonstration and explanation changed their conceptions and understandings about the nature of working memory. Almost everyone in the class agreed in the end that answer F “working memory must process words and pictures differently” was the most correct answer. We discussed the two previous most common answers (C and E) and the class was able to describe in what ways those responses were correct and incorrect.

Later I analyzed the students’ written responses to look for other evidence of changes in understanding of the working memory concept. I created a short rubric to use to score students’ pre and post writing responses:

Each student response was scored by me and a colleague who did not know which responses were “pre” and which were “post.” These scoring data also indicate changes in understanding the working memory concept.

Single diagnostic items like this one could be used to assess the effectiveness of the classroom demonstration about operational definitions. These “effectiveness data” could be used to make decisions about which demonstrations are most effective and which need to be modified. These same data could have multiple formative purposes: Teachers can regroup students into discussion groups based on their responses and ask groups to process the rationale behind their answers. Heterogeneous discussion groups might be useful, each student discussing their different answer with the goal of the group moving toward a consensus conclusion. Teachers could use the two most common answers and use other classroom demonstrations/activities to focus on those misconceptions directly. All these formative uses of the assessment data share a common characteristic: data from this one item are used to focus specifically on student misunderstandings about this important concept. This focus on the misconceptions these students demonstrate address student thinking actively and directly. The assessment data informs instruction by the teacher and metacognition by the students.

How to develop Single Diagnostic Items

Developing single-diagnostic items does require teachers to invest time in the item development process, but can save time in the classroom by efficiently providing valuable information about student misconceptions. One item-develop process is described below

  1. Gather teachers who teach the same/similar content. Writing single-diagnostic items requires “deep” content knowledge, and is best done with a group of experienced teachers.
  2. Choose a “big idea” to focus on. Single-diagnostic items take a while to write, so the group should spend its time focusing on an idea/concept/etc. that is a “big deal.” Some authors call these “hinge” or “threshold” concepts: ideas that students need to understand well in order to make progress in the discipline.
  3. Ask the group to list misconceptions about the “big idea” (another way to phrase this task is to ask “How do students go wrong about this idea?) List all the misconceptions the group develops, then look at the list and collapse any similar ideas into appropriate categories.
  4. Write a stem for the single-diagnostic item that will require students to use the “big idea.”
  5. Write options for the single-diagnostic idea, one option per misconception and one possible correct answer. Note: multiple correct answers can be included, and the group should end up with one (and only one) option for each misconception. Ideally, if a student chooses an incorrect option, teachers should be confident the student did so because they are laboring under that specific misconception.
  6. Test the item with real students. Participating teachers should use the item in class, and ask students who choose an incorrect response WHY they chose that response to test the relationships between incorrect options and misconceptions.
  7. Revise based on feedback.


Baddeley, A. D., & Hitch, G. (1974). Working memory. In G.H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 8, pp. 47-89). New York, NY: Academic Press

Wylie, C., & Ciofalo, J. (2006). Using diagnostic classroom assessment: One question at a time. Teachers College Record, Jan. 10, 2006, 1-6.


Let’s try to Intentionally Override…

This quote from Yana Weinstein (@doctorwhy) is important , I think. But I’ve been having trouble talking about it.


There’s a lot in this quote. Piles and piles of cognitive (and developmental) psychology indicate that humans use concepts and categories to understand the world. We just do. Our brains are meaning making and pattern finding machines. And it’s not a bad thing: generally making categories is an efficient, adaptive way to operate in the world.

BUT when we deal with other people, our categorization habits can get in the way. Badly.  Categories quickly lead to stereotypes, which can easily become prejudice, and if we have even a bit of power over someone else, prejudice becomes discrimination. And this all might happen without our awareness.

My favorite part of the quote is the last part: “The moral solution is to intentionally override the tendency to categorize individuals in the same way that we characterize other items that we encounter.” That sounds very matter of fact and clear, but underneath that statement is something profound, and inspiring, I think. When we deal with our fellow human beings, the other folks in our human family, we need to try to consciously “override” what our brain wants us to do at first: categorize someone. We should try to NOT judge their actions based on the categories and expectations built up based on our past experiences. We need to stop those immediate thoughts, remember that we’re thinking about another human being, and do our best to resist the influence of our internal categories (and stereotypes).

I expect there are many, many studies that show how unlikely this is. I’m certain that it’s difficult, and it may even ultimately be impossible. But I don’t want to think about that yet. Maybe it’s worthwhile thinking and talking about how we might at least try. What can help us interrupt the quick flow toward judgment? How can we try to override?

You keep using the phrase “Honesty Gap.” I don’t think it means what you think it means…

Stumbled across this site recently. Argh.


The site is well designed, professional, and the statistics and analysis look convincing. And they get their NAEP score analysis almost exactly wrong. The purpose of the site is to argue for this claim:

“Frequently, states’ testing and reporting processes have yielded significantly different results than the data collected and reported by the National Assessment of Educational Progress (NAEP). The discrepancy between NAEP, the Nation’s Report Card, and a state’s claim is what can be described as an “honesty gap.”

They go on to provide state by state evidence for this supposed Honesty Gap:


The implication is that states are lying about student proficiency because their proficiency rates are so much lower than the NAEP proficiency rates. BUT this analysis, and this website, doesn’t include a crucial detail (and I suspect they know about this detail and they purposely fail to include it): the way the NAEP developers use the term “proficient” is VERY different from the way that term is used on the state achievement tests they are comparing NAEP scores to. Here’s a summary (from this Washington Post article) about this important difference:

“Oddly, NAEP’s definition of proficiency has little or nothing to do with proficiency as most people understand the term. NAEP experts think of NAEP’s standard as “aspirational.” In 2001, two experts associated with NAEP’s National Assessment Governing Board (Mary Lynne Bourque, staff to the governing board, and Susan Loomis, a member of the governing board) made it clear that:

‘[T]he proficient achievement level does not refer to “at grade” performance. Nor is performance at the Proficient level synonymous with ‘proficiency’ in the subject. That is, students who may be considered proficient in a subject, given the common usage of the term, might not satisfy the requirements for performance at the NAEP achievement level.'”

So the “Honesty Gap” argument falls apart before it begins: you can’t compare “proficiency rates” between the NAEP test and state tests, because the term “proficient” is defined very differently. I’ve seen this same argument on other sites (often promoting charter schools) and it’s just wrong wrong wrong. The only Honesty Gap the site argues for is their own dishonest use of NAEP data. Knock it off, please.

Your Summer Reading List: 5 Psychology Books To Add To Your Bookshelf

(originally posted at http://psychlearningcurve.org/summer-reading-list/ )

Your Summer Reading List: 5 Psychology Books To Add To Your Bookshelf

Your Summer Reading List: 5 Psychology Books to Add to Your Bookshelf

The summer is a great time to catch up on psychology reading! Here are five books that provide information teachers can use to update, add to, and “enliven” research from your textbook. And as a bonus: they are filled with entertaining stories and details to keep us all reading this summer!

Make it Stick: The Science of Successful Learning (Brown, Roediger, & McDaniel, 2014:) Organized in a way that takes the reader through a “course” on cognitive psychology applications for learning (e.g., distributed practice, retrieval practice, and interleaving). If we all read Make it Stick and How we Learn, I  think we’d all be better teachers and students. 

How We Learn: The Surprising Truth about When, Where, and Why it Happens (Carey, 2014): A summary of cognitive science research that SHOULD impact the ways we teach and study! Many non-intuitive findings, explained clearly and with great stories and practical examples. This is the “missing manual” for students and teachers, with explanations about how our memory system works, and implications for teaching and learning.

Incognito: The Secret Lives of the Brain (Eagleman, 2011): I  think Eagleman is one of the most effective communicators of biopsychology research out there. He combines effective story-telling about early brain research with summaries of his and other current findings, and extends these discussions by explaining the implications of the research (his writing about how brain research should/could influence the legal system is challenging and provocative). Great examples and background for the Biopsychology chapter. 

Thinking Fast and Slow (Kahneman, 2011): I admit it: I’m not done with this book yet. I’m working my way through this very ambitious book slowly. Each chapter deserves quite a bit of time: Kahneman pulls together decades of research about cognitive biases, framing, prospect theory, and his overall metaphor of “system 1” and “system 2” thinking. 

Crazy Like Us: The Globalization of the American Psyche (Watters, 2011): Excellent background for the disorders chapter. Provides background on cross-cultural research regarding psychological diagnoses, including multiple examples of what happens when American attitudes and thinking about psychological disorders gets “exported” to other cultures.

If you are looking for more suggestions about psychology books, TOPSS members Laura Brandt and Nancy Fenton have a great Books for Psychology Class blog where they share books that would be useful in an introductory psychology class.  The Psychology Teacher Network newsletter also has regular book reviews.

Do you have other psychology books you recommend for summer reading? Please feel free to list suggested books in the comments below.


Evaluating research claims about teaching and learning: Using the APA’s Top 20 to think critically

(originally posted on: http://psychlearningcurve.org/evaluating-research-claims/)

Posted By: Rob McEntarffer, PhD February 15, 2016

What teachers and administrators need is a clear and concise way to evaluate claims made about teaching and learning before teachers are asked to implement “research findings” in their classrooms.

Picture a group of teachers at a professional development session. The speaker, a hired consultant who flew in for the presentation that morning, shows the teachers a graphic of what he calls the “Learning Pyramid.”

(source: Washington Post article, ” Why the ‘learning pyramid’ is wrong” )

The speaker uses this graphic as evidence to prove that teachers should change their instructional techniques, decreasing the amount of time they spend lecturing (since it is associated with a 10% student retention rate) and toward more interactive teaching strategies, like “teach others.”

Some teachers in the professional development session (which is, ironically, mostly a lecture) nod enthusiastically, but some teachers are troubled. Does this research really support the conclusion that all lectures are “bad” and all discussions are “good?” Based on this research being presented by this speaker, what are teachers being asked to accept and do?

In most professional development sessions, these teachers are left with such lingering concerns and doubts. The professional development might end at that point, with some teachers making changes while others ignore the advice. Teachers might be asked by administrators to explain how they implemented the “lessons learned”. But the underlying claims wouldn’t be questioned, just how they are put into action (or not).  Fortunately, the Center for Psychology in Schools and Education, within the American Psychological Association’s Education Directorate, produced a useful summary of the most important (and most supported by multiple research studies) principles related to teaching and learning: the Top 20 Principles from Psychology for K-12 Teaching and Learning. Educators can use this resource as a starting place when evaluating claims about teaching and learning. If a claim seems to contradict one or more of the principles, if it doesn’t “fit” with the 20 principles described in this document that can serve as a red flag for teachers and administrators. The claim would need to be looked at carefully before accepting it as valid and useful for teachers to implement.

Let’s take the “learning pyramid” as an example. The claim underlying the pyramid is that the method of delivery is the primary or major factor determining whether students retain the intended knowledge/skills. The first step in examining that underlying claim could be to check the Top 20 document.Principle 2 is the most immediately relevant body of research:

“What students already know affects their learning.”

The research summarized for Principle 2 indicates that one of the most important factors that impact student learning is their prior knowledge and conceptions/misconceptions (not a specific delivery method, like lecture or audiovisual presentation). The field of educational psychology extensively supports the determining of students’ current thinking about a topic, and using that information to help them grow in their understanding/skills. If the claims of the pyramid of learning were true, the method of delivery would have to “trump” the influence of students’ current thinking about a topic, and that’s not what research in this section is pointing toward. If teachers in this professional development section had access to this Top 20 document, they might have been able to question the consultant’s claims about the pyramid, which could have led to a more useful discussion (instead of a puzzling and disturbing experience). Teachers could discuss how they typically learn about students’ current thinking and conceptions/misconceptions regarding key concepts from their classes, and what they do with that information. The discussion might eventually include how they make choices about presentation methods based on what they know about students’ current thinking about a topic, and what presentations methods might be more appropriate or effective given students’ current conceptions.

I suspect that most school’s goals include some language about how we all want to help students “think critically” or “analyze information” independently in order to prepare students to be active citizens and consumers of information as adults. As educators, we need access to resources that empower us to think critically about claims made about teaching and learning. We encounter a large, constantly changing universe of advice about teaching and learning, and it is difficult to keep up with education research while doing our full time jobs in schools. The APA’s Top 20 document can serve a vital role as an initial “filter” or “check” regarding claims made about teaching and learning.

We would love to hear about teaching and learning claims you’ve encountered in your educational contexts. How do you evaluate these claims when you encounter them? Do you see a role for the Top 20 document in “testing” claims about teaching and learning?


Taste the soup, evaluate the soup, but please, don’t burn it.

(originally posted on Medium)


This Edutopia Tweet got me thinking about how formative/summative assessment process are defined and perceived.

I’ve heard several metaphors for formative/summative assessment processes, and they are all pretty good: “Formative = check up/Summative=autopsy, and the one from Edutopia “Formative=tasting the soup/Summative=eating the soup.” Those metaphors can be useful, but it’s also a bit tricky b/c neither of them go far enough: the process doesn’t “become” formative until someone USES the information to make a change (e.g. you use info. from the check up with your Dr. to change your eating habits, you add salt to the soup after you taste it). Similarly, the process doesn’t “become” summative until someone forms an evaluative conclusion based on the information (e.g. you figure out that a heart attack killed the patient, you say that the soup is good but too salty for your taste).

But the soup metaphor can get extended in a different direction, too: the way most teachers have to assign grades is to provide an overall, global “mark” (letter, number, etc.) to students over a period of time (quarter, semester, year). That’s a summative process, but a darn tricky one. Eating and evaluating the soup is a summative process, but most of us would probably say “the soup is good — the veggies might be a little big and chunky? And I could use less salt.” That’s a potentially useful, detailed evaluation. But teachers can’t do that. They usually have to give ONE overall mark to student work, which is like boiling down all the complexity and richness of the soup to a thin uniform paste on the bottom of the pan, then evaluating that. Yech.

TedX Lincoln Talk: The game of school isn’t the same as learning

Background: I got to give a TedX Lincoln talk in Oct. 2015. It was a powerful experience, and I’m grateful for the opportunity. 

Slides for the talk (slides created by the always awesome Chris Pultz!) 

Video of the talk (and yes, my darn tie is crooked the ENTIRE time) 

When I took my current job as a Lincoln Public Schools administrator, one of the first projects I wanted to work on was how to give good advice to teachers about classroom assessment. I decided to interview one of my former students, Alex, who was teaching high school students labelled “at risk”. I walked into his classroom after school with my set of carefully designed questions, opened my laptop, and was about to ask the first question, when he politely interrupted me. He said “Mr. Mac, I hope this doesn’t offend you, but I became a teacher because I realized that what I really learned in high school was how to get an A, and not much more than that. I want something better for my high school students.” He was talking about was the game of school. Alex’s comment stopped me in my tracks, and started me down the road of thinking about the differences between playing the game of school and real learning and real teaching. That’s is an important difference. It’s time to rethink the game of school.


You all know the game of school although you may not call it that: we play the game when we focus more on how to get by in a class, how to get an A, what do do for points, rather than focusing on any of the LEARNING in the class. The words we often use reveal some of the elements of this game: “How many points is this worth?” “What’s my score on a test” “What did she/he give you on that paper?” “Can I get any extra credit points?” Class rank, GPA, and a bunch of other school traditions reinforce the message that the goal of school is to accumulate points/grades/credits and win in the end. The game of school isn’t the same as learning, and it might get in the way. Every day in classrooms teachers help students who are convinced that they CAN’T do something, that they aren’t “math people” or “artists” or “writers,” and they get them to use some feedback and SEE that they CAN and DO get better at something important. Teachers help students turn losing streaks into winning streaks. Learning is an experience that changes you, changes the way you think about yourself and the world. This is a tough thing to measure well, but that doesn’t make it any less true. Comparing real learning to the game of school is like comparing love to the number of likes you got on Facebook.

A similar game of school gets played outside the four walls of the classroom too: The No Child Left Behind education reform act encouraged schools and districts to play very strange school games, all based on accumulating points and getting the high score. The premise of this game is that students, schools, and districts should be judged by single scores: averages on statewide achievement tests. If your school or district got below a certain score for a certain year, you were labelled “needs improvement,” No other information gets included in this label – just test scores.

Think about how strange this game is: that something as complex as teaching and learning should be judged by a single “score,” that these goals could be meaningfully represented by one overall grade or rank. Why would we ever think that’s a good idea? Measurement experts and other researchers don’t think it’s a good idea: they stress the importance of using a variety of data to make educational decisions.. Sports fans don’t think it’s a good idea: think about the complex set of statistics and judgments people use when thinking and talking about who the best player or team is from a certain era. I bet YOU don’t think it’s a good idea: if someone asked you to describe your favorite teacher, would you stop, look thoughtful, and simply reply “96.3% – an A.” ? We know that teaching, learning, schools, and districts are complex places doing complex work – why would we think that a single score can describe this complexity?


Playing the No Child Left Behind game of school has consequences. These are headlines from a single day, retrieved from Google News. These headlines are meant to scare us, to convince us that the sky is falling and that our schools in general are failing kids. But it’s all based on a game, and the game is set up to ensure that some schools will always be “needs improvement.” That makes great headlines. This “sky is falling” narrative can be paralyzing – we can become so fearful that we might avoid important decisions, and our vision gets “narrowed” because of the fear of evaluation. Teachers and schools might feel compelled to narrow their vision of education to only what is tested, and test preparation can edge out important learning experiences.

Learning isn’t a simple input/output system easily measured and “incentivized.” It’s not something that just happens as a nice side effect when people seeking points and keeping score. It’s not like a video game. Teachers don’t get to “zap” learning “into” students. Teaching and learning are creative acts, requiring vulnerability, insight, emotions, and struggle. Learning is, literally, creating meaning. We connect what we learn to other experiences and insights, weaving new ideas in with previous notions. Teachers don’t “give” learning to students. Teachers use the medium of curriculum and set up contexts that help students create meaning.

We don’t have to accept this strange game of learning set up by NCLB and policies like it. We can rethink it. We don’t have to pretend that the game of school is the only thing that matters. Instead of just summarizing tests with one overall score or rank for a school or district, why can’t we tell a more rich story? Why not tell several important stories about schools, including test scores but also including maybe graduation rates, college and career placement, student engagement, parent perceptions, effective use of funds, and innovative teaching? And perhaps schools and communities could be part of telling their OWN story, including their own strengths and weaknesses?

We can push back on the sky is falling mentality of NCLB, and acknowledge that real schools and real students need help solving real problems. We can remember, and remind other people, that tests can tell us importletterant information, but they don’t paint the whole picture, and we shouldn’t pretend they do. I’m proud of my co-workers in the Assessment and Evaluation Department – whenever we can in “official testing letters” we tell students and parents that “it is important to remember that this score is not the whole you. A score isn’t the complete picture of who you are and who you want to become.”


My favorite quote about teaching and learning is “I have come to believe that a great teacher is a great artist … Teaching might even be the greatest of the arts since the medium is the human mind and spirit.” — John Steinbeck “On Teaching” 1955 If you think about teachers and students as artists, and how limiting and tough it is to measure and evaluate artistry, we can get a sense of how humble and cautious we should be about “grading” “ranking” “judging” teaching and learning.

I want to finish my story about the conversation with Alex. After he said that what he learned in high school was how to get an A, I took a deep breath, and I closed my laptop. I didn’t ask any of my carefully prepared questions. I started listening carefully, and I hope I never stop. It turns out that helping teachers and students means listening rather than starting with the presumption that you have the right advice to tell them.

Learning gets created in classrooms, through educative relationships between teachers and students. I think that simple sentence describes a process that is human, important, beautiful, and one of the most vital parts of our lives. It’s not a game – it’s teaching and learning, and it’s too important to play with.