Some writers I respect (like Diane Ravitch, Joe Bower, and many others) want large-scale assessment out of schools, period. None of it. Get it the heck out of this school. Thanks, so long, see you later, and let’s get down to the business of teaching and learning. I sympathize with that sentiment, but I don’t share it completely.
(terminology note: I’m using the term “large-scale assessment” to refer to a standardized assessment administered to many students. Usually, these are state-developed or publisher developed tests, like the SAT, ITBS, etc. Many folks call these “standardized tests,” but that term is a bit misleading, I think, because any test a teacher gives in the same way across classes is “standardized.” So I’ll call them large-scale)
I sympathize because of the damage large scale assessment has done and continues to do in schools. The demands of No Child Left Behind and Race to the Top pressure schools and districts to use large-scale tests, which in turn pressure districts and schools to focus more on what the large-scale tests measure, which pressures teachers and others to spend more time on what the tests measure, which reduces the amount of time students get to spend learning what the tests don’t measure, which reduces students’ ideas about what “important” learning is and how they relate to (and like) school, and in the end what the large-scale tests measure define “learning.” The measurement ends up defining learning rather than describing it. (Others have described this process much more thoroughly than I have, such as Diane Ravitch’s excellent book, The Death and Life of the Great American School System)
But I don’t completely share “throw the bums out!” sentiment toward large-scale assessment because I think districts and schools can USE them rather than getting used by them. Large-scale assessments can be good for very specific purposes, and, like Liam Neeson, they have a very “specific set of skills.” If a district/school wants to compare assessment data (achievement, ability, whatever) to a national sample, then you’ve got to use a large-scale assessment. All the statistical trappings that go along with a large scale assessment (like percentile ranks, stanines, etc.) result from the way they’re built and maintained. Districts can use these data for useful purposes (they often used to, in pre-NCLB days), and I think they can again.
But having acknowledged all these potential uses, uses of large-scale assessment data are all kinds of out of whack right now. Let’s not even talk about the most egregious uses (like the really wacko teacher evaluation practices that rely on large-scale assessment results, and “merit pay” value-added schemes encouraged by the race to the top program). All large-scale assessments claim to be “reliable and valid,” but validity is a score use issue not an inevitable “trait” of a test that gets bestowed and never challenged. Large-scale tests are built for specific purposes, and using data from large-scale assessments in ways that they were never designed for (like teacher evaluation) is a serious threat to validity.
Maybe a ruthless “cost-benefit” analysis would be useful. How much value would a district/school have to get from a large-scale assessment in order for the assessment to justify it’s existence? How much “gain” would have to result to justify the expense, hassle, stress, and unintended consequences of a large-scale assessment? I think most (all?) state accountability tests would fail this (non-large-scale, non-standardized) test.