In the past week, an unbelievably stupid set of questions on a New York standardized test has made headlines. As a result, the state education commissioner has announced that the questions won't be counted toward students' official scores, but if you care about education, the concerns raised by these test questions can't end with "they don't count in New York." For one thing, these questions have been in use for years, in multiple states. While they won't count in New York, they have counted for many other students—and the teachers whose performance is judged by those students' test scores.
The questions at issue (PDF) were attached to a reading passage parodying the tortoise and the hare. In this one, a pineapple challenges a hare to a race, leaving other animals confused about who they should root for and whether the pineapple has a victory plan—a moose suggests that "The pineapple has some trick up its sleeve." When the race begins, the pineapple just sits there and is ultimately eaten by the animals, leading to the "MORAL: Pineapples don't have sleeves." The students then had to answer ambiguous questions such as why the animals ate the pineapple and which animal was the wisest.
"Pineapples don't have sleeves" is eminently quotable; the silliness of the passage and questions doubtless helped publicize it and get it looked at with a critical eye, but we can't let that same silliness obscure at least three major issues this episode highlights: Testing is big business bringing some corporations enormous profits, the tests that are so much a focus of education policy today are fallible, and the tests themselves are just the leading edge of how testing companies are making their way into the schools and defining the education kids get.
Testing is big business
The education commissioner of Texas, a Republican, recently said that:
“The assessment and accountability regime has become not only a cottage industry but a military-industrial complex. And the reason that you’re seeing this move toward the “common core” is there’s a big business sentiment out there that if you’re going to spend $600-$700 billion a year in public education, why shouldn’t be one big Boeing, or Lockheed-Grumman contract where one company can get it all and provide all these services to schools across the country.”Texas has been at the forefront of the testing craze; in fact, testing was one of the things George W. Bush brought with him from Texas and pushed to a national level, through No Child Left Behind. In 2000, Pearson Education, the company that produces tests for Texas, "signed a $233 million contract to provide tests for Texas schools, and in 2005 they got another $279 million." In 2011, as Texas was slashing its education budget to the bone, Gov. Rick Perry's administration gave Pearson a $470 million contract "to come up with a new test that will hold Texas schoolchildren to a higher standard at the same time that budget cuts are forcing them into increasingly crowded classrooms."
But Texas isn't alone. Pearson is the company responsible for "pineapples don't have sleeves," and the size of those Texas contracts combined with the fact that the pineapples passage has appeared on tests in New York, Alabama, Arkansas, Delaware, and Illinois at a minimum should give you some idea just how lucrative the testing business is for Pearson and other testing firms. In fact, combined state spending on standardized tests went from $423 million in 2001 to $1.1 billion in 2008.
When educational policy is just coincidentally falling in line with something that very directly creates large corporate profits, it's time to stop and consider whether maybe the policy is being driven more by profit than by actual results.
(Continue reading below the fold)
The tests are fallible
Standardized tests tend to be treated as if their results are Truth, as if they shine a light into the learning and the teaching going on in classrooms and report back an unvarnished, non-ideological assessment of how students and teachers are performing not just on standardized tests but in their entire intellectual lives.
In fact, we know that, at least as applied to teacher performance, the tests have huge measurement error and are often sloppily applied. We know that the rise of high-stakes testing leads to cheating scandals. We know that adults struggle with the tests. We know that the appearance of rising scores in a district is often a product of changing student bodies, tweaks to whose scores are counted, or flat out making the tests easier (never mind whatever cheating goes on). We know that pineapples don't have sleeves.
Then there are the eyewitness accounts. Todd Farley, who worked in the testing industry for 15 years, writes that:
...the companies who employed me [were] willing to take huge shortcuts in developing tests because meeting a contract’s deadline was clearly more important than the quality of any assessment.The questions about scoring tests are equally serious: Farley identifies a number of occasions on which thousands of test-takers have been given incorrect results, pointing out that:
Last year I was amazed to see the management of a publishing company giving its test developers only four weeks to produce K-12 assessments for the Detroit Public Schools (a school system now bankrupt but then willing to pay millions to a testing company); later, however, that short time-frame looked like a leisurely vacation compared to breakneck pace the company next worked its employees at, when the staff was required to pound out more than 200 Common Core Standard tests over the next two months.
...most of those errors were discovered only after a test-taker complained about a score, not when any company voluntarily disclosed the problem, raising questions about the legitimacy of every other test administered over the last 10 years.Those are your multiple-choice tests, where there is at least theoretically a single correct answer. But multiple choice only measures a very limited range of knowledge and skills, and open-ended tests that assessed higher-order skills have to be graded by someone. By whom, though? People paid $12 an hour to "read" 20 to 30 essays an hour? Tales of what that looks like are legion and make clear what a poor option it is. Recently there's been big claims about robo-graders being as effective as human graders. There's reason to question that conclusion:
The e-Rater’s biggest problem, he says, is that it can’t identify truth. He tells students not to waste time worrying about whether their facts are accurate, since pretty much any fact will do as long as it is incorporated into a well-structured sentence. “E-Rater doesn’t care if you say the War of 1812 started in 1945,” he said.To summarize, pineapples aside, tests include a lot of error, of the measurement error kind and the scoring error kind. They're written under time and financial pressure. They tend to produce cheating. If they're not multiple choice, there are huge issues with how they're graded. Yet they keep being treated as if they're infallible documents that have dropped from the sky instead of flawed ones created for profit.
Mr. Perelman found that e-Rater prefers long essays. A 716-word essay he wrote that was padded with more than a dozen nonsensical sentences received a top score of 6; a well-argued, well-written essay of 567 words was scored a 5.
The test's influence on schooling doesn't begin and end on test day
If everyone's future is, in some measure, riding on a test, schools will teach to the test. That means students don't learn math, reading, history, science. They learn how to do well on the specific test their school district has contracted with a testing company to provide. Examples of this abound. Jeff Nichols and Anne Stone, New York City parents who are opting their son out of standardized testing, write that:
Because so much is riding on these tests, the curriculum at our 3rd-grader's school has been distorted dramatically. There is no music, science, or gym teacher; art has been suspended since December so that there can be extended hours for test prep. Our son's homework for months has consisted of practice tests; the main function of school seems to be to teach him to read passages of little or no literary merit and then decide which of four possible answers to equally insipid questions is the "right" one. In math, our son brings home dreary worksheets day after day, asking the same kinds of questions 100 different ways.Pearson and other testing companies don't just make and sell tests, by the way. They also make and sell "teaching materials," and districts that are using tests by a company often also buy its "teaching materials"—what better way to be sure your students are prepared for the test they'll be taking? Yet schools on military bases, where standardized testing is deemphasized and doesn't control the curriculum, outperform traditional public schools and have a narrower racial achievement gap. Similarly,
A study published in the journal Science Education in December 2008 looked at two sets of high school science students. One set “sprinted”; the other set had teachers who slowed down, went deeper, and did not cover as much material. The results? The first group of students actually scored higher on the state tests at the end of the year. This is not surprising, as their teachers covered more of the test material. I am sure it made their parents, teachers, and administrators happy. What is more interesting, however, is that the students who learned through the slower, in-depth approach actually earned higher grades once they made it to college. This, too, is not surprising. These students were taught to think critically.Cases like these are why opting out is becoming an increasingly popular choice.
The opt-out movement
While policy leaders continue pushing testing and signing multimillion dollar contracts with Pearson and its ilk, people on the ground are revolting. In Texas, by now more than 360 school boards have passed a resolution:
...that says an “over reliance” on standardized high stakes testing is “strangling our public schools and undermining any chance that educators have to transform a traditional system of schooling into a broad range of learning experiences that better prepares our students to live successfully and be competitive on a global stage.”A National Testing Resolution based on the Texas resolution was written by:
Advancement Project; Asian American Legal Defense and Education Fund; FairTest; Forum for Education and Democracy; MecklenburgACTS; Deborah Meier; NAACP Legal Defense and Educational Fund, Inc.; National Education Association; New York Performance Standards Consortium; Tracy Novick; Parents Across America; Parents United for Responsible Education - Chicago; Diane Ravitch; Race to Nowhere; Time Out From Testing; and United Church of Christ Justice and Witness Ministries.It has been signed by dozens more groups and thousands of individuals. Given how completely bought into high-stakes, but unproven, standardized tests policymakers remain, a widespread movement opposing and opting out of reckless testing has become a necessity.