About 10 days ago, Laura Clawson published a story on the profit-making standardized test industry (inspired by the so-called Leave No Child Behind law) that has taken over our classrooms and imposed a regime of time-consuming testing and even more time-consuming preparation for testing on our public school children.
http://www.dailykos.com/...
Her example of the foolishness of this was a reading comprehension item on one of these company's tests for 8th graders, an item which on the face of it was absurd. The reading was a modified version of a Daniel Pinkwater (very funny guy who loves absurdity) story -- this version was called The Hare and the Pineapple.
Here's a link to the story and the test questions:
http://usny.nysed.gov/...
Comments in the diary focused on the testing issues in general, but there was some serious (though not uncivil) disagreement about the quality of the item just cited. Since I was involved in a fair amount of the back and forth on this, I felt we needed some real expertise on whether this item was an example of decent or poor quality testing.
My opinion was that this test item was poor. For example, one of the questions on the reading was which animal spoke the wisest words, and the correct answer was the owl who made the most concrete statement. And yet, the owl's statement was so simplistic, it suggests that the owl had no ability to understand abstractions (yes, I know, I'm referring to a talking owl here!).
Some commenters defended this item.
One commenter said:
In this story, the owl is proven right by the events of the story. That means in a question about who has the wisest words in this particular story, yes, the overly concrete answer is the wisest one.
My final answer in this debate was:
I'm going to send it to my sister and ask her for comments. I'm a survey researcher in background and know a good opinion question (or bad one) when I see it, but I'm no expert on questions written to test knowledge or ability.
She is. I'll ask her.
My sister wrote test questions at Educational Testing Service (in Princeton, NJ) for many years and is an expert. So I sent the example to her and I'm publishing her answer in full:
After reviewing the passage and questions with my professional ETS test developer hat on, I conclude that you have put your finger on the basic problem--the story is ironic and satirical, whereas the questions try to test the story's literal meaning. At ETS, we knew to avoid irony and satire in reading comprehension passages. It just doesn't work in a testing situation. Apparently, test developers at Pearson, the company that produced this passage, aren't aware of the pitfalls of irony and satire. Testing this passage's literal meaning results in unreliable, almost nonsensical questions. If Pearson had the kind of rigorous reviewing process that ETS has, this passage would have been thrown out during the reviews.
As for your question about the over-testing of kids--I think the No Child Left Behind law creates overuse and misuse of tests. I'm especially troubled by using tests to punish "failing" schools and to evaluate teachers. The tests aren't designed to do these things, and using them that way is destructive. The money aspect that you identify is particularly troubling--big corporations are making hundreds of millions of dollars that should instead be used to improve teaching. And I think you're right--these big companies, like Pearson, probably have questionable standards and procedures that aren't up to ETS standards. "The Hare and the Pineapple" is an example of the result.
She also said this about the question development process at ETS:
A lot of questions don't survive the reviews at ETS, and virtually all of the ones that do survive have been extensively revised and improved during the review process. When I worked at ETS in the 1980's and 1990's, every question for tests like the SAT and GRE was reviewed by 7 independent ETS staff members--5 different question-writers and 2 editors. The questions were then pretested (tried out, without counting towards students' scores). Only the questions that performed well statistically in the pretests (high-scoring students got them right and low-scoring students got them wrong) were used in final versions of the tests. The pretests tended to identify questions that had problems that even the 7 reviewers hadn't noticed. I doubt that "The Hare and the Pineapple" would have gotten into a final version if it had been pretested at Pearson.
I think her final statement points out that one of the main problems here was how the over-testing opened the door to big profits, at our expense, for unqualified and greedy companies to come in and create "standardized" tests. Not only are we over-testing our children, but we don't even know what all that testing really means.