The insanity of our high stakes testing policy

by teacherken

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Thursday, Mar. 01, 2007 Thursday, Mar. 01, 2007 at 4:24:47pm PST

The more any quantitative social indicator is used for social decisionmaking, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.

The foregoing is known as Campbell's Law, and it first appeared in print in Campbell, D. T. (1975). Assessing the impact of planned social change. In G. Lyons (Ed.), Social research and public policies: The Dartmouth/OECD Conference. (Chapter 1, pp 3-45). Hanover, NH: Dartmouth College, The Public Affairs Center. (p. 35). Serious social scientists have thus been aware for three decades of the dangers of things like high stakes tests, which make our excessive use of them (in NCLB and elsewhere) ridiculous, counterproductive, if not downright insane. And it is why a book officially out next week is so important. The book, by David Berliner and Sharon Nichols, is entitled Collateral Damage: How High-Stakes Testing Corrupts America's Schools

David Berliner is the Regents Professor and Dean of the College of Education at Arizona State, and one of the most eminent education researchers in America. He has served as President of the American Educational Research Association and of the American Psychological Association's Division of Educational Psychology. He was coauthor, along with Bruce Biddle, of The Manufactured Crisis: Myths, Fraud, and the Attack on America's Public Schools, which totally took apart A Nation at Risk, the reagan-era screed about the failing nature of our public schools. His coauthor Sharon Nichols, is an Assistant Professor at the University Of Texas at San Antonio. Their new book is derived from work previously released in PDF form and entitled The Inevitable Corruption of Indicators and Educators Through High-Stakes Testing (which is an almost 200 page PDF available for free from The Educational Policy Studies Laboratory at Arizona State). Let me quote the ten categories of hundresds examples of the corruption wrought by high stakes tests as identified in this earlier work:

Administrator and Teacher Cheating: In Texas, an administrator gave students who performed poorly on past standardized tests incorrect ID numbers to ensure their scores would not count toward the district average.

Student Cheating: Nearly half of 2,000 students in an online Gallop poll admitted they have cheated at least once on an exam or test. Some students said they were surprised that the percentage was not higher.

Exclusion of Low-Performance Students From Testing: In Tampa, a student who had a low GPA and failed portions of the state’s standardized exam received a letter from the school encouraging him to drop out even though he was eligible to stay, take more courses to bring up his GPA, and retake the standardized exam.

Misrepresentation of Student Dropouts: In New York, thousands of students were counseled to leave high school and to try their hand at high school equivalency programs. Students who enrolled in equivalency programs did not count as dropouts and did not have to pass the Regents’ exams necessary for a high-school diploma.

Teaching to the Test: Teachers are forced to cut creative elements of their curriculum like art, creative writing, and hands-on activities to prepare students for the standardized tests. In some cases, when standardized tests focus on math and reading skills, teachers abandon traditional subjects like social studies and science to drill students on test-taking skills.

Narrowing the Curriculum: In Florida, a fourth-grade teacher showed her students how to navigate through a 45-minute essay portion of the state’s standardized exam. The lesson was helpful for the test, but detrimental to emerging writers because it diluted their creativity and forced them to write in a rigid format.

Conflicting Accountability Ratings: In North Carolina, 32 schools rated excellent by the state failed to make federally mandated progress.

Questions about the Meaning of Proficiency: After raising achievement benchmarks, Maine considered lowering them over concerns that higher standards will hurt the state when it comes to No Child Left Behind.

Declining Teacher Morale: A South Carolina sixth-grade teacher felt the pressure of standardized tests because she said her career was in the hands of 12-year-old students.

Score Reporting Errors: Harcourt Educational Measurement was hit with a $1.1 million fine for incorrectly grading 440,000 tests in California, accounting for more than 10 percent of the tests taken in the state that year.

Many of you may have read some of these horror stories. Many of us in education have lived through some of them. Some might dismiss this as the inevitable bumps and bruises in the process of rapid expansion of our national program of testing. Except, we have long standing evidence of how prevalent the problem is in any program of high stakes tests.

Let me quote, with permission from Dr. Berliner, the text of a posting he made recently on an educational list in which the forthcoming book was being discussed, and note especially where have placed part of the message in bold:

We hope that you read the book.
We never say that tests are evil. We agree that indictors are needed for society to function. We are not against accountability. But there are alternatives to high-stakes testing as educational indicators and we think those are the ones that need to be used to judge the performance of our kids and schools.

We have the 1,298 year old high-stakes civil service exams of China as a guide. The conclusion of recent scholarship on that long history is that corruption of those who administered the tests, and cheating by those who took it, was a natural accompaniment to its administration. High-stakes corrupts. Always. Inevitably. That seems to be the lesson of history in many fields. If the indicator is crimes solved by police, solution rates go up if police are judged on this indicator. This happens by having petty criminals confess to crimes they didn't commit, for reduced sentences. If the number of children seen by social workers is how they end up being judged, then those numbers go up and the death of some children is assured as social workers spend too little time in home visitations. We read about these cases every few months. If the indicator that is valued is publications in institutions of higher education, then publications go up--some are nonsense, many are restatements of other work packaged to appear as if the work were different, and some publications make use of fake data. The stakes are high, so everything is corrupted.

We argue through case after case of the ubiquity of Campbell's law in education and other fields of endeavor. In our book we focus in on the (literally) thousands of examples of corruption and distortion due to high-stakes testing in education. We amply document the test prep programs that undermine interpretations of validity, we note the easing of time restrictions by teachers who hope their kids will get another item or two correct, we report on the selective dropping of low ability students by administrators seeking bonus's, we document the narrowing of the curriculum and the undermining of teacher morale through the high-stakes testing.

We conclude this sweep through the problems of high-stakes testing by pointing out that the present use of high-stakes testing in education makes every member of AERA, NCME, and APA, violate their own standards for professional behavior. We also offer alternatives to high-stakes testing. Ours is a narrowly focussed book. We are not arguing about standards, the silly 100% proficient criteria, the undefined "highly qualified" teacher part of the law, the rip-off that seems to appear in supplementary services, the lack of science behind reading first, and so forth. We make one point and one point only: High-stakes testing in education has been, is, and will continue to be a failure. I hope you will take a look at what we think is a convincing argument.

In my electronic discussions with Dr. Berliner in seeking permission to quote the above, he remarked that

We wanted this book to be read by congressional staffers and I am presenting it at Politics and Prose in DC in mid March. I wish we could get CNN interested in covering that.We are also trying to get a meeting with staffers arranged as well. But that is harder to do.

His concern about reaching out to congressional staffers is because NCLB is up for reauthorization, and staffers are often the key to influencing the Congressmen, who often have little time to devote to any one issue - influence the key staffer and you are influencing the Member or Senator.

Let me offer a few remarks of my own.

test prep: I taught classes for two years for Princeton Review and tutored for two additional years, one with them one with another firm. I have seen the impact such preparation can have. That merely exacerbates the class inequities already reflected in SAT scores, because such prep costs more than most lower SES families can afford. Further, some of these firms have now moved into offering prep for AP exams, and for high stakes high school graduation tests such as Standards of Learning in Virginia and High School Assessments in Maryland.

We are clearly seeing the narrowing of the curriculum to what is being tested. This narrows curriculum within the subjects tested and limits the exposure of students to subjects not being tested, such as music,art, and increasingly in light of NCLB social studies and science. Both of these are detrimental to real learning, and also to invoking and maintain the level of student interest necessary for real learning to occur.

Resources are being removed from quality instructional materials and shifted instead to tests, test preparation, remediation for students who do not do well on such test, and so on. Instead of these resources being directed to personnel and material within the schools, which operate on a non-profit, public service basis, much of such funds now go to for-profit educational service organizations and publishers, with the inevitable overhead costs, profits for shareholders and bonuses for staff. This has the effect of starving the schools of resources that could make a difference.

NCLB does not require that the tests have high stakes for students. We have seen, however, in situations where there are no direct stakes for children, they often blow off the tests if for no other reason that they are tired of all the testing. As a result, some jurisdictions have decided to uses tests that do have high stakes for students as indicators for NCLB purposes. But such a use of a test is improper, because if a test is designed to allow you to draw valid inferences about a student performance it does not allow you to draw valid inferences about the school effects. The professional organizations in testing and educational measurement and psychology all make this very clear, and yet educational policy makers and legislatures seem to ignore the professional advice.

Let me offer a comparison to what I have written in the last sentence above, about the policy makers not listening to the professionals. Might I point out that in our invasion and subsequent occupation of Iraq, the policy makers in this administration chose to dispense with more than a decades worth of detailed planning not only for invasion but for the aftermath of the invasion, and as a result we have had the horrible consequences of thousands upon thousands of unnecessary deaths, continued disorder, a deteriorating economy as well as lack of security. While physical lives may not be lost as a result of the idiocy of policy makers plunging ahead on high stakes tests despite what professional have tried to advice, our fixation on testings is causing wide spread damage, to students, to the health of public education, to the quality of our teaching profession.

I want to repeat what I bolded from Dr. Berliner's posting, just to engrave it in your memory: We have the 1,298 year old high-stakes civil service exams of China as a guide. The conclusion of recent scholarship on that long history is that corruption of those who administered the tests, and cheating by those who took it, was a natural accompaniment to its administration. High-stakes corrupts. Always. Inevitably.

If you care about education, I want to you educate yourself about this issue. You could begin with the pdf of the earlier work, and you can get the book when it comes out next week. You should communicate your concern about the issue to anyone of influence you can, be they school board members, governors, Congressmen, superintendents.

The forthcoming book has been highly praised by people I admire. Let me share with you some of the blurbs:

"Nichols and Berliner provide a hard-hitting and thoughtful critique of today’s overreliance on high-stakes testing. This is a must-read for anyone concerned about the unintended consequences of education reform." --Paul D. Houston, Executive Director, American Association of School Administrators

"The cumulative impact of the accounts Nichols and Berliner lay out before us is staggering. They punch it home: The moral impact of NCLB may be as dangerous as its educational effects." --Deborah Meier, Senior Scholar, New York University, and author of In Schools We Trust

"Collateral Damage delivers a healthy dose of hard truth. It should be required reading for policymakers and concerned citizens." --Jeannie Oakes, Presidential Professor and Director, UCLA’s Institute for Democracy, Education, and Access

"Nichols and Berliner provide a carefully reasoned analysis laced with frightening accounts drawn from public schools. This readable volume eviscerates the premise that our schools can be evaluated with a single indicator. If you care about public schooling, this book is essential." --W. James Popham, Professor Emeritus, UCLA

The term Collateral Damage is one with tragic military connotations. If we seek to bomb a restaurant to kill, say, Saddam Hussein, and in the process destroy a mosque and kill several dozen innocent civilians, that destruction and carnage is considered collateral damage. We have done a great deal of collateral damage in Iraq and Afghanistan. It is the contention of Nichols and Berliner that our application of high stakes testing is doing devastating collateral damage upon our schools and the education of our children. To me that is clear evidence of the insanity of our policy of high stakes testing.