Skip to main content

Instruction That Measures Up: Successful Teaching in the Age of Accountability
by W. James Popham
(ASCD, 2009)

Reviewed by Kenneth J. Bernstein, NBCT
High School Government & Social Studies (MD)



Teacher Leaders Network

this is cross-posted from the Teacher Leaders Network, for which I wrote it.  It has also been picked up by Education News, and will be featured in their daily email tomorrow

This is a book by one of America’s acknowledged experts on assessment: now emeritus from UCLA, Popham has been a leading figure in research (having served as President of the American Educational Research Association), in publication (of his many books and articles, and as editor of a major journal on evaluation), and as a person whose opinions on matters educational are always worth considering. For those interested, you may read a professional bio here.

Popham has in recent years been critical of how our educational policies have approached the matter of tests, assessment and evaluation. This book therefore may catch some off-base, because Popham now moves beyond criticism to try to help those in the classroom deal with the reality of test-based accountability, something to which the current national administration has made clear its commitment.

The purpose ofInstruction That Measures Up can be clearly understood from one paragraph in the preface, which appears on p. 2:

I believe the best way for teachers to deal with test-based accountability pressures — the way that benefits students  — is to accept those pressures as a given, then plan and carry out instruction knowing that it will take place on an accountability-spotlighted stage. What teachers must do is focus on providing instruction that measures up: to the expectations of administrators, parents, and taxpayers; to their own professional standards; and, most essentially, to the needs of their students.

The book is divided into 7 chapters:

   1. Teaching Through an Assessment Lens
   2. A Quick Dip in the Assessment Pool
   3. Curriculum Determination
   4. Instructional Design
   5. Monitoring Instruction and Learning
   6. Evaluating Instruction
   7. Playing the New Game

There is also a two page list of resources, an index, and some background on the author, with a total of 174 pages.

For those who are not all that knowledgeable about matters of assessment — which not only includes many of those in the classroom, but far too great a percentage of those involved with making educational policy — the second chapter by itself justifies the book. Popham divides the "Assessment Pool" into four broad categories: Testing as Score-Based Inference Making; The Core Concepts of Assessment; The Categories of Educational Tests; and The Summative and Formative Functions of Educational Assessments.

He provides clear explanations of the meaning of terms. Where necessary, he offers a great deal of detail with easy-to-comprehend explanations. A reader who pays attention will begin to grasp the importance of how psychometricians understand key terms.

The four core concepts Popham addresses are the key ideas of Reliability, Validity, Assessment Bias, and Instructional Sensitivity. Any assessment that fails to take into account these core concepts, whether it is designed by a classroom teacher for instructional purposes or by outside organizations and imposed from above for purposes of accountability, will by definition be at risk of being unable to provide sufficiently accurate information to allow one to draw appropriate inferences from the data.

Popham offers a number of blunt statements about problems with our current schemes of assessment, and a number of key warnings, of which one on p. 29 caught my attention: "It serves no one to ascribe unwarranted precision to educational tests." Unfortunately, our national obsession with numbers and our desire to rank and compare means that it is precisely ascribing too much precision to the data we obtain from tests that has been distorting much of our educational policy for the past several decades.

Each of the chapters has some important material at the end. We have a "Chapter Check-Back" in which key concepts are repeated in summary form, as well as a list of several "Suggestions for Further Reading" on the topic of the chapter. The chapter on Monitoring Instruction and Learning has four key concepts, among which is this:

Feedback to students is most effective when it is task focused, directive, timely and simple; it is least effective when it comes in the form of grades. (p. 125)

Among the suggested readings are works by notable names such as Rick Stiggins, Robert Marzano, Grant Wiggins and Jay McTighe, and Popham himself, as well as some valuable works by lesser known lights. Each suggestion is accompanied by a brief explanation by Popham as to why the work is included: for example, about the address offered by D. A. Frisbee as outgoing president of the National Council of Educational Measurement, Popham tells us

Frisbee lays out a set of basics in educational assessment – concepts that he feels have been distorted in recent years. The article gives teachers a list of important misconceptions to avoid. (P. 51, italics in original).

Many of the chapters also contain political cartoons. Through these Popham pokes fun at a lot of rhetoric commonly encountered in current discussion on educational policy. The final cartoon on p. 160 portrays a pre-test pep rally, with a sign in the background reading:

Tomorrow’s Accountability Test
Tomorrow’s Accountability Test
-- Cost teachers their jobs.
-- Close our school.
-- Destroy your future!

The speaker, apparently the principal, is urging the assembled to "Try harder, and harder, and harder!" while one teacher in the audience comments: I can see why Confucius said, "High stakes are for string-beans."

Popham is for PROPER use of assessment. He quotes three sentences from Dylan Williams of Britain. The middle sentence reads:

It is only through assessment that we can find out whether what has been taught has been learned.

That, Popham says, is "one I’d like to see in neon lights above the entrance to every school in the world." (p. 101)

This is a book Popham intends to be of practical use to teachers. One may not agree with all his formulations -- this reader had some questions about the approach Popham offers as the structure of an effective lesson. Nevertheless there is a great deal of insight and practical advice. If nothing else, readers should come away with a deepened understanding of the terminology, and of the appropriate uses and inappropriate misuses of assessment of various kinds.

Before some final remarks, allow me to share a number of very brief selections which will give you a real sense of Popham and his approach:

...rarely can anyone look at a planned instructional activity and say for certain that it’s going to be effective. (p. 18)

Still, few educators, though seemingly awash in an ocean of test-based accountability, currently recognize how few accountability tests are even mildly sensitive to the quality of a teacher’s instruction. (p. 38)

It is far better for students to master a modest number of truly potent, large-grain curricular aims than it is for them to superficially touch on a galaxy of smaller-grain curricular aims. (p. 61

Teachers must always concern themselves with what’s best for their students. (p. 70)


Remember, instructionally insensitive accountability tests are essentially insensitive to instruction, meaning what a teacher emphasizes in class is probably not going to make a substantial difference in students’ scores on an instructionally insensitive accountability test. (p. 71)

Self-reflection is a teachers’ ally. (p. 114)

...it is fundamentally wrongheaded to try to use a test to help students monitor their own learning while, at the same time, using the results of that test to grade or rank those students. (p. 119)

...formative assessment’s focus is on getting students to learn, not outperform other students. (p. 120)

...although many of those earlier researchers set out on a quest for a silver bullet that would permit the definitive appraisal of a teacher’s competence, such a bullet was never found. IT still hasn’t been.

The insuperable obstacle to the creation of a sure-fire, cookie-cutter approach to teacher evaluation is teaching’s profound particularism. (p. 145)

And one final quotation, that may help summarize Popham’s thinking:

So, as someone who’s been dipping in and out of the teacher evaluation research literature for more than 50 years, I’ve come to a conclusion about the only truly defensible way to evaluate a teacher’s skill. Because of the inherent particularism enveloping a teacher’s endeavors, I believe the evaluation of teaching must fundamentally rest on the professional judgment of well-trained colleagues. (p. 146)

Ultimately this is a book about teaching. It is presented through the lens of a deep understanding of what assessment and evaluation can contribute to the improvement of teaching practice, as well as some serious cautions, offered throughout the work, about the dangers of pushing the instruments we have beyond the limits of the valid information they can provide us. Or rather, the reliable information from which we are able to draw valid -- even if often limited -- inferences.

I come away from the book agreeing with Popham that teachers should insist on getting a better grounding in assessment and evaluation — which is about far more than testing but which should thoroughly cover matters of testing -- as part of their professional development. It’s best if it is part of teacher preparation, but not too late as a part of continuation for those already in the classroom.

This is a valuable book. It is for and about teachers, but can be profitably read by anyone interested in improving teaching by the proper understanding and application of assessment. That should mean everyone, for we are all affected by educational policy, even if only through decisions made on how to spend the taxes we all pay.

I highly recommend this book.

Kenneth J. Bernstein is a National Board-certified teacher of social studies at Eleanor Roosevelt High School Eleanor Roosevelt High School in Greenbelt, Md., and a member of the Teacher Leaders Network. He is nationally known as a blogger on education and other issues under his online name of teacherken. Bernstein is also a 2010 recipient of The Washington Post’s Agnes Meyer Outstanding Teacher Award.

That's the review.  Hope y'all find it useful.

Peace.

Originally posted to teacherken on Thu Sep 16, 2010 at 08:20 PM PDT.

EMAIL TO A FRIEND X
Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

  •  hadbn't done a diary today (23+ / 0-)

    thought this might be of some interest.

    Could just have put up a link but thought I'd save people some time.

    Peace.

    "what the best and wisest parent wants for his child is what we should want for all the children of the community" - John Dewey

    by teacherken on Thu Sep 16, 2010 at 08:20:01 PM PDT

  •  Let me know if I have interested you (7+ / 0-)

    or anything else you might like to share.

    Peace.

    "what the best and wisest parent wants for his child is what we should want for all the children of the community" - John Dewey

    by teacherken on Thu Sep 16, 2010 at 08:33:04 PM PDT

  •  I've always asked for a simple requirement (16+ / 0-)

    When they pass standardized testing and impose such rigorous, inflexible systems on students, there should be a provision in the law which requires that every elected official in the state must also take the tests (e.g., the ones required to graduate high school).

    The tests should be administered under the exact same conditions that students take them, and every politician's score should be made public as soon as they're available.  This would have two useful effects:

    (1) it would make legislators think twice about mandating such tests; and

    (2) it would give the voting public a very revealing look at the (true?) intelligence and knowledge levels of their politicians.

    I really wish that some group(s) would push for a requirement such as this through a petition initiative.  It would be very hard for most politicians to vote against it, as it would suggest that they were afraid of being tested.  And the result, I have to believe, would lead to many of the truly stupid ones (disproportionately Republicans) being exposed to deserved ridicule.

    Yet it is not our part to master all the tides of the world, but to do what is in us for the succour of those years wherein we are set... -- Gandalf

    by dnta on Thu Sep 16, 2010 at 08:39:33 PM PDT

  •  glad a few are finding this (6+ / 0-)
    Recommended by:
    mataliandy, TexMex, Heiuan, JanL, luckydog, LWelsch

    past my bedtime.  I get up at 5 or the cats get me up.

    Catch up with any additional traffic then.

    "what the best and wisest parent wants for his child is what we should want for all the children of the community" - John Dewey

    by teacherken on Thu Sep 16, 2010 at 08:56:34 PM PDT

  •  I used to teach for Princeton Review (9+ / 0-)

    It was about 50% knowledge and 50% how to beat the test.

    The lessons were good and transferable; I aced the LSAT and the Bar Exam.  That said, it is easy to teach the test and hard to teach for real.  

    Republican Platform = Fear, Anger, and Hate. Oddly their God preached against those very traits.

    by TexDemAtty on Thu Sep 16, 2010 at 09:56:25 PM PDT

  •  nice to see (6+ / 0-)

    an addressing which talks to the importance of qualitative metrics.

    Thanks for the great book review.

    "a lie that can no longer be challenged becomes a form of madness" -Debord

    by grollen on Thu Sep 16, 2010 at 10:11:09 PM PDT

  •  I always had the radical idea.... (5+ / 0-)
    Recommended by:
    Reino, Heiuan, debedb, LWelsch, TexDemAtty

    ... that the exams for a given course should be written by the teacher giving the next course, the course for which this course is a prerequisite.

    Because that's the person who actually gets directly impacted if the students fail to learn what they were supposed to.  "Do these people know what they're supposed to know coming in?" is a question that teacher will care deeply about.

    Becomes hard to apply for "capstone" courses though.

    -5.63, -8.10. Learn about Duverger's Law.

    by neroden on Thu Sep 16, 2010 at 11:54:55 PM PDT

  •  Thanks for the wonderful peek into your important (5+ / 0-)
    Recommended by:
    Reino, Heiuan, JanL, LWelsch, luckylizard

    and often unjustly maligned profession.

  •  I've always been (2+ / 0-)
    Recommended by:
    Heiuan, JanL

    averse to anything having to do with numbers, including the results of standardized tests and statistics.  This book sounds like something that I could easily understand.  It's now on my wish list, the one that I'll start filling when I get another job.  :-)  Thanks, Ken!

    -7.62, -7.28 "Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly." -Langston Hughes

    by luckylizard on Fri Sep 17, 2010 at 01:58:14 AM PDT

  •  This sounds interesting (1+ / 0-)
    Recommended by:
    teacherken

    Usually, from what I've read, all kinds of bias are lumped in with validity.  And, if the test is intended to be one of teachers' abilities, so would sensitivity.  But I can see why they would merit separate sections of a book like this.

    Usually, reliability is defined as "whatever it is that this test measures, does it do so in a way that is reproducible and consistent?"  Two parts of reliability are test-retest reliability (if you give a test twice, with not too long a break between, will people get similar scores?) and internal consistency (do all the items on the test relate to each other?)

    Validity, on the other hand, is "does this test measure what it purports to measure?"

    Thus, a test of (say) reading ability  which is biased against (say) tall people, would be less valid; and a test intended to measure teaching but which did not do so would also be less valid.

    I would add to the list of problems that you list from the book an additional one: It is very hard to measure student ability.  I am aware of NO tests, or assessments of any sort, that have very high test-retest reliability.  Nor are the correlations between ANY form of assessment and the things they purport to measure all that high.  But as hard as it is to measure student ability, it is even harder to measure teacher ability.

    This is not just for standardized tests, or bubble tests, or whatever, but for ANY sort of test.  There's been research, for instance, where teachers were given essays to grade, and then, a month later, given the same essays - and reliability wasn't all that good.  And that's the SAME teacher.  There's also been lots of stuff on the biases of teachers - both conscious and unconscious.  I had one professor in grad school (ironically, she taught ethics) who gave every woman in class an A and every man either a B or B+.  But there are more subtle biases of all sorts, as well.

    It's hard to measure ability.  

    Does that mean we should cease trying?

    We all differ in ways that matter. But we're all the same in the ways that matter most.

    by plf515 on Fri Sep 17, 2010 at 03:15:36 AM PDT

    •  The tests aren't trying to measure ability (2+ / 0-)
      Recommended by:
      Heiuan, JanL

      I think that the current NCLB/RTTT tests are trying to measure teacher impact rather than teacher ability, and I also think that teacher impact theoretically should be easier to measure. Basically, if a group of students has gained the ability to answer questions that the students could not answer a year ago, then there has been a teacher impact.

      That being said, the current tests we have do a very poor job of measuring teacher impact for a long list of reasons that you probably could draw up better than I could. Looking at teacher impact from year to year, they are not particularly reliable, and my sense is that the validity issues are even bigger.

      As to your last question, if trying means spending millions of dollars and taking away 2-3 school days in every school in the country to get unhelpful results, then we should stop trying. If trying means research into finding testing systems that are helpful and figuring out how such systems could be applied on a larger scale, then we should keep trying.

      "H.R.W.A.T.P.T.R.T.C.I.T.G -- He really was a terrible president that ran the country into the ground."

      by Reino on Fri Sep 17, 2010 at 03:38:30 AM PDT

      [ Parent ]

      •  As I've said before (4+ / 0-)
        Recommended by:
        Reino, TexMex, Heiuan, JanL

        if you want to look at teacher impact on student performance, then testing should be frequent, low-stress, varied, and short.

        Testing once a year is ludicrous.  
        Making tests so high-stress is ludicrous.
        Testing only reading and math is ludicrous.
        And asking students to spend so many consecutive hours doing tests is ludicrous.

        What would allow good use of test scores is something like the spelling tests we took as kids, but varied.  

        Once a week, half an hour each time, varied over all the subjects in the curriculum.

        We all differ in ways that matter. But we're all the same in the ways that matter most.

        by plf515 on Fri Sep 17, 2010 at 03:42:32 AM PDT

        [ Parent ]

        •  Agreed, but... (1+ / 0-)
          Recommended by:
          plf515

          A system like that would make the tests more useful in that they would give students and teachers feedback that they might be able to apply to their current practice.

          However, there is still the problem of what you would put on such tests. The spelling tests we took (and that my daughters currently take) were simple in part because they measured a simple task--how well you could spell the words on this week's list. It becomes more difficult when the ability being tested is along the lines of locating and using information or evaluating political candidates. I realize that such things can be tested, but you are talking about short, multi-subject tests, which seems to me to make such tasks very difficult to test.

          That problem grows when you are testing across teachers, classes, and schools. Would you put the same material on the test of an 8th grader taking Algebra as an 8th grader taking Pre-Algebra? What about the weekly tests of two 8th graders taking Algebra in different courses which use a different sequence? What about an 8th grader in a gifted program who had to do well on a very difficult test to enter the program as opposed to one not in such a program? What about one in a class where the teacher emphasizes good note-taking as opposed to one in a class where the teacher hands out class notes every day?

          "H.R.W.A.T.P.T.R.T.C.I.T.G -- He really was a terrible president that ran the country into the ground."

          by Reino on Fri Sep 17, 2010 at 06:46:46 AM PDT

          [ Parent ]

          •  I didn't say it would be easy! (0+ / 0-)

            One note - I didn't mean short multi-subject tests, I meant short single-subject tests, with the subject changing week-to-week. So, week 1 might be English, week 2 math, week 3 social studies, week 4 science then back to English (depending on curriculum).

            The tests would have to be subject specific, of course.  A test of pre-algebra (whatever THAT is, there's really no such thing as pre-algebra, if you ask me) would be different from one on algebra.

            If the goal is to measure teacher impact, the fact that different students of the same age are taking subjects of different difficulty would not be bad.

            As for different sequences/methods - well, many universities attempt to solve this in their courses that are offered by multiple professors.  I think it should be possible to agree on some core ideas in each course.

            There's also been a lot of work done on test-equating, which could alleviate some of the problem.

            We all differ in ways that matter. But we're all the same in the ways that matter most.

            by plf515 on Fri Sep 17, 2010 at 08:01:20 AM PDT

            [ Parent ]

  •  excellent diary.. (0+ / 0-)

    thanks for your hard work!

    http://www.c-spanvideo.org/program/206488-1 at 1:31:20

    by TexMex on Fri Sep 17, 2010 at 04:46:59 AM PDT

  •  Thank you for this... (0+ / 0-)

    Sorry, I hadn't had time to read this until today.

    Courage is what you are in the dark. Emilio Lazzardo in Buckaroo Bonzai

    by Temmoku on Sun Sep 19, 2010 at 06:27:34 AM PDT

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site