Skip to main content

View Diary: Ten Reasons Why Value-Added Assessments are Harmful to a Child’s Education (107 comments)

Comment Preferences

  •  When you look at the results, you find troubling (3+ / 0-)

    outcomes.

    You find teachers ranked in the highest group one year and in the lowest group the next year.

    You find, for example, that you can predict a 3rd grader's test result if you know the 5th grade teacher she was assigned to.

    You don't see that the value add statistics give you better results than other measures.

    We haven't accounted for the cost of this data analysis, which is substantial, and compared its benefits for student outcomes with other uses of those people or those funds (IT support in schools, additional teacher aides, data analysis that is more focused on getting particular resources for underperforming kids, etc)

    This is all so that no one has to walk into the school and see for themselves which teachers connect with the kids and are raising their achievement, confidence, and success. Good principals know who is teaching well, and if you don't have good principals you can trust, the teachers cannot perform to their full potential anyway.

    Fry, don't be a hero! It's not covered by our health plan!

    by elfling on Tue Mar 12, 2013 at 10:57:09 AM PDT

    [ Parent ]

    •  Sample sizes, etc. (0+ / 0-)
      How do you get a large enough sample size? Our elementary classrooms have ~20 kids. That makes each child 5% of the score. It's not typical for the classroom to have the same 20 kids all year. We're a relatively stable school; the teacher might have 16 of those kids for the full school year. So if you were to only use those 16, now each child is 6.25% of the score. It sounds terrible if you hear that 19% of those kids are far below basic - and certainly that's a problem. But it's 3 kids. 3 kids with their own individual story that may or may not have much to do with the school. 3 kids who may or may not have come to school every day.
      1. Each classroom may have only 20 kids. But the English teacher is probably teaching 5 classes a day of 20 kids each. So 100 kids a year. Measure those results over 5 years, and that's 500 kids. That's a large enough sample size.

      2. Say the teacher is teaching only 20 kids all year, thus teaching all subjects. You might or might not be able to draw useful conclusions from the data, like "Mr. Smith is great in English but needs a brushup over the summer in math instruction, and mentoring from Ms. Johnson, who's great in math." Maybe the data show you that most teachers have particular strengths and weaknesses, so you want to have each teach only the subject where he or she is strongest, instead of having every teacher teach every subject. When my son was in 4th grade, two teachers paired up--one taught all the language arts for both classes, the other taught all the science for both classes, and they each taught their own math and social studies. Etc.

      You find teachers ranked in the highest group one year and in the lowest group the next year.
      That should be a red flag about something. Maybe the test is poorly designed. Maybe some teachers are improving, passing other teachers by who are failing to improve. If it simply reflects that a teacher was moved from a class full of affluent white kids to a class full of poor, transient, non-native-English-speakers, then the evaluation criteria are wrong. The criteria should measure improvement of the individual students, not absolute achievement.

      If it's just a fluke, then it will average out over time.

      You find, for example, that you can predict a 3rd grader's test result if you know the 5th grade teacher she was assigned to.
      (I assume you meant that the other way around.) Again--that's only a valid concern if the evaluation criteria are stupid. The criteria should measure how each teachers' students do relative to expectations:

      * If Mr. Phillips's kids start off in August at 78th percentile and finish up in May at 78th percentile, that would not indicate that Mr. Phillips is a 78th-percentile teacher; it would indicate Mr. Phillips is an average teacher.

      * If his kids improve from 78th percentile to 90th percentile, that would not make him a 90th-percentile teacher; it might well make him a 99th-percentile teacher.

      * If his kids start at 78th percentile and end up at 60th percentile, that does not make him a 60th-percentile teacher; it may well make him a 5th-percentile teacher.

      Please do not try to discredit the idea of gathering data by pointing out that data can be used in a stupid manner.

      "The true strength of our nation comes not from the might of our arms or the scale of our wealth, but from the enduring power of our ideals." - Barack Obama

      by HeyMikey on Tue Mar 12, 2013 at 01:37:40 PM PDT

      [ Parent ]

      •  No, I meant what I wrote (1+ / 0-)
        Recommended by:
        Mostel26
        You find, for example, that you can predict a 3rd grader's test result if you know the 5th grade teacher she was assigned to.
        (I assume you meant that the other way around.) Again--that's only a valid concern if the evaluation criteria are stupid. The criteria should measure how each teachers' students do relative to expectations:
        If you run the model using the 5th grade teacher's scores as an input to predict the 3rd grade scores, you find a strong correlation. The exercise is done as a test of the model: if it finds it can "predict" scores with information that cannot possibly have influenced the score, then it means you have not adequately controlled your variables.

        http://www.epi.org/...

        For a variety of reasons, analyses of VAM results have led researchers to doubt whether the methodology can accurately identify more and less effective teachers. VAM estimates have proven to be unstable across statistical models, years, and classes that teachers teach. One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year. Thus, a teacher who appears to be very ineffective in one year might have a dramatically different result the following year. The same dramatic fluctuations were found for teachers ranked at the bottom in the first year of analysis. This runs counter to most people’s notions that the true quality of a teacher is likely to change very little over time and raises questions about whether what is measured is largely a “teacher effect” or the effect of a wide variety of other factors.

        A study designed to test this question used VAM methods to assign effects to teachers after controlling for other factors, but applied the model backwards to see if credible results were obtained. Surprisingly, it found that students’ fifth grade teachers were good predictors of their fourth grade test scores. Inasmuch as a student’s later fifth grade teacher cannot possibly have influenced that student’s fourth grade performance, this curious result can only mean that VAM results are based on factors other than teachers’ actual effectiveness.

        (I note that I misremembered that it was 5th grade to 4th grade rather than 3rd. Sorry for the error.)

        Fry, don't be a hero! It's not covered by our health plan!

        by elfling on Tue Mar 12, 2013 at 02:11:58 PM PDT

        [ Parent ]

        •  What that shows... (0+ / 0-)

          ...is that internal politics rule at these schools.

          The 5th-grade teacher with the most political juice gets the best kids assigned to her class.

          And/or, the parents who are most vocal and involved (and therefore have the best-testing kids) push to have their kids assigned to certain teachers.

          Did you know that being elected President "predicts" that you went to an Ivy League school?

          •  I submit (2+ / 0-)
            Recommended by:
            Mostel26, ManhattanMan

            that it could mean a lot of things, some educationally appropriate and some not. It can also just be that standard result we see with standardized tests: that the strongest correlating variable remains household income, and that the resolution of the data available doesn't allow it to be pulled out.

            But it does suggest that typically teacher-student matchups are not random.

            I am not certain to what extent this was controlled for individual schools; I suspect not. Few elementary schools would be large or diverse enough to have enough 5th grade teachers to see this effect with much confidence.

            Fry, don't be a hero! It's not covered by our health plan!

            by elfling on Tue Mar 12, 2013 at 05:38:50 PM PDT

            [ Parent ]

            •  You only need two... (0+ / 0-)

              ...5th-grade teachers. In fact the effect is more pronounced with just two, because each teacher you add costs your regression model a degree of freedom.

              1) You have a young one with no political pull and no social power.

              2) You have an old one who is close friends with the Principal.

              The Old One asks to have Genius, Hardworker, Imaginative, and Polite assigned to her class. She sticks the Young Teacher with Stupid, Unmotivated, Violent, and Lazy.

              Sure enough, a regression shows that being assigned to Old One's class is a powerful predictor of good 3rd-grade performance.

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site