Below the orange mandalla is a LTE I wrote in response to an article about overtesting in the CPS (Chicago Public Schools).
Some of the discussion is specific to Chicago, but I think much of it will be relevant to situations in other areas.
The excess testing is only a symptom of CPS idiocy. Education in Chicago has its problems, and the central-office response is to "Do Something." It isn't necessarily something intelligent, reasonable, or related to the problem. And, since the problems remain next year, the response next year is to do something else.
The first thing to understand is that a comprehensive test of learning achievement tells you something. The second such test adds comparatively little to that information, and the third test adds almost nothing. In my entire educational progress from grade school to grad school, almost every teacher -- and absolutely every teacher I respected -- gave tests to judge how much I had learned. (They were seldom machine-graded multiple-choice tests.) If the success of one end of the educational experience can be judged by tests, I doubt claims that the process is too transcendent for the success of the other end to be judged by tests.
So, were the CPS:
1) Interested in results rather than bloviating, and
2) Moderately competent,
How would they proceed?
They would find one test of academic achievement which they would adopt for a significant period. This should probably have various sections, like the verbal and quantitative sections of the SATs, and various levels so that it could reasonably measure where students are at the beginnings of every grade.
They would give that test at the beginning of each year and at the end of the senior year of high school. What we really want to know is how any change in instruction affects the entire length of the school career, but we can't wait that long. Giving a test of knowledge in June rather than in September, however, expends valuable time -- and also money -- to investigate short-term learning. The result we want from 4th grade is that the kids have certain skills and knowledge at the beginning of 5th grade. No sane and sober adult would choose instead to test their skills and knowledge at the end of 4th grade. The students will, unfortunately, be unavailable for testing after graduation from high school.
The hypothetical intelligent administration interested in actually maximizing the children's learning would administer this test to all CPS students at the beginning of the first year. For each student it would measure also, every socio-economic factor knowable about the student, and, the number of absences of that student, and any shifts of that student from one class to another or from one school to another.
These would provide a large set of independent variables:
The particular student's score the beginning test, the class mean on the test, and the class variance on the test,
The same for each socio-economic factor, and the same for measures of class size, class attendance, and any shifts in class membership.
When the individual student takes the test the next year, the data is all in the CPS mainframe computer. They first take the coefficients of linear correlation of the second test scores with all the independent variables. Tedious, but not mathematically challenging, calculations will yield the first predicted score for each student on the test of the second year. (Note that this prediction is based on the actual results, not on some hypothesis that some bureaucrat pulls out of his ass. Particular students will have a deviation from the prediction, the average deviation of students fitting any particular screen of independent variables will be zero.)
Then the deviations would be compared to each truly-linear independent variable to see if the function is truly linear. This would take humans, and would be somewhat tedious, but should result in a short list of independent variables which improve the fit by being put in curvilinear form. Do that.
At this point, we have the best predictors from all accessible data -- except the teacher -- of the child's progress. Then we sort out the deviations of the children from predictions by teacher and by principal. We do quite similar studies for the next year.
Then we see how reliable are the results of the form "Miss Jones's students advance more rapidly than Miss Smith's students." Until we do this, the whole program of evaluating teachers by student progress is wild-ass guessing.
It should go without saying, although with CPS nothing is too obvious to state, that test monitors can affect the test results. Judging teachers on the basis of tests they administer themselves is first-order idiocy. Actually, the tests should be administered by roving panels of monitors on different days in different schools. Then one of the independent variables can be the administering panel. Even trained and intending to be neutral, different panels will get slightly different results.
Once this background is established, the next bright idea for a new change in curriculum or processs comes along, it can be tested as to results on a sample of schools instead of being imposed on all of them before we know if it will work.