What Good Tests Look Like, and Why We Don’t Have Them

Back when I was a teacher-educator, I would teach pre-service teachers what a good test looked like. This was so they could recognize one when it appeared, and when their students received standardized test scores, they could explain them to parents. I used the acronym “RSVP” (taken from this pretty good educational psychology textbook) to emphasize the quality of good testing. “RSVP” has implications for education reform, not just one-off tests.

RSVP stands for Reliable, Standard, Valid, and Practical.

  • Reliable means the test gives the same score each time. A reliable test should give the same score even as testing conditions change. It is of course hard to demonstrate the reliably for ‘high-stakes’ testing that takes place once a year. It would be far better for testing to occur on a quarterly, monthly, yearly, daily, or (preferably) nearly-continuous basis. If a standard scores high on a test over a subject-area on time, and low the next, and the same subject area is being tested at the same level, the test would have low reliability.
  • Standard means that every student gets the ‘same’ test scored the ‘same’ way. Note this doesn’t mean something foolish like, ‘Everyone gets the same test on 8 AM the same day of the year.’ It does mean that subjective “portfolios” — the only type of ‘testing’ that teachers as a politica bloc oppose — are wrong-headed. “Portfolios” aren’t a form of RSVP testing at all, but serve as a political attempt by teachers to prevent any measurement of their quality as teachers in inspiring learning.
  • Valid means the test actually measures what it is supposed to measure. So a reliable and standard testing on the Revolutionary War, composed entirely of geometry problems, probably isn’t valid. Likewise, a test of working memory capacity which would show blacks and hispanics average at the same score probably isn’t valid, owing to the strong psychometric evidence of durable gaps in general intelligence between these populations. Validity thus incorporates a very large range of issues, some almost purely political, some almost purely technical.
  • Practical means that the testing can actually be accompmlished without harming learning. The test should be easy enough to administer and score that the teacher can actually obtain scores that make sense. Likewise, information from the test should be used as feedback so teaching in that particular class can improve. A once-a-year disruptive test that takes three days and whose scores are not returned that semester would only be ‘practical’ if those three days are academcially worthless.

Most tests now used in education are not RSVP. This is because the six forces involved in the education reform debate either don’t care or are hostile to RSVP testing. Neither Districts nor States have created tests which are RSVP. On a day-to-day level, neither Parents nor Large-Scale Consumers of Educated Workers understand how to read test scores. Neither Teachers nor Publishers regularly create RSVP tests.

But between teachers and publishers, only teachers are opposed to them in principle. RSVP tests would allow districts to fire bad teachers and pay more to good ones, which wouuld be a disaster to the work rules the lobotmized teacher labor force has grown accustomed too. Teacher labor agitators like Diane Ravitch are like the longshoremen on the east coast — opposed to change and so content to watch their industry be destroyed beyond all recognition. Publishers, on the other hand, simply don’t have the skill to give RSVP tests — because they’re influence is contracts and profit, they are indifferent as to whether RSVP test will be given or not.

The federal-academic complex works as a bank for multiple interests. It allows Large-Scale Consumers of Educated Workers to convert their financial resources into power that encourages States, Districts, and Publishers to embrace measurable results, in the form of testing. It allows States to translate their power into money, which can encourage Publishers to make tests. It also creates tests which actually are RSVP, and can be embraced by Publishers as their own products.

Good tests should be RSVP – Reliable, Standard, Valid, and Practical. While we still have a way to go, for the time being teachers are against RSVP tests in principle, while Publishers are simply ignorant as to make to make and sell them. It’s easier to impart knowledge than to change hearts and minds. So until teacher groups wise up, the easier road to education reform is through empowering Publishers and working with them to create RSVP tests.