Demystifying the GMAT: Who Owns Validity?
By Lawrence M. Rudner
In order to remain relevant and useful, testing programs must periodically update their tests to match shifts in student populations and school curricula. One might think that because the publishers are offering a product, the responsibility for updates rests entirely on them. Publishers conduct studies to assure new content is appropriate. But the truth is, motivated test takers are needed to properly evaluate individual questions and the validity of an updated instrument. Motivated users of the exams results — mostly admissions personnel — are also key players in the validity drama. Recent discussion among schools and blogs and test prep organizations suggesting that a “serious” use of a new section in admissions is easily postponed, does a disservice to test takers, test publishers, and to schools themselves because the longer it takes to get good data, the longer it will take to confirm the test’s validity and to solidify its use as a reliable admissions tool.
There are two basic models for updating a test:
The first, massively redesigning a test to give it new content and new scales, assures motivated test takers. But it is disruptive; both test takers and schools that use the results must adjust to a whole new test and a new score scale. A potential drawback is that the validity and reliability of the old test may not hold with the new one.
The other approach, making an incremental change while keeping parts of the test the same, is minimally disruptive to users and test takers. ACT did that by adding an essay section while keeping the rest of the test the same. The Graduate Management Admission Council recently did that with the new GMAT Integrated Reasoning section.
For most of the test, users can still rely on scales that they know and that have documented validity and reliability. Test takers will be familiar with most of the material, meaning proven methods for doing one’s best on the exam continue to hold. This allows test takers to continue to demonstrate their skills in ways schools can understand and are already proficient at using in the admissions process. But even incremental changes involve risk with the new section. If test takers are given the wrong message — even unintentionally — motivation can become a major issue and a major deterrent to what we are all looking for, proven validity.
New tests, whether they are a complete overhaul or a new section, are typically developed after years of research. Textbooks are examined and surveys conducted in order to identify potential content. Various item structures and measurement models are explored in order to identify what does a good job of measurement and what is scalable. Questions are piloted to gather data to assure that the items are measuring as intended and are free of bias. New forms are developed and equated to assure that results are always comparable. But all of this only helps assure content validity — that the test questions address the desired content. These steps do not address whether a test will predict well. They do not even assure that individual questions will work out in the field. To do that, the test must be “live,” and test takers must be challenged to do their very best.
Admissions personnel shouldn’t give too much weight to any test or section until it has been demonstrated to be relevant and to work well for them. Proven data are clearly better than data that are unknown. But this is not the same as telling test takers that they will be ignoring a new test or section. And to say so is simply irresponsible. It is hard to imagine that anyone will ignore readily available objective data, no matter how unproven. If two applicants look almost identical and one has a better score on a new section, who do you think will be preferred? The test taker who did not blow off the section will be less likely to blow off a course.
It is in everyone’s interest for test takers to make some effort. Admissions personnel will obtain a realistic view of how well the new test or section works in their program. Test takers have the opportunity to demonstrate their abilities. The test taker who makes no effort when even one competitor might is clearly a fool. Test publishers obtain quality data to improve their product. Thus, when a new test or section is introduced, it is in everyone’s interest to be realistic. While the new material will be studied, test takers should be told to take the section seriously.
Lawrence M. Rudner, PhD, MBA, is vice president of research and development and chief psychometrician for the Graduate Management Admission Council.
© 2012 Graduate Management Admission Council. All rights reserved. This article may be reproduced in its entirety without edits with attribution to the Graduate Management Admission Council.