How Can We Best Factor Multiple Sets of GMAT® Scores?

In setting admissions policies, how should business schools factor multiple GMAT scores from a candidate who takes the test more than once? Lawrence Rudner, who leads the research group at GMAC, offers some perspective.

In what is proving to be an increasingly competitive admissions cycle for business schools, record numbers of people are taking the GMAT test—and more people are retaking the exam than ever before. GMAC® data show that approximately 18 percent of all GMAT test takers sit for the examination more than once. That phenomenon raises several admissions policy issues. Should a program consider just the latest scores? Should the highest set of scores be used? Should scores from different test administrations be averaged, or should one combine the highest scores across subtests? Answers to these questions depend more on philosophy than hard mathematics.

Who Retakes a Standardized Test?

Examinees who perform as they expected or better on the GMAT test have no motivation to sit again for the four-hour examination. Those who retake the test are not pleased with their scores and believe they can do better. Thus, the 18 percent who retake the examination are a self-selected group. GMAC research shows that those who re-take the test are less likely to have finished their first examination and are more likely to have GMAT scores that are not aligned with their undergraduate grade-point averages.

On average, those who retake the GMAT attain only a modest increase in Total score—about 31 points. Those gains are consistent with the gains on other standardized tests, including the SAT and LSAT. In contrast, the average gain for a random sample of test takers is expected to be zero, with a standard deviation of 28 points. Another way to look at the data is that slightly more than 50 percent of the repeat test-takers have gains that are more than chance fluctuations. The extra gain is not surprising. There is a reason these test takers are sitting for the examination again. (It should also be noted, however, that almost 25 percent of repeat test takers have lower second scores.)

Competing Philosophies

Because no test is perfectly reliable, scores for an individual taking a test multiple times will fluctuate in a range around what is considered to be the person’s true score. One can argue that the average of scores is a better estimate of the test-taker’s true ability than any individual set of scores, but this assumes that the test-taker’s behaviors are exactly the same across testing sessions.  Changes in such factors as attitude, health, and sleep may affect the quality of a test-taker’s efforts and test-taking consistency. Test takers make careless errors, misinterpret test instructions, forget test instructions, inadvertently skip questions, and misread test items.

GMAC encourages all test takers to become familiar with the exam in advance and offers a variety of low-cost and free test-preparation material. Nevertheless, many test-takers sit for the GMAT exam with minimal practice the first time. Many are penalized because they do not properly pace themselves. Approximately 25 percent of the repeaters did not finish either the Quantitative or Verbal sections the first time, and their gains on retesting were higher.

How should one respond to careless mistakes or lack of preparation? On the one hand, schools want students who pay attention to detail and properly prepare for class. If applicants are not going to prepare for the GMAT exam, can you expect them to fully participate in their education? Thus, one could argue that just the lowest scores should be considered.

On the other hand, if test takers attain higher scores in the second testing, we can assume that they prepared better and that their new score better reflects their ability. Isn’t ability a major part of evaluating fit? If one believes that a test-taker should not be penalized for poor performance when they have shown a higher capability, then an examination of the highest scores is the logical choice.

What About Averaging Scores?

Taking the average of the scores is justified if one believes the test taker was properly prepared and motivated for each sitting. When the scores are close, that is probably true. However, when the scores are further apart, most likely the test taker prepared better and did a better job of pacing the second time around. I would argue that in such cases, emphasis should be on the score that shows the test-taker’s capabilities—that is, the highest Quantitative and Verbal scores.
 
If one is going to examine the highest scores, which set of scores should be examined—the best scores from one sitting or a combination of best scores from different test administrations? An argument to use scores from one sitting is that it places everyone on a level playing field. The scores represent the best the test taker is able to achieve within the allocated test time. No one gains an advantage by concentrating their preparation on one subtest or another.

An argument for the use of combined scores is that it gives the test-taker the benefit of every doubt. Each subtest measures an independent set of skills. The best score is a reflection of the test taker’s capability. The counterargument is that it opens the door for test takers to game the system. They can concentrate their efforts on one subtest the first time and then the other for the second. But there is a point of diminishing returns for test preparation, and the tests are independent.

My preference is that test takers should be given credit for their best performance and that therefore their highest Quantitative and Verbal scores should be considered. When Total scores have to be reported, a school would have to go with the highest observed Total score, as only that score would be auditable.

Lawrence Rudner
Vice President of Research and Development
Graduate Management Admission Council