Purchasing New Pilot Selection Tests

Diane L. Damos, Ph.D. and R. Bruce Gould, Ph.D.

Damos Research Associates, Inc.

5824 W. 74th Street

Los Angeles, CA 90045-1714

Identifying a pilot selection test that will improve prediction accuracy is often time consuming, and mistakes are expensive. This article discusses questions a potential purchaser should ask before buying a pilot selection test.

To meet the increasing demand for new high-quality pilots, many air carriers are adding new tests to existing pilot selection batteries or implementing selection batteries for the first time. Several commercially available tests purport to identify those candidates who will be successful air carrier pilots, and test developers are eager to sell their products to air carriers. However, identifying tests that will increase the usefulness of a selection battery is usually a time-consuming and difficult process and mistakes are often costly.

Despite test developers’ claims, an air carrier, strictly speaking, cannot determine how accurately a test will identify successful candidates without trying it on its own candidates. That is, if a test identifies successful candidates at Air Carrier X with high accuracy, it may have a much lower accuracy at Air Carrier Y even though the carriers appear similar in many respects (types of routes, aircraft, etc.). Few developers, however, will permit their tests to be used on a trial basis; the potential user usually must purchase the test without knowing how accurately it will identify successful candidates.

To increase the chances that a test will identify successful candidates with acceptable accuracy, the air carrier should ask the test developer two questions. The first question is “What skills or abilities relating to flying a commercial aircraft does this test measure?” The test developer should have performed, or have access to, a detailed task analysis of an air carrier pilot’s job, and the test should relate clearly to specific skills or abilities described in the task analysis or in an accompanying document known as the KSA (knowledge, skills, and abilities) list. For example, a task analysis of the pilot’s job will reveal that pilots must be able to land a commercial aircraft. The task analysis or the KSA list will also indicate that landing requires good eye-hand coordination. Thus, a test that purports to measure eye-hand coordination may be useful for selecting airline pilots. Tests that purport to measure “general pilot aptitude” or assess some new skill or ability required for successful air carrier flying should be viewed with suspicion unless accompanied by significant justification for the claims.

The second question the air carrier should ask the test developer is “How do you know that the test measures what you say it does?” Typically, the test developer will give one of three answers. The first is that the test is a variation of other tests that have gone through a process known as “validation”. The test developer must be able to describe the other tests and indicate how the test under consideration differs from the validated tests. The second answer is that the test developer conducted a study and can show that the test correlates highly with other tests known to assess the skill or ability of interest. For example, a developer of an eye-hand coordination test may have conducted a study using four established tests of eye-hand coordination and the test in question. If the test in question correlates highly with the established tests, then the new test may be assumed to measure eye-hand coordination. Although both of these two answers could be sufficient, the best answer is one that is rarely given: The test scores correlate with performance on critical pilot tasks, such as landing an aircraft.

If the test developer answers both of the two questions acceptably, the air carrier should ask itself a third question “How much will this test improve my selection process?” To answer this question, the relation between the test in question and the existing (or planned) selection system must be determined. For example, if an air carrier already has an intelligence test in its selection battery, adding a second intelligence test will have little, if any, effect on the predictive accuracy of the battery. Although this example is obvious, determining how much a new test will improve an existing battery or add to a planned battery is rarely straightforward and requires training in statistics and test design. Data showing how well the test identified successful candidates at other air carriers should not be used to estimate the benefits of adopting the test because such data usually reflect the predictive accuracy of the test when used by itself. That is, like the example above, the estimates cannot indicate how much the test will improve the selection accuracy of the existing or planned battery.

Not all air carriers will have the expertise to answer the third question or to evaluate the answers to the first two questions. In such cases, the companies may need to seek expert help. Proper evaluation of the test developers’ responses and the role of the new test in the selection process will help ensure that the new test is cost effective. Careful evaluation will also decrease the amount of time required to develop a pilot selection battery or improve an existing one.