AMT Adaptive Matrices Test

Application

This AMT is a non-verbal test for assessing general intelligence as revealed in the ability to think inductively. It is suitable for subjects aged 13 and over.

Theoretical background

The items resemble classical matrices, but in contrast to these they are constructed on the basis of explicit psychologically-based principles involving detailed analysis of the cognitive processes used in solving problems of this type. A total of 289 items were created and they were evaluated in three extensive studies involving large numbers of people in Katowice (Poland), Moscow and Vienna. The items were analyzed using the Rasch dichotomous probabilistic test model and the corresponding characteristic values were estimated for the items (cf. Hornke, Küppers & Etzel, 2000). The resulting item pool means that the test can be presented adaptively and that it has all the advantages of modern computerized test procedures: shorter administration time but improved measurement precision, and high respondent motivation because the items presented are appropriate to the respondent’s ability.

Administration

Items are presented adaptively – that is, after an initial phase the respondent is presented only with items of a level of difficulty which is appropriate to his ability. It is not possible to omit an item or to go back to a preceding one. The eight alternative answers to each question reduce the probability of successful guesswork.

Test forms

There are four test forms, S1, S2, S3 and S11; they differ in respect of the pre-set precision (standard measurement error) of the person parameter estimate and in the level of difficulty of the first item. The standard measurement error is set at 0.63 for test form S1, 0.44 for S2, 0.39 for S3 and 0.63 for S11 (corresponding to reliabilities of 0.70, 0.83, 0.86 and 0.70).

Scoring

The test yields an estimate of the respondent’s general intelligence. The estimate is produced on the basis of the Rasch model according to the maximum likelihood method. A percentile ranking with reference to a norm sample is also given.

Reliability

Because of the validity of the Rasch model, reliability in the sense of internal consistency is given. For the four test forms it has been set at a standard measurement error (SEM) of 0.63, 0.44, 0.39 and 0.63, corresponding to reliabilities of 0.70, 0.83, 0.86 and 0.7.
This reliability applies to all respondents and at all scale levels. This is the central and significant advantage over other widely used psychometric tests based on classical test theory: all respondents are assessed with equal reliability.

Validity

According to Hornke, Etzel and Küppers (2000; Hornke, 2002), the construction rational correlates at 0.72 with the difficulty parameters. In addition, Sommer and Arendasy (2005; Sommer, Arendasy & Häusler, 2005) demonstrated using a confirmatory factor analysis that this test, together with tests of inductive and deductive thinking, loads onto the factor of fluid intelligence (Gf). Fluid intelligence was found to be the intelligence factor with the highest g-loading. A number of studies carried out in the fields of traffic and aviation psychology also confirm the test’s criterion validity.