CAT4 UK Edition
The reliability of a test is a measure of the consistency of a student’s test scores over repeated testing, assuming conditions remain the same – that is, there was no fatigue, learning effect or lack of motivation. Tests with poor reliability might result in very different scores for a student across two test administrations.
The reliability of the test was estimated using the Cronbach’s Alpha formula which produces values ranging from 0 to 1. Values above 0.80 are considered to be very good. The reliability values for the various CAT4 batteries are given in the table below, and all show that the tests are very reliable. These are based on students who took part in the UK standardisation.
For interpreting the score of an individual student, the standard error of measurement (SEM) is a more useful statistic than a reliability coefficient. It indicates how large, on average, the fluctuations in standard scores may be. The SEM for the Verbal Reasoning Battery is 5.0, which indicates that there is a 68% chance that the student’s true verbal SAS will be in the range +/- 5.0. For example, for an average- performing student with a verbal SAS of 100, there is a 68% chance that his or her true verbal score is in a range from 95 to 105.
However, most tests show the 90% chance or confidence bands. For values around the average, the 90% confidence band is as follows:
For example, for an average-performing student with a verbal SAS of 100, there is a 90% chance that the true verbal score is in a range from 92 to 108.
Test re-test reliability
A study of 3,883 students who took Level D and subsequently took Level F two years later showed the correlation for the mean CAT4 SAS between the two time points was high at 0.88. The correlations for the overall mean CAT4 SAS and the four batteries are shown in the table below:
The results showed a high level of consistency and: 62% of students had mean CAT4 scores within +/- 5 SAS points; 90% of students had mean CAT4 scores within +/- 10 SAS points