Key Stage 2 National Test indicators: England

The KS2 indicators are derived from an analysis of the relationship between CAT4 scores from Level A to Level C and KS2 test results at age 11 from a large and national representative sample of around 24,000 students taking the KS2 SATS in 2019. These indicators are updated regularly as we get new data.

Correlations of CAT4 and KS2 scaled scores

There is a strong relationship between CAT4 scores and Key Stage 2 outcomes. The strength of the relationship between two variables can be measured by a statistic called the correlation coefficient. A value of zero indicates no relationship between the two measures, whereas a value of one indicates a perfect positive relationship. The table below shows the correlation coefficients between CAT4 standard age scores (SAS) and students’ subsequent KS2 scaled score outcomes.

The correlations are all highly significant. The Mathematics outcomes tend to have their highest correlation with the mean CAT4 SAS. The CAT4 Verbal Reasoning score gives a slightly higher or similar correlation than the mean CAT4 score for English Reading, and Grammar, Punctuation and Spelling.

The graph below illustrates the relationship between the mean CAT4 score and the KS2 Mathematics scaled scores. It shows the most likely scaled score and the score if the student is challenged. We can see that the scaled scores increase as the CAT4 scores increase.

For example, a student with a mean CAT4 score of 90, the ‘most likely’ Mathematics scaled score is 99 and the ‘if challenged’ threshold is 103. Not all students with a mean CAT4 score of 90 will get a Mathematics scaled score of 99. The ‘most likely’ score is an average, so around half of the students with mean CAT4 scores of 90 will obtain a Mathematics scaled score below 99; 25% of the students will obtain a Mathematics scaled score of between 99 and 102; and 25% of the students will obtain an ‘if challenged’ score of 103 or above.

Likelihood of Key Stage 2 indicated standard

The graph below illustrates the proportion of students achieving a scaled score of 100 (the government’s expected standard) or the high score of 110 for Mathematics for each mean CAT4 score. We can see that the higher the mean CAT4 score, the greater the proportion of students who achieve the government’s benchmark or above. For example, 58% of students with a mean CAT4 score of 90 obtained the expected standard of 100 or above in Mathematics; in contrast, about 95% of students with a mean CAT4 score of 110 achieved this.

The chart below illustrates the relationship between the Verbal CAT4 score and the KS2 English Reading benchmarks.

The chart below illustrates the relationship between the Verbal CAT4 score and the KS2 English Spelling, and Grammar (SPAG) benchmarks.

KS2 indicators for groups of students

The table below illustrates how the group/class indicators have been calculated for a fictitious group of five students and shows the probability of obtaining different KS2 Mathematics benchmarks.

The individual student indicators do not show any of these five students likely to obtain a high scaled score benchmark of 110 or more. However, some students have a high chance of achieving this, e.g. student 5 has a 37% chance of obtaining a high score of 110 or more. Overall for this group of five students we expect 20% (i.e. one out of the five students) to achieve the high score. As an illustration, if your group has 10 students all with mean CAT4 scores of 106, the most likely outcome for each of these 10 students individually is a scaled score of 106. However, it is likely that 23% of these students (i.e. two out of the 10 students) will achieve the high score.

The group level indicators are the average of the probabilities for all students in the group. Our research has shown that this method provides the most accurate set of group level indicators. However, group indicators are extremely sensitive to variations in the number of students in the group, and may be very unstable for groups of less than 30 students. Group indicators should only ever be taken as a rough guide to the possible future performance of a class.