Validation

Validity is the extent to which a test measures what it claims to be measuring and appropriate inferences can be made from the test score. There are a variety of methods used in estimating the validity of a test. Construct validity relates to how well the test measures the intended construct and one way of assessing this involves comparison of mean scores of groups for which score differences would be expected. For LASS 8–11, this analysis looks at the differences between dyslexic and non-dyslexic students for each LASS 8–11 subtest (see Table 8).

This analysis indicates medium to large effects on those subtests where we would expect non- dyslexics to outperform dyslexic students (i.e. Spelling, Sentence reading, Word chopping / Segments and Funny words / Non-words) and small or no effect sizes on those subtests where we would not expect differences between these two groups (i.e. Non-verbal reasoning, Verbal reasoning and Sea creatures). Note that, due to the ceiling effect on Single word reading, this subtest only shows a small effect size, and this subtest should only be administered to students who perform poorly on Sentence reading.

 

Table 8. Construct validity

* Cohen’s d is a measure of effect size of the difference between two means

Validity of new psychological and educational tests is usually established by comparing them with equivalent established tests. This is usually called ‘concurrent validity’. Some difficulties may arise in the case of computer-based tests, where the modes of response (typically using a mouse) are different to those used in conventional tests (typically either oral or written responses). Inevitably, this tends to result in somewhat lower correlation coefficients that those obtained when comparing two similar conventional tests (for a discussion of these issues, see Singleton, 2001).

For LASS 8–11, concurrent validity was measured by comparing the LASS 8–11 subtest scores with scores from the Suffolk Reading Scale, using Pearson’s r correlation (see Table 9), for a subset of the standardisation sample. A correlation of .55 is considered to be an adequate level of concurrent validity, with .65 considered as good. The results show good evidence of congruent validity, with correlations of .732, .655 and .569 for those subtests that would be expected to correlate well with the Suffolk Reading Scale (Sentence reading, Spelling and Single word reading). The results also show clear evidence of divergent validity, as those subtests that measure constructs that differ from the Suffolk Reading Scale (i.e. Non-verbal reasoning, Mobile phone and Sea creatures) show correlations well below .55. Those subtests that do not measure exactly the same construct as the Suffolk Reading Scale, but where there is some degree of overlap between them (i.e. Verbal reasoning, Funny words and Word chopping) show mid- range correlations, as would be expected.

 

Table 9. Correlations with Suffolk Reading Scale