Reliability Of Repeated Tests

Will the same tests give consistent results when used repeatedly with the same

subject? In general we may say that they do. Something depends, however, on the age and

intelligence of the subject and on the time interval between the


Goddard proves that feeble-minded individuals whose intelligence has

reached its full development continue to test at exactly the same mental

age by the Binet sc
le, year after year. In their case, familiarity with

the tests does not in the least improve the responses. At each retesting

the responses given at previous examinations are repeated with only the

most trivial variations. Of 352 feeble-minded children tested at

Vineland, three years in succession, 109 gave absolutely no variation,

232 showed a variation of not more than two fifths of a year, while 22

gained as much as one year in the three tests. The latter, presumably,

were younger children whose intelligence was still developing.

Goddard has also tested 464 public-school children for three successive

years. Approximately half of these showed normal progress or more in

mental age, while most of the remainder showed somewhat less than normal


Bobertag's retesting of 83 normal children after an interval of

a year gave results entirely in harmony with those of Goddard.

The reapplication of the tests showed absolutely no influence of

familiarity, the correlation of the two tests being almost perfect

(.95). Those who tested "at age" in the first test had advanced, on

the average, exactly one year. Those who tested _plus_ in the first

test advanced in the twelve months about a year and a quarter, as we

should expect those to do whose mental development is accelerated.

Correspondingly, those who tested _minus_ at the first test advanced

only about three fourths of a year in mental age during the


Our own results with a mixed group of normal, superior, dull and

feeble-minded children agree fully with the above findings. In this case

the two tests were separated by an interval of two to four years, and

the correlation between their results was practically perfect. The

average difference between the I Q obtained in the second test and that

obtained in the first was only 4 per cent, and the greatest difference

found was only 8 per cent.

The repetition of the test at shorter intervals will perhaps affect the

result somewhat more, but the influence is much less than one might

expect. The writer has tested, at intervals of only a few days to a few

weeks, 14 backward children of 12 to 18 years, and 8 normal children of

5 to 13 years. The backward children showed an average improvement in

the second test of about two months in mental age, the normal children

an average improvement of little more than three months. No child varied

in the second test more than half a year from the mental age first

secured. On the whole, normal children profit more from the experience

of a previous test than do the backward and feeble-minded.

Berry tested 45 normal children and 50 defectives with the Binet 1908

and 1911 scales at brief intervals. The author does not state which

scale was applied first, but the mental ages secured by the two scales

were practically the same when allowance was made for the slightly

greater difficulty of the 1911 series of tests.

We may conclude, therefore, that while it would probably be desirable

to have one or more additional scales for alternative use in testing the

same children at very brief intervals, the same scale may be used for

repeated tests at intervals of a year or more with little danger of

serious inaccuracy. Moreover, results like those set forth above are

important evidence as to the validity of the test method.