Key Stage Two (Age 11)

Key Stage 2 (KS2) is the legal term for the four years of schooling in maintained schools in England and Wales normally known as Year 3, Year 4, Year 5 and Year 6, when pupils are aged between 7 and 11. The KS2 examinations are taken in year 6 at the end of Primary schools. The examinations are in the three core subjects: English, Maths and Science.

The KS2 results are available in two formats, the fine point score which details each pupil’s final mark in each exam (or aggregated at subject level), or the overall level attained by the pupil in each subject.

The variables are commonly used in two ways:
  1. to control for prior attainment of pupils (as a proxy for ability)
  2. aggregated to provide a measure of school quality

Full information about the KS2 data can be found at NPD Userguides.

Data collection

Data feeds are provided to the AAT (Achievement and Attainment Tables) contractor, which enables the creation of the AAT datasets, which are then matched into the NPD. After matching is completed, the first, unamended KS2 extract is released in September. At the same time as the data is being matched into the NPD, the AAT data is sent to schools for checking. After schools have made any changes, the amended data can then be matched into the NPD. The amended extract will be available in December. After the AATs have been published, schools are given the opportunity to make errata changes. Again, this new data is matched into the NPD and the final KS2 extract will be available around March.

Validity of measure

The KS2 test marks data will be missing if students KS2 level data is missing, the pupil was absent, has level recorded as “D” (disapplied) or “IN” (Invalid). The test marks will also be missing if the KS2 level is “B” (pupils working below test levels); therefore use of marks data can introduce bias by omitting lower attaining pupils.

Marks are not always consistent with test levels, for instance where there is a review the corrected level will appear in the NPD following the checking process, but the corresponding amended test mark may never be captured.

Cleaning the variable

This first Stata routine destrings the key variables. Destringing just involves setting non-numerical values to missing and performing some basic clean-up of miscoded values.
In each of the three subjects this code uses:
  • a total mark score
  • marks achieved on individual papers
  • score on an extension paper for relevant years and subjects
  • a teacher assessed level
  • the pupil's overall level derived from the marks data
foreach i in ks2_engtotmrk ks2_engextmrk ks2_enghndwrmrk ks2_engreadmrk ///
ks2_engspellmrk ks2_engwritmrk ks2_engtalev    ks2_englev ///
foreach i in ks2_mattestamrk ks2_mattestbmrk ks2_matarthmrk ks2_matarthmrkt ///
ks2_mattotmrk ks2_matextmrk ks2_mattalev ks2_matlev  ///
ks2_scitestamrk ks2_scitestbmrk ks2_scitotmrk ks2_sciextmrk ks2_scitalev ks2_scilev  {
    capture replace `i'="" if `i'=="U" | `i'==" " | `i'=="D" | `i'=="A" | `i'=="Z" | `i'=="T"
    capture replace `i'="" if `i'=="X" | `i'=="_X" | `i'=="_NV" | `i'=="M" | `i'=="N"
    capture replace `i'="1" if `i'=="W" | `i'=="L" | `i'=="B"
    capture replace `i'="3" if `i'=="03"
    capture replace `i'="4" if `i'=="04"
    capture replace `i'="5" if `i'=="05"
    capture replace `i'="6" if `i'=="06"
    capture replace `i'="7" if `i'=="07"
    capture replace `i'="8" if `i'=="08"
    capture replace `i'="9" if `i'=="09"
    capture destring(`i'), replace force
capture gen ks2_scitotmrk=ks2_scitestamrk+ks2_scitestbmrk
capture replace ks2_engtotmrk=72 if ks2_engtotmrk==772

This Stata code cleans up the KS2 variables that are generally kept in the final dataset.
capture label var ks2_engtotmrk "KS2 Total marks achieved in English test (sum of reading and writing tests)"
capture rename ks2_engtotmrk ppks2engtotmark
capture label var ks2_engextmrk "KS2 English Extension Mark"
capture rename ks2_engextmrk ppks2engextmark
capture label var ks2_enghndwrmrk "KS2 English Handwriting Test Mark"
capture rename ks2_enghndwrmrk ppks2enghandwriting
capture label var ks2_engreadmrk "KS2 Marks achieved in English reading test"
capture rename ks2_engreadmrk ppks2engreading
capture label var ks2_engspellmrk "KS2 English Spelling Test Mark"
capture rename ks2_engspellmrk ppks2engspelling
capture label var ks2_engwritmrk "KS2 Marks achieved in English writing test"
capture rename ks2_engwritmrk ppks2engwriting
capture label var ks2_engtalev "KS2 National Curriculum level awarded for English Teacher Assessment"
capture rename ks2_engtalev ppks2engteacher
capture label var ks2_mattestamrk "KS2 Marks achieved in Paper A of Maths test"
capture rename ks2_mattestamrk ppks2matpapera
capture label var ks2_mattestbmrk "KS2 Marks achieved in Paper B of Maths test"
capture rename ks2_mattestbmrk ppks2matpaperb
capture replace ks2_matarthmrk=ks2_matarthmrkt if ks2_matarthmrk==""
capture label var ks2_matarthmrk "KS2 Marks achieved in mental arithmetic paper of Maths test"
capture rename ks2_matarthmrk ppks2matpaperarith
capture label var ks2_matarthmrkt "KS2 Marks achieved in mental arithmetic paper of Maths test"
capture rename ks2_matarthmrkt ppks2matpaperarith
capture label var ks2_mattotmrk "KS2 Total marks achieved in Maths test (sum of Paper A, Paper B and mental arithmetic tests)"
capture rename ks2_mattotmrk ppks2mattotmark
capture label var ks2_matextmrk "KS2 Maths Extension Mark"
capture rename ks2_matextmrk ppks2matextmark
capture label var ks2_mattalev "NC level awarded for Maths Teacher Assessment"
capture rename ks2_mattalev ppks2matteacher
capture label var ks2_scitestamrk "KS2 Marks achieved in Paper A of Science test"
capture rename ks2_scitestamrk ppks2scipapera
capture label var ks2_scitestbmrk "KS2 Marks achieved in Paper B of Science test"
capture rename ks2_scitestbmrk ppks2scipaperb
capture label var ks2_scitotmrk "KS2 Total marks achieved in Science test (sum of Paper A and Paper B tests)"
capture rename ks2_scitotmrk ppks2scitotmark
capture label var ks2_sciextmrk "KS2 Science Extension Mark"
capture rename ks2_sciextmrk ppks2sciextmark
capture label var ks2_scitalev "KS2 NC level awarded for Science Teacher Assessment"
capture rename ks2_scitalev ppks2sciteacher

Imputation on the subject total marks score is performed in order to retain as many pupils as possible in the dataset. For all imputed pupils we will have some information on their KS2 performance in the relevant subject and will not impute across subjects or by using KS1 data. Therefore, this is a limited imputation exercise.
This is a single imputation using information from any of the following variables:
  • subject extension mark
  • the individual paper marks
  • the teacher assessment of level
The imputation is carried out provided the teacher assessment is present.
capture impute ppks2engtotmark ppks2engextmark ppks2enghandwriting ppks2engreading  ///
ppks2engspelling ppks2engwriting ppks2engteacher, gen(tempppks2engtotmark)
capture impute ppks2engtotmark ppks2engreading ppks2engwriting ppks2engteacher, gen(tempppks2engtotmark)
capture impute ppks2mattotmark ppks2matpapera ppks2matpaperb ppks2matpaperarith ///
ppks2matextmark ppks2matteacher, gen(tempppks2mattotmark)
capture impute ppks2mattotmark ppks2matpapera ppks2matpaperb ppks2matpaperarith ///
 ppks2matteacher, gen(tempppks2mattotmark)
capture impute ppks2scitotmark ppks2scipapera ppks2scipaperb  ppks2sciextmark ///
ppks2sciteacher, gen(tempppks2scitotmark)
capture impute ppks2scitotmark ppks2scipapera ppks2scipaperb ppks2sciteacher, gen(tempppks2scitotmark)
foreach i in eng mat sci    {
    capture replace ppks2`i'totmark=tempppks2`i'totmark if ppks2`i'totmark==. & ppks2`i'teacher<.
    capture replace ppks2`i'totmark=round(ppks2`i'totmark)
    capture replace ppks2`i'totmark=0 if ppks2`i'totmark<0

There are two alternative z-scores that can be used. The standard z-score or the ranked z-score. Since we have no particular belief that the underlying
marks distribution is meaningful, the ranked z-score is more commonly used by researchers. This the calculation for both.
foreach i in eng mat sci    {
    capture summ ppks2`i'totmark
    capture gen ppks2`i'standardz=(ppks2`i'totmark-r(mean))/r(sd)
    capture gen temp`i'=rank(ppks2`i'totmark), track
    capture summ temp`i'
    capture gen ppks2`i'rankedz=(temp`i'-r(mean))/r(sd)
    capture drop temp*

This creates an average relative position across the three subjects (or as many
subjects for which there are marks available).
It also creates indicators for whether the pupil is in the top or lowest quartile in
KS2 scores. This can be used as a rough indicator of attainment.
capture egen ppks2total=rowmean(ppks2scistandardz ppks2matstandardz ppks2engstandardz)
capture label var ppks2total "Pupil KS2 maths, English and science aggregated"
capture summ ppks2total, detail
capture gen ppks2top=0 if ppks2total<.
capture replace ppks2top=1 if ppks2total<. & ppks2total>r(p75)
capture label var ppks2top "Pupil scored in top 25% overall in KS2 tests"
capture gen ppks2low=0 if ppks2total<.
capture replace ppks2low=1 if ppks2total<. & ppks2total<r(p25)
capture label var ppks2low "Pupil scored in bottom 25% overall in KS2 tests"

Description of values across cohorts

By age groups
Over time
Stability within pupil