Linked Data

Given a list containing sufficient pupil identifiers (e.g. name, date of birth, postcode), it is possible to obtain the NPD information relating to those pupils. This is particularly helpful where a study has collected a range of information that is not collected as part of the NPD, as this can be combined with the NPD data to produce a more powerful set of variables. For example, the NPD contains very little information about pupils’ family background, so where this can be matched into an NPD extract, it can be included in analysis of educational outcomes. Similarly, taking data about attainment and attendance from NPD, as well as pupil characteristics data, saves individual studies collecting this information themselves and means that there is consistency across different data sets in terms of how these things are measured.

A number of studies have used this functionality to produce powerful data sets which can be used to answer a wider range of research and policy questions than either the study or the NPD would have been able to answer in isolation. One of the issues with some of these matching processes has been that the matches haven’t necessarily joined on all available data from the NPD. Where possible, it makes sense to take as much information as possible from the matching process, as this will reduce the likelihood of having to repeat the process, and it maximises the potential of the combined data set. Some examples of data that has been matched are listed below:

The Longitudinal Study of Young People in England (LSYPE)

LSYPE is a longitudinal study of a sample of around 15,000 young people. It started in 2004, when the individuals involved were aged 14, and has collected information every year about the young people and their families. Six waves of data is now available to be downloaded, tracking these young people through to age 19. It can be accessed via the iLSYPE website or via ESDS.

The Effective Pre-School, Primary and Secondary Education study (EPPSE)

EPPSE is a large scale longitudinal study of children in England. This study has tracked around 3,000 children from age three onwards. The children are in four different academic cohorts, which has associated benefits and drawbacks. The data isn’t readily accessible, but findings can be found on the EPPSE team’s website for different phases of the study.

The Avon Longitudinal Study of Parents and Children (ALSPAC)

ALSPAC, which is also known as Children of the 90s, is a long-term health and development research project. The study has tracked the families of more than 14,500 pregnancies recruited during 1991 and 1992 born in the region surrounding Bristol, in the South-West of England. ALSPAC has followed the health and development of their children in great detail ever since, with the study families provideding a vast amount of genetic and environmental information over the years. The ALSPAC cohort is multi-generational, while the focus to date has been on the study parents and the index child, ALSPAC is now piloting recruitment of the 3rd generation offspring.

The ALSPAC data resource is documented on the website along with information about accessing the data and study findings.

The Impact of Family Socio-economic Status on Outcomes in Childhood & Adolescence (IFFSOCA) project has the objective of understanding the importance of family socio-economic status/position for adolescents in today's Britain through using ALSPAC data. The primary focus is the behaviours and outcomes of individuals in late childhood and adolescence and makes extensive use of data collected from the NPD. Outputs are listed on the project website.

The ALSPAC cohort is split across 3 school years. The oldest ALSPAC children entered reception in autumn 1995 and the youngest will take their KS4 assessments in summer 2009. Available data include PLASC data, SATS (collected from the local authorities rather than NPD), KS2, KS3, KS4 and school level census data (ASC). Expected academic progress of the ALSPAC birth cohort according to their dates of birth:


The Millennium Cohort Study (MCS)

The MCS tracks 19,000 children born in the UK just after the turn of the Millennium. Twelve thousand of those children were born in England during the academic year 2000/01, and so will reach the end of Key Stage 2 in the summer of 2012. Some of these will not attend schools in England, and some of the children born in Scotland, Wales or Northern Ireland will attend schools in England. For those who do attend schools in England, NPD data has been matched and is available. Data from the Foundation Stage Profile can be downloaded, along with the rest of the MCS data from the ESDS. The Key Stage 1 data is expected to be available shortly, and future data should be matched as and when it becomes available.