This post investigates what PISA 2015 results reveal about:
- Progress towards the government’s 2020 national performance targets; and
- Trends in the comparative performance of England’s high attainers.
It complements a parallel post about the TIMSS 2015 results – Troubling TIMSS trends (December 2015).
The results of the 2015 Programme for International Student Assessment (PISA) were published in early December 2016. PISA is a triennial survey of the educational achievement of 15 year-olds overseen by the OECD.
The survey was first undertaken in 2000, so this is the sixth cycle, but there were problems with UK response rates in 2000 and 2003, so this analysis relies on trends over the four cycles between 2006 and 2015.
Seventy-one jurisdictions are listed as participants in the 2015 survey, compared with 65 participants in 2012.
In England the 2015 survey was conducted between November and December 2015 with a sample of Year 11 students from both state-funded and independent schools (9% of participating schools were independent).
The participation status of China is confusing. Hong Kong and Macao continue to have their results reported separately. In 2012 Shanghai participated as a (very high scoring) stand-alone entity. In 2015 their results are no longer reported separately, but as part of a (somewhat lower scoring) bloc including new participants Beijing, Jiangsu and Guangdong provinces. Some publications call this bloc China; others prefer the abbreviation B-S-J-G.
The principal assessments are in maths, science and reading. Each cycle features one of these more prominently and it is the turn of science in 2015.
According to the OECD:
‘Every PISA survey tests reading, mathematical and scientific literacy in terms of general competencies, that is, how well students can apply the knowledge and skills they have learned at school to real-life challenges. PISA does not test how well a student has mastered a school’s specific curriculum.’
The 2015 National Report identifies a strong (but unquantified) correlation between PISA scores and GCSE grades, but adds:
‘Whereas GCSEs examine pupils’ knowledge of specific content and application of specific techniques as defined by national curricula, PISA measures pupils’ ‘functional skills’ – their ability to apply knowledge to solve problems in real world situations. This is also in contrast to other international studies, such as TIMSS, where the assessment framework is aligned to a set of content agreed by the International Association for the Evaluation of Educational Achievement (IEA) who oversee the study.’
PISA participants are ranked on each assessment according to their mean score. The average mean score across OECD countries is 500.
Mean scores are also separated into proficiency levels for each assessment. In maths there are six levels of proficiency from 1 (the lowest) to 6 (the highest). In reading and science level 1 is divided into 1a. and 1b. Level 6 was introduced for reading only in 2009.
There are descriptors for each level in each assessment. For example, level 6 in science is described thus:
‘Pupils consistently provide explanations, evaluate and design scientific enquiries and interpret data in a variety of complex situations. They draw appropriate inferences from different data sources and provide explanations of multi-step causal relationships. They can consistently distinguish scientific and non-scientific questions, explain the purposes of enquiry, and control relevant variables in a given scientific enquiry. They can transform data representations, interpret complex data and demonstrate an ability to make appropriate judgments about the reliability and accuracy of any scientific claims. Level 6 students consistently demonstrate advanced scientific thinking and reasoning requiring the use of models and abstract ideas and use such reasoning in unfamiliar and complex situations.’
The discussion of high attainment in this post relies principally on the percentages of participants achieving proficiency levels 5 and 6 (actually level 6 and levels 5/6 combined). It also draws on the comparative performance of the top decile in each jurisdiction on each assessment.
The primary sources for this post are the OECD publication PISA 2015 Results (Volume 1) Excellence and Equity in Education, the associated data tables and the PISA 2015 National Report for England prepared by the UCL Institute of Education.
The evolution and articulation of the present government’s 2020 performance targets are described at some length in a previous post – TIMSS PISA PIRLS: Morgan’s targets scrutinised (May 2016).
They went through several iterations, characterised by some disagreement over the precise combination of subjects (perm from maths, science, computing, engineering, reading, writing) and the degree of improvement required (best in the world, top five in the world, best in Europe).
Invariably the target date was 2020 (marking the end of this government) and the measure was England’s performance on some combination of the PISA, TIMSS and PIRLS international comparisons studies.
My analyses assume that the target is to be best in Europe (typically the least demanding option given the dominance of Asian jurisdictions in these studies) in reading, maths and science (the subject areas covered by PISA, TIMSS and PIRLS).
My original post estimated the mean scores required in the 2015/16 PISA, TIMSS and PIRLS cycles to achieve a trajectory consistent with being best in Europe by 2020. It also calculated the increases required in the percentages of learners achieving higher and lower benchmarks in each study to match the best in Europe.
This post revisits progress towards the ‘best in Europe’ target taking into account England’s actual mean scores in PISA 2015, recalculating the further improvements necessary to achieve that target by 2020.
It also covers:
- The trend in England’s performance against PISA proficiency levels 5 and 6 – and how that performance compares with other jurisdictions, especially our highest-performing European competitors.
- How the performance of the top decile on each assessment in England compares with the performance of the top decile in other jurisdictions, especially the highest-performing.
- The socio-economic background of the population achieving PISA proficiency levels 5 and 6 in science in England, and the incidence of relatively high-achieving and relatively disadvantaged (so-called ‘resilient’) students in England and other jurisdictions.
- The proportions of students achieving highly on two or all three assessments – and how this compares with the highest-performing jurisdictions.
The conclusion summarises these findings and draws comparisons with those in this parallel post on TIMSS 2015 results, to establish whether or not they are consistent.
Changes in mean scores
Chart 1, below, shows the trend in England’s mean scores since PISA 2006. The dashed lines illustrate the further improvement that must be secured in the 2018 PISA cycle (the last before 2020) to achieve scores equivalent to the best by a European jurisdiction in PISA 2015.
Hence these are the improvements needed to achieve the government’s target assuming that no competitor country exceeds the current best European scores in PISA 2018.
Chart 1: Trend in PISA mean scores, 2006-2015 and improvement required to be best in Europe by 2020 (assuming ‘best in Europe’ scores are unchanged)
It shows that:
- In maths England’s score declined by two points this time round. It needs to add 28 points in the next cycle to match Switzerland.
- In science England’s score fell by four points in 2015. It must add 22 points in the next cycle to equal Estonia.
- In reading England’s score was unchanged. It needs 26 more points next time to match Finland.
The PISA study assumes that a 20 point gap is approximately equivalent to an additional eight months of schooling and a 30 point gap equivalent to an additional year of schooling. So England requires improvements on each assessment equivalent to an additional 9-12 months of study.
Given the trends in England’s performance since 2006, improvement on this scale is not feasible.
It is conceivable that these ‘best in Europe’ benchmarks may be somewhat less demanding in 2018 than they were in 2015. Compared with PISA 2012, the mean scores returned by the best European jurisdiction have fallen across the board in 2015:
- In maths Switzerland was top at 531 points in 2012. It is still best in Europe but now on 521 points, a 10-point fall.
- In science Finland was leading on 545 points in 2012. It has now been leapfrogged by Estonia but it only registers 534 points, so 11 points lower.
- In reading Finland was at 526 in 2012 and remains Europe’s best but now on 524 points, a two-point decline.
But it is unlikely that the best mean scores in Europe would continue to decline at the same rate and, even if they did, England would still face an uphill battle to equal them.
The government’s 2020 targets are pie in the sky.
The downward trend between 2012 and 2015 is also evident in world-leading scores, though largely as a consequence of high-scoring Shanghai’s incorporation into relatively lower-scoring B-S-J-G:
- In maths Shanghai scored 613 in 2012, with Singapore next best at 573. Singapore is world leader in 2015 at 564 points. Three Asian high performers – Hong Kong, South Korea and Taiwan – have fallen back significantly since 2012, but their mean scores are still amongst the world’s best.
- In science Shanghai scored 580 in 2012, with Hong Kong next at 555. Singapore is world leader in 2015 at 556. Hong Kong and South Korea have fallen back significantly since 2012 (but remain high scorers) while Singapore and Taiwan have improved their mean scores.
- In reading Shanghai scored 570 in 2012, with Hong Kong next at 545. Singapore is world leader in 2015 at 535, having lost a little ground. Slovenia and Russia have gained significant ground since 2012, while South Korea, Japan and Hong Kong have fallen back significantly.
By comparison, England’s lack of progress between 2012 and 2015 seems rather less disappointing, but this downward trend is far from universal.
Some European countries have leapfrogged England in 2015, notably Norway (maths), Slovenia (science and reading) and several others are snapping at our heels.
Comparing England’s performance at higher and lower proficiency levels in 2012 and 2015
A comparison of England’s performance below proficiency level 2 and above proficiency level 5 in 2012 and 2015 shows that:
- In maths the percentage of low achievers has deteriorated very slightly, increasing from 21.7% in 2012 to 22.1% in 2015. By comparison, the percentage of high achievers has fallen more substantially, from 12.4% in 2012 to 11.3% in 2015.
- In science the percentage of low achievers has worsened by two full percentage points, from 14.9% in 2012 to 16.9% in 2015. Conversely the percentage of high achievers has remained unchanged.
- In reading the percentage of low achievers has grown by more than one percentage point, from 16.7% in 2012 to 17.9% in 2015. Meanwhile the percentage of high achievers has increased by 0.8% from 9.1% to 9.9%.
On this evidence high achievers have been rather less badly affected than low achievers in science and (to a lesser extent) reading, whereas the reverse is true in maths.
The only improvement amongst high achievers has been in reading. In maths there has been a substantive fall in the incidence of high achievement whereas in science the position is unchanged.
Trends in England’s performance at higher proficiency levels 2006-2015
Charts 2 and 3 below shows the trend since 2006 in England’s performance against the highest PISA proficiency levels – level 6 and levels 5/6 combined.
The percentages are drawn from the OECD’s regional tables since, for some reason, the National Report rounds them to whole percentage points. This is particularly misleading in maths where 2.6% (level 6) and 8.7% (level 5) are both rounded up and totalled as 12.0% when the actual total is 11.3%.
The profile of level 6 performance is noticeably different for each assessment.
In maths it has fluctuated up and down with each cycle. Initial decline was rectified by marked improvement in 2012, followed by another moderate decline in 2015, leaving the overall percentage almost identical to where it was in 2006.
In science a more pronounced decline (from a higher starting point) between 2006 and 2009 has been followed by three successive cycles of minimal change. Consequently the percentage at level 6 is significantly lower than it was in 2006. Whereas science led maths in 2006, the reverse is true in 2015.
In reading there has been steady though unspectacular improvement since 2009 (there was no level 6 in 2006). The gap between level 6 performance in reading and in science is closing steadily but the gap between reading and maths has increased slightly between 2009 and 2015.
Chart 2: Percentage achieving the L6 proficiency level, England, 2006-2015
The profiles of performance at level 5 and above are very similar. Maths has fluctuated up and down with each cycle, science declined initially but has been stable since 2009. Reading also recorded an initial decline but has improved steadily since 2009.
Science enjoyed a 2.8 percentage point lead over maths in 2006. Maths outperformed science briefly in the 2012 cycle, but science has now regained the ascendancy, albeit only 0.4 percentage points ahead.
Despite the fluctuations, maths is at almost exactly the same level that it was in 2006, but science has barely recovered any of the substantial decline recorded in 2009. Reading has improved but by less than one percentage point. Reading has closed much of the gap between it and science and some of the gap between it and maths.
Chart 3: Percentage achieving the L5 and L6 proficiency levels, England, 2006-2015
One this evidence there is little cause for concern about high achievement in reading, but maths and science are far more problematic.
Trends in England’s progress at higher proficiency levels compared with other European countries
A comparison of 2012 and 2015 outcomes at proficiency level 5 and above for England and the best in Europe gives a rather different story:
- In maths in 2012 England was 12.4 percentage points behind and needed to double its success rate to 24.8% to match the best in Europe. But England is now only 8.0 percentage points behind, because the fall in our success rate is much less pronounced than the fall in Europe’s best score.
- In science England was 5.4 percentage points behind the best in Europe in 2012, but this has more than halved to 2.6 percentage points, again because Europe’s best score has fallen while England’s percentage is unchanged.
- In reading England was 4.4 percentage points behind Europe’s best score in 2012, and this gap has fallen more modestly, to 3.8 percentage points, as England’s score has improved at a slightly faster rate.
Chart 4, below, shows the percentage point gaps between England and the best in Europe since 2006 for achievement of proficiency levels 5 and above.
It suggests that England is making steady progress in closing these gaps, but this is largely attributable to stability or modest improvement in England combined with more substantial falls in performance further up the league table.
Chart 4: Percentage point gaps between England and best in Europe, percentage achieving PISA proficiency levels 5 and 6 combined, 2006-2015
Chart 5 illustrates the change in percentage point gaps for proficiency level 6. Although the pattern is less consistent than for levels 5 and above, the gaps are smallest on all three assessments in 2015.
Chart 5: Percentage point gaps between England and best in Europe, percentage achieving PISA proficiency level 6, 2006-2015
At both level 5/6 and at level 6 England has by far the biggest gap to make up in maths, whereas we are much closer to the best in Europe in science and reading.
It would be feasible to match the best in Europe in science and reading by 2018, particularly at level 6.
In science England currently sits behind only Finland at level 6 and Finland and Estonia at level 5 and above. On the other hand there are four European jurisdictions with a better success rate at L6 in reading, seven better at L5 and above.
Maths is different. Gaps of that size will take longer to close and there are more countries to overtake. I counted nine European countries with a better level 6 success rate and 13 with a better success rate at Level 5 and above, including the likes of Malta and Portugal.
But, while some of this can be seen in a positive light, the gaps between England and Singapore remain huge:
- In maths Singapore has 13.1% at level 6 (five times more than England) and 34.8% at level 5 and above (three times more than England).
- In science Singapore has 5.6% at level 6 (almost three times more than England) and 24.2% at level 5 and above (double the rate in England).
- In reading Singapore has 3.6% at level 6 (double the rate in England) and 18.3% at level 5 and above (almost double England’s success rate).
Top decile performance in England compared with other countries
The National Report has little to say about comparative performance against PISA’s higher proficiency levels, other than to note that England is one of several countries (the others unidentified) that have a larger proportion of high achievers than expected given its mean scores in science and reading (but not in maths).
Instead it relies disproportionately on a different measure, comparing the scores achieved by the top decile (and bottom decile) of the distribution in each jurisdiction on each assessment.
The prevailing narrative is that England makes a typically strong showing with high attainers, is less convincing with low attainers and has atypically large gaps between the two.
But this is not entirely borne out by the facts:
- In maths the 90th percentile in England scores 613. Twenty-one jurisdictions have a higher score, 12 exceeding England by a statistically significant margin. These include Switzerland, Belgium, the Netherlands and Estonia. Eight further European jurisdictions are also ahead by a smaller margin. England’s score was also 613 in 2006 and the variance betweentimes is not statistically significant. Many other countries have experienced similar stability. The gap between the top and bottom deciles in England is 245 points, equivalent to eight years of schooling. The OECD average is not much less (232 points). Singapore, South Korea, Taiwan and B-S-J-G have a bigger gap than England, as do eight European jurisdictions, but the majority of jurisdictions have a smaller gap.
- In science the 90th percentile in England scores 642. Eight jurisdictions have a higher score, four of them with statistical significance, one of them Finland. Estonia is also ahead by a smaller margin. England scored 653 in 2006 but the fall since then is not statistically significant. There has been a statistically significant fall in some other jurisdictions, including Hong Kong and Finland. The gap between the top and bottom deciles in England is 264 points, equivalent to almost nine years of schooling. The OECD average is not too much lower, at 247. Singapore and B-S-J-G have a bigger gap than England, as do five other European jurisdictions, but the majority have a smaller gap.
- In reading the 90th percentile in England scores 625 points. Sixteen jurisdictions have a higher score, seven with statistical significance, including Finland, France and Norway. Five further European jurisdictions are ahead of England by a smaller margin. England scored 622 in 2006 and variance since then is not statistically significant. Very few countries have experienced a statistically significant decline. The gap between England’s top and bottom deciles is 254 points, equivalent to eight-and-a-half years of schooling. The OECD average is very similar, at 249 points. Jurisdictions with a bigger gap include Singapore and B-S-J-G as well as 14 European jurisdictions. England is mid-table, since there are also 15 European jurisdictions with a smaller gap.
The national report does celebrate the fact that England outscores the other home countries on level 5/6 in all three assessments, as well as on top decile performance in science. This is attributable to ‘a sustained decline…over the last decade’ in the other home countries, rather than any improvement in England.
Characteristics of English pupils achieving the higher proficiency levels
The National Report is rather disappointing in its coverage of this issue.
It shows that 4% of students achieving proficiency levels 5 and 6 in science were FSM, compared with 12% non-FSM. Percentages are also provided for ‘ever FSM’ (4%) and ‘never FSM’ (13%) but there is no ‘ever 6 FSM’ so we do not know what proportion receive the pupil premium on grounds of deprivation.
The breakdown by ethnicity relates only to five broad ethnic categories so it is not possible (as it was with TIMSS) to see the outcomes of Chinese students.
Almost one quarter (23%) of those achieving the higher proficiency levels attended independent schools.
No equivalent analysis is provided for maths or reading.
There is comparison of the proportion of students comparatively disadvantaged and relatively high achieving in science in different jurisdictions, according to the PISA ‘resilient student’ measure.
This defines a student as resilient if they are in the bottom quartile of the socio-economic distribution in their jurisdiction (using the OECD’s own index) and in the top quartile of performers in science across all jurisdictions ‘after accounting for differences in socio-economic status across countries’.
Some 36% of students in England qualify as resilient. The OECD average is 29%. Vietnam tops the table with 76% of students deemed resilient. In Europe only Estonia, Finland and Portugal outscore England. There is a correlation between high mean scores in science and the proportion of resilient students.
For more about resilient students see this earlier post – Beware the ‘short head’: PISA’s resilient students’ measure – which predates the PISA 2015 results.
Strong all-round performance
The National Report also reveals that 18.1% of English students achieve higher level proficiency (levels 5 or 6) in at least one of the maths, science and reading assessments.
It supplies the percentages below, showing higher level proficiency in different combinations, which I have reproduced as a Venn diagram.
Chart 6: Venn diagram, percentages of English students achieving proficiency level 5 or 6 in two or more assessments, PISA 2015
A surprisingly large proportion of English students – 4.8% – achieved level 5 or 6 in all three assessments. More students achieved this in maths and science combined (2.6%) than in science and reading combined (2.0%), while only 0.7% managed it in maths and reading combined.
England falls slightly below the OECD average for reading only (2.5% OECD average) and further below for maths only (3.9% OECD average) and reading and maths combined (1.1% OECD average). Conversely, England is well ahead for science and reading combined (OECD average 1.0%).
By comparison, according to the OECD’s Report, world-leading Singapore has over 39% of its students achieving this standard on one or more assessments, over twice as many as in England. Some 13.7% of Singaporeans manage this on all three assessments, approaching three times the number of English students.
Singapore outscores England severely on maths only (11.4% in Singapore) and science and maths combined (7.7% in Singapore). It also leads on reading and maths combined (2.0% in Singapore), but England outscores Singapore on science only, reading only and science and reading combined.
In Europe Finland is the best performer on the first criterion with 21.4% achieving level 5 or 6 on one or more assessments. It has 6.0% of its students managing a clean sweep, but is touched out by both Estonia and the Netherlands which each return 6.1%.
Summary of PISA outcomes
This analysis shows that:
- England has almost no chance of achieving the government’s target of being the best in Europe by 2020, as measured by mean scores in the PISA assessments of maths, science and reading. Assuming no improvement elsewhere it needs to add 22 points in science, 26 points in reading and 28 points in science. England’s only hope is that the current leaders fall back significantly in PISA 2018, while it records double digit improvements and no other jurisdiction overtakes. The probability of all three happening is slim.
- When it comes to PISA’s higher proficiency levels, England’s performance has changed relatively little between 2012 and 2015, but there is limited improvement in reading, limited decline in maths and science is unchanged. Higher achievers have fared worse than lower achievers in maths, but the reverse is true in science and reading.
- Even so, England has been making steady progress in closing gaps between it and Europe’s best performers. Across the board – in all three assessments at both level 6 and level 5 and above – these gaps are smaller than they have been in any of the last four PISA cycles. It would be feasible to reach best in Europe at level 6 in both science and reading by 2020.
- But much of England’s progress is attributable to a falling back elsewhere, and we cannot rely on this continuing. The trends in the percentages of English students achieving these higher levels are not encouraging, with the exception of reading. Continued improvement at level 6 in reading, commensurate with what was achieved in 2015, would close the gap to Europe’s current best performer in 2018. At level 5 and above the gap would be almost halved. In maths particularly, England might be closing gaps at the top, but it is still bested by nine other European jurisdictions at level 6 – and by 13 at level 5 and above including Malta and Portugal. In both maths and science England needs to discover the secret of steady, sustained improvement.
- Compared with the TIMSS national report, the PISA report tells us far too little about the characteristics of students achieving the higher proficiency levels. It provides too little analysis based on domestic measures of socio-economic disadvantage and none using more specific minority ethnic categories, so we cannot see whether domestic Chinese performance far exceeds that of other groups. We should take more interest than we do in the PISA resiliency measure and strong all-round performance across all three measures, especially in the characteristics of the successful students, for the optimal outcome must be a ‘balanced scorecard’ of high performance in all three assessments at all levels of proficiency, including the topmost.
Comparability with TIMSS outcomes
It must be understood that TIMSS and PISA outcomes are not properly comparable:
- The field of participants differs in size and composition, although some jurisdictions take part in all the assessments within both studies.
- TIMSS is conducted with learners in the final terms of Year 5 and Year 9, PISA with learners in the first term of Year 11, and national samples are differently derived.
- The derivation, number and pitch of performance thresholds are all different. These analyses reflect performance on the TIMSS advanced benchmark (625 points on the TIMSS scale) and PISA proficiency levels 6 and 5/6 combined (PISA scale points differ according to the assessment).
- TIMSS and PISA assess different competences, although the distinction is not quite as crude as it is sometimes portrayed. TIMSS foregrounds knowledge, understanding and application of specified content; PISA foregrounds the application of knowledge and understanding to real-life problems.
Consequently England’s TIMSS and PISA results are not likely to coincide, but they might be expected to support a broadly coherent description of our national performance.
Drawing the two strands together
A first and obvious point is that neither TIMSS nor PISA mean scores hold any great comfort for the government. While TIMSS mean scores have improved to some extent (by between four and 11 points), PISA mean scores have remained stable (one) or fallen slightly (two).
The trajectory towards ‘best in Europe’ by 2020 is almost certainly unattainable. The last hope is a positive outcome from PIRLS 2016. The government’s targets will be mentioned in future only by its critics. The associated manifesto commitment, which doesn’t mention reading anyway, is a dead duck.
The picture of high attainment is rather more complex.
Looking first at performance in England alone, there is very little evidence of substantive improvement since the last cycle, other than in PISA reading (which might be a positive pointer for PIRLS).
The percentage achieving the Advanced benchmark has fallen slightly or remained unchanged in three of the four TIMSS assessments; the same is true at proficiency levels 5 and above in PISA maths and science.
The trend over the last four cycles is mixed. It is downwards in two of the four TIMSS assessments, significantly so in Y5 science. Ground is also being lost in PISA science; none is being gained in PISA maths, but the trend in PISA reading is more positive. Where there is improvement it is steady rather than spectacular.
There is a marked disparity in how England’s performance compares with our European competitors.
According to PISA, England is steadily closing excellence gaps between it and the best in Europe, though more as a consequence of faltering progress elsewhere. England is not too far away from best in Europe in science and reading, especially at level 6.
But in TIMSS 2015 England is losing ground on its European competitors in three of the four assessments, the exception being Y9 maths where it closed by just one percentage point in 2015. The broad trend over the last four cycles of TIMSS is for these gaps to grow on each occasion, whereas the reverse is true in PISA.
It is important to note in passing that the gaps between England and world leader Singapore remain eye-wateringly large on every single assessment across both TIMSS and PISA, despite any slippage in top PISA scores since 2012.
The reasons why European competition is holding up in TIMSS but not so well in PISA are beyond my capacity to explain. But the critical take-away is that any bettering of England’s comparative position cannot be attributed to resounding domestic success. The best that can be said is that more severe backsliding has been avoided.
Given the scale of the present government’s ambition, as revealed by those ill-fated targets, and the sheer breadth of the reform agenda since 2010, this is a disappointing outcome.
For those pursuing the holy grail of causation – the notion that national policy interventions can lead directly to statistically significant improvements in these scores – the only crumb of comfort lies in the argument that the impact of post-2010 policy will not be felt until the next round of assessments in 2018/2019.
But don’t hold your breath! The counter-argument is that they were the wrong interventions, or were insufficiently targeted, or that there were too many at once, all pointing in different directions, not all of them helpful.
From a high attainment perspective especially, I fear the second argument will turn out to trump the first.