.

This post is about the new ‘achieving at a higher standard’ headline measure that will now feature in the primary performance tables.

Provisional statistics indicate that only 5% of the 2016 end of KS2 cohort achieved this standard. That is disappointing, even allowing for the substantial impact of curriculum reform and new assessment arrangements.

The commentary below:

  • Explains how this measure is defined and how it will be presented in performance tables
  • Analyses the data published to date about achievement of the higher standard and
  • Discusses some implications for schools and for wider educational policy.

.

How the higher standard is defined

We have known for some time that the 2016 primary performance tables would include as a headline measure:

‘the percentage of pupils who achieve at a higher standard in English reading, English writing and mathematics’.

To meet this higher standard learners must achieve a high scaled score in the end of KS2 tests of reading and of maths.

They must also be assessed as ‘working at a greater depth in the expected standard’ in writing, as defined by the interim teacher assessment framework.

When the Standards and Testing Agency (STA) confirmed in early July 2016 that the scale would stretch from 80 to 120, with 100 set as the expected standard, it omitted to mention what high scaled score would be required.

The statement of intent for the 2016 performance tables, published in early August 2016, confirmed inclusion of the measure, but did not define it.

Once the parameters of the scale were confirmed it always seemed probable that the ‘higher level’ score would be set at 110, exactly midway between the expected standard (100) and the maximum score (120).

That would take in the top quartile of the scale – but not necessarily the top quartile of the attainment distribution.

.

.

I did suggest, however, that there is a case for introducing two different standards, one pitched at 110, the other at 115 – with the implication that a similar distinction should be included for writing in the interim teacher assessment framework.

.

.

That would support a more differentiated analysis of high achievement, consistent with the white paper commitment to focus more intensively on maximising attainment at the top end. It would also tell us something about the distribution and characteristics of the very highest attaining pupils.

The missing detail was finally confirmed, rather quietly, at the end of August 2016 within the procedures for the data checking exercise.

The final answer in the Q and A section, on the final page of this document, says:

‘The scaled score that pupils need to achieve in the test subjects (reading, mathematics or grammar, punctuation and spelling) to attain the higher standard is 110.’

This was underlined in the updated edition of the Primary school accountability in 2016 document which appeared on 1 September. It describes the expected standard and the higher standard thus:

‘The percentage of pupils achieving the expected standard is a combined measure across the three subjects. To be counted towards the measure, a pupil must have a scaled score of 100 or more in reading and a scaled score of 100 or more in mathematics; and have been teacher assessed in writing as ‘working at the expected standard’ or ‘working at a greater depth in the expected standard’.

The percentage of pupils achieving at a higher standard is also a combined measure across the three subjects. To be counted towards the measure, a pupil must have a ‘high scaled score’ of 110 or more in reading and mathematics; and have been teacher assessed in writing as ‘working at a greater depth within the expected standard’

The accountability document also explains that the scaled score necessary to achieve the higher standard was ‘determined solely with reference to the distribution of pupils’ test results to identify the pupils who achieved a high mark on the 2016 tests’ – ie there was no standard-setting exercise.

A similar line is taken in SFR39/2016 – National curriculum assessments: key stage 2, 2016 (provisional) which also appeared on 1 September. It suggests that:

‘A threshold of 110 was chosen to give approximately one-fifth of pupils achieving the high score in each subject. This threshold also has the presentational advantage that it is the mid-point between the expected standard and the maximum scaled score. The threshold for the high score will be confirmed for future years in updates to the technical guidance, but the intention is that it will remain in the same place (110) for a number of years so that changes over time can be measured.’

This appears to suggest that selection of the mid-point was a secondary consideration, rather than the obvious choice. The justification for selecting in (very) approximately 20% in each assessment – as opposed to 5%, 10%, 25% or 33% – is not explained.

The commitment to maintaining this position is not exactly unwavering. The form of words leaves open the possibility that the score may be changed in future if too few learners manage to achieve it. This conflicts sharply with the government’s insistence that the scale and the expected level will remain unchanged.

This might be indicative of disagreement about where exactly the higher standard should be pitched: the headline that only 5% achieved it may turn out to be politically damaging, particularly if there is negligible improvement in 2017.

.

How the higher standard will be presented in performance tables

. 

The headline measure and a high standard in individual assessments

According to the statement of intent, the 2016 primary performance tables, scheduled for publication in mid-December, will include for each school:

  • The percentage of learners achieving the higher standard across reading, writing and maths (the headline measure).
  • The percentage of pupils achieving a high standard in each of the three separate tests – and the percentage working at a greater depth in the teacher assessment of writing.
  • The percentage of disadvantaged pupils achieving the headline measure – and how this compares with the ‘attainment of other pupils nationally’ – plus the percentage of disadvantaged pupils achieving a high standard on each of the tests and WAGD in writing.

In relation to the comparator for the performance of disadvantaged learners, the statement says:

‘In 2016, the primary performance tables will not include measures of “in-school” performance gaps between disadvantaged pupils and other pupils at the school. The tables will still include measures that report the difference between disadvantaged pupils at the school and other pupils nationally as the most appropriate basis on which to judge schools’ performance. Focusing on in-school gaps risks setting limits on the ability of all pupils to achieve to their full potential, including those identified as disadvantaged. The approach being taken in the 2016 tables will reward schools that set and achieve the highest aspirations for all their pupils.’

The argument is, presumably, that schools might focus on closing gaps by raising the performance of those from disadvantaged backgrounds at the expense of their more advantaged peers, or rest on their laurels when once they have closed internal gaps while failing to see the bigger picture.

Since the tables provide each ‘higher standard’ measure for disadvantaged and all learners respectively, it will be straightforward to establish the gap between the disadvantaged percentage and the entire cohort – and the supporting data will reveal the incidence of disadvantage in each school.

I am not convinced that the adjustment is necessary, or likely to have any significant impact on schools’ behaviour.

.

High prior attainment

In addition the primary performance tables will continue to report outcomes separately for those with low, middle and high prior attainment. For the 2016 end of KS2 cohort, this will depend on KS1 performance in 2012.

One assumes that this calculation is unchanged from 2015, but I cannot yet find confirmation of this.

The quality and methodology document which accompanied SFR47/2015 explains that, for the 2015 cohort:

‘The KS1 average point score includes reading, writing, mathematics and overall science only’

High attaining pupils are defined as those ‘above level 2 at KS1 (KS1 APS >= 18)’.

An accompanying table shows that an old level 2a is worth 17 points, an old level 3 21 points and an old level 4 27 points.

This is not to be confused with the different mechanism for determining prior attainment groupings for new-style progress scores.

The updated ‘primary school accountability in 2016’ document explains that these utilise the same KS1 point scores, but for reading, writing and maths only.

Moreover, the results are weighted 50:50 between English and maths.

This generates 21 different prior attainment groupings. The top five (17-21 inclusive) would be deemed ‘high prior attainment’ according to the performance tables.

There is a fairly strong case for ensuring consistency between the calculation of these two measures, rather than operating two different models side by side.

The 2016 tables will show for high, middle and low prior attainers:

  • The percentage of learners achieving a high standard (so 110 or above) in each of reading, GPS and maths – and the percentage ‘working at a greater depth’ in writing.
  • The average scaled score per pupil in reading, maths and GPS.
  • The school’s overall progress score for reading, maths and GPS.

So it will be possible to compare the progress of high attainers with middle and low attainers. There is a case for performance on this measure to have some currency in decisions about whether schools are coasting or otherwise underperforming.

For some reason the tables will not show the percentage of high, middle and low prior attainers achieving the higher standard headline measure.

It would be very useful to know what proportion of high prior attainers in each school fell short of this standard.

Provisional data on 2016 performance at the higher standard

Some provisional data about achievement of the higher standard in the 2016 cycle was released on 1 September 2016 in SFR39/2016. The subsequent sections offer an analysis.

.

Headline measure

The bottom line is that, while 53% of learners reached the expected standard, a mere 5% achieved the higher standard.

The precise number is 31,391 learners – 5.39% of all eligible pupils. This includes those attending independent schools who undertook the assessments. (Percentages are given to one decimal place in Chart 1 below.)

Most attention has been given to achievement of the expected standard; there has been very little reaction to achievement of the higher standard, although that is arguably the more alarming outcome.

It is certainly concerning that only just over half of the cohort are operating at the expected level, even allowing for the fact that the curriculum is new and the expected level is significantly more demanding.

But to have only 5.4% of learners meeting the higher standard is worrying when, to all intents and purposes, that denotes the proportion performing consistently in the top quartile of the attainment scale.

.

pry-2016-chart-1-capture

Chart 1: Provisional KS2 outcomes 2016: Achievement of expected standard and higher standard headline measures by gender

.

Chart 1 also illustrates the sizeable gender differences. When percentages are calculated to one decimal place, boys dip under 50% at the expected level, almost eight percentage points behind girls.

Some 6.2% of girls achieved the higher standard, compared with about 4.6% of boys. Of the total of 31,391 learners, 17,701 were girls (56.4%) and 13,690 were boys (43.6%) were boys.

So the gender disparity is much more pronounced at the higher standard.

The statistical tables in the SFR also provide regional and local authority outcomes on the headline measure.

The regional differences are substantial. All regions come in between 4% (North West, Yorkshire and Humberside, West Midlands) and 7% (London). Girls reach 8% in London and boys fall to 3% in Yorkshire and Humberside.

That means girls in London are more than twice as likely to achieve this higher standard as boys in Yorkshire and Humberside.

At local authority level, the headline measure varies between 11.7% (Richmond and Sutton) and 1.7% (Oldham). One is almost seven times more likely to find a pupil at the higher standard in the former than in the latter.

Chart 2 shows performance in the three best and three worst performing local authorities.

Gender differences are pronounced:

  • Girls reach 14.1% in Kensington and Sutton, but are at only 2.0% in Oldham and Portsmouth: one in seven compared with one in fifty.
  • Boys reach 10.1% in Richmond, but are at only 1.3% in Oldham: one in ten compared with almost one in eighty.

.

pry-2016-chart-2-capture

Chart 2: Provisional KS2 outcomes 2016: Performance on the higher standard headline measure by gender – top three and bottom three LAs

.

Individual assessments

Turning to achievement of the higher standard in the individual assessments, the national figures are that 17% achieved the higher standard in maths, as did 19% in reading. In the teacher assessment of writing 15% were assessed as ‘working at a greater depth’.

The SFR does not supply the proportions achieving the higher standard in any two of the three assessments that constitute the headline measure. It would have been useful to have the missing percentages in the Venn diagram below.

.

pry-2016-chart-3-capture

Chart 3: Provisional KS2 outcomes 2016: achievement at a higher standard in each element of the headline measure

.

The SFR does confirm that 23% of learners achieved the higher standard in the grammar, punctuation and spelling test (which does not contribute towards the headline measure).

In writing, approximately one in five of those assessed as achieving the expected standard were ‘working at a greater depth’. At the other extreme, almost one in three of those achieving the expected standard in GPS also achieved the higher standard. Approximately one in four achieved the higher standard in maths. In reading the proportion was roughly 1:3.5 (or two in every seven).

Gender gaps at the higher standard were:

  • In reading 6 points (boys 16%; girls 22%)
  • In writing 8 points (boys 11%; girls 19%)
  • In maths 3 points (boys 18%; girls 15%)
  • In GPS 9 points (boys 18%; girls 27%)

Chart 4 shows that gender differences were broadly the same at the higher standard and the expected standard, except in maths, where girls matched boys at the expected standard but fell behind at the higher standard.

In general the gender gaps are more pronounced at the higher standard than they are at the expected level, but there are significant gender disparities across all the assessments in English.

.

pry-2016-chart-5-capture

Chart 4: Provisional KS2 outcomes 2016: Achievement of expected standard and higher standard in separate assessments by gender

. 

It is instructive to compare performance at the higher standard in 2016 with performance at KS2 L5 and above in 2015, even though these two measures are not comparable.

The pattern of performance across the different assessments is broadly similar, even though the 2016 percentages are much lower. I do not have access to national data for performance at KS2 L5b and above in 2015, but this might well be a closer match.

But it is abundantly clear that the ratio between the percentages achieving the aggregate measures in 2015 and 2016 is much bigger than the ratios for the separate assessments.

The pattern of performance on the separate assessments would lead one to expect that the percentage achieving the headline higher standard in 2016 should be roughly double what it is – much nearer 10% than 5%.

.

pry-2016-chart-4-capture

Chart 5: Comparing percentages achieving KS2 L5 in 2015 with percentages achieving the KS2 higher standard in 2016

.

It is also possible to see from the tables the numbers of learners achieving each scaled score in each of the tests.

Chart 6 shows the number of boys and girls who achieved the maximum scaled score of 120 and also a score of 115 or higher on each test.

The scale maximum was achieved:

  • In reading by 6,441 pupils (1.1%) – 2,160 boys and 4,281 girls
  • In GPS by 4,863 pupils (0.8%) – 1,809 boys and 3,054 girls
  • In maths by 2,017 pupils (0.3%) – 1,278 boys and 739 girls.

Many more achieved the highest scaled scores in reading than in maths, reversing the pattern seen in KS2 L6 performance under the old assessment regime. This suggests that the new single test in reading may be a rather more effective assessment instrument for the top of the attainment distribution than the old L6 reading test.

In the teacher assessment of writing, 85,889 learners were ‘working at greater depth’ 32,053 (37.3%) of them boys and 53,836 (62.7%) of them girls. That is broadly in line with the pattern at 115+ in the reading and GPS tests.

.

pry-higher-chart-6-capture

Chart 6: Provisional KS2 outcomes 2016 – Numbers achieving 120 and 115+ by gender in each test

.

The SFR’s statistical tables also give regional and local authority percentages achieving the higher standard on the individual assessments, all broken down by gender.

In terms of overall performance, Richmond reaches 36% in reading, 40% in GPS and 31% in maths. These are the highest percentages returned in each assessment.

However, Richmond manages only 18% on ‘working at greater depth’ in writing, comfortably outscored by Greenwich and South Tyneside, both on 26%.

Amongst local authorities returning the lowest performance levels are:

  • Knowsley (12% reading, 18% GPS, 11% maths, 8% WAGD in writing TA)
  • Bradford (12% reading, 17% GPS, 13% maths, 13% WAGD in writing TA)
  • Doncaster (11% reading, 17% GPS, 12% maths, 11% WAGD in writing TA)
  • Stoke (12% reading, 18% GPS, 11% maths, 10% WAGD in writing TA)
  • Peterborough (11% reading, 15% GPS, 12% maths, 10% WAGD in writing TA)

But the lowest percentage returned for WAGD in writing is 5% in Oldham and West Sussex. These results may have been affected by inconsistent moderation.

Returns for girls vary considerably:

  • Reading – 40% (Richmond) to 12% (Peterborough)
  • GPS – 46% (Kensington and Chelsea) to 18% (Peterborough)
  • Maths – 28% (Kensington and Chelsea, Richmond and Sutton) to 8% (Rutland and Bedford)
  • Writing (WAGD) – 32% (Greenwich and South Tyneside) to 7% (Birmingham, Oldham and West Sussex)

Returns for boys are similarly variable:

  • Reading – 32% (Richmond) to 9% (Knowsley, N E Lincs., Leicester, Stoke, Luton and Thurrock)
  • GPS – 36% (Kensington and Chelsea) to 10% (Middlesbrough)
  • Maths – 34% (Richmond) to 11% (Knowsley and Bedford)
  • Writing (WAGD) – 21% (Greenwich) to 3% (Oldham)

There is an obvious correlation between high scores and areas of comparative advantage – and vice versa – but the variance between the highest and lowest performing local authorities is far too pronounced.

.

Summary

The key messages from this analysis are:

  • The fact that only 1 in 20 of the cohort achieved the headline measure is a major cause for concern, even when allowance has been made for the impact of curriculum reform, new assessment arrangements and potential inconsistencies in the moderation of teaching assessment in writing. The ‘higher standard’ hurdle in the tests was not set particularly high, at the midpoint between the expected score of 100 and the maximum score of 120. It would be helpful to know which paired combinations of the three contributing assessments were hardest to secure. It is essential to ensure that the highest attaining learners concentrate on their weaker areas as much as their strongest, if not more.
  • This overall outcome masks big differences between the success rates of girls and boys. Some 56.4% of those achieving the headline higher standard are female, so only 43.6% are male. This is a far bigger disparity than at the expected level (52.5% female; 47.5% male) and exists despite the fact that more boys than girls achieve the higher level in maths. At the very top of the distribution, girls are very much in the ascendancy in reading and, to a slightly lesser extent, GPS. Boys are in the ascendancy in maths. It is noteworthy (and surprising) that scaled scores of 120 and 115+ were much more prevalent in reading and GPS than they were in maths.
  • There are also big differences between local authorities, on the headline measure and the individual assessments. Some of the least disadvantaged London boroughs have performed particularly well; some of the more disadvantaged northern boroughs have performed particularly badly (but few coastal boroughs appear right at the bottom of the distribution). This reinforces the correlation between advantage and high attainment. It also reinforces the argument that the government’s Achieving Excellence Areas should be helping to close such excellence gaps.

The revised statistics, to be published alongside the performance tables in December, should include full breakdowns of achievement of the higher standard by pupil characteristics, including socio-economic disadvantage.

They will place in sharp relief the unacceptably large size of national excellence gaps at the end of KS2.

This outcome gives the lie to the repeated assertions from Ofsted that, while provision for the most able in non-selective secondary schools is poor, the primary sector has nothing to worry about. It also reinforces the significance of white paper commitments to rectify previous ‘neglect’ of the needs of our most able learners. 

The headline higher standard might prove useful if there is ever a need to select (disadvantaged) high attainers at the age of 11, for the purposes of admission to a school, or into a programme to improve achievement in KS3-5 and subsequent progression to selective higher education.

Selection by attainment is perhaps slightly more palatable than selection by ability, especially at this tender age, even though the gaps between advantaged and disadvantaged are so pronounced.

Separate 11+ ability tests and associated private coaching would be redundant. Primary schools would be rather better placed to secure a level playing field, though the sharp-elbowed middle classes would still try to tutor their way to success.

.

TD

September 2016

Advertisements