Assessment for Accountability – Taking the Biscuit

Thinking about assessment and accountability again. I adapted this from a letter I wrote to the then Ed Select Committee.

The problem of accountability

If we take it to be the case that teachers and schools need to be ‘held to account’, then we need to ask ourselves some questions.

Held to account for what?

The answer to this is  crucial. For a long time we were held to account for pupil ‘attainment’. Recently there has been the reasonable suggestion that there are many factors outside of our control which impact on attainment, and that progress might be a better measure of how good or bad we are. The measurement of progress nevertheless, remains a massive challenge, in spite of attempts to contextualise it, or to use national trends for comparison: baseline data is not reliable (it’s ludicrous to believe that you can use the behaviour of 4-year-olds to derive data that will hold teachers to account at the end of KS2 – and beyond!); pupils do not make standard amounts of progress; domains at the start and end of the progress measurements are different (GCSE art teachers, beware!); cohorts are different; significance is difficult with small groups, etc.

Whilst ‘progress’ seems at first fairer and a preferable measure to ‘attainment’, neither is sufficient for our purposes – not the way currently measured and not when matched against the aspirations of the National Curriculum.

Does the current system even serve the right purpose?

In the drive to measure ‘attainment’ and ‘progress’, I think we sometimes forget that we are using these things as proxies for the quality of the education being provided. We need to return to the drawing board for how we might ensure this happens. Currently we use an assessment system that cannot do this; the measurement is too narrow, subject to chance variables, and too much driven by fear of failure, leading to all the perverse incentives the assessment experts have been writing about for so many decades. A quality primary education is not ensured by testing the very few items that are currently measured at the end of KS2, any more than a quality factory is ensured by eating one of its biscuits.

I think schools and teachers do want to provide a good education to their pupils and that a climate of fear is unnecessary and counterproductive. Real accountability must involve a move away from looking only at outcomes and focus instead on quality input: we need well-educated teachers with excellent and maintained subject knowledge, quality text books produced by experts and thoroughly vetted by the profession, online materials and required reading, use of evidence and avoidance of fads. Quality training is essential, as is development and retention of teachers with expertise.

How can we ensure accountability where it counts?

Realistically, I don’t expect a quick move away from summative assessments for the purpose of accountability, in spite of all the arguments against. But I feel that we could address some of the issues that arise with the current system by generating and providing (to the DfE if need be) not less but more information:

  • Frequent, low-stakes tests help both teaching and learning – require/provide tests throughout the year, every year.
  • Fine grained, specific tests provide useful information – test what we want to know about – keep that data.
  • Assessing the same domain more than once and in different ways helps to reduce unreliability – do not rely on one single end of year test.
  • Testing earlier in the cycle gives useful feedback for teaching – do not wait until the end of the year or the end of the Key Stage.
  • Random selection from a broad range of criteria helps to reduce ‘teaching to the test’ – test knowledge in all curriculum areas without publishing a narrow list of criteria.
  • Use assessment experts and design assessments that test what we want pupils to know or do. Criteria need to be reasonable – not obscure and mystical as they have been recently.

If these aspects were applied to an assessment system throughout the primary phase, I believe we could enhance learning, improve accountability in what really matters and provide vast amounts of data.

We really need to make better use of technology at all stages; this is the only way in which we can feasibly make assessment serve multiple purposes. There would need to be a move away from the high stakes pass/fail system which is not fit for purpose, towards a timely monitoring and feedback system that could alert all stakeholders to issues and provide useful tools for intervention. Data collected from continuous low-stakes assessments provides a far more valid picture of teaching and learning.

Advertisements

Ed Select Committee report – improvements to come?

The Education Select Committee has published its report into the impact of the changes to primary assessment. It’s been an interesting journey from the point at which I submitted written evidence on primary assessment; I wrote a blog back in October, where I doubted there would be much response, but in fact I was wrong. Not only did they seem to draw widely from practioners, stake-holders and experts to give evidence, the report actually suggests that they might have listened quite well, and more to the point, understood the gist of what we were all trying to say. For anyone who had followed assessment research, most of this is nothing new. Similar things have been said for decades. Nevertheless, it’s gratifying to have some airing of the issues at this level.

Summative and formative assessment

The introduction to the report clarifies that the issues being tackled relate to summative assessment and not the ongoing process of formative assessment carried out by teachers. For me, this is a crucial point, since I have been trying, with some difficulty sometimes, to explain to teachers that the two purposes should not be confused. This is important because the original report on assessment without levels suggested that schools had ‘carte blanche’ to create their own systems. Whilst it also emphasised that purposes needed to be clear, many school systems were either extensions of formative assessment that failed to grasp the implications and the requirements of summative purposes, or they were clumsy attempts to create tracking systems based on data that really had not been derived from reliable assessment!

Implementation and design

The report is critical of the time-scale and the numerous mistakes made in the administration of the assessments. They were particularly critical of the STA, which was seen to be chaotic and insufficiently independent. Furthermore, they criticise Ofqual for lack of quality control, in spite of Ofqual’s own protestations that they had scrutinised the materials. The report recommends an independent panel to review the process in future.

This finding is pretty damning. This is not some tin-pot state setting up its first exams – how is incompetence becoming normal? In a climate of anti-expertise, I suppose it is to be expected, but it will be very interesting to see if the recommendations have any effect in this area.

The Reading Test

The report took on board the wide-spread criticism of the 2016 Reading Test. The STA defense was that it had been properly trialled and performed as expected. Nevertheless, the good news (possibly) is that the Department has supposedly “considered how this year’s test experience could be improved for pupils”. 

Well we shall see on Monday! I really hope they manage to produce something that most pupils will at least find vaguely interesting to read. The 2016 paper was certainly the least well-received of all the practice papers we did this year.

Writing and teacher assessment

Teacher assessment of writing emerged as something that divided opinion. On the one hand there were quotes from heads who suggested that ‘teachers should be trusted’ to assess writing. My view is that they miss the point and I was very happy to be quoted alongside Tim Oates, as having deep reservations about teacher assessment. I’ve frequently argued against it for several reasons (even when moderation is involved) and I believe that those who propose it may be confusing the different purposes of assessment, or fail to see how it’s not about ‘trust’ but about fairness to all pupils and an unacceptable burden on teachers.

What is good to see, though, is how the Committee have responded to our suggested alternatives. Many of us referred to ‘Comparative Judgement’ as a possible way forward. The potential of comparative judgement as an assessment method is not new, but is gaining credibility and may offer some solutions – I’m glad to see it given space in the report. Something is certainly needed, as the way we currently assess writing is really not fit for purpose. At the very least, it seems we may return to a ‘best-fit’ model for the time being.

For more on Comparative Judgment, see:

Michael Tidd  The potential of Comparative Judgement in primary

Daisy Christodoulou Comparative judgment: 21st century assessment

No More Marking

David Didau  10 Misconceptions about Comparative Judgement

Support for schools

The report found that the changes were made without proper training or support. I think this is something of an understatement. Systems were changed radically without anything concrete to replace them. Schools were left to devise their own systems and it’s difficult to see how anyone could not have foreseen that this would be inconsistent and often  inappropriate. As I said in the enquiry, there are thousands of primary schools finding thousands of different solutions. How can that be an effective national strategy, particularly as, by their own admission, schools lacked assessment expertise? Apparently some schools adopted commercial packages which were deemed ‘low quality’. This, too, is not a surprise. I know that there are teachers and head-teachers who strongly support the notion of ‘doing their own thing’, but I disagree with this idea and have referred to it in the past as the ‘pot-luck’ approach. There will be ways of doing things that are better than others. What we need to do is to make sure that we are trying to implement the most effective methods and not leaving it to the whim of individuals. Several times, Michael Tidd has repeated that we were offered an ‘item bank’ to help teachers with ongoing assessment. The report reiterates this, but I don’t suggest we hold our collective breath.

High-stakes impact and accountability

I’m sure the members of the Assessment Reform Group, and other researchers of the 20th century, would be gratified to know that this far down the line we’re still needing to point out the counter-productive nature of high-stakes assessment for accountability! Nevertheless, it’s good to see it re-emphasised in no uncertain terms and the report is very clear about the impact on well-being and on the curriculum. I’m not sure that their recommendation that OFSTED broadens its focus (again), particularly including science as a core subject, is going to help. OFSTED has already reported on the parlous state of science in the curriculum, but the subject has continued to lose status since 2009. This is as a direct result of the assessment of the other subjects. What is assessed for accountability has status. What is not, does not. The ASE argues (and I totally understand why) that science was impoverished by the test at the end of the year. Nevertheless, science has been impoverished far more, subsequently, in spite of sporadic ‘success stories’ from some schools. This is a matter of record. (pdf). Teacher assessment of science for any kind of reliable purpose is even more fraught with difficulties than the assessment of writing. The farce, last year, was schools trying to decide if they really were going to give credence to the myth that their pupils had ‘mastered’ all 24 of the objectives or whether they were going to ‘fail’ them. Added to this is the ongoing irony that primary science is still ‘sampled’ using an old-fashioned conventional test. Our inadequacy in assessing science is an area that is generally ignored or, to my great annoyance, completely unappreciated by bright-eyed believers who offer ‘simple’ solutions. I’ve suggested that complex subjects like science can only be adequately assessed using more sophisticated technology, but edtech has stalled in the UK and so I hold out little hope for developments in primary school!

When I think back to my comments to the enquiry, I wish I could have made myself clearer in some ways. I said that if we want assessment to enhance our pupils’ education then what we currently have is not serving that purpose. At the time, we were told that if we wished to further comment on the problem of accountability, then we could write to the Committee, which I did. The constant argument has always been ‘…but we need teachers to be accountable.’ I argued that they need to be accountable for the right things and that a single yearly sample of small populations in test conditions, did not ensure this. This was repeated by so many of those who wrote evidence for the Committee, that it was obviously hard to ignore. The following extract from their recommendations is probably the key statement from the entire process. If something changes as a result of this, there might be a positive outcome after all.

Many of the negative effects of assessment are in fact caused by the use of results
in the accountability system rather than the assessment system itself. Key Stage 2
results are used to hold schools to account at a system level, to parents, by Ofsted, and results are linked to teachers’ pay and performance. We recognise the importance of holding schools to account but this high-stakes system does not improve teaching and learning at primary school. (my bold)