Got the T-shirt (a moderate tale)

Given that teacher assessment is a nonsense which lacks reliability, and that moderation can not really reduce this, nor ensure that gradings are comparable, our moderation experience was about as good as it could be! It was thus:

Each of we two Y6 teachers submitted all our assessments and three children in each category (more ridiculous, inconsistent and confusable codes, here), of which one each was selected, plus another two from each category at random. So, nine children from each class. We were told who these nine were a day in advance. Had we wanted to titivate, we could have, but with our ‘system’ it really wasn’t necessary.

The ‘system’ was basically making use of the interim statements and assigning each one of them a number. Marking since April has involved annotating each piece of work with these numbers, to indicate each criterion. It was far less onerous than it sounds and was surprisingly effective in terms of formative assessment. I shall probably use something similar in the future, even if not required to present evidence.

The moderator arrived this morning and gave us time to settle our classes whilst she generally perused our books. I had been skeptical. I posted on twitter that though a moderator would have authority, I doubted they’d have more expertise. I was concerned about arguing points of grammar and assessment. I was wrong. We could hardly have asked for a better moderator. She knew her stuff. She was a y6 teacher. We had a common understanding of the grammar and the statements. She’d made it her business to sample moderation events as widely as possible and therefore had had the opportunity to see many examples of written work from a wide range of schools. She appreciated our system and the fact that all our written work from April had been done in one book.

Discussions and examination of the evidence, by and large led to an agreed assessment. One was raised from working towards; one, who I had tentatively put forward as ‘greater depth’, but only recently, was agreed to have not quite made it. The other 16 went through as previously assessed, along with all the others in the year group. Overall my colleague and I were deemed to know what we were doing! We ought to, but a) the county moderation experience unsettled us and fed my ever-ready cynicism about the whole business and b) I know that it’s easy to be lulled into a false belief that what we’ve agreed is actually the ‘truth’ about where these pupils are at. All we can say is that we roughly agreed between the three of us. The limited nature of the current criteria makes this an easier task than the old levels, (we still referred to the old levels!) but the error in the system makes it unusable for accountability or for future tracking. I’m most interested to see what the results of the writing assessment are this year – particularly in moderated v non-moderated schools. Whatever it is, it won’t be a reliable assessment but, unfortunately it will still be used (for good or ill) by senior leaders, and other agencies, to make judgements about teaching.

Nevertheless, I’m quite relieved the experience was a positive one and gratified and somewhat surprised to have spent the day with someone with sense and expertise. How was it for you?





Final report of the Commission on Assessment without Levels – a few things.

I’ve read the report and picked out some things. This is not a detailed analysis, but more of a selection of pieces relevant to me and anyone else interested in primary education and assessment:

Our consultations and discussions highlighted the extent to which teachers are subject to conflicting pressures: trying to make appropriate use of assessment as part of the day-today task of classroom teaching, while at the same time collecting assessment data which will be used in very high stakes evaluation of individual and institutional performance. These conflicted purposes too often affect adversely the fundamental aims of the curriculum,

Many of us have been arguing that for years.

the system has been so conditioned by levels that there is considerable challenge in moving away from them. We have been concerned by evidence that some schools are trying to recreate levels based on the new national curriculum.

Some schools are hanging on to them like tin cans in the apocalypse.

levels also came to be used for in-school assessment between key stages in order to monitor whether pupils were on track to achieve expected levels at the end of key stages. This distorted the purpose of in-school assessment,

Whose fault was that?

There are three main forms of assessment: in-school formative assessment, which is used by teachers to evaluate pupils’ knowledge and understanding on a day-today basis and to tailor teaching accordingly; in-school summative assessment, which enables schools to evaluate how much a pupil has learned at the end of a teaching period; and nationally standardised summative assessment,

Try explaining that to those who believe teacher assessment through the year can be used for summative purposes at the end of the year.

many teachers found data entry and data management in their school burdensome.

I love it, when it’s my own.

There is no intrinsic value in recording formative assessment;

More than that – it degrades the formative assessment itself.

the Commission recommends schools ask themselves what uses the assessments are intended to support, what the quality of the assessment information will be,

I don’t believe our trial system using FOCUS materials and assigning a score had much quality. It was too narrow and unreliable. We basically had to resort to levels to try to achieve some sort of reliability.

Schools should not seek to devise a system that they think inspectors will want to see;


Data should be provided to inspectors in the format that the school would ordinarily use to monitor the progress of its pupils

‘Ordinarily’ we used levels! This is why I think we need data based on internal summative assessments. I do not think we can just base it on a summative use of formative assessment information!

The Carter Review of Initial Teacher Training (ITT) identified assessment as the area of greatest weakness in current training programmes.

We should not expect staff (e.g. subject leaders) to devise assessment systems, without having had training in assessment.

The Commission recommends the establishment of a national item bank of assessment questions to be used both for formative assessment in the classroom, to help teachers evaluate understanding of a topic or concept, and for summative assessment, by enabling teachers to create bespoke tests for assessment at the end of a topic or teaching period.

But don’t hold your breath.

The Commission decided at the outset not to prescribe any particular model for in-school assessment. In the context of curriculum freedoms and increasing autonomy for schools, it would make no sense to prescribe any one model for assessment.

Which is where it ultimately is mistaken, since we are expected to be able to make comparisons across schools!

Schools should be free to develop an approach to assessment which aligns with their curriculum and works for their pupils and staff


Although levels were intended to define common standards of attainment, the level descriptors were open to interpretation. Different teachers could make different judgements

Well good grief! This is true of everything they’re expecting us to do in teacher assessment all the time.

Pupils compared themselves to others and often labelled themselves according to the level they were at. This encouraged pupils to adopt a mind-set of fixed ability, which was particularly damaging where pupils saw themselves at a lower level.

This is only going to be made worse, however, by the ‘meeting’ aspects of the new system.

Without levels, schools can use their own assessment systems to support more informative and productive conversations with pupils and parents. They can ensure their approaches to assessment enable pupils to take more responsibility for their achievements by encouraging pupils to reflect on their own progress, understand what their strengths are and identify what they need to do to improve.

Actually, that’s exactly what levels did do! However…

The Commission hopes that teachers will now build their confidence in using a range of formative assessment techniques as an integral part of their teaching, without the burden of unnecessary recording and tracking.

They hope?

Whilst summative tasks can be used for formative purposes, tasks that are designed to provide summative data will often not provide the best formative information. Formative assessment does not have to be carried out with the same test used for summative assessment, and can consist of many different and varied tasks and approaches. Similarly, formative assessments do not have to be measured using the same scale that is used for summative assessments.

OK – this is a key piece of information that is misunderstood by nearly everybody working within education.

However, the Commission strongly believes that a much greater focus on high quality formative assessment as an integral part of teaching and learning will have multiple benefits:

We need to make sure this is fully understood. We must avoid formalising what we think is ‘high quality formative assessment’ because that will become another burdensome and meaningless ritual. Don’t get me started on the Black Box!

The new national curriculum is founded on the principle that teachers should ensure pupils have a secure understanding of key ideas and concepts before moving onto the next phase of learning.

And they do mean 100% of the objectives.

The word mastery is increasingly appearing in assessment systems and in discussions about assessment. Unfortunately, it is used in a number of different ways and there is a risk of confusion if it is not clear which meaning is intended

By  leading politicians too. A common understanding of terms is rather important, don’t you think?

However, Ofsted does not expect to see any specific frequency, type or volume of marking and feedback;

OK, it’s been posted before, but it’s worth reiterating. Many SL and HTs are still fixated on marking.

On the other hand, standardised tests (such as those that produce a reading age) can offer very reliable and accurate information, whereas summative teacher assessment can be subject to bias.

Oh really? Then why haven’t we been given standardised tests and why is there still so much emphasis on TA?

Some types of assessment are capable of being used for more than one purpose. However, this may distort the results, such as where an assessment is used to monitor pupil performance, but is also used as evidence for staff performance management. School leaders should be careful to ensure that the primary purpose of assessment is not distorted by using it for multiple purposes.

I made this point years ago.

Unpicking just one tiny part of interim teacher assessment

We’ve been waiting, but not, I may say, with bated breath. There was no doubt in my mind that the descriptors would be less useful for measuring attainment than a freshly-caught eel. Let’s just look at Reading for KS2 and see how easy it would be to make judgements that would be fair across pupils, classes and schools.

The pupil can:
• read age-appropriate books with confidence and fluency (including whole novels)

  • which books are deemed age-appropriate?
  • define confidence
  • define fluency
  • novels?
  • compare it to the KS1 statement: read words accurately and fluently without overt sounding and blending, e.g. at over
    90 words per minute

• read aloud with intonation that shows understanding

  • intonation does not imply understanding. My best orator from last year, had no understanding of what he was reading so beautifully.

• work out the meaning of words from the context

  • how is this an end of ks2 requirement? This is what we do from the moment we start to read.

• explain and discuss their understanding of what they have read, drawing inference and justifying these with evidence

  • again – how do we extract the end of KS2 requirement from this? It could apply to year1 or PhD level.
  • compare the KS1 requirement: make inferences on the basis of what is said and done

• predict what might happen from details stated and implied

  • again – end of KS2 requirement?
  • compare to KS1 working at greater depth: predict what might happen on the basis of what has been read so far

• retrieve information from non-fiction

  • again – end of KS2 requirement? To what extent? What level of non-fiction? What type of information? In what way? If a child can not retrieve information from non-fiction, they are operating at a very much lower level than the end of the key stage.

• summarise main ideas, identifying key details and using quotations for illustration

  • to what extent? Again, this is also a degree level requirement

• evaluate how authors use language, including figurative language, considering the impact on the reader

  • to what extent?

• make comparisons within and across books.

  • what comparisons? ‘This book has animals and this book has machines.’
  • KS1 greater depth: make links between the book they are reading and other books they have read

I’ve felt like I’ve been arguing for many years, against the strength of mythological belief in the wonders of teacher assessment. Fortunately, it looks like, at long last, there is some recognition in this report that it can not be used where reliability is an issue, e.g.

Some types of assessment are capable of being used for more than one purpose. However, this may distort the results, such as where an assessment is used to monitor pupil performance, but is also used as evidence for staff performance management. School leaders should be careful to ensure that the primary purpose of assessment is not distorted by using it for multiple purposes. (p 24)

and the attempt to create assessment statements from the national curriculum objectives is just one clear reason why that is true. Mike Tidd suggests that we are heading towards the demise of statutory teacher assessment used in this way. Good, because it’s been a nightmare we should be happy to wake from!

Awaiting some ministerial decisions

What a joke! This from them today:

Changes to 2016 tests and assessments We are aware that schools are waiting for additional information about changes to the national curriculum tests and assessments to be introduced for the next academic year. We are still awaiting some ministerial decisions, in particular in relation to teacher assessment. We will let you know in September, as more information becomes available

Only they’re not kidding. Mike Tidd comments on the same here, but I was unrealistically (and uncharacteristically) optimistic that something would come out before we had to have everything in place in September. Should we laugh or tear our hair out that they are ‘awaiting ministerial decisions’? What – the ministers haven’t been able to decide after 2 years? I won’t hold my breath for anything sensible then. Of course, ‘teacher assessment’ should be a matter for serious consideration, but I doubt that their decisions are delayed for the types of reservations I have on the matter. Whilst it seems to have become the global panacea for all assessments that are too complex to manage, I keep banging on about how inappropriate and unreliable it is. If we are to expect pupil attainment to be a criterion for teacher appraisal and progression, then how can we possibly expect teachers to carry out that assessment themselves? That would be wrong, even if we had extremely reliable tools with which to do it, but we don’t. We have nothing of the sort and we never will have, as long as we assess by descriptive objectives.

So what do I really want? Well, to be honest, although I believe in the essential role of testing within learning, I really want to stop assessing attainment in the way it has become embedded within English culture.  It’s a red herring and has nothing to do with education. I never thought I’d say that – I always had highly ‘successful’ results within the old levels system – but I’m very much questioning the whole notion of pupil attainment as currently understood. It’s based on a narrow set of values which, in spite of all the rhetoric of ‘closing the gap’ are never going to be brilliantly addressed by all pupils. That’s an inescapable statistical fact. And why should they be? Attainment is not the same as education in the same way that climbing the ladder is not the same as being equipped to make informed decisions.

But if we must, then give us all the same tools – the same yardstick. At the end of Year 6, all pupils will be assessed by written tests for maths, reading, spelling and grammar. Their results will then be effectively norm referenced (after a fashion). Do that for all the year groups. I’d prefer it if we moved into the latter half of the 20th century in terms of the effective use of technology, but even an old Victorian style paper is better than the vague nonsense we are currently working with.

So, anyway, as it stands, are we justified in Autumn 2015, when we are visited by OFSTED, in having an assessment system in disarray or are we supposed to have sorted it all out, even though they haven’t?

Can we ditch ‘Building Learning Power’ now?

Colleagues in UK primary schools might recognise the reference, ‘Building Learning Power‘ which was another bandwagon that rolled by a few years ago. As ever, many leaped aboard without stopping to check just exactly what the evidence was. Yes, there did appear to be a definite correlation between the attitudinal aspects (‘dispositions‘ and ‘capacities‘) outlined in the promotional literature and pupil attainment, but sadly few of us seem to have learned the old adage that correlation does not necessarily imply causation. Moreover we were faced with the claim that ‘it has a robust scientific rationale for suggesting what some of these characteristics might be, and for the guiding assumption that these characteristics are indeed capable of being systematically developed.‘. And who are we, as the nation’s educators, to question such an authoritative basis as a ‘robust scientific rationale’ (in spite of the apparent lack of references)?

So, instead of simply acknowledging these characteristics, we were expected somehow to teach them, present assemblies on them and unpick them to a fine degree. It didn’t sit comfortably with many of us – were we expecting pupils to use those dispositions and capacities whilst learning something else, or were we supposed to teach them separately and specifically? When planning lessons, we were told to list the BLP skills we were focussing on, but we were confused. It seemed like we would always be listing all the skills – inevitably, since they were the characteristics which correlated with attainment. But still, teachers do what they’re told, even if it ties them up in knots sometimes.

So it is with interest I came across this piece of research from the USA:

Little evidence that executive function interventions boost student achievement

As I’m reading, I’m wondering what exactly ‘executive function’ is and why I haven’t really heard about it in the context of teaching and learning in the UK, but, as I read on I see that it is ‘the skills related to thoughtful planning, use of memory and attention, and ability to control impulses and resist distraction’ and it dawns on me that that is the language of BLP! So I read a little more closely and discover that in a 25 year meta-analysis of the research, there is no conclusive evidence that interventions aimed at teaching these skills have had any impact on attainment. To quote:

“Studies that explore the link between executive function and achievement abound, but what is striking about the body of research is how few attempts have been made to conduct rigorous analyses that would support a causal relationship,” said Jacob [author]

The authors note that few studies have controlled for characteristics such as parental education, socioeconomic status, or IQ, although these characteristics have been found to be associated with the development of executive function. They found that even fewer studies have attempted randomized trials to rigorously assess the impact of interventions.

Not such a robust scientific rationale, then? Just to be clear – lack of evidence doesn’t mean there isn’t causation, but isn’t that exactly what we should be concerned with? This is only one of a multitude of initiatives that have been thrown our way in the past decade, many of which have since fallen into disuse or become mindlessly ritualised. We are recently led to believe, however, given the catchphrase bandied about by government ministers and a good degree of funding, through such bodies as The Education Endowment Fund, that there is an increased drive for ‘evidence-based education’, which of course begs the question: what’s been going on – what exactly has underpinned the cascade of initiatives – up to this point?

Shouldn’t we just say ‘no’?

I’m beginning to wonder why we are playing their game at all. Why are we not questioning the basis for the assumptions about what children should know/be able to do by whatever year, as prescribed in the new curriculum and the soon to be published, rapidly cobbled together, waste of time and paper that are the new ‘descriptors’. Have they based these on any actual research other than what Michael Gove dimly remembered from his own school days?

We recently purchased some published assessments, partly, I’m sorry to say, on my suggestion that we needed something ‘external’ to help us measure progress, now that levels no longer work. It wasn’t what I really wanted – I favour a completely different approach involving sophisticated technology, personal learning and an open curriculum, but that’s another long story and potential PhD thesis! Applying these assessments, though, is beginning to look unethical, to say the least. I’ve always been a bit of a fan of ‘testing’ when it’s purposeful, aids memory and feeds back at the right level, but these tests are utterly demoralising for pupils and staff and I’m pretty sure that’s not a positive force in education. I’m not even sure that I want to be teaching the pupils to jump through those hoops that they’re just missing; I strongly suspect they are not even the right hoops – that there are much more important things to be doing in primary school that are in no way accounted for by the (currently inscrutable) attaining/not attaining/exceeding criteria of the new system.

So what do we do when we’re in the position of being told we have to do something that is basically antagonistic to all our principles? Are we really, after all this time, going to revert to telling pupils that they’re failures? It seems so. Historically, apart from the occasional union bleat, teachers in England have generally tried their best to do what they’re told, as if, like the ‘good’ pupils they might have been when they were at school, they believe and trust in authority. Milgram would have a field day. Fingers on buttons, folks!

Another pointless consultation

The DfE are apparently ‘seeking views on draft performance descriptors for determining pupil attainment at the end of key stages 1 and 2’.

They have previously ‘sought views’ on the draft national curriculum and the assessment policy, which they acknowledged and then proceeded to largely ignore. I should imagine this will be no different. Needless to say, I still responded, as I did with the others, if only for the opportunity to point out how vague and meaningless their descriptors are.

My response in brief:

It is really important that you remove all vague terminology, such as ‘increasing’, or ‘wider’. In removing levels, you acknowledged the unreliability of the system and difficulty faced by teachers in agreeing levels. This document falls into the same trap. It would be far better to provide examples of what was expected at each key stage (and in each year), than these vague descriptions, some of which could apply to any level of study (Reception to post-doctoral). Many teachers have worked for years on helping colleagues to understand exactly what was required to show a pupil’s attainment, and in one fell swoop, the new curriculum has demolished all that work without replacing it with anything effective. Give us a standardised set of concrete examples and explanations (not exemplars of pupils’ work), along the lines of those provided by Kangaroo Maths (when we were grappling with what the levels represented in the old curriculum). Give us some e-assessment software that will allow us to quickly determine and collate this information.

I did want also to say, ‘Give us some mid 20th Century text books, since that’s obviously the source of your ‘new’ curriculum.’ In actual fact this isn’t just a just a bitter jibe. A text book would at least guide us through the current morass. We could really do with some clarity and consistency. I suggest a state of the art information source written by actual experts rather than the range of opportunistic publications which will be cobbled together by commercial companies who are ill-prepared to jump on this latest bandwagon.