I worked this fact out for myself just the other day: with a Union mandated cap on lesson observation of three hours per year, school leaders across England and Wales can only observe 0.34% of what a teacher does in the classroom each year.

Now I can hear the nitpicking begin, so don’t misunderstand me. I’m aware that there’s more than one way to skin a pedagogue. I’m not counting work scrutiny or learning walks (may they rest in peace). I’m counting the, “in the room” and “on your feet”, in lesson, lesson observations. Let’s also bear in mind here that, in many, if not all schools, Performance Management observations (which typically make up two of the three) are notified and specifically prepped. Can these be representative of the “typical” lessons taught on a day by day basis? (This blog by @TeacherToolkit about “Typicality” is brilliant) Should they be “typical”? These lessons more than likely count towards pay progression evidence, so shouldn’t teachers be given these opportunities to present themselves at their best?

It’s the season of SEF updating and judging the impact of actions held within improvement plans. It’s the season of considering, truly, what impact we’ve had. And a considerable part of that is weighing up results and achievement and progress data against teaching and learning data. And this is where my sticking point comes. If Ofsted quality assured observers produce a suite of observations which suggest an improvement, then it’s an improvement. Isn’t it? If experienced observers watch well over, let’s say, forty lessons (and more) – lessons they’ve both seen before and then watched again and graded, often with an Ofsted inspector or HMI those findings have to hold some credence. Don’t they?

It’s a terrible realisation. That moment when you realise that there may actually be an absolute truth. A severe fact that may over ride others that you value and even pride yourself in. Achievement, exam results and progress data are the only way that we can truly judge the quality of teaching in a school. Really it’s an exercise in triangulation.

“Triangulation is a powerful technique that facilitates validation of data through cross verification from two or more sources. In particular, it refers to the application and combination of several research methodologies in the study of the same phenomenon.
– It can be employed in both quantitative (validation) and qualitative (inquiry) studies.
– It is a method-appropriate strategy of founding the credibility of qualitative analyses.
– It becomes an alternative to traditional criteria like reliability and validity.
– It is the preferred line in the social sciences.

By combining multiple observers, theories, methods, and empirical materials, researchers can hope to overcome the weakness or intrinsic biases and the problems that come from single method, single-observer and single-theory studies.”

In essence, at the end of the first half term of year two of SLT, the learning curve continues to take a few sharp turns in directions that, hubris maybe, might never have made me think they were going. Here’s where my sticking point was before maths changed my outlook. Take two (or more) teachers and review their teaching and learning data, I’m paticularly interested in those (and there’s a number in a number of schools) for whom good and outstanding (should we be grading still! Pah!) is the norm. Now take their achievement and progress data. I’m especially interested in examination results here – internal assessments, however rigorous can be porous in a process like this. Now the pattern begins to deviate. In any spread of data, the bell curve will dictate where any of this data then “sits”. In essence, teachers with the same quality of observations could have (and do) have diametrically different sets of results. In triangulating the data, does one become good or outstanding and the other/s “require improvement”? Does this one measurement outweigh observation so much that we put a label on a teacher until the next set of data is produced to review the judgement? No. Of course not – this would be demotivating and would negate the processes we use to, no more than, “dip test” quality.

As always, some wider reading has helped me to reach some peace with this. The always brilliant Tom Sherrington (@Headguruteacher) and this blog got me thinking about the need for a tone to be set around quality and, of course David Didau (@LearningSpy) with this blog and this one too, reminded me that any observation is only ever a judge of what is seen in that moment in time. It can only ever be an indicator.

I’ll be honest, I wrangled with this for a good few weeks. I’ve observed lessons with Ofsted and HMI and pride myself on the accuracy of my judgements. I’m realistic to know that achievement will always be a “limiting judgement” in terms of its relationship with quality of teaching but, if I’m honest, I never really saw them as being as linked as I did when I saw the number 0.34%.

I may have just completed my stages of grief enough to now be considering how to see more, gauge more, forensically scrutinise more. With a three hour cap, it’s going to be interesting. All suggestions welcome as always.