Skip to content

Founding a religion

December 30, 2018

Sleep or water?

Hugo Williams’ lovely poem Religion, from his 2009 collection West End Final, imagines setting one up. It’s a gentle homage to Philip Larkin’s even finer Water (1954). Isn’t Larkin’s “If I were called in to construct a religion”, a great opening?

Religion by Hugo Williams

If it were up to me
I would make use of sleep.
Going to church
would involve a flight of stairs
to a familiar bedroom,
where a broken alarm clock told the time.
The spreading of sheets,
the turning down of blankets,
would be followed by the drawing of curtains
in broad daylight,
the ritual of undressing.

Members of my religion
would be encouraged to sleep in
on Monday mornings
and any other morning they felt like it,
with no questions asked.
Sleep notes would be provided.
Couples would be authorised
to pull the covers over their heads
and spend their days tucked up
in cosy confessionals,
where all their sins would be forgiven.

Water by Philip Larkin

If I were called in
To construct a religion
I should make use of water.

Going to church
Would entail a fording
To dry, different clothes;

My liturgy would employ
Images of sousing,
A furious devout drench,

And I should raise in the east
A glass of water
Where any-angled light
Would congregate endlessly.



Nordic birth role models

December 24, 2018

Finnish grips, movement worries & induction

Nordic pregnancy and birth care is second to none. Midwives are well trained, and respected, and they keep accurate statistics. As a result maternal and perinatal mortality fell well before it did in the rest of Europe.  A Norwegian, Christian Kielland (click here) made the only important contribution to forceps design since Smellie introduced the pelvic curve.  A Swede, Tage Malmstrom (click here), invented the first workable ventouse. Iceland leads in genetic studies. Finnish obstetricians realised the importance of antisepsis before Semmelweis (click here) and were the the first to notice that so called, “benign” cholestasis, might not be benign after all (click here). I’m a fan.

Anglo Saxon obstetricians and midwives in contrast were slow to embrace evidence-based medicine, for which they were famously awarded the wooden spoon by Archie Cochrane in 1979 (click here). But we’ve improved. In England today it would be almost unthinkable to introduce a new way to deliver the head in the hope that it reduced vaginal tears. Imagine if obstetricians got the idea that encouraging women to call in urgently after the slightest alteration in the way their baby moved, or that inducing birth a week early, reduced stillbirths. Enthusiasts would be shown the relevant Cochrane reviews (click here, here and here), told to find out if the issue mattered to patients (click here) and if so, to test their hypotheses in randomised controlled trials. Claims to have discovered better ways to give birth or manage pregnancy, on the basis of historical or non-randomised controls, are met with derision in the UK. And that’s as it should be. Pregnant women have suffered enough from good ideas introduced with the best intentions.

But wait. Some Finns and Norwegians (click here) recently claimed that pressing on the mother’s perineum during birth using the “Finnish grip” (click here) prevented anal sphincter tears. Swedes, who let the head crown in the usual way, beat themselves up for not doing it (click here). Another group of Norwegians told women to report urgently, day or night, the slightest change in their baby’s movements and claimed to prevent stillbirths (click here). The Danes argued that a doubling of induction had done the same (click here).

And a funny thing happened. Instead of reminding the “Finnish grippers” that the rather similar HOOP (Hands On Or Poised) trial (click here) had shown no effect on perineal tears, and the “movement change worriers” that a large fetal movement counting trial had shown no effect on stillbirths (click here), the Royal Colleges of Obstetricians, and of Midwives, pushed both interventions as part of their “Obstetric Anal Sphincter Injury” (OASI) (click here) and “Saving Babies Lives” (click here) care bundles. They didn’t need to push induction. UK induction rates soared without help. The ideas had come from Scandinavia, and normal critical faculties switched off.

Fortunately not everyone was so gullible. Jane Norman and her colleagues tested the “make everyone worry about any alteration in fetal movements” hypothesis and showed that it was not only ineffective but harmful (click here and here). Now we’ve just gotta figure out how to stop doing it!

Bill Grobman and his colleagues in the US tested the “induce everyone hypothesis” and found that it didn’t reduce bad baby outcomes as much as hoped, although surprisingly, it appeared to reduce Caesareans (click here). More trials needed.

No-one seems to be doing a decent trial of the “Finnish grip” yet, although the OASI care bundle is getting push back from midwives on Twitter. Forgive the pun! Hopefully someone will do one soon.

Nordic obstetrics is great. But we should judge their ideas by the standards we require of anyone else.

Jim Thornton

Congratulations on your new baby

December 11, 2018

Now we do a rectal examination

The OASI (Obstetric Anal Sphincter Injury) Care Bundle, endorsed by the Royal Colleges of both Obstetricians & Gynaecologists and Midwives (click here and here) recommends a digital rectal examination after every normal birth “even if the perineum appears intact”. The idea of such an intrusive procedure, at such a sensitive time, is to diagnose unrecognised external anal sphincter tears. But it makes no sense.

The external anal sphincter is a ring of muscle, about as thick as a finger, which maintains continence. If there is a perineal tear or episiotomy, the sphincter is usually exposed, and a trained midwife can either see a tear directly, or palpate it.  Such injuries affect 2-5% of vaginal births, and examining for them in the presence of a perineal tear is uncontentious, because immediate repair is recommended.

In contrast, in the absence of a visible injury, anal sphincter damage is rare. One group (click here) found no cases among 291 women examined later by endoanal ultrasound. (Seven percent had evidence of a “non intact” internal sphincter, but that is a thin layer of impalpable muscle fibres adjacent to the anal mucosa; no-one thinks you can diagnose damage to that that by rectal examination.)

Moreover, if the perineum is intact, palpation is an imprecise way to diagnose sphincter injury, because skin and the transverse perineal muscles lie in the way. Few midwives or doctors are trained to identify an anal sphincter tear when the perineum is intact.

And what is the midwife supposed to do if she diagnoses this rare injury in a woman with an intact perineum? Call a doctor to incise the skin and repair the damaged sphincter? I don’t think so! Thirty years ago, I’m told, an idiosyncratic consultant in Leeds, who believed in the existence of occult sphincter damage, occasionally did just that, but he was a poor role model, ending his career in disgrace following an unrelated conviction for sexually assaulting his patients.

Pelvic floor exercises are recommended anyway, and there’s no point in bringing women back for extra postnatal examinations, because no-one recommends late surgery in the absence of symptoms.

Routine rectal examination in the presence of an intact perineum fails all the criteria of a useful screening test. Most midwives wisely don’t do it. Those that do, should stop.

Jim Thornton


Philip Larkin’s Koan

September 22, 2018

By Paisley Rekdal

Larkin never wrote a villanelle, but he featured in this one. Nor, though he wrote so much else about death, did he ever say that in a perfect universe we’d all be dead. It is the sort of thing he might have said, but death also terrified him. The koan I guess, a riddle without a solution, like the sound of one hand clapping.

Philip Larkin’s Koan

In the perfect universe of math it’s said
the world’s eternal aberration.
In fact, we should be less than dead,

math itself disrupted for matter ever to be read
as real. A thought so hard to fathom that The Nation
in its article on math has said

we lack the right imagination: the human head
will not subtract itself from the equation,
zero out the eager ego to be less than dead.

Did the numbers hunger for mistake, for fun upend
themselves to recalculate our infinite extinction?
And was existence meant for all, since it could be said

without our numbers others might have thrived:
the black rhinoceros, shortnose sturgeon—?
Articles of horn and scale both less and more than dead,

figurative dreams that now haunt us in our beds.
Memory’s another flaw in our equation. Was it The Nation?
I forget. Regardless, I know that someone said
in a perfect universe, we’d all be dead.

By Paisley Rekdal

Twenty Yards Behind

September 11, 2018

By Hugo Williams

In 1975 Wilko Johnson wrote “Twenty Yards Behind” for his band Dr Feelgood (click here).

“I’m walking twenty yards behind her/cause I love the way she shakes behind”

In 2014 Hugo Williams included this villanelle in his collection “I Knew the Bride” (click here)

All those things men find so intense,
watching her walking from twenty yards behind,
women take as the most tender nonsense.

Their appreciation isn’t a pretence,
but they couldn’t care less what kind
of strange things we find so intense,

so long as we enjoy the performance
and what we place in their hand
isn’t just some tender nonsense.

If we knew their true response,
as they threw their limbs around,
to all the things we find so intense,

we might experience detumuscence,
but at least we would understand
why they talked such tender nonsense.

With a greater degree of correspondence
we might not like what we found.
All those things men find so intense
women take as the most tender nonsense.

Hugo Williams





The Embedding Formative Assessment (EFA) trial

July 31, 2018

Quality randomised education trials are possible, but they deserve better reporting


Embedding formative assessment in schools “probably” raises GCSE scores, but only by a little bit

Good teachers continually check their pupils’ grasp of a topic so that their teaching can be tailored to progress. Not got it, try again. Got it, move on. This sort of formative assessment also helps pupils direct their own study appropriately. It may consist of anything from asking an individual student a question or marking a piece of homework, to class wide mock exams or quizzes.

Most teachers know this is a good thing and try to do it regularly, but education expert Dylan Wiliam (click here) is a super enthusiast; he’s written books and study guides to help teachers do it better (click here) and travels the world extolling its importance.

Last month the Education Endowment Foundation (EEF) (click here), an outfit which among other things tries to put education on a firm evidence base, published the results of a randomised trial of Wiliam’s programme. They claim it works:

“Students whose teachers were trained in this approach made two months more progress than a similar group of pupils whose teachers did not receive the intervention. The findings have a very high level of security as it was a large and well-run trial, which means we can be confident in the results.”

But the EEF has not always covered itself in glory with its trial reports. A couple of years ago it drew howls of derision, from me and others, for falsely claiming that a negative trial of teaching philosophy to primary pupils improved their mathematical skills (click here). So let’s take a close look.

Appropriately for a teacher/classroom-based intervention, it was a cluster trial with schools as the unit of randomisation. The main report (click here) is turgid and repetitive, but I’ve read it, so you don’t have to. Here’s what happened.

Population – 140 UK secondary schools during the 2015/16 & 16/17 academic years.

Intervention – Each school implemented Dylan Wiliam’s Embedding Formative Assessment (EFA) programme. They got his EFA pack, a day’s training from the man himself, and ongoing support from the Schools, Students and Teachers (SSAT) network.

Control  – Each school got a one-off payment of £300, but otherwise carried on with ‘business as usual’, with no restrictions on how they took forward formative assessment.

Outcome – The pupils’ “Attainment 8 GCSE scores” calculated from their top eight GCSEs, each graded from 1-9, with maths and English counting double. Max score = 90. Pupils who took fewer than eight subjects scored zero for each unfilled slot.

Planned sample size – 120 schools, 60 per group, with 100 students per cluster (school), and assuming an intra-cluster correlation of 0.2, was judged to have 80% power at the 0.05 significance level to show an improvement of 0.2 standard deviations in the mean “Attainment 8 GCSE scores”. This 0.2 SD effect size was judged by the EEF advisory panels “to be an acceptable level of improvement from a policy perspective to roll out the intervention more widely”. In a medical trial this would be the minimum clinically important difference (MCID). I guess we could call it the minimum educationally worthwhile difference (MEWD).

The trial recruited 140 schools, an additional 20 to allow for attrition. However, although twelve intervention schools eventually gave up on the programme for one reason or another, excluding them from the intervention arm would have biased the results, so the final analysis was of all 140 randomised schools, analysed by “intention to treat”.

The trial wasn’t registered but there’s a protocol from 2016 (click here), and an undated statistical analysis plan (click here). I couldn’t find any major outcome switching, or other risks to the trial integrity. A lot of effort was made to achieve and measure fidelity to the programme, but it turned out that many intervention schools adapted it in unanticipated ways. A planned subgroup analysis of high fidelity schools was eventually judged impracticable, so the trial was a pragmatic test of the effect of implementing EFA in the real world. Schools were randomised within blocks, based on similar GCSE scores and proportions of pupils eligible for free school meals, so the trial groups ended up balanced on these factors.

There is a CONSORT flow diagram, and table 4 shows the trial groups were indeed well balanced at baseline.



The trial was negative. The mean intervention group score was only 0.1 standard deviation higher, a difference that could have occurred by chance (P=0.088) using both the conventional and the pre-specified level of statistical significance (p<0.05). However, the 95% confidence interval around the effect size went from -0.01 to 0.21, so they had just failed to rule out their pre-specified minimum worthwhile effect of a gain of 0.2 standard deviations (table 5).  For those who are unfamiliar with education trials expressing their results this way, the second right column, labelled Hedges g, is the difference in mean scores between the trial groups, measured as proportion of a standard deviation, together with it’s 95% confidence interval. The right hand column is the P value, the probability that a difference as large or larger than the observed one would have occurred by chance if the treatment had no effect. ln summary a negative trial but, as it turned out, also a slightly under-powered one.


Sub-analyses, among children eligible for free school meals, and those scoring in the lower tercile (Table 6) showed the same non-significant 0.1 SD higher scores as in the overall sample, and there was negligible effect in the upper tercile pupils, or on English and maths scores (table 7).  TEEP is the Teacher Effectiveness Enhancement Programme, another intervention applied in some schools. The analysis of non TEEP schools was exploratory.

So how come EEF claims it works?

It appears that the report authors (who do not include Dylan Wiliam) quietly decided, after the results were in, that a level of 10% significance was OK, and that an improvement of only 0.1 standard deviation would be worthwhile after all. Hey presto! The result is positive.

Cheeky eh?  Imagine big pharma getting an effect size of half what they had pre-specified as the minimum clinically worthwhile difference, and a P value of 0.088. Imagine if they then announced not only that the smaller effect size was worthwhile after all, but that P <0.1 was what they had been aiming for all along. Doctors would be sceptical.

Or perhaps not. Imagine if the drug manufacturer was working in a difficult field where there was hardly any evidence that anything worked, where all previous trials had been tiny, fatally flawed, or worse, and that the P=0.088 trial had been otherwise well conducted.  Doctors, and even regulators, might well decide that, at least for the moment, we should use that drug.

The same applies to Dylan Wiliam’s Embedding Formative Assessment. This is one of very few large trials in education. One of even fewer with a proper protocol, a predefined endpoint and analysis plan, and most important of all an endpoint, GCSE exam results, that matters to parents and pupils. Sure, the benefit was smaller than the authors had hoped for, and the results didn’t quite reach conventional levels of statistical significance, but there’s still a smaller than 10% probability that the difference observed occurred by chance. The effect is also plausible.

If I was a head teacher looking to raise my school’s GCSE scores, I’d seriously consider buying Dylan Wiliam’s Embedding Formative Assessment programme. But I wish the EEF would report their trials more honestly.

Jim Thornton

City canoeing in Hamburg

July 19, 2018

Not the Elbe – the Alster

The 56km Alster river joins the Elbe in the centre of Hamburg, and the Canoe Club (click here) is so old that the German word kanu didn’t even exist when it was founded in 1905.

They own a lovely old building with all facilities, a camping lawn, and private river frontage right in the middle of the city.


For a modest fee members of other canoe clubs can camp and use the facilities. A great base for an urban paddle. We did two, each about 6km there and back

Trip 1. Upstream to Fuhlsbüttel dam and power station

Distances measured from the canoe club. Right and left banks labelled as for a downstream paddler.

200m – Meenk bridge


300m – Railway bridge (1941).  Beach bar right upstream


500 m – Deelböge bridge. As you pass under it you feel you’re entering open country.


Branches left and right mark the original river course. The straightened new channel varies in width.

800m – Metzger bridge. Still very rural. An estate of lovely old (im)mobile chalets left bank


1.3 km – Damm bridge 1918. Braband Bistro upstream right bank.


1.7 km – Hindenburg bridge 1920 Skaggerak canal leaves left


2.2 km – Senglemann bridge 1919 (widened 2001).


3 km – Hasenberg bridge 1913.  The river widens in front of a park on the left bank, allowing the tourist boats to turn.


Fuhlsbüttel hydroelectric power station (110KW) (2000), the navigation limit for larger boats. Fish bypass and canoe rollers.

Hamburg airport is a few hundred yards west, and Ohlsdorf, the world’s 4th largest cemetery an equal distance east. Helmut Schmidt, West German chancellor, James Last, big band leader, and 1.5 million others are buried here.

Trip 2. The Isebek canal

Ise beck, a tiny stream running into the Alster was canalised way back, became polluted and nearly got paved over in the 1960s. Various sewage treatment plants and storm water basins keep it reasonably clean today, but the resulting flow is too low for the water to be adequately oxygenated for most fish. In 1988 oxygenation pipes were installed, and they seem to be working well. No pong in June 2018. More here.

From the canoe club turn right. Downstream on the Alster

50m Fahrhaus footbridge. Weighed down with love padlocks

100m Winterhuder bridge followed by riverside restaurant left


Again side channels follow the river’s original course.

300m – U3 railway bridge. Tourist boats fatten their customers.


500m – U1 railway bridge


800m – Goene bridge

1 km – The next bridge leads to Alster lake. Too windy for me. Turn right under Heilwig bridge and enter Isebek canal.


The canal is lively with boat clubs and and bars.

1.3 km – Ise bridge


1.5 km – Railway bridge


1.6 km – Eppendorf bridge


2 km – Kloster bridge


2.25 km – Grindel bridge

About 50 metres after this bridge the canal narrows. Thick bushes hide the urban surroundings.

2.5 km – Manstein bridge

2.6 km – Geben footbridge


2.8 km – Bundestrasse bridge.  Excellent landing stage and access.

3 km – The canal ends in a stagnant dead end. No attractive landing.


But overall a fine urban paddle.

Jim Thornton







%d bloggers like this: