Skip to content

Standard, Population & Customised fetal size charts 6 – practical problems with race or ethnicity

September 11, 2019

Last Saturday (click here) I suggested that the relation between race or ethnicity and poor pregnancy outcomes made customisation of fetal size charts by either, a poor way to improve detection of adverse outcomes. This post considers more practical issues.

Classification of human beings by race, an inherent physical or biological quality attributable to some groups rather than others, arose as a scientific idea in the 17th century, reached its full flowering in the 19th, degenerated into a justification for mass murder in the mid twentieth century and is now completely discredited. Although the word “race” has a lay meaning, it no longer has a scientific one. Even if we wanted to, we couldn’t customise charts by race.

Classification by ethnic group, shared ancestry, language, homeland and other cultural features, is possible but also problematic. Definitions of what constitutes an ethnic group vary over time and by who is doing the classification, and the history of claiming ethnic differences in other areas such as IQ, criminality, sporting prowess etc. have all been based on poor science, often driven by racist ideas.

Ethnicity is also problematic because so many human beings are of mixed ethnicity. This paper (click here) nicely shows the problems. This not only makes it difficult for the clinician faced with a mixed ethnicity parent, to know which chart to use. It also causes problems for scientists drawing up ethnically-based customised charts.  How can they ensure that the people on whom their charts were derived, were of “pure” ethnicity?

In practice most ethnically customised charts are based on “self-reported ethnicity”; typically a “tick box” completed by a clerk when the patient registers. No-one knows whether the clerk based their decision on self-report, skin colour, facial features or something else.

To summarise ethnicity is poorly defined, and many of our patients are of mixed ethnicity, so even if ethnicity wasn’t also related to poor outcomes, which it is, it would be an unsuitable factor on which to either create customised charts or to decide which charts to use.

Next weight (click here)

Jim Thornton


Standard, Population & Customised fetal size charts 5 – race ethnicity

September 6, 2019

At first sight customisation by maternal (or paternal) race or ethnic group appears sensible. Han Chinese women, for example, tend to be smaller than say Swedish women, and have smaller babies. Surely we should plot their baby’s growth on different charts.

But how can we be sure that the ethnic “differences” we observe are not on, or correlated with, the pathway to the pathology we are trying to identify?

Consider the two ethnic groups above, who appear to differ in size, and have had relatively little historical intermingling, Northern Europeans, and the Han Chinese.  Northern European women and their babies are taller, and larger than Han Chinese.  But perinatal mortality is also higher in the Han Chinese.

Both differences are probably at least partly caused by under nutrition among the Han Chinese ancestors who would have lived through two of the worst famines in recent human history, Mao Tse Tung’s Great Leap Forward, and Cultural Revolution. Both will certainly be affected by present day differences in nutrition, and smoking and drinking habits.  If so, the observation that more than n% of Chinese babies fall below the nth centile of Western European charts, is a sign that such babies are genuinely failing to reach their full growth potential, rather than that the charts are wrong.

It doesn’t matter if race or ethnicity causes the adverse pregnancy outcomes, or is just correlated, the fact that there is a correlation should make us think twice about customising on either. Although even a strong correlation in itself doesn’t prove it wrong. If race was less strongly correlated with adverse outcomes than size, customisation on race, at least in theory, might still work.

Tomorrow (click here) we’ll see some practical reasons why successful customisatiom by race or ethnicity is in reality a hopeless quest.

Jim Thornton

Standard, Population & Customised fetal size charts 4 – principles

September 6, 2019

Previous posts (click here and here) set the scene; how to get accurate scan measures, and the difference between reference and standard charts. Now the main event.


This means creating a special chart based on one or more “custom” features of the individual being measured. Vets do it all the time – they couldn’t use the same chart for a Shire horse and a Shetland pony! Customisation makes sense for equines because even healthy foals differ in size.

But what about human smokers and non-smokers? Their fetuses  differ in size too, but customised charts for smokers would be bonkers. What’s the difference?

The principle is as follows. Charts are used to predict pathology, death, brain damage etc.

For factors unrelated to the pathology sought, customisation is appropriate. 

A Shetland pony is not a malnourished Shire horse, so it is good practice to have different charts to detect both malnourished Shire Horses and malnourished Shetland ponies.

But for factors lying on the pathway which leads to pathology (e.g. smoking is on the pathway to stillbirth) customisation is inappropriate. 

Note the choice of words, “lying on the pathway” rather than “causing”.  For this purpose, correlation is equivalent to causation; customisation on a measure which is simply correlated with pathology is also inappropriate. Consider social class. Babies of mothers in lower social classes tend to be smaller, but it is unlikely that social class affects size directly. More likely the effect is mediated via differences in diet, smoking or some other behaviour. But it would still be inappropriate to produce separate customised charts for different social classes.

Note two. Strictly the principle relates to the relative strength of the size/factor relationship to the outcome we wish to predict.  If the relationship between smoking and stillbirth is stronger than that between size and stillbirth, customisation on smoking would have the net effect of reducing the identification of babies who were destined to die.  If smoking was less strongly related there might theoretically be a way to customise charts by smoking status which improved detection of stillbirth. The latter might apply to maternal height (see later post).

In the next few posts we’ll look at some real factors for which doctors have suggested customisation, and see if they make sense. First race and ethnicity (click here).

Jim Thornton

Standard, Population & Customised fetal size charts 3 – Population/reference or healthy/standard?

September 5, 2019

Some of the confusion about fetal charts arises from the language used to describe the various non-customised sorts. It is confusing. “Population” and “reference” are different ways to describe one type of chart. “Healthy” or “standard” different names for the other type.

Population/reference charts

Older fetal growth charts were based on populations of more or less randomly selected pregnant women, including those whose pregnancy had, or was destined to have, problems of one sort or another. The idea was that the charts would “do what it said on the tin”; ten percent of fetuses below the 10th centile, ten percent above the 90th and so on.

But populations vary, so doctors often noticed that the charts were “wrong”, and derived a new set for their local population. Many hundreds of different charts were developed. In the UK the most popular was the Chitty chart* (click here), developed from measurements made on 663 consecutive women booking at Kings College Hospital in South London in the early 1990s. Such charts are also called reference charts because they “refer” to a particular population. However, I will use the term “population chart”.

As an aside, local population charts, for example for Pacific islanders, South Indian women, or for that matter “women booking in Kings College Hospital in South London in the early 1990s”, are a type of customised chart. So long as they are used for the population on which they were developed, they are “customised” for that particular population. But true customisation in the modern sense, which we will discuss tomorrow, is more sophisticated.

Healthy/standard charts

These are derived from healthy well-nourished populations excluding, as far as possible, problem pregnancies. The idea is that they provide a “standard” measure of growth in the absence of disease or nutritional constraints. I will use the term “standard” in what follows.

One of the controversies in creating standard charts is to decide what constitutes a healthy population. We can all agree that smokers, and women with raised blood pressure or anaemia should be excluded, but what about short women? Perhaps they are short because of malnutrition early in life. What about underweight, or overweight women? The details matter but the principle is simple. We want to measure fetal growth against other healthy pregnancies.

In theory any difference between standard charts and real world populations reflects the amount of disease and abnormal nutrition in the real world. The fact that in the UK for example more than ten percent of the population lies below the 10th centile reflects the UK’s many smokers, whose babies are abnormally small. The fact that more than ten percent lie over the 90th centile reflects our many overweight or diabetic women. The discordance is a “feature not a bug!”  Intergrowth-21 (click here) and WHO (click here) are standard charts.

Tomorrow customisation (click here).

Jim Thornton

* The eagle eyed will notice that Altman and Chitty were attempting to create a “standard” chart, and indeed they excluded some unhealthy pregnancies such as those with hypertension and diabetes.

Standard, Population & Customised fetal size charts 2 – technical scan stuff

September 4, 2019

Yesterday, in my first fetal size chart post (click here), I teased those doctors who customise children’s growth charts on the basis of US citizenship!  I will be more serious as I consider fetal growth charts, but first let’s get some technical issues out of the way. We can’t weigh, or measure babies directly in-utero. We have to scan them. And we have to do the scan right.

  1. How to do the scans

As image quality improved, techniques changed. For example measuring the bi-parietal diameter (width) of the head from the proximal surface of the near skull bone to the proximal surface of the far one, used to be the standard technique. It always underestimated the actual measure, but was necessary because the distal surface of the far bone was indistinct. As image quality improved we can now easily see the outer surface of the distal skull bone, so we now measure “outer to outer”. But some old charts are based on the outdated method.

As outlining technology improved, calculating circumferences from two diameters measured at right angles was superseded by direct outline measures. But some old charts used the outdated method.

The need to create charts from correctly aligned scans showing the landmarks clearly, causes problems for population charts (more on these tomorrow) which unavoidably include overweight women whose scan views are often sub-optimal, but whose exclusion would distort the result. This is not a problem for the creators of standard charts because overweight women are excluded by definition.

2. How to make the actual measurements

If the ultrasonographer placing the measuring calipers can see the value he’s coming up with, he may unconsciously adjust the position to get a normal value, or report to the nearest whole or half millimetre.  The way to avoid such bias and digit preference is to get someone who doesn’t know the woman to place the calipers, ovoids or other boundary markers on a stored image, and only reveal the measurement after placement is judged correct. It’s akin to using a random zero sphygmomanometer to measure blood pressure. Easy in principle, but expensive in practice. Few, if any, of the older charts even attempted it.

3. How many pregnancies to study.

Smoothed growth charts can be produced with small samples. They look good, but by definition include few measurements at the upper and lower centiles, making these outer lines imprecise. Charts based on large samples typically used routinely collected scan data and suffer from the measurement biases described above.

In practice only two modern charts have taken measures using the modern techniques, avoided bias, and had a sufficiently large sample size to estimate the outer centiles with any degree of precision. These are Intergrowth-21 (click here) and WHO (click here). We will return to them. Tomorrow some more technical stuff, the difference between reference or population charts and healthy or standard charts (click here).

Jim Thornton

Standard, Population & Customised fetal size charts 1 – newborns

September 3, 2019

Universal fetal growth standard charts, derived from healthy normal mothers having healthy normal pregnancies, have not been widely adopted, at least not in the UK. Instead local population charts, and charts “customised” for various parental factors, keep popping up. Most experts think both the latter are a bad idea; click here for an up-to-date technical description of the reasons. This series will explain why in lay language. The debate can get emotive, so before I tackle fetal charts, let’s look outside pregnancy. After birth no-one customises on anything but the baby’s sex. Or do they?

Newborn & Child Growth Charts

In 1997 the World Health Organisation (click here) developed standard charts for newborns and children. They based them on the healthy breast-fed babies of healthy well-nourished mothers. They were uncontroversial, and are now used almost everywhere, including the UK. Of course in some countries with high rates of malnutrition, more than half of babies fall below the 50th centile, but no-one argues the charts are wrong. They carry on using them and concentrate on improving nutrition. In other countries with high rates of bottle feeding or other types of over-nutrition, local babies appear bigger than expected by the charts.  But again no-one argues the charts are wrong; the abnormally big babies grow into obese adults. We concentrate on encouraging breast feeding and healthy diets. With one exception.

United States babies are also significantly larger than those on whom the WHO charts were based. The reason is also that many are bottle-fed or over-fed in other ways. We can see the consequences.  But some US doctors argued for local US population charts prepared by the US Centre for Disease Control (CDC). Such population or “reference” charts are a first step towards customisation. In this case a boy’s chart is “customised” for an “American boy”. Take a look at the two charts below. Left – WHO “standard” chart.  Right – US population “reference” chart. Ignore the CDC logo on both.


The WHO 50th centile for 18-month-old boys is 10.9 kg, and the US 50th centile 11.8 kg.  Nearly a ten percent, 1 kg, difference – in means! Those doctors who use the US population charts are telling the parents of overweight boys that their son is “normal weight for America”, and missing an opportunity to improve his diet and prevent him growing into an obese adult.

If you’re an obstetrician or midwife, and the above seems obvious and sensible, there’s no need to read on. Make sure your hospital scan department is using a universal growth standard chart (WHO or Intergrowth-21) and get on with more important things. But if you’ve been told that customised charts are a good idea, and are unconvinced by the above, the next few posts are for you.

Tomorrow some technical background (click here).

Jim Thornton



August 31, 2019

Jim’s prediction

I like to predict what trials will show before I see the results. For my reasons click here.

The PHOENIX trial in this week’s Lancet (click here) tested the effect of immediate or delayed delivery for women with late preterm (34-37 weeks) pre-eclampsia.  The researchers randomised 450 women (471 infants ) to planned delivery and 451 (475) to expectant management. The primary maternal outcome was a composite of death, morbidity or a systolic blood pressure of at least 160 mm Hg, and the primary fetal one a composite of death or neonatal unit admission.  Nottingham was a participating centre but, apart from recruiting a few participants, I had no involvement.

I favoured delay. I thought planned delivery would reduce trivial adverse maternal events such as episodes of high blood pressure, but nothing that mattered, and that it would harm the baby. In May of this year, before I’d seen any results, I wrote on (click here):

1. “The primary maternal outcome will favour immediate delivery. This will be statistically significant at the P<0.05 level. However after exclusion of the component “recorded systolic blood pressure ≥160 mmHg” from the primary maternal composite outcome the difference will no longer be nominally significant. I appreciate that this could be judged a data driven analysis, which is why I am registering my prediction here.”

2. “The primary short term baby outcome will favour expectant management. This will be statistically significant at the P<0.05 level.”

I was partially correct. The primary maternal outcome was reduced by early delivery, 289 (65%) v 338 (75%), relative risk 0·86, 95% CI 0·79–0·94; p=0·0005, and the primary fetal one increased, 196 (42%) v 159 (34%), RR 1·26, 1·08–1·47; p=0·0034). We can be confident that both effects are real. The trial was registered, the outcomes pre-defined, the sample size large, everyone was followed-up and the differences are unlikely to have occurred by chance.

However I was wrong to predict that the reduction in adverse maternal outcomes would disappear when raised BP was excluded. The top row of table 3 “maternal morbidity composite outcome” i.e. the composite without the raised BP component, was 68 (15%) v 90 (20%), RR 0.76; 0.59-0.98. The intervention really does reduce maternal morbidity. The authors argue that this strengthens the argument in favour of early delivery. But let’s look at what the morbidity consisted of.  Supplemental appendix table 3.  Not easy to access, so I’ve tidied it up below.

Planned delivery (n=448) Expectant management (n=451)
Maternal death * 0 (0%) 1 (0%)
Eclampsia 3 (1%) 4 (1%)
Inotropic support 0 (0%) 1 (0%)
Infusion of 3rd parenteral antihypertensive drug 2 (0%) 0 (0%)
Myocardial ischaemia or infarction 1 (0%) 0 (0%)
SpO2 <90% 2 (0%) 3 (1%)
≥50% FiO2 for >1 hr 1 (0%) 0 (0·0)
Intubation (other than for caesarean section) 2 (0%) 0 (0%)
Pulmonary oedema 1 (0%) 2 (0%)
Transfusion of any blood product 20 (5%) 23 (5%)
Platelet count <50×10⁹ per L, with no transfusion 2 (0%) 4 (1%)
Hepatic dysfunction 44 (10%) 63 (14%)
Acute renal insufficiency (creatinine >150 µmol/L) 3 (1%) 4 (1%)
Total women with morbidity** 68 (15%) 90 (20%)

*The maternal death occurred unexpectedly, 5 days after delivery in a woman with medical co-morbidities, and was judged to be unrelated to the trial. **The numbers don’t add up because some women had multiple morbidities. No participants had a Glasgow coma score <13, stroke or reversible ischaemic neurological deficit, transient ischaemic attack, cortical blindness or retinal detachment, posterior reversible encephalopathy, hepatic haematoma or rupture, or acute renal failure (creatinine >200 µmol/L).

Pretty much all the difference was in worsening tests revealing liver or renal damage or low platelets. These are the tests which, alongside BP measurement and the fetal heart rate pattern, we use to monitor pre-eclampsia, and to judge the timing of delivery with expectant mangement. In well organised hospitals we should be able to prevent really serious adverse events such as stroke, heart attack, permanent organ damage or death. There were few such events and no obvious excess in the expectant management group.

Fetal outcomes are in table 4. No babies died and the only differences were in admission rates to various levels of neonatal unit admission. Here’s the relevant section.

There are more details in the supplementary appendix table 5. Serious adverse baby events were very rare and did not differ between groups.

Here’s my final take. The “harms” to the mother from waiting a bit were limited to abnormal blood or blood pressure tests to which the doctors responded correctly, and from which no long-term harm ensued. The “harms” to the baby from planned delivery were a bit of additional monitoring and oxygen therapy from which all the babies also emerged healthy. The authors conclude that “this trade-off should be discussed with women”. I agree, but I still favour delay.

I made a third prediction on

3. The primary long term baby outcome (PARCA-R at 2 years) will not show any statistically significant difference (at the 5% level) between the groups. However I predict that the point estimates for those measures which had been predefined in the analysis plan e.g. mean or median scores, or rates of scores below various cut offs, will all favour expectant management.

Babies are resilient. They recover pretty well from all the stress that nature and doctors throw at them.  But they prefer to not deliver preterm without a good reason. I’ll have to wait a little longer to find out if I’m right, and the PHOENIX trial is unlikely to be big enough to prove it, but I still think there will be subtle long-term harms from planned early delivery.

Jim Thornton

%d bloggers like this: