Skip to content

Standard, Population & Customised fetal size charts 17 – addendum for twins

September 29, 2019

Twins, especially identical ones,  tend to be smaller than singletons (click here). This has led to the suggestion that they should have special charts (click here, and here). The support group, the Twin and Multiple Birth Association, (TAMBA) apparently supports the idea (click here), but it’s a bad one.

The Perinatal Institute, normally an advocate of customisation,  explains why rather well (click here). To quote:

The [twin] charts […] are based on reference values derived from the whole population, not only from uncomplicated pregnancies. Therefore, they do not represent a normal growth standard but one that may have been affected by an unspecified number of pathological factors. This concept is particularly important in twin pregnancies as they have a substantially increased number of complications.”

“The pattern of slowed growth from 30-32 weeks in many (but not all) twin pregnancies may be pathological due to late onset fetal growth restriction associated with placental insufficiency,
which usually also becomes manifest from around 32 weeks. Adjusting the curves downwards may normalise pathology, reduce recognition of pregnancies at risk, and lead to false reassurance.”

Correct. Twin smallness is not healthy smallness. Twins also have higher rates of stillbirth, cerebral palsy and other problems. The reasons not to customise by ethnic group (click here), height (click here), weight (click here) and parity (click here), also apply to twin customisation. Special “small” charts for twins condemn them to being classified as “normal for twins”.

The Italian group (Ghi et al click here) recognise the problem; “adjusting for multiple pregnancy, thereby shifting the normal range of fetal growth downward, has the potential to mask truly growth restricted twins and increase perinatal morbidity from failure to recognize growth restriction”, but hope that because they excluded twins born before 36 weeks, or below the 5th centile on a singleton chart, that it will go away. They compare their twin chart with a local customised Italian one (click here) but provide no data of the detection rate of pathology.

The STORK study authors (here) compared detection rates for stillbirth, with a singleton chart. But their chosen comparison chart (Poon click here) which correctly excluded babies of smoking mothers, and those with major medical disease, has it’s own problems. Leona Poon not only based her chart on routinely collected data with all the error and bias that entails, but also used the weights of babies who were actually born, which seriously distorts the preterm part of the chart (click here).

Many twin fetuses fall below the Intergrowth-21 or WHO singleton growth standards. Some even below the 1st centile. These are reasons both to use charts with reasonably precise extreme centiles, and to consider delivery before term (click here).  Some experts also recommend basing decisions on estimated fetal weight discordance (EFW) a sign of selective growth restriction, is included as a factor in delivery timing. If so, it is preferable to use the most accurate and unbiased  Intergrowth-21 EFW formula (click here).

TAMBA should think again.

Jim Thornton


Standard, Population & Customised fetal size charts 16 – summary

September 21, 2019

That’s it folks. Let’s summarise.

Standard growth charts tell you how big a healthy fetus from a healthy pregnancy and healthy mother should be. Population charts plot the average for your local population, including unhealthy pregnancies (click here). Customised charts adjust the size on the basis of individual parental features (click here).

Customisation improves detection of pathology if we can be sure that the feature customised on, is not only objective and reproducible, but also not associated with pathology (click here).  Customisation on factors associated with pathology, such as smoking, condemn the small sick fetus damaged by smoking, to being classified as “normal for smoking”. Fortunately no-one customises on maternal smoking.

But some people do customise on maternal ethnicity, weight, height and parity. Ethnicity is not objective (click here) and is associated with pathology (click here). Maternal weight is objective, but at both extremes is associated with pathology (click here). Maternal height is objective but also associated with pathology, albeit only at the lower end (click here). Parity is objective but also associated with pathology (click here).

Fetal sex is objective and negatively associated with pathology, with larger male fetuses having higher mortality (click here). This should make it an ideal factor on which to customise, and is the reason why, after birth, paediatricians routinely use separate charts for boys and girls. But it’s not suitable for routine prenatal use at present.

The main customised charts available in the UK are produced by The Perinatal Institute (click here). They provide customised charts for fundal height and for estimated fetal weight. Their training programmes for measuring fundal height are excellent, but the fetal weight estimates on which those charts are based use Hadlock’s outdated formulae. Their customisation formulae for both fundal height and fetal weight are secret. Their charts customise on maternal ethnicity, weight, height and parity, and are therefore likely to condemn the fetuses of first pregnancies and of mothers whose ethnicity, weight and height are markers of past and present deprivation, to having pathology missed.

Fetal growth charts should be created on large populations, using careful techniques to avoid bias (click here). Only the two standard charts created by Intergrowth-21 (click here) and by WHO (click here) have done this.  A well-publicised academic dispute (click here) between the Intergrowth-21 and WHO authors, involved accusations of plagiarism, but had nothing to with the science. Both charts are scientifically rigorous. However, the Intergrowth-21 sample was twice the size of WHO’s, and that group used slightly more advanced techniques for avoiding bias, so their chart is to be preferred.

Claims that the introduction of customised charts caused the recent fall in UK stillbirth rates do not bear close scrutiny (click here).

Suggestions that standard charts are flawed because they detect different rates of growth restriction or macrosomia in different populations make no sense; that is a feature of growth standards.

Empirical comparisons (click here) between customised and population charts are unhelpful because they compare two type of customised chart. All empirical studies suggesting that customised charts detect more pathology than standard charts have also reported higher false positive rates with customisation. No empirical study has ever shown greater detection rates of pathology for a fixed false positive rate. The best quality empirical study (the POP study from Cambridge), where results were also concealed from clinicians, showed no difference with customisation.

Population and customised charts, as currently available, cannot be recommended.

The Intergrowth-21 or WHO fetal growth standard charts should be used.

Jim Thornton

Standard, Population & Customised fetal size charts 15 – empirical studies

September 20, 2019

How do customised charts perform in practice?

Previous posts have described why, in theory, fetal growth standard charts such as Intergrowth-21 or WHO are preferable to local population or customised charts. Here we look at empirical comparisons. Unfortunately there are few good ones, many are misleading*, and reliable ones difficult to do.

Comparisons between customised and population charts (e.g. here, or here) don’t help; a population chart is simply a type of customised one. Claims that percentages of fetuses falling above and below various centiles vary more with standard charts than with customised ones (Francis et al 2018 click here) miss the point. This is a feature not a bug of standard charts!

Studies comparing the rate of detection of fetuses who are small for gestational age (SGA) by charts A and B, but which either don’t define SGA (Gardosi and Francis 2005 click here) or define it by a third chart C (e.g. Odibo et al. 2018 click here) confuse the issue as well.

We need to compare different chart’s detection rates for the sort of pathology that matters, stillbirth or brain damage, or for surrogates for those things, such as low cord pH, Apgar scores or admission to neonatal intensive care.  Such studies need to fix the false positive rate. Detecting more babies with adverse outcomes is not necessarily better if the false positive rate is also higher. Francis et al 2018 (click here) reported that customised charts would have detected 411 stillbirths compared with Intergrowth’s 229 in a cohort of 1.2 million term singleton pregnancies. However to do so, the customised charts would have classified 10.5% of pregnancies as growth restricted, while Intergrowth-21 would have only classified 4.4%. At least two other studies, (Anderson et al. 2016 click here) and (Pritchard et al. 2018 click here) suggesting that customised charts detect more pathology than Intergrowth-21, also had double the false positive rates with customisation.

A recent large retrospective US study (Kabiri 2019 click here) of five different charts, including Intergrowth-21, WHO and GROW, compared detection rates of a range of different adverse outcomes for a fixed 10% false posive rate (figure 2 in the paper). Intergrowth had the highest sensitivity overall, and GROW the lowest, albeit the differences were small and the confidence intervals overlapping. Another retrospective study from Scotland (click here), which also reported detection rates for fixed false positive rates, showed that partial customisation on two objective factors (maternal height and parity) did not improve detection of pathology.

With one exception, these studies are the best we have. However, such retrospective comparisons where one chart type was used in practice are tricky because clinicians will have acted on the result from the chart in use. If, for example, they induced labour early they may have prevented, or caused, an adverse outcome, so called treatment paradox. Caesarean for a suspected small baby may prevent the death which the chart was correctly predicting. Conversely induction for a false prediction of growth restriction may cause hypoxia in a previously healthy baby as a result of long labour.

This final problem is difficult to avoid. It requires scan findings to be concealed from clinicians. This has only been done once, in the POP study from Cambridge (click here), which showed that customising using the GROW software did not improve detection rates of any adverse outcomes.

And that’s pretty much it. Many other papers purport to show that customisation is good or bad, but they all either reported percentages above or below different centiles, or compared customised with population charts, or detection rates for pathology without fixing false positive rates, or all three, or came to no clear conclusion.

Next and finally, a summary of what we have learned (click here).

Jim Thornton

* Readers beware. Authors often use the terms population, standard and customised charts rather loosely.

Standard, Population & Customised fetal size charts 13 – The Intergrowth/WHO plagiarism controversy

September 18, 2019

Soon after the Intergrowth-21 charts were published, allegations of skullduggery appeared from WHO (click here). Rumours flew around, big money was involved, and the organisations employing each group of researchers, WHO and Oxford University, conducted enquiries. Eventually WHO referred Jose Villar and Stephen Kennedy, two of the leaders of the Intergrowth-21 group, to the UK General Medical Council (click here), who declined to investigate.

Supporters of WHO claimed that the two Intergrowth-21 authors had plagiarised a previous WHO protocol. Villar had been employed by WHO before he moved to Oxford, and Kennedy had been a member of a WHO expert group.

Defenders of Intergrowth-21 say that the idea of creating fetal growth standards was not only “obvious”, but had already been widely discussed in public long before either team set to work.

The full story has never been published, but the editor of the Lancet, who had seen both the WHO and Oxford reports, sided with the Intergrowth-21 group and concluded that at worst it was a matter of academic rivalry (click here).  His comment that the WHO enquiry was “disappointingly insubstantial” clearly stung WHO, who responded by publishing it here, the “report from reviewers”.  Judge for yourself. The WHO also took the opportunity to confirm that.

“WHO has never questioned the scientific validity of the research conducted and the papers produced by Oxford University (as published in the Lancet and other peer-reviewed journals)”.

Three supporters of the WHO allegations wrote a letter which added little (click here). The Oxford report, exonerating the Intergrowth-21 authors, remains unpublished. And there the matter rests.

The issue is long past. But it’s worth recalling, because advocates of alternatives occasionally allow the idea to get out that “There’s something fishy about the Intergrowth or WHO charts”. There is not. The allegations never included any scientific criticism of either. Rather the opposite. The idea of international fetal growth standards was so good, that academic rivals sought credit for it. This mattered once to the people involved, but not to the rest of us.

Next customised charts (click here).

Jim Thornton

Standard, Population & Customised fetal growth charts 14 – GROW

September 17, 2019

Customised charts

The Perinatal Institute (click here) is a leading UK advocate of customisation.  It markets Gestation Related Optimal Weight (GROW) charts, customised on the mother’s height, weight, ethnicity and parity (1). Two charts are typically combined into one physical chart which can be printed out and filed in the woman’s record. The left hand scale shows fundal height in cm and the right hand estimated fetal weight (EFW) calculated from a combination of biparietal diameter, head circumference, abdominal circumference and femur length, using Hadlock’s formula, in grams.  Staff are encouraged to plot fundal height using an X symbol and EFW using a O. Examples below.


The left hand chart is for a normal weight and height British European woman (2), and the right for an underweight Indian woman.  Both customised charts include an estimate called the Term Optimal Weight (TOW). For the British European baby this is 3,429g and for the Indian one 3,042g.  The difference matters.  Imagine that the EFW was 2,000g at 39 weeks. The British European mother would be told that her baby was growth restricted and likely advised to have labour induced, but the Indian woman would be told that the weight was “normal for her ethnicity, height and weight” and probably allowed to let the pregnancy continue.

Their customisation principles are detailed on the website (click here) but the exact formulae in use at any one time are a commercial secret. They are regularly updated on the basis of data sent back to the Institute by their customers.  This is a potential weakness since, although GROW customers have been trained in both fundal height and scan measurements, it is unlikely that the biases which inevitably affect revealed human measures in practice will have all been removed (click here).

The underlying justification for the choice of features on which GROW charts are customised is also weak. Self-reported ethnicity has both theoretical (click here) and practical (click here) problems. Maternal height (click here), weight (click here) or parity (click here) are problematic because underweight, shorter, and first time mothers all have both smaller babies and higher perinatal mortality. Customisation on those features thus risks normalising pathology. It would only be justified if we could be sure that the strength of relation between each feature and adverse outcomes was weaker than that between birth weight and adverse outcomes (click here). There are no such data. Even those WHO standard chart authors who advocate customisation in theory, have not even attempted to provide such data (click here). Customisation by fetal gender makes theoretical sense (click here) but GROW charts do not at present offer this.

Finally GROW only provides customised charts for fetal weight, not for head or abdominal size, or fetal length separately. This condemns obstetricians using GROW charts to either ignore these component measures or to interpret them on a different, non-customised chart (click here).

The Institute’s director, Professor Jason Gardosi, claims that GROW chart introduction was associated with a reduction in stillbirth and that the reduction was greater in English regions with higher uptake of GROW charts. Both claims are doubtful. The following are screen grabs taken from a lecture he regularly gives, available on the Institute’s website (click here). The data are those in his BMJ Open paper (click here). The graph scales appear to have been selected to emphasise a point.


Note how the vertical scale differs between the left and middle slide. The right hand slide (3) is fairer. Stillbirths were falling long before GROW charts were introduced and if anything the trend has levelled off.

The choice of time periods and regions to report may also be selective. See left and middle figures below.


The right hand slide shows the same data for the whole of the UK (3). The rate of fall slowed slightly in England and Wales where GROW software was most widely used, and was steepest in Scotland where GROW software was not in use.

The reasons for this general trend are well understood. Falls in smoking, increased diagnosis and termination for lethal fetal abnormalities, and increased inductions near term, all three of which reduce stillbirths. Given the undoubted benefits of the rest of The Perinatal Institute’s training in encouraging staff to measure fundal height correctly and to act on the results, this hardly suggests that customisation is beneficial.

Next, other empirical comparisons (click here)

Jim Thornton


  1. The Perinatal Institute also provides training in various aspects of maternity care. This latter work is generally agreed to be excellent.
  2. British European is not currently one of the official UK ethnicity groups. Presumably the Perinatal Institute authors mean White British.
  3. I thank Professor Gordon Smith for the national stillbirth data.


Standard, Population & Customised fetal size charts 12 – WHO

September 17, 2019

World Health Organisation (WHO) fetal growth charts

Like Intergrowth-21 these are standard charts. They were published in 2017 (click here).

Participants came from ten urban centres, Rosario Argentina, Campinas Brazil, Kinshasa Congo, Copenhagen Denmark, Assiut Egypt, Paris France, Hamburg-Eppendorf Germany, New Delhi India, Bergen Norway and Khon Kaen Thailand. They were all living below 1,500m, aged 18-40, with BMI 18–30kg/m2, a singleton pregnancy, gestational age at entry between 8-13 weeks, no chronic health problems or long-term medication, no environmental or economic constraints likely to impede fetal growth, non-smokers, with no history of recurrent miscarriages, preterm delivery or birth of a baby <2,500g. The ultrasonographer training and scan techniques were carefully standardised and quality controlled.

There were three major differences with Intergrowth-21.  The WHO sample was smaller, only 1,387, compared with Intergrowth’s 4,321, which will have reduced the precision, especially of the outer centiles. WHO revealed the scan measures on the screen as the ultrasonographer placed the calipers, which could have biased the results (click here). Finally WHO used Hadlock’s formulae to estimate fetal weight, which may have affected those results, albeit in uncertain ways (click here).

The authors noted fetal sex differences in size, which others have also observed (click here). Otherwise, with the exception of estimated fetal weight, see below, the final charts were close to those of Intergrowth-21. Here are the two charts for head circumference. Note they are not exactly comparable since Intergrowth-21 left, gives 3rd, 10th, 50th, 90th and 97th centiles. WHO right, gives the 1st, 5th, 10th, 50th, 90th, 95th, and 99th.


And here they are for abdominal circumference


The WHO researchers also published separate estimated fetal weight charts by country, and noted, in alleged contrast to Intergrowth-21, that some country charts differed significantly from the pooled chart. The explanation may partly be that the two groups of researchers used different statistical techniques to calculate the smoothed centile curves*. But there is no dispute that some geographical size differences remain.

The most likely explanation is that neither group managed to completely exclude malnourished women, or those with other environmental constraints on growth, from the populations they studied. Both made a valiant attempt to produce healthy standard charts, but are unlikely to be the last word on the topic (click here).  Observing small differences, even in apparently healthy pregnancies, between some rich and poor countries does not prove the existence of an innate racial, ethnic or national difference on which we should customise.

Nevertheless seven of the WHO study’s twenty two authors see things differently and have gone on to strenuously argue (click here) that their data support the use of different charts for different populations. They provide no detailed prescription for how this should be done.

They wisely don’t support customisation by ethnicity since “Ethnicity, and particularly self-reported ethnicity, is not a straight-forward characteristic of a person or population”.  Nor customisation by country since this is both impossible for the 185 countries for which WHO produced no chart, and makes even less sense than customisation by ethnicity!

The WHO data showed that older women had bigger babies (2–3% per 10 years), that multiparity increased fetal weight by 1–3%, and that maternal height increased it by 1–2% per 10 cm. All three effects more marked on smaller fetuses. Maternal weight also increased fetal weight by 1–1.5% per 10 kg, but that effect was greatest among larger fetuses. Recognising that none of these effects were large and exerted unequally among different weight centiles, the WHO authors accepted that any customization for individual use would be complicated, although “statistical development, growing computer power, and more data accrual should handle it”.

Perhaps so. But customisation requires more than just showing that size differs according to a particular feature. To make customisation useful we need to also show that the strength of the relationship between size and pathology is stronger than that between the feature and pathology. There is a strong relation between smoking and stillbirth (click here) but no-one wants to customise on that! The WHO authors did not even attempt to measure the strength of the relation between age, parity, height or weight to pathology, and compare each with that of size to pathology.

It is difficult to know exactly what in practice the seven WHO authors recommend, and tempting to conclude that they had some other dispute with Intergrowth-21 (see next post).

WHO and Intergrowth-21 are the best two fetal growth standard charts. Since Intergrowth-21 is based on a larger sample, and used better methods for avoiding bias, their charts are marginally to be preferred. It’s a bonus that they are also user-friendly and have been integrated into most of the leading scan software packages.

Next the WHO-Intergrowth plagiarism dispute (click here).

Jim Thornton

* For those who are interested, the Intergrowth-21 authors tested whether the distribution of values for each gestational age was normally distributed. It was, so they created their standards using a statistical technique, fractional polynomials, that required such a distribution. The WHO researchers in contrast used a technique, quantile regression, which required no assumptions about the data distribution.

Standard, Population & Customised fetal size charts 11 – Intergrowth 21

September 16, 2019

Standard charts

In 2008 the Intergrowth-21 group, funded by Bill Gates, produced a series of growth standard charts for fetuses.  Click here for their website, here for the main report and here for their estimated fetal weight standards, which were published separately. According to some historians Intergrowth-21 was originally a spin off from the World Health Organisation (WHO) fetal growth standard group. However the main reports from Intergrowth-21 preceded publication of those from WHO.

Intergrowth-21 collected scan measures from 4,600 healthy fetuses and healthy mothers in eight geographically distinct urban areas, Pelotas Brazil, Turin Italy, Muscat Oman, Oxford UK,
Seattle USA, Shunyi County Beijing China; Central Nagpur India, and Parklands Nairobi Kenya, where environmental, nutritional and social constraints on fetal growth were judged to be minimal. It was called the Fetal Growth Longitudinal Study (FGLS). They chose cities located below 1600 metres, with low levels of pollution. The women had no clinically relevant medical problems, started antenatal care before 14 weeks, had a height ≥153 cm, a body-mass index (BMI) between 18 and 30 kg/m², a haemoglobin concentration ≥110 g/L, and were not on any special diet. This resulted in a group of educated, affluent, clinically-healthy women, with adequate nutritional status, who by definition were at low risk of fetal growth restriction and preterm birth. The Intergrowth-21 group used all the latest scan methods as well as modern techniques to avoid bias (click here).

The authors found little variation by ethnicity. Specifically there were no statistically significant differences between each geographical area and the pooled data from the other seven. The charts also aligned closely with newborn charts from similar healthy populations.

This lack of important size difference between healthy fetuses from different ethnic groups implies that the differences we see every day between such groups are largely a result of environmental and nutritional factors. Once these are removed the differences disappear.  It is strong evidence against customisation by ethnicity.

But it is also the reason why some enthusiasts for customisation push back so strongly against it. They dispute the statistical methods, or point to other standard charts, notably those from WHO which we will discuss tomorrow, showing small differences between geographic groups, as evidence for customisation. But the converse argument does not apply. Finding ethnic differences even in standard charts, is not strong evidence for customisation. There are small differences, e.g between Seattle and Shunyi County, in Intergrowth-21, and between countries in WHO, but the most likely explanation is that neither group of researchers succeeded completely in removing all study participants who had environmental constraints acting on their pregnancy.

The Intergrowth-21 authors concluded that their charts are the single best growth standard chart for use worldwide, and I agree. We use them in Nottingham.

Next (click here) the WHO standard charts.

Jim Thornton

%d bloggers like this: