Standard, Population & Customised fetal size charts 15 – empirical studies
How do customised charts perform in practice?
Previous posts have described why, in theory, fetal growth standard charts such as Intergrowth-21 or WHO are preferable to local population or customised charts. Here we look at empirical comparisons. Unfortunately there are few good ones, many are misleading*, and reliable ones difficult to do.
Comparisons between customised and population charts (e.g. here, or here) don’t help; a population chart is simply a type of customised one. Claims that percentages of fetuses falling above and below various centiles vary more with standard charts than with customised ones (Francis et al 2018 click here) miss the point. This is a feature not a bug of standard charts!
Studies comparing the rate of detection of fetuses who are small for gestational age (SGA) by charts A and B, but which either don’t define SGA (Gardosi and Francis 2005 click here) or define it by a third chart C (e.g. Odibo et al. 2018 click here) confuse the issue as well.
We need to compare different chart’s detection rates for the sort of pathology that matters, stillbirth or brain damage, or for surrogates for those things, such as low cord pH, Apgar scores or admission to neonatal intensive care. Such studies need to fix the false positive rate. Detecting more babies with adverse outcomes is not necessarily better if the false positive rate is also higher. Francis et al 2018 (click here) reported that customised charts would have detected 411 stillbirths compared with Intergrowth’s 229 in a cohort of 1.2 million term singleton pregnancies. However to do so, the customised charts would have classified 10.5% of pregnancies as growth restricted, while Intergrowth-21 would have only classified 4.4%. At least two other studies, (Anderson et al. 2016 click here) and (Pritchard et al. 2018 click here) suggesting that customised charts detect more pathology than Intergrowth-21, also had double the false positive rates with customisation.
A recent large retrospective US study (Kabiri 2019 click here) of five different charts, including Intergrowth-21, WHO and GROW, compared detection rates of a range of different adverse outcomes for a fixed 10% false posive rate (figure 2 in the paper). Intergrowth had the highest sensitivity overall, and GROW the lowest, albeit the differences were small and the confidence intervals overlapping. Another retrospective study from Scotland (click here), which also reported detection rates for fixed false positive rates, showed that partial customisation on two objective factors (maternal height and parity) did not improve detection of pathology.
With one exception, these studies are the best we have. However, such retrospective comparisons where one chart type was used in practice are tricky because clinicians will have acted on the result from the chart in use. If, for example, they induced labour early they may have prevented, or caused, an adverse outcome, so called treatment paradox. Caesarean for a suspected small baby may prevent the death which the chart was correctly predicting. Conversely induction for a false prediction of growth restriction may cause hypoxia in a previously healthy baby as a result of long labour.
This final problem is difficult to avoid. It requires scan findings to be concealed from clinicians. This has only been done once, in the POP study from Cambridge (click here), which showed that customising using the GROW software did not improve detection rates of any adverse outcomes.
And that’s pretty much it. Many other papers purport to show that customisation is good or bad, but they all either reported percentages above or below different centiles, or compared customised with population charts, or detection rates for pathology without fixing false positive rates, or all three, or came to no clear conclusion.
Next and finally, a summary of what we have learned (click here).
Jim Thornton
* Readers beware. Authors often use the terms population, standard and customised charts rather loosely.
Trackbacks