 ## Confidence intervals

(go to Outline)

In surveys, the most common measurement of sampling error is the 95% confidence interval. But what does "95% confidence interval" mean?

Now you know the statistical definition, but what does this really mean when doing a survey?

Below is a schematic drawing of a 95% confidence interval. The horizontal black line shows the possible range of prevalence for our health outcome, for example, stunting in children 6-59 months of age. We do a survey, and the point estimate from the survey sample is 45%, that is, 45% of the children in the survey sample are stunted. But because this estimate is based on a randomly selected sample, it is subject to sampling error. The confidence interval means that if you did many surveys in the same population using the same sample size and the same methods, for 95% of these surveys, the confidence intervals will include the true population value. As a result, if we have only one survey, we would be about 95% sure that the true population value falls within the confidence interval. This is because for 95% of the hypothetical replications of the survey, the true population value does lie within the confidence interval.

Although we know that the true prevalence of stunting in young children in the entire population is probably not 45%, we are 95% sure that the true prevalence is somewhere within the bracket. This is a confidence interval. If we add specific numbers to this schematic drawing, it may help relate the previous drawing to what you see on the page of a report (for example, prevalence of stunting = 45%; 95% CI: 35%,55%).

The left drawing below shows a survey with large sampling error, probably because it had a small sample size. The right drawing shows a survey with a much smaller sampling error, probably because the sample size was larger. Note that the point estimate is the same for both surveys, 45%. The drawing below is another way of visualizing confidence intervals. It imagines that a single survey is a dart which produces a single estimate of some health outcome, for example, the prevalence of having a safe water supply. If the sampling error is large because the sample size of the survey was small, the dart might have a large circle of uncertainty. We may be 95% sure that the true population value is somewhere in the circle, but if the circle is large, this survey result may not be very useful. If the sampling error is small because the sample size was large, the circle of certainty may be much smaller, as shown on the right. Now if we are 95% sure that the true population value is within this small circle, the survey result may be very useful. 