# User guide

## Pooled prevalence estimates are biased!

For all of the frequentist methods for estimating prevalence from pooled samples, p is an upwardly biased estimator of the true prevalence. The magnitude of the bias is decreased with lower prevalence (p), increased numbers of pools (m), smaller pool size (k) and increased total sample size (n). Bias also increases as the probability that all pools will test positive increases.

In general, bias is negligible for m > 30. However, even for quite small values for m and k the bias may also be quite small, particularly if p is low.

The estimated prevalence (p) is also sensitive to (and may be biased by) errors in the assumptions of perfect test sensitivity or specificity. p is particularly sensitive to errors in sensitivity as p increases and if k is too large. Clustering or overdispersion of positive individuals in the sampled population can also result in substantial bias in prevalence estimates.

The actual bias in any estimate depends on the true prevalence, pool size and the number of pools and can be estimated for any particular pooling strategy using simulation methods. Simulation utilities are provided for both fixed and variable pool-size strategies to assist in evaluating the potential bias in proposed pooling strategies.

Bias can be minimised by ensuring an adequate total sample size, by testing a larger number of pools of smaller size, rather than vice versa or by testing several individual samples in addition to the pooled samples (using the variable pool size method).