# 3 - Bayesian vs frequentist methods

The analytical methods provided on this site all fall into one of two broad categories of statistical methods: frequentist or Bayesian.

Frequentist methods use conventional statistical techniques to calculate maximum-likelihood estimates of true prevalence and confidence limits, in a similar manner to standard techniques used to analyse conventional survey data. Frequentist methods are conceptually simpler to understand and are usually computationally easier to implement (and take less computer time to run). They do not take account of any existing knowledge of the likely prevalence, although some methods do allow for adjustment of estimates for imperfect sensitivity and specificity of the tests used.

On the other hand, Bayesian analysis uses simulation (using a Gibbs sampler) to derive a posterior probability distribution(s) for the parameter(s) of interest - usually true prevalence but distributions for sensitivity, specificity and other parameters are also generated.

Bayesian methods also have the advantages that:

• pre-existing estimates of prevalence can be incorporated in the analysis to increase confidence in the results;
• imperfect sensitivity and specificity of tests and uncertainty about their true values are incorporated explicitly in the procedure; and
• the lower probability limit of the estimated prevalence can never be negative.

Briefly, a Bayesian approach allows the combination of any prior information available on test sensitivity and specificity and estimated prevalence of disease with the results of testing, to produce a posterior probability distribution of the estimated true prevalence (and other measures such as test sensitivity and specificity) that best fits the combination of prior distributions and observed testing results. Bayesian methods were initially developed for estimating prevalence from individual testing and were subsequently extended for use with pooled testing strategies.

However, Bayesian estimates can be seriously affected by the use of inappropriate prior distributions (inaccurate estimates and/or overconfidence in the values) for prevalence, sensitivity or specificity and therefore must be used with care. Wherever possible prior estimates should be based on real data and should be appropriately weighted (wide probability limits) to ensure that any errors do not dominate the data, causing inaccurate results. See the Glossary for more details on Bayesian methods and the Beta distribution and its parameters.

Bayesian also rely on simulation rather than analytical methods, and therefore can take some time to run, depending on the number of iterations used. It is also that sufficient iterations are run to allow convergence of the Bayesian model and to support inference from the results.

Outputs from the Gibbs sampler are revised estimates of prevalence, test sensitivity, test specificity and other parameters as posterior probability distributions.

Contents
1 Introduction
2 Overview
3 Bayesian vs Frequentist methods
4 Fixed pool size and perfect tests
5 Fixed pool size and known Se & Sp
6 Fixed pool size and uncertain Se & Sp
7 Variable pool size and perfect tests
8 Pooled prevalence using a Gibbs sampler
9 True prevalence using one test
10 Estimated true prevalence using two tests with a Gibbs sampler
11 Estimation of parameters for prior Beta distributions
12 Sample size for fixed pool size and perfect test
13 Sample size for fixed pool size and known test sensitivity and specificity
14 Sample size for fixed pool size and uncertain test sensitivity and specificity
15 Simulate sampling for fixed pool size
16 Simulate sampling for variable pool sizes
17 Important Assumptions
18 Pooled prevalence estimates are biased!