# User guide

## Estimated true prevalence using two tests with a Gibbs sampler

This analysis uses a Bayesian approach and Gibbs sampler to estimate the true animal-level prevalence of infection based on testing of individual (not pooled) samples using two (2) tests with imperfect sensitivities and/or specificities. The analysis requires prior estimates of true prevalence and sensitivity and specificity for both tests as Beta probability distributions. Outputs are posterior probability distributions for prevalence, sensitivity and specificity. The analysis assumes that the two tests are independent, conditional on disease status. See Joseph et al. (1995) for more details.

Required inputs for this analysis are:

• the number of samples tested,
• the number of samples in each cell of the 2x2 table of comparative test results,
• alpha and beta parameters for prior Beta distributions for:
• the number of iterations to be simulated in the Gibbs sampler,
• the number of iterations to be discarded to allow convergence of the model,
• lower and upper probability (confidence) limits for summarising the output distributions and
• starting values for the number of truly infected individuals in each cell of the 2x2 table of comparative test results.
• The number of samples tested must be a positive integer and the number of positive samples must be an integer >=0 and <= the number of samples tested. Alpha and beta parameters for prevalence, sensitivities and specificities must be >0 and upper and lower confidence limits must be >0 and <1. Starting values for the numbers of truly infected individuals in each cell of the 2x2 table must be integers >= zero and <= the number of results in that cell. The number of iterations and the number discarded must both be positive integers (>0) and the number discarded must be less than the number of iterations.

For this analysis, the observed results of testing with two tests concurently can be described in a 2x2 table as follows:

 Test 2: Test 1: +ve -ve +ve: a b -ve: c d

where a, b, c & d are the observed number of sample results in each cell. A proportion of these samples in each cell will be from truly infected animals, depending on true prevalence and test sensitivities and specificities. The Gibbs sampler is used to estimate the true number of infected animals represented in each of the cells ( Y1, Y2, Y3 & Y4) and hence to generate posterior probability distributions for true prevalence, and test sensitivities and specificities that best fit the data and the prior distributions provided.

Prior estimates of the true prevalence and test sensitivity and specificity may be based on expert knowledge or on previous data. These estimates are specified as Beta probability distributions, with parameters alpha and beta. Beta probability distributions are commonly used to express uncertainty about a proportion based on a random sample of individuals. In this situation, if x individuals are positive for a characteristic out of n examined, then the alpha and beta parameters can be calculated as alpha = x + 1 and beta = n - x + 1. Alternatively, alpha and beta can be calculated using the Beta distribution utility, provided estimates of the mode and 5% or 95% confidence limits are available from expert opinion.

Outputs from the Gibbs sampler are posterior probability distributions for:

• animal-level prevalence,
• test sensitivity for both tests,
• test specificity for both tests,
• positive and negative predictive values for both tests,
• the numbers of truly infected individuals ( Y1, Y2, Y3 & Y4) in each cell of the 2x2 table describing the comparative test results.

These distributions are described by their:

Because the Gibbs sampler estimates prevalence iteratively, based on the data and the prior distributions, it may take a number of iterations for the model to converge on the true value. Therefore, a specified number of initial iterations must be discarded (not used for estimation) to allow the model to converge on the true values. This number must be sufficient to allow convergence, and should be at least 2000 - 5000. It is also important to carry out an adequate number of iterations to support inference from the results. Suggested minimum values for the total number of iterations and the number to be discarded are provided, but can be varied if desired.

This analysis may take several minutes to complete, depending on the number of iterations required.