# User guide

## Simulate sampling for fixed pool size

These utilities were developed as additional tools to help in the evaluation of the validity and precision of different pooling strategies for fixed pool sizes. Three different options are provided depending on assumptions about test sensitivity and specificity. The three options are:

- assuming both sensitivity and specificity are perfect (100%);
- assuming sensitivity and/or specificity are less than 100% but are known exactly and
- assuming that sensitivity and/or specificity are uncertain.

These programs each use two separate pairs of values for sensitivity and specificity. The first pair of values are the values used in estimating true prevalence from the simulated testing results (for the perfect test option estimated sensitivity and specificity are both assumed to be 100% and cannot be entered in the input screen). The second pair of values ('True test sensitivity/specificity') are used to determine the actual results of testing during simulation. By specifying different values for the true test sensitivity (specificity) and estimated sensitivity (specificity) it is possible to evaluate the importance of potential errors in the assumed values used. For example, if prevalence is estimated assuming that the test is perfect (both sensitivity and specificity are 100%) but in fact the true sensitivity is say 80%, the true prevalence would be substantially underestimated, resulting in a biased estimate. This model allows estimation of the magnitude of this bias.

All three methods simulate sampling and prevalence estimation for up to 6 different pooling strategies for assumed values of prevalence and test sensitivity and specificity and for a specified level of confidence, assuming that a fixed pool size is used. The program runs multiple iterations of sampling and estimation and calculates the mean prevalence, confidence interval width and estimated bias across all iterations. By simulating alternative pooling strategies this utility allows the various strategies to be evaluated and compared to determine the optimum strategy that will give the desired level of precision in the prevalence estimate and also minimise the level of bias in the estimate.

For each pooling strategy, the program simulates sampling, pooling and testing of individuals from an infinite population with the specified prevalence, using a test of the specified true sensitivity and specificity. Sampling and testing is repeated for the specified number of iterations for each strategy and the prevalence, confidence interval width and variance are estimated for each iteration using the selected method and assumed values for sensitivity and specificity. The mean prevalence, bias, confidence interval width and variance are calculated across all iterations for each strategy, where mean bias is the mean prevalence estimate less the true (design) prevalence for the population. Mean square error (mean variance plus square of mean bias) is also calculated, and the magnitude of the mean bias is also calculated as proportions of the mean estimated prevalence, the true (design) prevalence and the mean square error.

Outputs for each method are summarised across all iterations for each strategy entered and presented in a summary
table. The main outputs are:

- mean prevalence;
- minimum and maximum prevalence estimates;
- mean bias in the estimated mean prevalence;
- mean confidence interval width;
- mean standard error of the estimated prevalence;
- mean squared error of the estimated prevalence (mean variance plus the square of the mean bias);
- relative bias as a proportion of the mean estimated (apparent) prevalence (AP);
- relative bias as a proportion of the specified design (true) prevalence (TP);
- squared mean bias as a proportion of the mean squared error;
- the proportion of 'valid' estimates, where the confidence interval for the estimated prevalence contains the true (design) prevalence.
- detailed results for all iterations for each strategy (download as a text file by clicking on the appropriate icon in the summary results table and
- histogram of the distribution of prevalence estimates (view or download by clicking on the appropriate icon in the summary results table.

For fixed pool sizes and perfect tests or tests of known sensitivity and specificity, exact binomial confidence limits are used. For fixed pool sizes and tests of uncertain sensitivity and specificity, asymptotic confidence limits are used. For fixed pool sizes and tests of known sensitivity and specificity, the width of the simulated (exact) confidence intervals may be substantially wider than the corresponding asymptotic confidence intervals. Therefore, sample sizes calculated using asymptotic methods for known sensitivity and specificity may be inadequate to give the desired precision if exact confidence limits are calculated, and may need to be increased if the desired precision is to be achieved.

It is important to enter pool sizes and associated numbers of pools tested from the top of the table. You must enter at least one row of valid values, and any rows entered must be complete. All values must be positive integers. Any row in the input table that includes an invalid value will be ignored, as will any subsequent rows.