Calculating a Confidence Interval for Task Completion

Using the Adjusted Wald Method

by Tom Tullis
Originally posted March 28, 2008; last modified March 29, 2008

Sauro and Lewis (2005) and Lewis and Sauro (2006) demonstrated that the Adjusted Wald Method of calculating a confidence interval works well for many of the situations we encounter in usability testing. The basic idea behind the Adjusted Wald Method (Agresti & Coull, 1998) is that you need to adjust the observed proportion of task successes to take into account the small sample sizes commonly used in usability tests. The formula for calculating the Adjusted Wald confidence interval is as follows:

p_adj ± z * sqrt(p_adj(1- p_adj)/n_adj)

where:
n = total number of trials
p = proportion of trials that were successes
z = the z-value corresponding to the desired confidence level
p_adj = (n*p + z²/2)/(n + z²)
n_adj = n + z²

For example, assume that 4 out of 5 users successfully completed a given task, and that you want to use a 95% confidence level. Given those assumptions:

n = 5
p = 0.8
z = 1.96

p_adj = (5*0.8 + (1.96^2)/2)/(5 + 1.96^2)
= (4 + 1.9208)/(5 + 3.8416)
= 5.9208/8.8416
= 0.6696

n_adj = 5 + 1.96^2
= 5 + 3.8416
= 8.8416

And finally, the calculation of the confidence interval:

p_adj ± z * sqrt(p_adj(1- p_adj)/n_adj)
0.6696 ± 1.96 * sqrt(0.6696(1-0.6696)/8.8416)
0.6696 ± 1.96 * sqrt(0.2212/8.8416)
0.6696 ± 1.96 * 0.1582
0.6696 ± 0.3100

Or:
Lower Limit = 0.3596
Upper Limit = 0.9796

That means the 95% confidence interval if you observed 4 successes out of 5 trials is approximately 36% to 98%.

Here is a simple spreadsheet for doing these calculations. And here is a link to Jeff Sauro's online calculator using the Adjusted Wald Method.

[Page reference in book: p. 69.]

References

Agresti, A., & Coull, B. (1998). Approximate is better than 'exact' for interval estimation of binomial proportions. The American Statistician, 52, 119-126.

Lewis, J., & Sauro, J. (2006). When 100% really isn't 100%: Improving the accuracy of small-sample estimates of completion rates. Journal of Usability Studies, Vol. 1, #3, May 2006, 136-150.

Sauro, J., & Lewis, J. (2005) Estimating Completion Rates from Small Samples using Binomial Confidence Intervals: Comparisons and Recommendations. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Orlando, FL. http://www.measuringusability.com/papers/sauro-lewisHFES.pdf.

Comments? Contact Tom@MeasuringUX.com.

Measuring UX Homepage