![]() |
Calculating a Confidence Interval for Task CompletionUsing the Adjusted Wald Methodby Tom Tullis |
Sauro and Lewis (2005) and Lewis and Sauro (2006) demonstrated that the
Adjusted Wald Method of calculating a confidence interval works well for
many of the situations we encounter in usability testing. The basic idea
behind the Adjusted Wald Method (Agresti & Coull, 1998) is that you
need to adjust the observed proportion of task successes to take into
account the small sample sizes commonly used in usability tests. The
formula for calculating the Adjusted Wald confidence interval is as
follows:
padj ± z * sqrt(padj(1- padj)/nadj)
where:
n = total number of trials
p = proportion of trials that were successes
z = the z-value corresponding to the desired confidence level
padj = (n*p + z2/2)/(n + z2)
nadj = n + z2
For example, assume that 4 out of 5 users successfully completed a given task, and that you want to use a 95% confidence level. Given those assumptions:
n = 5
p = 0.8
z = 1.96
padj = (5*0.8 + (1.96^2)/2)/(5 + 1.96^2)
= (4 + 1.9208)/(5 + 3.8416)
= 5.9208/8.8416
= 0.6696
nadj = 5 + 1.96^2
= 5 + 3.8416
= 8.8416
And finally, the calculation of the confidence interval:
padj ± z * sqrt(padj(1- padj)/nadj)
0.6696 ± 1.96 * sqrt(0.6696(1-0.6696)/8.8416)
0.6696 ± 1.96 * sqrt(0.2212/8.8416)
0.6696 ± 1.96 * 0.1582
0.6696 ± 0.3100
Or:
Lower Limit = 0.3596
Upper Limit = 0.9796
That means the 95% confidence interval if you observed 4 successes out of 5 trials is approximately 36% to 98%.
Here is a simple spreadsheet for doing these calculations. And here is a link to Jeff Sauro's online calculator using the Adjusted Wald Method.
[Page reference in book: p. 69.]
Agresti, A., & Coull, B. (1998). Approximate is better than 'exact' for interval estimation of binomial proportions. The American Statistician, 52, 119-126.
Lewis, J., & Sauro, J. (2006). When 100% really isn't 100%: Improving the accuracy of small-sample estimates of completion rates. Journal of Usability Studies, Vol. 1, #3, May 2006, 136-150.
Sauro, J., & Lewis, J. (2005) Estimating Completion Rates from Small Samples using Binomial Confidence Intervals: Comparisons and Recommendations. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Orlando, FL. http://www.measuringusability.com/papers/sauro-lewisHFES.pdf.
Comments? Contact Tom@MeasuringUX.com.