Samuel Litwin, PhD
Associate Professor 

Minimalist Power Analysis for Continuous Comparisons
A common problem facing investigators comparing two treatments is sample size. For example, a new treatment is postulated to increase the activity of a certain enzyme in mice. How many treatment and control mice are needed to see if it works? Knowing the mean and variance of enzyme levels, a design based on the T distribution will answer the question. But what if there are no data on the variability of this enzyme level? Such is often the case when investigators enter new areas. It is still possible to determine a sample size that is guaranteed to meet the requirements of power and type I error. (Where power is the chance of saying the treatment works when it does, type I error is the chance of saying treatment works when it doesn’t.) This can be done on the premise that enzyme level measurements are continuously distributed, or nearly so. We assume only that the enzyme levels in the several mice can be sorted, from largest to smallest, i.e., the chance of ties is small.
With this assumption, the set of enzyme levels may be divided into two groups, those above and those below an arbitrary cut point. The resulting, doubly dichotomized, mouse data may be organized into a twobytwo table:
The chance that a particular enzyme level will be above the cutpoint divided by the chance that it will not is the ‘odds’ of being above the cutpoint. The ‘odds ratio,’ψ, is the odds that the enzyme level in a treated mouse will be above the cut point, divided by the odds for an untreated mouse. If the odds ratio is one (ψ= 1), the treatment has no effect. If ψ > 1, then treatment works. Thus, ψ is a measure of treatment effect. The investigator will usually require that the experiment have a small type I error, e.g., 5% and that it be able to detect a reasonable treatment effect. If ψ= 3 the investigator might want the chance of saying the treatment works to be at least 90%.
The power and type I error of Fisher’s exact test, applied to the twobytwo table, may be computed from knowledge of n_{1}, n_{2}, ψ and m_{1}. The first two are simply the numbers of mice in the two groups. The value of ψ, deemed important to distinguish from ψ = 1, is decided by the investigator. m_{1} is the number above the cutpoint. The assumption that the data contains no ties allows us to assign m_{1} as we please without having the data. This, in turn, allows computation of power and type I error. We proceed to compute power for each possible value of m_{1} from 1 to N  1. We select the cut point m_{1} to give the maximum possible power. In a variation of this method, we set the required power and type I error in advance, then find the smallest odds ratio that will be distinguishable from 1.0 on this basis.
As an example, a median cutpoint, m_{1}, is used in the table below. The table allows an odds ratio of 2.9991 to be distinguished from one of 1.0 with 90% power and 4.71% type I error. If there are at least 34 treated mice within the top 58 enzyme levels, a = 34, we can declare that the treatment worked.
In case the two groups are of unequal size, as might occur in patients either recurring or not in response to treatment, the cutpoint selected for sensitivity to an odds ratio above 1.0 will usually not be the median. For example, if 87 individuals respond and 44 do not, the optimal cutpoint is m_{1} = 80. In this case Fisher’s exact test can distinguish an odds ratio of ψ = 2.9928 from ψ = 1 with 90% power and 4.92% type I error. The null hypothesis that ψ = 1 is rejected if a ≥ 58.
This illustrates establishing an optimal cutpoint after group sizes are determined, e.g., from archives, but before the continuous data set is obtained.
Top