Confounding Factors

An experiment tests a null hypothesis by examining the effect of a factor that is suspected of having an influcence. There are at least two levels of the factor whose influence are measured: the amount of the factor that should produce the effect (the experimental condition) and the lack of the factor (the control, or null, condition). For instance, if you are testing the influence of water on plant growth, you apply the amount of water that should elicit a growth response to one group of plants and do not apply water to another group of plants.

Other Relevant Factors
There are many factors besides water that affect plant growth: sunlight, nutrients, herbivores, plant species, etc. If such factors are not dealt with in the experiment, they could confound your results, such as: the results show no effect of water, show that water decreases plant growth, or that water sometimes stimulates growth and sometimes inhibits growth. You could have made a Type I or Type II error or spent your time and money without any useful result. Factors that affect the response but are not being tested are confounding factors. If the influence of such factors is not considered in the design of the experiment or the analysis of results, the outcome of the experiment could show an effect of the factor you are testing that is not real or show no effect when it really has an influence.

Adjusting for Confounding Factors
confounding factors:http://junkcharts.typepad.com/junk_charts/table/page/2/Experimentally
The variation in results that are caused by confounding factors can be eliminated by making the confounding factors the same for all test conditions. For example, the same soil is used for all plants, all plants are exposed to the same schedule of light, herbivores are excluded, and the same type of plant is tested. Note that, since you have not varied those factors, your results strictly apply only to the set of conditions under which the plants were grown.

Statistically
Knowing that there are other relevant factors, you record them as you test for the effect of water. For example, you water plants in the wild and don't water an equal number, making sure that the water you provide is the only water each plant receives during the experiment. In addition, you measure the amount of light each receives, you test the soil in which each is growing, you measure herbivory on each, and you record the species of plant. Naturally, the number of plants tested would have to be quite large to provide sufficient numbers for each factor alone and in combination.

Experimentally and Statistically
It is possible to expand your experiment to apply more than one level of the other relevant factors and measure the response of the plants under variations of all factors. Statistics are used to separate the individual effects of each factor and any effects of their combinations so that the main effect of water can be calculated. An experiment that includes the simultaneous effects of more than one factor is called a factorial experiment. Obviously, the number of plants tested would be greater than those needed when using only one level of the other relevant factors, but you would gain insight into plant response when other factors change.

Number of Subjects
Statistical comparisons often compare the average group response relative to variability within the groups. Even when averages are not the same, a large amount of variabilty may prevent you from concluding that the difference in the averages was great enough to show that the treatments had the effect you were trying to prove.

When all test subjects are carefully picked to conform to the same characteristics, then the variability in response to the treatment from subjects should be very low and few subjects would be needed. On the other hand, you generally have much less control over many factors when picking volunteers to participate in an experiment, and responses may show variability from the effects of many factors you were unable to control. Even if you used a questionnaire to impose uniformity or by excluding certain categories altogether, you would need a relatively large number of subjects to reduce the variabilty in the statistics so that a difference between treatments would be statistically significant.

Some Examples
1. The class web page on Hormone Replacement Therapy describes a conclusion about the therapy and heart disease that turned out to put women at higher risk than opting for no treatment. 2. For years, studies of heart disease conducted only on men excluded the possible confounding factor of gender. As a consequence, doctors had to guess at the application of test results to women. 3. The polarizing controversy over whether reducing the US national debt was more important than increasing government spending to stimulate jobs in a sluggish economy came close to a government default in late 2011. When conducting a survey among Americans on the relative importance of those two issues, a critical factor affecting the analysis is the political party of each person surveyed.
References: What is a confounding factor?, What are confounding factors and how do they affect studies?, Following one's nose 2