In many studies of the relationship between two variables, the goal is to establish that changes in the explanatory variable cause changes in the response variable. Even when a strong association is present, however, the conclusion that this association is due to a causal link between the variables is often hard to justify. What ties between two variables (and others lurking in the background) can explain an observed association? What constitutes good evidence for causation? We begin our consideration of these questions with a set of observed associations. In each case, there is a clear association between variable x and variable y. Moreover, the association is positive whenever the direction makes sense.
Here are some examples of observed association between x and y:
Figure 2.34 shows in outline form how a variety of underlying links between variables can explain association. The dashed double-arrow line represents an observed association between the variables x and y. Some associations are explained by a direct cause-and-effect link between these variables. Figure 2.34(a) shows “x causes y” by a solid arrow running from x to y.
Figure 2.34 Possible explanations for an observed association. The dashed double-arrow lines show an association. The solid arrows show a cause-and-effect link. The variable x is explanatory, y is a response variable, and z is a lurking variable.
Items 1 and 2 in
Example 2.50 are
examples of direct causation.
Even when direct causation is present, very often it is not a
complete explanation of an association between two variables. The best evidence for causation comes from experiments that
actually change x while holding all other factors fixed. If
y changes, we have good reason to think that x caused
the change in y.
“Beware of the lurking variable” is good advice when thinking about an association between two variables. Figure 2.34(b) illustrates common response. The observed association between the variables x and y is explained by a lurking variable z. Both x and y change in response to changes in z. This common response creates an association even though there may be no direct causal link between x and y.
The third and fourth items in Example 2.50 illustrate how common response can create an association. What would be a good candidate for the variable z in these two examples?
For the first item in
Example 2.50, we
expect that inheritance explains part of the association between the
body mass indexes (BMIs) of daughters and their mothers. Can we use
r or
Figure 2.34(c) illustrates
confounding. Both the explanatory variable x and the lurking
variable z may influence the response variable y.
Because x is confounded with z, we cannot distinguish
the influence of x from the influence of z. We cannot
say how strong the direct effect of x on y is. In
fact, it can be hard to say if x influences y at all.
When many uncontrolled variables are related to a response
variable, you should always ask whether or not confounding of
several variables prevents you from drawing conclusions about
causation.
The last two associations in Example 2.50 (items 5 and 6) are explained in part by confounding. What would be a good candidate for the confounding variable z in these two examples?
Many observed associations are at least partly explained by lurking variables. Both common response and confounding involve the influence of a lurking variable (or variables) z on the response variable y. The distinction between these two types of relationships is less important than the common element: the influence of lurking variables. The most important lesson of these examples is one we have already emphasized: even a very strong association between two variables is not by itself good evidence that there is a cause-and-effect link between the variables.
How can a direct causal link between x and y be established? The best method—indeed, the only fully compelling method—of establishing causation is to conduct a carefully designed experiment in which the effects of possible lurking variables are controlled. Chapter 3 explains how to design convincing experiments.
Many of the sharpest disputes in which statistics plays a role involve questions of causation that cannot be settled by experiment. Does gun control reduce violent crime? Does living near power lines cause cancer? Has outsourcing work to overseas locations reduced overall employment in the United States? All these questions have become public issues. All concern associations among variables. And all have this in common: they try to pinpoint cause and effect in a setting involving complex relations among many interacting variables. Common response and confounding, along with the number of potential lurking variables, make observed associations misleading. Experiments are not possible for ethical or practical reasons. We can’t assign some people to live near power lines or compare the same nation with and without strong gun controls.
Electric currents generate magnetic fields. So living with electricity exposes people to magnetic fields. Living near power lines increases exposure to these fields. Really strong fields can disturb living cells in laboratory studies. Some people claim that the weaker fields we experience if we live near power lines cause leukemia in children.
It isn’t ethical to do experiments that expose children to magnetic fields. It’s hard to compare cancer rates among children who happen to live in more and less exposed locations because leukemia is rare, and locations vary in many ways other than magnetic fields. We must rely on studies that compare children who have leukemia with children who don’t.
A careful study of the effect of magnetic fields on children took five years and cost $5 million. The researchers compared 638 children who had leukemia and 620 who did not. They went into the homes and actually measured the magnetic fields in the children’s bedrooms, in other rooms, and at the front door. They recorded facts about nearby power lines for the family home and also for the mother’s residence when she was pregnant. Result: no evidence of more than a chance connection between magnetic fields and childhood leukemia.31
“No evidence” that magnetic fields are connected with childhood leukemia doesn’t prove that there is no risk. It says only that a careful study could not find any risk that stands out from the play of chance that distributes leukemia cases across the landscape. Critics continue to argue that the study failed to measure some lurking variables or that the children studied don’t fairly represent all children. Nonetheless, a carefully designed study comparing children with and without leukemia is a great advance over haphazard and sometimes emotional counting of cancer cases.
Despite the difficulties, it is sometimes possible to build a strong case for causation in the absence of experiments. The evidence that smoking causes lung cancer is about as strong as nonexperimental evidence can be.
Doctors had long observed that most lung cancer patients were smokers. Comparison of smokers and similar nonsmokers showed a very strong association between smoking and death from lung cancer. Could the association be due to common response? Might there be, for example, a genetic factor that predisposes people both to nicotine addiction and to lung cancer? Smoking and lung cancer would then be positively associated even if smoking had no direct effect on the lungs. Or perhaps confounding is to blame. It might be that smokers live unhealthy lives in other ways (diet, alcohol, lack of exercise) and that some other habit confounded with smoking is a cause of lung cancer. How were these objections overcome?
Let’s answer this question in general terms: What are the criteria for establishing causation when we cannot do an experiment?
The association is strong. The association between smoking and lung cancer is very strong.
The association is consistent. Many studies of different kinds of people in many countries link smoking to lung cancer. That reduces the chance that a lurking variable specific to one group or one study explains the association.
Higher doses are associated with stronger responses. People who smoke more cigarettes per day or who smoke over a longer period get lung cancer more often. People who stop smoking reduce their risk.
The alleged cause precedes the effect in time. Lung cancer develops after years of smoking.
The alleged cause is plausible. Experiments show that tars from cigarette smoke cause cancer when applied to the backs of mice.
Medical authorities do not hesitate to say that smoking causes lung cancer. The U.S. Surgeon General states that cigarette smoking is “the largest avoidable cause of death and disability in the United States.”32 The evidence for causation is strong—but it is not as strong as the evidence provided by well-designed experiments.
Some observed associations between two variables are due to a cause-and-effect relationship between these variables, but others are explained by lurking variables.
The effect of lurking variables can operate through common response if changes in both the explanatory and the response variables are caused by changes in lurking variables. Confounding of two variables (either explanatory or lurking variables or both) means that we cannot distinguish their effects on the response variable.
Establishing that an association is due to causation is best accomplished by conducting an experiment that changes the explanatory variable while controlling other influences on the response.
In the absence of experimental evidence, be cautious in accepting claims of causation. Good evidence of causation (1) requires a strong association, (2) appears consistently in many studies, (3) has higher levels of the explanatory variable associated with stronger responses, (4) requires that alleged cause precede the effect in time, and (5) must be plausible.
2.109 Examples of association. Give three examples of association: one due to causation, one due to common response, and one due to confounding. Use your examples to write a short paragraph explaining the differences among these three explanations for an observed association.
2.110 The five criteria for establishing causation. Consider the five criteria for establishing causation when an experiment is not possible. Explain how each of these, if not established seriously, weakens the case that an association is due to causation.
2.111 Iron and anemia. A lack of adequate iron in the diet is associated with anemia, a condition in which the body does not have enough red blood cells. However, anemia is also associated with malaria and infections from worms called helminths. Discuss these observed associations using the framework of Figure 2.34.
2.112 Stress and lack of sleep in college students. Studies of college students have shown that stress and lack of sleep are associated. Do you think that lack of sleep causes stress or that stress causes lack of sleep? Write a short paragraph summarizing your opinions.
2.113 Online courses. Many colleges offer online versions of courses that are also taught in the classroom. It often happens that the students who enroll in the online version do better than the classroom students on the course exams. This does not show that online instruction is more effective than classroom teaching, because the people who sign up for online courses are often quite different from the classroom students. Suggest some student characteristics that you think could be confounded with online versus classroom. Use a diagram like Figure 2.34(c) to illustrate your ideas.
2.114 Marriage and income. Data show that men who are married, and also divorced or widowed men, earn quite a bit more than men who have never been married. This does not mean that a man can raise his income by getting married. Suggest several lurking variables that you think are confounded with marital status and that help explain the association between marital status and income. Use a diagram like Figure 2.34(c) to illustrate your ideas.
2.115 Exercise and self-confidence. A college fitness center offers an exercise program for staff members who choose to participate. The program assesses each participant’s fitness, using a treadmill test, and also administers a personality questionnaire. There is a moderately strong positive correlation between fitness score and score for self-confidence. Is this good evidence that improving fitness increases self-confidence? Explain why or why not.
2.116 Computer chip manufacturing and miscarriages. A study showed that women who work in the production of computer chips have abnormally high numbers of miscarriages. The union claimed that exposure to chemicals used in production caused the miscarriages. Another possible explanation is that these workers spend most of their work time standing up. Illustrate these relationships in a diagram like one of those in Figure 2.34.
2.117 Hospital size and length of stay. A study shows that there is a positive correlation between the size of a hospital (measured by its number of beds x) and the median number of days y that patients remain in the hospital. Does this mean that you can shorten a hospital stay by choosing a small hospital? Use a diagram like one of those in Figure 2.34 to explain the association.
2.118 Watching TV and low grades. Children who watch many hours of television get lower grades in school, on the average, than those who watch less TV. Explain clearly why this fact does not show that watching TV causes poor grades. In particular, suggest some other variables that may be confounded with heavy TV viewing and may contribute to poor grades.
2.119 Artificial sweeteners. People who use artificial sweeteners in place of sugar tend to be heavier than people who use sugar. Does this mean that artificial sweeteners cause weight gain? Give a more plausible explanation for this association.
2.120 Exercise and mortality. A sign in a fitness center says, “Mortality is halved for men over 65 who walk at least 2 miles a day.”
Mortality is eventually 100% for everyone. What do you think “mortality is halved” means?
Assuming that the claim is true, explain why this fact does not show that exercise causes lower mortality.
2.121 Effect of a math skills refresher initiative. Students enrolling in an elementary statistics course take a pretest that assesses their math skills. Those who receive low scores are given the opportunity to take three one-hour refresher sessions designed to review the basic math skills needed for the statistics course. Those who took the refresher sessions performed worse than those who did not on the final exam in the statistics course. Can you conclude that the refresher course has a negative impact on performance in the statistics course? Explain your answer.