9.1 Inference for Two-Way Tables in Chapter 9 Inference for Categorical Data

9.1 Inference for Two-Way Tables

When you complete this section, you will be able to:

Find the joint distribution, the marginal distributions, and the conditional distributions for a two-way table of counts.
Choose appropriate conditional distributions to describe relationships in a two-way table.
Compute the expected cell counts under the null hypothesis.
Perform a chi-square test. Give the test statistic and the degrees of freedom, and interpret the P-value.
For a 2×2 table, explain the relationship between the chi-square test and the z test for comparing two proportions.

When we studied inference for two proportions in Section 8.2, we summarized the raw data by giving the number of observations from each population (n) and how many of them were classified as “successes” (X).

Example 9.1 Lost wallets.

Data set for lost.

In Example 8.11 (page 469), we compared the proportions of returned lost wallets with no money and with money. The following table summarizes the data used in that comparison:

Wallet condition	n	X	p^=X/n
Money	300	174	0.58
No money	300	111	0.37

These data suggest that wallets with money are more likely to be returned (58%) than are wallets with no money (37%). In Example 8.12 (page 470), we reported the difference between the proportions D=0.58−0.37=0.21, with a margin of error of 0.0781.

In this chapter, we consider a different summary of the data. Rather than record just the count of returned wallets (X), we record counts of all the outcomes in a two-way table.

Example 9.2 Two-way table for lost wallets.

Data set for lost.

Here is the two-way table classifying the 600 wallets:

Two-way table for lost wallets
	Wallet condition
Returned	Money	No money	Total
No	126	189	315
Yes	174	111	285
Total	300	300	600

We use the term r×c table to describe a two-way table of counts with r rows and c columns. The two categorical variables in this 2×2 table are Returned and Condition. Returned is the row variable, with values “No” and “Yes,” and Condition is the column variable, with values “Money” and “No money.” Because the objective in this example is to examine the effect of money in the wallet, we view Condition as an explanatory variable and Returned as the response variable. Just as in Chapter 2, where we used the x axis for the explanatory variable (page 79), here we use Condition as the column variable.

It is these r×c table summaries that are used for analysis in this chapter. If we wanted to compare more than two populations, we’d have c greater than 2. If the categorical response variable had more than two categories, we’d have r greater than 2.

Here is another example of a 2×2 table in which the data are collected differently. Instead of comparing populations, we’re looking at the association between two categorical variables obtained from an SRS of one population.

Example 9.3 Vaccinations and political party preference.

Data set icon for vaccine.

Should parents be able to decide whether or not to vaccinate their children, or should all vaccinations be required for all children? A Pew Internet survey asked this question of U.S. adults aged 18 and over.¹ It also asked adults about their political preference. The following table breaks down the responses:

Observed numbers of adults
	Party
Required	Democratic	Republican	Total
No	230	258	488
Yes	729	479	1208
Total	959	737	1696

The two categorical variables are Required, with values “No” and “Yes,” and Party, with values “Democrat” and “Republican.” We view Party as an explanatory variable and Required as a categorical response variable.

In Chapter 2, we discussed two-way tables and using the joint, marginal, and conditional distributions to study the relationship between the two categorical variables. We now view these sample distributions as estimates of the corresponding population distributions. Let’s look at some software output that gives these distributions.

Example 9.4 Software output for vaccinations and political party.

Data set icon for vaccine.

Figure 9.1 shows the output from JMP, Minitab, and SPSS for the vaccination data of Example 9.3. For now, we will just concentrate on the different distributions. Later, we will explore other parts of the output.

The three packages use similar displays for the distributions. In the cells of the 2×2 table, we find the counts, the conditional distributions of the column variable for each value of the row variable, the conditional distributions of the column variable given the row variable, and the joint distribution. All of these are expressed as percents rather than proportions.

Let’s look at the entries in the upper-left cell of the JMP output. We see that there are 230 Democrats (the count entry) who think vaccinations should not be required. The entry below the counts (Total %) tells us that these 230 represent 13.56% of the study participants. The four entries in the table for Total % give the joint distribution. The 230 Democrats who think that vaccinations should not be required represent 23.98% (Col %) of the Democrats in the study. This entry together with the (Col %) entry for the Democrat column in the Yes row (76.02%) gives the conditional distribution of Required for Democrats. The conditional distribution of Party given the opinion that vaccinations are not required are the Row % entries in the top two cells (47.13% and 52.87%). The marginal distributions are in the rightmost column and the bottom row. Minitab and SPSS give the same information but not necessarily in the same order.

JMP, Minitab, and SPSS outputs. — Figure 9.1 JMP, Minitab, and SPSS outputs, Examples 9.3 and 9.4.

The JMP output shows an expanded dropdown list menu, contingency table, which shows a table where rows represent required and columns represent party. A key identifies the output in each cell of the table. Each cell lists five values which represent the following in order, count, total percentage, column percentage, row percentage, and expected. The data is as follows. Required, no. Democratic, 230, 13.56, 23.98, 47.13, 275.939. Republican, 258, 15.21, 35.01, 52.87, 212.061. Total count no, 488. Total percentage no, 28.77. Required, yes. Democratic, 729, 42.98, 76.02, 60.35, 683.061. Republican, 479, 28.24, 64.99, 39.65, 524.939. Total count yes, 1208. Total percentage yes, 71.23. Total count democratic, 959. Total percentage democratic, 56.54. Total count republican, 737. Total percentage republican, 43.46. The Minitab output shows a contingency table with rows representing required and columns representing party. A key at the bottom identifies the output in each cell of the table. Each cell lists four to five values which represent the following in order, count, percentage of row, percentage of column, percentage of total, expected count. The data is as follows. Required, no. Democratic, 230, 47.13, 23.98, 13.56, 275.9. Republican, 258, 52.87, 35.01, 15.21, 212.1. All, 488, 100.00, 28.77, 28.77. Required, yes. Democratic, 729, 60.35, 76.02, 42.98, 683.1. Republican, 479, 39.65, 64.99, 28.24, 524.9. All, 1208, 100.00, 71.23, 71.23. All Democratic, 959, 56.54, 100.00, 56.54. Republican, 737, 43.46, 100.0, 43.46. Total all, 1696, 100.00, 100.00, 100.00. The SPSS shows a similar table titled, required, party crosstabulation. Rows represent required and columns represent party. A key identifies the output in each cell of the table. Each cell lists five values which represent the following in order, count, expected count, percentage within required, percentage within party, percentage of total. The data is as follows. Required, no. Democratic, 230, 275.9, 471 percent, 24.0 percent, 13.6 percent. Republican, 258, 212.1, 52.9 percent, 35.0 percent, 15.2 percent. Total, 488, 488.0, 100.0 percent, 28.8 percent, 28.8 percent. Required, yes. Democratic, 729, 683.1, 60.3 percent, 76.0 percent, 43.0 percent. Republican, 479, 524.9, 39.7 percent, 65.0 percent, 28.2 percent. Total, 1208, 1208.0, 100.0 percent, 71.2 percent, 71.2 percent. Total democratic, 959, 959.0, 56.5 percent, 100.0 percent, 56.5 percent. Total republican, 737, 737.0, 43.5 percent, 100.0 percent, 43.5 percent. Total all, 1696, 1696.0, 100.0 percent, 100.0 percent, 100.0 percent.

In Chapter 2, we learned that the key to examining the relationship between two categorical variables is to look at the conditional distributions. Let’s do that for the vaccination data.

Example 9.5 Conditional distributions given political party.

Data set icon for vaccine.

To compare the frequency of vaccination opinions across political party preference, we examine the column percents. Here they are, rounded from the output in Figure 9.1 for clarity:

Column percents for political party
	Party
Required	Democratic	Republican
No	24%	35%
Yes	76%	65%
Total	100%	100%

The “Total” row reminds us that 100% of the Democrats and Republicans have been classified as either thinking that vaccinations should be required or not. (The sums sometimes differ slightly from 100% because of roundoff error.) The bar graphs in Figure 9.2 compare the percents. The difference of 11% between the percents of adults who think vaccinations should not be required is reasonably large (24% for Democrats versus 35% for Republicans).

A bar graph of vaccine beliefs by political party. — Figure 9.2 Bar graph of the percents of adults who believe vaccinations should not be required (no) and who believe that vaccinations should be required (yes), by political party preference, Example 9.5.

A statistical test will tell us whether or not this difference can be plausibly attributed to chance. Specifically, if there is no association between party preference and opinions about requiring vaccinations, how likely is it that a sample would show a difference as large as or larger than that displayed in Figure 9.2? In the next part of this section, we discuss the significance test to examine this question.

Note that Figure 9.2 shows the percents favoring required vaccinations (yes) as well as percents opposed (no). In a description of the results, we would choose one of these for our main story. For tables with more than two columns, we would normally plot the percents for all columns. The next example presents another way to display the data from a two-way table.

Example 9.6 Mosaic plot for vaccination opinions and political party preference.

Data set icon for vaccine.

Figure 9.3 displays the joint distribution and the two marginal distributions in a single plot, called a mosaic plot. It also shows the conditional distributions by party. The sizes of the four rectangles are proportional to the four probabilities of the joint distribution. The bar at the right side gives the marginal distribution of the Required variable, while the widths of the vertical bars give the marginal distribution of the variable Party. Within each of the two vertical bars, the blue and yellow sections make up the conditional distribution by party.

A vaccine belief by political party. — Figure 9.3 Mosaic plot for the vaccinations and political party data, Example 9.6.

Check-in

9.1 Find two conditional distributions for the lost wallet data. Figure 9.4 shows JMP output for the lost wallets data of Example 9.2 (page 487). Use this output to answer the following questions.
1. Find the conditional distribution of Returned for wallets with money.
2. Do the same for wallets with no money.
3. Graphically display the two conditional distributions.
4. Write a short summary interpreting the two conditional distributions.
9.2 Condition on wallet. Refer to the previous Check-in question. Use the output in Figure 9.4 to answer the following questions.
1. Find the conditional distribution of Condition for returned wallets.
2. Do the same for wallets that were not returned.
3. Graphically display the two conditional distributions.
4. Write a short summary interpreting the two conditional distributions.
9.3 Which conditional distributions should you use? Refer to your answers to the two previous Check-in questions. Which of these distributions do you prefer for interpreting these data? Give reasons for your answer.

A JMP output of a contingency table. — Figure 9.4 JMP output for lost wallets, Check-in questions 9.1, 9.2, and 9.3.

The JMP output shows an expanded dropdown list menu, contingency table, which shows a table where rows represent returned and columns represent wallet. A key identifies the output in each cell of the table. Each cell lists four values which represent the following in order, count, total percentage, column percentage, row percentage. The data is as follows. Returned, yes. Wallet money, 174, 29.00, 58.00, 61.05. Wallet no money, 111, 18.50, 37.00, 38.95. Total count yes, 285. Total percentage yes, 47.50. Returned, no. Wallet money, 126, 21.00, 42.00, 40.00. Wallet no money, 189, 31.50, 63.00, 60.00. Total count no, 315. Total percentage no, 52.50. Total count wallet money, 300. Total percentage wallet money, 50.00. Total count wallet no money, 300. Total percentage wallet no money, 50.00.

The hypothesis: No association

The null hypothesis H0 of interest in a two-way table is “There is no association between the row variable and the column variable.” In Example 9.3, this null hypothesis says that there is no association between political party preference and belief that vaccinations should be required. The alternative hypothesis Ha is that there is an association between these two variables. The alternative Ha does not specify any particular direction for the association. For two-way tables in general, the alternative includes many different possibilities. Because it includes all sorts of possible associations, we cannot describe Ha as either one-sided or two-sided.

In our example, the hypothesis H0 that there is no association between political party preference and opinions about requiring vaccinations is equivalent to the statement that the variables Required and Party are independent.

For other two-way tables, such as the lost wallet study of Example 9.2, we have samples from two populations, one for each condition. For each population there is a distribution of the categorical response variable Returned. For two-way tables like this, the columns correspond to independent samples from c distinct populations, there are c distributions for the row variable, one for each population. The null hypothesis then says that the c distributions of the row variable are identical. The alternative hypothesis is that the distributions are not all the same.

Expected cell counts

To test the null hypothesis in r×c tables, we compare the observed cell counts with expected cell counts calculated under the assumption that the null hypothesis is true. A numerical summary of the comparison will be our test statistic.

Example 9.7 Expected counts.

Data set icon for vaccine.

The observed and expected counts for the vaccination example appear in the JMP, Minitab, and SPSS computer outputs shown in Figure 9.1 (page 489). The expected counts are given as the last entry in each cell for JMP and Minitab and as the second entry in each cell for SPSS. For example, in the cell for Democrats who do not think that vaccinations should be required, the observed count is 230, and the expected count is 275.939 (JMP) or 275.9 (Minitab and SPSS).

How is this expected count obtained? Look at the percents in the right margin of the JMP table in Figure 9.1. We see that 28.77% of all adults thought that vaccinations should not be required. If the null hypothesis of no relation between Party and Required is true, we expect this overall percent to apply to both Democrats and Republicans. In particular, we expect 28.77% of the Democrats to be opposed to making vaccinations required. Because there are 959 Democrats, the expected count is 28.77% of 959, or 275.39. The other expected counts are calculated in the same way.

The reasoning of Example 9.7 leads to a simple formula for calculating expected cell counts. To compute the expected count of Democrats opposed to requiring vaccinations, we multiplied the proportion of adults opposed to requiring vaccinations (488/1696) by the number of Democrats (959). From Figure 9.1, we see that the numbers 488 and 959 are the row and column totals for the cell of interest and that 1696 is n, the total number of observations for the table. The expected cell count is, therefore, the product of the row and column totals divided by the table total.

The reasoning is similar for the wallet data in Example 9.2. Under the null hypothesis, the two populations defined by Condition have the same distribution. The common distribution is estimated by the marginal counts for Returned, 315 and 285. Expressed as percents, the distribution is 52.5% (315/600) for “No” and 47.5.5% (285/600) for “Yes.” The population “Money” has 300 samples, so the expected count is 52.5% of 300, or 157.5, for the cell defined by “Money” and “No.” Expected counts for the other three cells are calculated in the same way.

In Figure 9.3 (page 491), we used a mosaic plot to display the data for the vaccination and political party preference data. Looking at the two columns, we can see that the proportion in the lower region, corresponding to being opposed to required vaccinations, is smaller for the Democrats than for the Republicans. This illustrates graphically the difference in the conditional distributions for the two parties.

What would the mosaic plot look like if the null hypothesis were true? In this case, the two conditional distributions would be the same. Ideally, the observed counts would be equal to the expected counts. If we rerun the analysis with the expected counts in place of the observed counts, we obtain the mosaic plot in Figure 9.5. Notice that the proportions of each party responding no are now the same and equal to 28.77%, the marginal percent of adults who do not think vaccinations should be required.

The chi-square test

Data set icon for Vtm.

To test H0 that there is no association between the row and column classifications, we use a statistic that compares the entire set of observed counts with the set of expected counts. To compute this statistic,

First, take the difference between each observed count and its corresponding expected count. Verify that the sum of the r×c differences is zero.
Then, square these values so that they are all either 0 or positive.
Because a large difference means less if it comes from a cell that is expected to have a large count, divide each squared difference by the expected count. This is a type of standardization.
Finally, sum over all cells.

The result is called the chi-square statistic X2. The chi-square statistic was proposed by the English statistician Karl Pearson (1857–1936) in 1900. It is the oldest inference procedure still used in its original form.

If the expected counts and the observed counts are very different, a large value of X2 will result. Large values of X2 provide evidence against the null hypothesis. To obtain a P-value for the test, we need the sampling distribution of X2 under the assumption that H0 (no association between the row and column variables) is true. The distribution is called the chi-square distribution, which we denote by χ2 (χ is the lowercase Greek letter chi).

Like the t distributions, the χ2 distributions form a family described by a single parameter, the degrees of freedom. We use χ2 (df) to indicate a particular member of this family. Figure 9.6 displays the density curves of the χ2(2) and χ2(4) distributions. As you can see in the figure, χ2 distributions take only positive values and are skewed to the right. Table F in the back of the book gives upper critical values for the χ2 distributions.

Two chi square density curves. — Figure 9.6 The χ2 (df) density curves for (a) df=2 and (b) df=4.

Now that we have our test statistic and its sampling distribution under H0, we can describe the test.

Chi-square test for two-way tables

The null hypothesis H0 is that there is no association between the row and column variables in a two-way table. The alternative hypothesis is that these variables are related.

If H0 is true, the chi-square statistic X2 has approximately a χ2 distribution with (r−1)(c−1) degrees of freedom.

The P-value for the chi-square test is

P(χ2≥X2)

where χ2 is a random variable having the χ2 (df) distribution with df=(r−1)(c−1).

For tables larger than 2×2, we will use this approximation whenever the average of the expected counts is 5 or more and the smallest expected count is 1 or more. For 2×2 tables, we require all four expected cell counts to be 5 or more.

The chi-square test always uses the upper tail of the χ2 distribution because any deviation from the null hypothesis makes the statistic larger. The approximation of the distribution of X2 by χ2 becomes more accurate as the cell counts increase. Moreover, it is more accurate for tables larger than 2×2 tables.

Example 9.8 Chi-square significance test.

Data set icon for vaccine.

The results of the chi-square significance test for the vaccination example appear in the computer outputs in Figure 9.7, labeled Pearson (JMP) or Pearson Chi-square (Minitab and SPSS). Because all the expected cell counts are large (5 or more), the χ2 distribution provides an accurate P-value. We see that X2=24.71, df=1, and P<0.0001. Note that Minitab and SPSS report the P-value as 0.000 or .000. These are rounded numbers and potentially misleading. The P-value is small, but it is not zero. For this reason, we prefer to report P<0.0001.

As a check, we verify that the degrees of freedom are correct for a 2×2 table:

df=(r-1)(c-1)=(2−1)(2−1)=1

The chi-square test confirms that the data provide evidence against the null hypothesis that there is no relationship between political party preference and vaccination opinion. Under H0, the chance of obtaining a value of X2 greater than or equal to the calculated value of 24.71 is small, less than 0.0001—fewer than 1 time in 10,000.

The JMP output shows an expanded dropdown list menu, tests, which shows three tables of the following data. First table. N, 1696. D F, 1. negative log like, 12.129567. R square, U, 0.0106. Second table. Test, likelihood ratio. Chi square, 24.579. Probability greater than chi square, less than 0.001 asterisk. Test, Pearson. Chi square, 24.709. Probability greater than chi square, less than 0.001 asterisk. Third table. Fisher’s exact test, left. Probability, less than 0.001 asterisk. Alternative hypothesis, probability of party = republican is greater for required = no than yes. Fisher’s exact test, right. Probability, 1.0000. Alternative hypothesis. Probability of party = republican is greater for required = yes than no. Fisher’s exact test, 2 tail. Probability, less than 0.0001 asterisk. Alternative hypothesis. Probability of party = republican is different across required. The Minitab output lists the following data. Pearson. Chi Square = 24.709, D F = 1, P Value = 0.000. Likelihood Ratio. Chi Square = 24.579, D F = 1, P Value = 0.000. Fisher’s exact test. P value = 0.0000008. The SPSS output show a table of chi square test data as follows. Test, Pearson chi square. Value, 24.709 asterisk, superscript a. d f, 1. Asymptotic significance, 2 sided, 0.000. Exact significance, 2 sided, 0.000. Exact significance, 1 sided, 0.000. Test, continuity correction, superscript b. Value, 24.174. d f, 1. Asymptotic significance, 0.000. Exact significance, 2 sided, blank. Exact significance, 1 sided, blank. Test, likelihood ratio, 24.579. d f, 1. Asymptotic significance, 0.000. Exact significance, 2 sided, 0.000. Exact significance, 1 sided, 0.000. Test, Fisher’s exact test. Value, blank. d f, blank. Asymptotic significance, 2 sided, blank. Exact significance, 2 sided, 0.000. Exact significance, 1 sided, 0.000. N of valid cases, 1696. Notes. a. 0 cells, 0.0 percent, have expected count less than 5. The minimum expected count is 212.06. b. Computed only for a 2 by 2 table.

The outputs in Figure 9.7 also report results for testing the hypothesis of no association using alternatives to the chi-square significance test. Fisher’s exact test is preferred by many, particularly when the counts are small and the chi-square approximation is not very accurate. Its results are provided in each of the software outputs.

The significance test result does not provide insight into the nature of the relationship between the variables. It is up to us to see that the data show Republicans are more likely to believe that vaccinations should not be required. You should always accompany a chi-square test by percents such as those in Example 9.5 and Figure 9.2 and by a description of the nature of the relationship.

Observational studies such as the one in Example 9.3 cannot tell us whether or not an explanatory variable is a cause of a pattern in a response variable. For the vaccine and party preference setting, a causal association does not seem plausible. Often, association can be explained by confounding with other variables. Similarly, the results for the wallet data in Exercise 9.2 tell us that wallets with money are treated differently, but they do not explain the underlying reason for this behavior.

Computations

The calculations required to analyze a two-way table are straightforward but tedious. In practice, we recommend using software, but it is possible to do the work with a calculator or a spreadsheet, and some insight can be gained by examining the details. Here is an outline of the steps required.

The next few examples illustrate these steps.

Example 9.9 Health habits of college students.

Data set icon for health.

Physical activity generally declines when students leave high school and enroll in college. This suggests that college is an ideal setting to promote physical activity. One study examined the level of physical activity and other health-related behaviors in a sample of 1184 college students.² Let’s look at the data for physical activity and consumption of fruits. The study categorized physical activity as low, moderate, or vigorous and fruit consumption as low, medium, or high. Here is the two-way table that summarizes the data:

	Physical activity
Fruit consumption	Low	Moderate	Vigorous	Total
Low	69	206	294	569
Medium	25	126	170	321
High	14	111	169	294
Total	108	443	633	1184

This table includes the marginal totals obtained by summing across rows and columns. For example, the first-row total is 69+206+294=569. The grand total, the number of students in the study, can be computed by summing the row totals (569+321+294=1184) or the column totals (108+443+633=1184). caution It is easy to make an error in these calculations, so if doing this by hand, it is a good idea to do both as a check on your arithmetic.

Computing conditional distributions

First, we summarize the observed relation between physical activity and fruit consumption. We expect a positive association, but there is no clear distinction between an explanatory variable and a response variable in this setting. If we have such a distinction, then the clearest way to describe the relationship is to compare the conditional distributions of the response variable for each value of the explanatory variable. Otherwise, we can compute the conditional distribution each way and then decide which gives a better description of the data.

Example 9.10 Health habits of college students: Conditional distributions.

Data set icon for health.

Let’s look at the data in the first column of the table in Example 9.9. There were 108 students with low physical activity. Of these, there were 69 with low fruit consumption. Therefore, the column proportion for this cell is

69108=0.639

That is, 63.9% of the low physical activity students had low fruit consumption. Similarly, 25 of the low physical activity students had medium fruit consumption. This percent is 23.1%:

25108=0.231

In all, we calculate nine percents. Here are the results:

Column percents for fruit consumption and physical activity
	Physical activity
Fruit consumption	Low	Moderate	Vigorous	Total
Low	63.9	46.5	46.4	48.1
Medium	23.1	28.4	26.9	27.1
High	13.0	25.1	26.7	24.8
Total	100.0	100.0	100.0	100.0

In addition to the conditional distributions of fruit consumption for each level of physical activity, the table also gives the marginal distribution of fruit consumption. These percents appear in the rightmost column, labeled “Total.”

The sum of the percents in each column should be 100, except for possible small roundoff errors. caution It is good practice to calculate each percent separately and then sum each column as a check. In this way, we can find arithmetic errors that would not be uncovered if, for example, we calculated the column percent for the “High” row by subtracting the sum of the percents for “Low” and “Medium” from 100.

Figure 9.8 compares the distributions of fruit consumption for each of the three physical activity levels. For each activity level, the highest percent is for students who consume low amounts of fruit. For low physical activity, there is a clear decrease in the percent when moving from low to medium to high fruit consumption. The patterns for moderate physical activity and vigorous physical activity are similar. Low fruit consumption is still dominant, but the percents for medium and high fruit consumption are about the same for the moderate and vigorous activity levels. The percent of low fruit consumption is highest for the low physical activity students compared with those who have moderate or vigorous physical activity. These plots suggest that there is an association between these two variables.

Three bar graphs of fruit consumption. — Figure 9.8 Comparison of the distribution of fruit consumption for different levels of physical activity, Example 9.10.

Check-in

9.4 Examine the row percents. Refer to the health habits data that we examined in Example 9.9 (page 496). For the row percents, make a table similar to the one in Example 9.10 (page 497).
9.5 Make some plots. Refer to the previous Check-in question. Make plots of the row percents similar to those in Figure 9.8.
9.6 Compare the conditional distributions. Compare the plots you made in the previous Check-in question with those given in Figure 9.8. Which set of plots do you think gives a better graphical summary of the relationship between these two categorical variables? Give reasons for your answer. Note that there is not a clear right or wrong answer for this exercise. You need to make a choice and to explain your reasons for making it.

We observe a clear relationship between physical activity and fruit consumption in this study. The chi-square test assesses whether this observed association is statistically significant—that is, too strong to occur often just by chance. The test confirms only that there is some relationship. The percents we have compared describe the nature of the relationship.

caution The chi-square test does not in itself tell us what population our conclusion describes. The subjects in this study were college students from four midwestern universities. The researchers could argue that these findings apply to college students in general. This type of inference is important, but it is based on expert judgment and is beyond the scope of the statistical inference that we have been studying.

Example 9.11 The chi-square significance test for health habits of college students.

Data set icon for health.

The first step in performing the significance test is to calculate the expected cell counts. Let’s start with the cell for students with low fruit consumption and low physical activity. Using the formula on page 493, we need three quantities:

The corresponding row total, 569, which is the number of students who have low fruit consumption;
The corresponding column total, 108, which is the number of students who have low physical activity; and
The total number of students, 1184.

The expected cell count is, therefore,

(108)(569)1184=51.90

Note that although any observed count of the number of students must be a whole number, an expected count need not be.

Calculations for the other eight cells in the 3×3 table are performed in the same way. With these nine expected counts, we are now ready to use the formula for the X2 statistic on page 493. The first term in the sum comes from the cell for students with low fruit consumption and low physical activity. The observed count is 69, and the expected count is 51.90. Therefore, the contribution to the X2 statistic for this cell is

(69−51.90)251.90=5.63

When we add the terms for each of the nine cells, the result is

X2=14.15

Because there are r=3 levels of fruit consumption and c=3 levels of physical activity, the degrees of freedom for this statistic are

df=(r−1)(c−1)=(3−1)(3−1)=4

Under the null hypothesis that fruit consumption and physical activity are independent, the test statistic X2 has a χ2(4) distribution. To obtain the P-value, look at the df=4 row in Table F.

df=4
p	0.01	0.005
χ2	13.28	14.86

The calculated value X2=14.15 lies between the critical points for probabilities 0.01 and 0.005. The P-value is, therefore, between 0.01 and 0.005. (In Excel, =1-CHISQ.DIST(14.15,4,TRUE) gives the value as 0.0068.) There is strong evidence (X2=14.15,df=4,P<0.01) that there is a relationship between fruit consumption and physical activity.

We can check our work by adding the expected counts to obtain the row and column totals, as in the table of Example 9.10 (page 497). These totals should be the same as those in the table of observed counts except for small roundoff errors.

Check-in

9.7 Find the expected counts. Refer to Example 9.11. Compute the expected counts and display them in a 3×3 table. Check your work by adding the expected counts to obtain row and column totals. These should be the same as the row and column totals in the table of observed counts except for small roundoff errors.
9.8 Find the X2 statistic. Refer to the previous Check-in question. Use the formula on page 494 to compute the contributions to the chi-square statistic for each cell in the table. Verify that their sum is 14.15.
9.9 Find the P-value. For each of the following, give the degrees of freedom and an appropriate bound on the P-value for the X2 statistic.
1. X2=13.50 for a 4×4 table.
2. X2=13.50 for a 3×3 table.
3. X2=6.20 for a 2×3 table.
4. X2=6.20 for a 3×2 table.
9.10 Lost wallets: The chi-square test. Refer to Example 9.2 (page 487). Use the chi-square test to assess the relationship between money in a lost wallet and the chance that it is returned. State your conclusion.

The chi-square test and the z test

A comparison of the proportions of “successes” in two populations leads to a 2×2 table. We can compare two population proportions either by using the chi-square test or by using the two-sample z test from Section 8.2. In fact, these tests always give exactly the same result because the X2 statistic is equal to the square of the z statistic, and χ2(1) critical values are equal to the squares of the corresponding N(0, 1) critical values. The advantage of the z test is that we can test either one-sided or two-sided alternatives. The chi-square test always tests the two-sided alternative. Of course, the chi-square test can compare more than two populations, whereas the z test compares only two.

Example 9.12 Chi-square and z for political preference and vaccines.

Data set icon for vaccine.

In Example 9.8 we performed the significance test to examine the relationship between political preference and whether vaccinations should be required. We calculated the text statistic X2=24.71. Let’s apply the significance test for comparing two proportions (page 474) to test the null hypothesis that the proportions of Democrats and Republicans who believe that vaccinations should not be required are the same. For this comparison, the test statistic is z=4.97. Squaring this gives z2=24.70 which is the same as X2 with a very small rounding error.

Check-in

9.11 Comparison of conditional distributions. Consider the following 2×2 table. Data set icon for comp.

Observed counts
Response variable (Y)	Explanatory variable (X)		Total
Response variable (Y)	1	2	Total
Yes	72	96	168
No	138	114	252
Total	210	210	420

Compute the conditional distribution of the response variable for each of the two explanatory-variable categories.
Display the distributions graphically.
Write a short paragraph describing the two distributions and how they differ.

9.12 Expected cell counts and the chi-square test. Refer to the previous Check-in question. You consider using the chi-square test to compare these two conditional distributions.
1. Find the expected counts for all cells. Are they large enough to justify use of the chi-square test for these data?
2. Computer software gives you X2=3.95. What are the degrees of freedom for this statistic?
3. Using Table F, give an appropriate bound on the P-value.
9.13 Compare the chi-square test with the z test. Refer to the previous two Check-in questions and the significance test for comparing two proportions (page 474).
1. Set up the problem as a comparison between two proportions. Describe the population proportions, state the null and alternative hypotheses, and give the sample proportions.
2. Carry out the significance test to compare the two proportions. Report the z statistic, the P-value, and your conclusion.
3. Compare the P-value for this significance test with the one that you reported in Check-in question 9.12.
4. Verify that the square of the z statistic is the X2 statistic given in Check-in question 9.12.

Beyond the Basics

Meta-analysis

Policymakers wanting to make decisions based on research are sometimes faced with the problem of summarizing the results of many studies. These studies may show effects of different magnitudes, some highly statistically significant and some not. What overall conclusion can we draw? Meta-analysis is a collection of statistical techniques designed to combine information from different but similar studies. Each individual study must be examined with care to ensure that its design and data quality are adequate. The basic idea is to compute a measure of the effect of interest for each study. These measures are then combined, usually by taking some sort of weighted average, to produce a summary measure for all of the studies. Of course, a confidence interval for the summary is included in the results. Here is an example.

Example 9.13 Meta-analysis for eating too much salt.

Evidence from a variety of sources suggests that diets high in salt are associated with risks to human health. To investigate the relationship between salt intake and stroke, information from 14 studies was combined in a meta-analysis.³ Subjects were classified based on the amount of salt in their normal diet. They were followed for several years and then classified according to whether or not they had developed cardiovascular disease (CVD). A total of 104,933 subjects were studied, and 5161 of them developed CVD. Here are the data from one of the studies:⁴

	Low salt	High salt
CVD	88	112
No CVD	1081	1134
Total	1169	1246

Let’s look at the relative risk for this study. We first find the proportion of subjects who developed CVD in each group. For the subjects with a low salt intake, the proportion who developed CVD is

881169=0.0753

or 75 per thousand; for the high-salt group, the proportion is

1121246=0.0899

or 90 per thousand. We can now compute the relative risk as the ratio of these two proportions. We choose to put the high-salt group in the numerator. The relative risk is

0.08990.0753=1.19

Relative risk greater than 1 means that the high-salt group developed more CVD than the low-salt group. For this study, the association is not statistically significant. The 95% confidence interval for the relative risk is (0.91, 1.56).

When the data from all 14 studies were combined, the relative risk was reported as 1.17, with a 95% confidence interval of (1.02, 1.32). Because this interval does not include the value 1, corresponding to equal proportions in the two groups, we conclude that the higher CVD rates are not the same for the two diets (P<0.05). The high-salt diet is associated with a 17% higher rate of CVD than the low-salt diet. Note that the relative risk for the individual study in this example was not statistically significant, even though it was higher than the overall estimate (1.19 versus 1.17). This illustrates the value of the meta-analysis where the conclusion is based on combining results from several studies.

Check-in

9.14 A different view of the relative risk. In the previous example, we computed the relative risk for the high-salt group relative to the low-salt group. Now, compute the relative risk for the low-salt group relative to the high-salt group by inverting the relative risk reported in the meta-analysis in Example 9.13—that is, compute 1/1.17. Then restate the last paragraph of the exercise with this change. (Hint: For the lower confidence limit, use 1 divided by the upper limit for the original ratio and do a similar calculation for the upper limit.)

Section 9.1 SUMMARY

The null hypothesis for r×c tables of count data is that there is no relationship between the row variable and the column variable.
Expected cell counts under the null hypothesis are computed using the formula

expected count=row total × column totaln
The null hypothesis is tested by the chi-square statistic, which compares the observed counts with the expected counts:

X2=∑(observed-expected)2expected
Under the null hypothesis, X2 has approximately the chi-square distribution with (r−1)(c−1) degrees of freedom. The P-value for the test is

P(χ2≥X2)
where χ2 is a random variable having the χ2 (df) distribution with df=(r−1)(c−1).
The chi-square approximation is adequate for practical use when the average expected cell count is 5 or greater and all individual expected counts are 1 or greater, except in the case of 2×2 tables. All four expected counts in a 2×2 table should be 5 or greater.
For two-way tables we first compute percents or proportions that describe the relationship of interest. Then, we compute expected counts, the X2 statistic, and the P-value.
Two different models for generating r×c tables lead to the chi-square test. In the first model, independent SRSs are drawn from each of c populations, and each observation is classified according to a categorical variable with r possible values. The null hypothesis is that the distributions of the row categorical variable are the same for all c populations. In the second model, a single SRS is drawn from a population, and observations are classified according to two categorical variables having r and c possible values. In this model, H0 states that the row and column variables are independent.

Now that you have completed this section, you will be able to:

Find the joint distribution, the marginal distributions, and the conditional distributions for a two-way table of counts. Review Example 9.4 (page 488) and try Exercise 9.1.
Choose appropriate conditional distributions to describe relationships in a two-way table. Review Example 9.5 (page 490) and try Exercise 9.3.
Compute the expected cell counts under the null hypothesis. Review Example 9.7 (page 492) and try Exercise 9.5.
Perform a chi-square test. Give the test statistic and the degrees of freedom and interpret the P-value. Review Example 9.8 (page 495) and try Exercise 9.7.
For a 2×2 table, explain the relationship between the chi-square test and the z test for comparing two proportions. Review Example 9.12 (page 500) and try Exercise 9.9.

Section 9.1 EXERCISES

9.1 Eight is enough. A healthy body needs good food, and healthy teeth are needed to chew our food so that it can nourish our bodies. The U.S. Army has recognized this fact and requires recruits to pass a dental examination. If you wanted to be a soldier in the Spanish American War, which took place in 1898, you needed to have at least eight teeth. Here is the statement of the requirement: Data set icon for teeth.

Unless an applicant has at least four sound double teeth, one above and one below on each side of the mouth, and so opposed as to serve the purpose of mastication, he should be rejected.

A study reported the rejection data for enlistment candidates classified by age. Here are the data:⁵

	Age
Rejected	<20	20–25	25–30	30–35	35–40	>40
Yes	68	647	1,114	1,783	2,887	3,801
No	58,884	77,992	55,597	43,994	47,569	39,985

Which variable is the explanatory variable? Which variable is the response variable? Give reasons for your answer.
Find the joint distribution. Write a brief summary explaining the major features of this distribution.
Find the two marginal distributions. Write a brief summary explaining the major features of these distributions.
Find conditional distributions and give a brief summary explaining the major features of these distributions.

9.2 Physical education requirements. In Exercise 8.41 (page 482), you analyzed data from a study that included 354 higher education institutions: 225 private and 129 public. Among the private institutions, 60 required a physical education course, while among the public institutions, 101 required a course. Your analysis in that exercise focused on the comparison of two proportions. Use these data to construct a two-way table for analysis and find the joint distribution, the marginal distributions, and the conditional distributions. Use these distributions to give a brief summary of the relationship between the type of institution and whether a physical education course is required.
9.3 Conditional distribution for eight is enough. Refer to Exercise 9.1. Which conditional distribution would you choose to explain the relationship between the two variables? Write a summary that includes your interpretation of the relationship based on this conditional distribution.
9.4 Conditional distribution for physical education requirements. Refer to Exercise 9.2. Which conditional distribution do you prefer to explain the results of your analysis? Give a reason for your answer.
9.5 Expected counts for eight is enough. Refer to Exercise 9.1. Find the expected counts.
9.6 Expected counts for physical education requirements. Refer to Exercise 9.2. Find the expected counts.
9.7 Significance test for eight is enough. Refer to Exercise 9.1. Find the chi-square statistic, the degrees of freedom, and the P-value. What do you conclude?
9.8 Significance test for physical education requirements. Refer to Exercise 9.2. Find the chi-square statistic, the degrees of freedom, and the P-value. What do you conclude?
9.9 Two views of the significance test for physical education requirements. Refer to Exercise 9.2. Show that the chi-square statistic that you found in Exercise 9.8 is the square of the z statistic that you found in Exercise 8.41 (page 482).
9.10 Survival and class on the Titanic. On April 15, 1912, on her maiden voyage, the Titanic collided with an iceberg and sank. The ship was luxurious but did not have enough lifeboats for the 2224 passengers and crew. As a result of the collision, 1502 people died.⁶ The ship had three classes of passengers. The level of luxury and the price of the ticket varied with the class, first class being the most luxurious. There were 323 passengers in first class, 277 in second class, and 709 in third class. The number of first-class passengers who survived was 200. For second- and third-class, the numbers were 119 and 181, respectively. Let’s look at these data with a two-way table.
1. Create a two-way table that you could use to explore the relationship between survival and class.
2. Which variable is the explanatory variable, and which is the response variable? Give reasons for your answers.
3. Find the two marginal distributions. Write a brief summary explaining the major features of these distributions.
4. Find the conditional distributions. Which conditional distribution would you choose to explain the relationship between these two variables? Explain your answer.
5. Find the expected counts, the X2 statistic, and the P-value.
6. Write a summary of your analyses that includes your interpretation of the results.
9.11 Sexual harassment in middle and high schools. A nationally representative survey of students in grades 7 to 12 asked about the experience of these students with respect to sexual harassment.⁷ One question asked how many times the student had witnessed sexual harassment in school. The two-way table for this exercise is given in Figure 9.9. Use the figure to find the joint distribution, the two marginal distributions, and the conditional distributions. Which conditional distribution do you prefer to explain the results of your analysis? Give a reason for your answer.

Figure 9.9 JMP output, Exercises 9.11 and 9.13.

The output shows an expanded dropdown list menu, contingency analysis of gender by times. Beneath, is a collapsed menu, mosaic plot, followed by another expanded menu, contingency table. It shows a table where rows represent times and columns represent sex. A key identifies the output in each cell of the table. Each cell lists four values which represent the following in order, count, total percentage, column percentage, row percentage. The data is as follows. Times, more. Boys, 732, 37.23, 76.01, 52.17. Girls, 671, 34.13, 66.90, 47.83. Total count more, 1403. Total percentage more, 71.36. Times, never. Boys, 106, 5.39, 11.01, 43.09. Girls, 140, 7.12, 13.96, 56.91. Total count never, 246. Total percentage never, 12.51. Times, once. Boys, 125, 6.36, 12.98, 39.43. Girls, 192, 9.77, 19.14, 60.57. Total count once, 317. Total percentage once, 16.12. Total count boys, 963. Total percentage boys, 48.98. Total count girls, 1003. Total percentage girls, 51.02. Total count all, 1966. Below, an expanded menu, tests, lists two tables of data as follows. First table. N, 1696. D F, 2. negative log like, 10.410814. R square, U, 0.0076. Test, likelihood ratio. Chi square, 20.822. Probability greater than chi square, less than 0.0001 asterisk. Test, Pearson. Chi square, 20.707. Probability greater than chi square, less than 0.0001 asterisk.
9.12 What’s wrong? Each of the following statements contains an error. Describe each error and explain why the statement is wrong.
1. A chi-square statistic is used to test the null hypothesis that two categorical variables are dependent.
2. Marginal distributions can be used to explain the relationship between variables in a two-way table.
3. A chi-square statistic is always the square of a z-statistic for comparing two proportions.
9.13 Sexual harassment in middle and high schools. Refer to Exercise 9.11. Use the output in Figure 9.9 to find the chi-square statistic, the degrees of freedom, and the P value. What do you conclude from this analysis?