8.1 Inference for a Single Proportion in Chapter 8 Inference for Proportions

8.1 Inference for a Single Proportion

When you complete this section, you will be able to:

Identify the sample size, the count, and the sample proportion for a single sample.
Calculate the standard error of a sample proportion and the margin of error.
Construct the large-sample confidence interval for a single proportion.
Use the large-sample significance test to test a null hypothesis about a population proportion.
Find the sample size needed for a desired margin of error.
Find the sample size needed for a significance test of a single proportion.

We want to estimate the proportion p of some characteristic in a large population. For example, we may want to know the proportion of likely voters who approve of the president’s conduct in office. We select a simple random sample (SRS) of size n from the population and record the count X of “successes” (such as Yes answers to a question about the president). A “success” response represents the characteristic of interest in this example.

In statistical terms, we are concerned with inference about the probability p of a success in the binomial setting. The sample proportion of successes p^=X/n estimates the unknown population proportion p. If the population is much larger than the sample (at least 20 times as large), the count X has approximately the binomial distribution B(n, p).¹

Example 8.1 Robotics and jobs.

Data set icon for robot.

A Pew survey asked a panel of experts whether or not they thought that networked, automated, artificial intelligence (AI), and robotic devices will have displaced more jobs than they have created (net jobs) by 2025.²

The sample size is the number of experts who responded to the Pew survey question, n=1896. The report on the survey tells us that 48% of the respondents said they “believe net jobs will decrease by 2025 due to networked, automated, artificial intelligence (AI), and robotic devices.” Thus, the sample proportion is p^=0.48. We can calculate the count X from the information given; it is the sample size times the proportion responding Yes, X=np^=1896(0.48)=910.

Check-in

8.1 Users of Instagram. The Pew Research Center surveyed U.S. social media users. Among the 236 respondents who were 18 to 21 years old, 158 said that they used Instagram.³
1. What is the sample size n for the 18- to 21-year-olds?
2. In this setting, describe the population proportion p in a short sentence.
3. What is the count X? Describe the count in a short sentence.
4. Find the sample proportion p^.
8.2 Users of Snapchat. Refer to the previous Check-in question. For the same 236 respondents, 62% said that they used Snapchat.
1. What is the sample size n for this setting?
2. What is the count X of those who said they use Snapchat?
3. What is the sample proportion p^?

If the sample size n is very small, we must base tests and confidence intervals for p on the binomial distributions. These are awkward to work with because of the discreteness of the binomial distributions.⁴ But we know that when the sample size n is large, both the count X and the sample proportion p^ are approximately Normal. We will consider only inference procedures based on the Normal approximation. These procedures are similar to those for inference about the mean of a Normal distribution.

Large-sample confidence interval for a single proportion

The unknown population proportion p is estimated by the sample proportion p^=X/n. If the sample size n is sufficiently large, the sampling distribution of p^ is approximately Normal, with mean μp^=p and standard deviation σp^=p(1−p)/n. This means that approximately 95% of the time, p^ will be within 2p(1−p)/n of the unknown population proportion p.

Note that the standard deviation σp^ depends upon the unknown parameter p. To estimate this standard deviation using the data, we replace p in the formula by the sample proportion p^. As we did in Chapter 7, we use the term standard error for the standard deviation of a statistic that is estimated from data. Here is a summary of the procedure.

Large-sample confidence interval for a population proportion

Choose an SRS of size n from a large population with an unknown proportion p of successes. The sample proportion is

p^=Xn

where X is the number of successes. The standard error of p^ is

SEp^=p^(1−p^)n

and the margin of error of p^ at confidence level C is

m=z*SEp^

where the critical value z* is the value for the standard Normal density curve with area C between −z* and z*.

An approximate level C confidence interval for p is

p^±m

Use this interval for 90%(z*=1.645), 95%(z*=1.96), or 99%(z*=2.576) confidence when the number of successes and the number of failures are both at least 10.

Table D at the back of the book includes a line at the bottom with values of z* for selected values of C. Use Table A for other values of C. You can also use software, such as Excel with the formula such as = NORM.S.INV(1−C).

Example 8.2 Inference for robotics and jobs.

Data set icon for robot.

The sample survey in Example 8.1 found that 910 of a sample of 1896 experts reported that they think net jobs will decrease by 2025 because of robots and related technology developments. The sample proportion is

p^=Xn=9101896=0.47996

which was rounded to 48% in their report. The standard error is

SEp^=p^(1−p^)n=0.47996(1−0.47996)1896=0.011474

The z critical value for 95% confidence is z*=1.96, so the margin of error is

m=1.96SEp^=(1.96)(0.011474)=0.022489

The confidence interval is

p^±m=0.480±0.022

We are 95% confident that between 45.8% and 50.2% of experts would report that they think net jobs will decrease by 2025 because of robots and related technology developments.

In performing these calculations, we have kept a large number of digits for our intermediate calculations to avoid rounding errors. However, when reporting the results, we prefer to use rounded values (for example, “48.0% with a margin of error of 2.2%”). You should always focus on what is important. Reporting extra digits that are not needed can divert attention from the main point of your summary. There is no additional information to be gained by reporting p^=47.996% with a margin of error of 2.2489%. Do you think it would be better to report 48% with a 2% margin of error?

caution Remember that the margin of error in any confidence interval includes only random sampling error. If people do not respond honestly to the questions asked, for example, your estimate is likely to miss by more than the margin of error. Likewise, if the response rate is low, your estimate and standard error may be biased.

Although the calculations for statistical inference for a single proportion are relatively straightforward and can be done with a calculator or in a spreadsheet, we prefer to use software.

Example 8.3 Robotics and jobs confidence interval using software.

Data set icon for robot.

Figure 8.1 shows a spreadsheet for the robotics and jobs example that could be used as input for statistical software. Note that there are 1896 experts who expressed opinions in this survey. The sheet specifies a value for each of these 1896 cases: there are 910 cases with the value Yes and the remaining 986 cases with the value No. An alternative spreadsheet would not summarize the responses but rather would list all 1896 cases and the response for each case.

An Excel spreadsheet lists the following data. Decrease, yes. Count, 910. Decrease, no. Count, 986. — Figure 8.1 The robotics and jobs data in an Excel spreadsheet for the confidence interval, Example 8.3.

Figure 8.2 gives output from JMP and Minitab for these data. There are differences in the displays, but it is easy to find the 95% confidence interval. For JMP, the confidence interval is on the line with “Level” equal to Yes under the headings “Lower CI” and “Upper CI.” Minitab gives the output in the form of an interval under the heading “95% CI.” Notice that the confidence intervals are similar but not identical. Minitab notes that the Normal approximation is used. This is the large-sample method that we described. JMP notes that an alternative method, using score functions, is used.

J M P and Minitab outputs for jobs and robotics data. — Figure 8.2 JMP and Minitab outputs for the robotics and jobs survey, Example 8.3.

The J M P window has an expanded menu, distributions. Beneath is a menu, decrease, also expanded. Under it is a bar graph that plots count on the vertical axis, ranging from 0 to 1000 in increments of 200, versus decrease on the horizontal axis, yes or no. The approximate data is as follows. No, 990. Yes, 910. Below the graph is an expanded menu, frequencies. Below it is the following table of data. Level, no. Count, 986. Probability, 0.52004. Level, yes. Count, 910. Probability, 0.47996. Total, 1896. Probability, 1.00000. N missing, 0. Two levels. Below the table is an expanded menu, confidence levels. Below it is the following table of data. Level, no. Count, 986. Probability, 0.52004. Lower C I, 0.497536. Upper C I, 0.542467. 1 alpha, 0.950. Level, yes. Count, 910. Probability, 0.47996. Lower C I, 0.457533. Upper C I, 0.502464. 1 alpha, 0.950. Total count, 1896. Note, computed using score confidence intervals. The Minitab output lists the following data for test and C I for one proportion. Sample, 1. X, 910. N, 1896. Sample p, 0.479958. 95 percent C I, (0.4574700, 0.502446). Using the normal approximation.

As usual, the output reports more digits than are useful. When you use software, be sure to think about how many digits are meaningful for your purposes. Do not clutter your report with information that is not meaningful.

We recommend the large-sample confidence interval for 90%, 95%, and 99% confidence whenever the number of successes and the number of failures are both at least 10. For smaller sample sizes, we recommend exact methods that use the binomial distribution. These, as well as other alternative procedures, such as the score function, are available as the default or as options in many statistical software packages. We do not cover them here. There is also an intermediate case between large samples and very small samples where a slight modification of the large-sample approach works quite well. This method is called the “plus four” procedure and is described next.

Check-in

8.3 Users of Instagram. Refer to Check-in question 8.1 (page 451).
1. Find SEp^, the standard error of p^.
2. Give the 95% confidence interval for p in the form of estimate plus or minus the margin of error.
3. Give the confidence interval as an interval of percents.
4. State your conclusion and interpret the meaning of the confidence interval in part (c).
8.4 Users of Snapchat. Refer to Check-in question 8.2 (page 451).
1. Find SEp^, the standard error of p^.
2. Give the 95% confidence interval for p in the form of estimate plus or minus the margin of error.
3. Give the confidence interval as an interval of percents.
4. State your conclusion and interpret the meaning of the confidence interval in part (c).

Beyond the Basics

Plus four confidence interval for a single proportion

Computer studies reveal that confidence intervals based on the large-sample approach can be quite inaccurate when the number of successes and the number of failures are not at least 10. In such cases, a simple adjustment to the confidence interval works very well. The adjustment is based on assuming that the sample contains four additional observations, two of which are successes and two of which are failures. This estimate was first suggested by Edwin Bidwell Wilson in 1927, and it is sometimes called the Agresti–Coull estimate.⁵ We call it the plus four estimate:

p^=X+2n+4

The confidence interval is based on the z statistic obtained by standardizing the plus four estimate p∼. Because p∼ is the sample proportion for our modified sample of size n+4, it isn’t surprising that the distribution of p∼ is close to the Normal distribution with mean p and standard deviation p(1−p)/(n+4). To get a confidence interval, we estimate p by p∼ in this standard deviation to get the standard error of p∼. Here is an example.

Example 8.4 Plus four for the percent of equol producers.

Research has shown that there are many health benefits associated with a diet that contains soy foods. Substances in soy called isoflavones are known to be responsible for these benefits. When soy foods are consumed, some subjects produce a chemical called equol, and it is thought that production of equol is a key factor in the health benefits of a soy diet. Unfortunately, not all people are equol producers; there appear to be two distinct subpopulations: equol producers and equol nonproducers.

A nutrition researcher planning some bone health experiments would like to include some equol producers and some nonproducers among her subjects. A preliminary sample of 12 female subjects was measured, and 4 were found to be equol producers. We would like to estimate the proportion of equol producers in the population from which this researcher will draw her subjects.

The plus four estimate of the proportion of equol producers is

p∼=4+212+4=616=0.375

For a 95% confidence interval, we use Table D to find z*=1.96. We first compute the standard error

SEp∼=p∼(1−p∼)n+4=(0.375)(1−0.375)16=0.12103

and then the margin of error

m=z*SEp∼=(1.96)(0.12103)=0.237

So the confidence interval is

p∼±m=0.375±0.237=(0.138, 0.612)

We estimate with 95% confidence that between 14% and 61% of women from this population are equol producers. Note that the interval is very wide because the sample size is very small. Compare this result with the large-sample confidence interval.

If the true proportion of equol users is near 14%, the lower limit of this interval, there may not be a sufficient number of equol producers in the study if subjects are tested only after they are enrolled in the experiment. It may be necessary to have a screening phase to determine whether or not a potential subject is an equol producer. The study could then be designed to have the same number of equol producers and nonproducers.

Significance test for a single proportion

Data set icon for Vtm.

Recall that the sample proportion p^=X/n is approximately Normal, with mean μp^=p and standard deviation σp^=p(1−p)/n. For confidence intervals, we substitute p^ for p in the last expression to obtain the standard error. When performing a significance test, however, the null hypothesis specifies a value for p, and we assume that this is the true value when calculating the P-value. Therefore, when we test H0: p=p0, we substitute p0 into the expression for σp^ and then standardize p^. Here are the details.

Large-sample significance test for a population proportion

Draw an SRS of size n from a large population with an unknown proportion p of successes. To test the hypothesis H0: p=p0, compute the z statistic

z=p^−p0p0(1−p0)n

In terms of a standard Normal random variable Z, the approximate P-value for a test of H0 against

Ha: P>p0 is P(Z≥z)

Ha: P<p0 is P(Z≤z)

Ha: P≠p0 is 2P(Z≥ |z|)

We recommend the large-sample z significance test as long as the expected number of successes, np0, and the expected number of failures, n(1−p0), are both at least 10.

If the expected numbers of successes and failures are not both at least 10, or if the population is less than 20 times as large as the sample, other procedures should be used. One such approach is to use the binomial distribution, as we did with the sign test. Here is a large-sample matched-pairs example.

Example 8.5 Comparing two sunblock lotions.

Data set icon for sunbl.

Your company produces a sunblock lotion designed to protect the skin from both UVA and UVB exposure to the sun. You hire a company to compare your product with the product sold by your major competitor. The testing company exposes skin on the backs of a sample of 20 people to UVA and UVB rays and measures the protection provided by each product. For 13 of the subjects, your product provided better protection, while for the other 7 subjects, your competitor’s product provided better protection. Do you have evidence to support a commercial advertisement claiming that your product provides superior UVA and UVB protection? For the data we have n=20 subjects and X=13 successes. The parameter p is the proportion of people who would receive superior UVA and UVB protection from your product. To answer the claim question, we test

H0: p=0.5Ha: p≠0.5

The expected numbers of successes (your product provides better protection) and failures (your competitor’s product provides better protection) are 20×0.5=10 and 20×0.5=10. Both are at least 10, so we can use the z test. The sample proportion is

p^=Xn=1320=0.65

The test statistic is

z=p^−p0p0(1−p0)n=0.65−0.5(0.5)(0.5)20=1.34

From Table A, we find P(Z<1.34)=0.9099, so the probability in the upper tail is 1−0.9099=0.0901. The P-value is the area in both tails, P=2×0.0901=0.1802.

We conclude that the sunblock testing data do not provide evidence to reject the hypothesis of no difference between your product and your competitor’s product (p^=0.65, z=1.34, P=0.18). The data do not support your proposed advertising claim.

Note that we have used the two-sided alternative for this example. In settings like this, we must start with the view that either product could be better if we want to prove a claim of superiority. Thinking or hoping that your product is superior cannot be used to justify a one-sided test.

Although these calculations are not particularly difficult to do using a calculator, we prefer to use software. Here are some details.

Example 8.6 Sunblock significance tests using software.

Data set icon for sunbl.

JMP and Minitab outputs for the analysis in Example 8.5 appear in Figure 8.3. JMP uses a slightly different way of reporting the results. Two ways of performing the significance test are labeled in the column “Test.” The one that corresponds to the procedure that we have described is on the second line, labeled “Pearson.” The P-value under the heading “Prob>Chisq” is 0.1797, which is very close to the 0.1802 that we calculated using Table A. Minitab reports the value of the test statistic z, and the P-value is rounded to 0.180.

The J M P window has an expanded menu, distributions. Beneath is a menu, sunblock, also expanded. Under it is a bar graph that plots count on the vertical axis, ranging from 0 to 16 in increments of 2, versus sunblock on the horizontal axis, theirs or yours. The approximate data is as follows. Theirs, 7, representing 35 percent. Yours, 13, representing 65 percent. Below the graph is an expanded menu, frequencies. Below it is the following table of data. Level, theirs. Count, 7. Probability, 0.35000. Level, yours. Count, 13. Probability, 0.65000. Total, 20. Probability, 1.00000. N missing, 0. Two levels. Below the table is an expanded menu, confidence levels. Below it is the following table of data. Level, theirs. Count, 7. Probability, 0.35000. Lower C I, 0.181192. Upper C I, 0.567146. 1 alpha, 0.950. Level, yours. Count, 13. Probability, 0.65000. Lower C I, 0.432854. Upper C I, 0.818808. 1 alpha, 0.950. Total count, 20. Note, computed using score confidence intervals. Below is an expanded menu, test probabilities, with the following two tables of data. First. Level, theirs. Estimated probability, 0.35000. Hypothetical probability, 0.50000. Level, yours. Estimated probability, 0.65000. Hypothetical probability, 0.50000. Second. Test, likelihood ration. Chi square, 1.8280. D F, 1. Probability greater than chi square, 0.1754. Test, Pearson. Chi square, 1.8000. D F, 1. Probability greater than chi square, 0.1797. The Minitab output lists three sets of data for test a C I for one proportion as follows. Method. p, event proportion. Normal approximation method is used for this analysis. Descriptive statistics, table. N, 20. Event, 13. Sample p, 0.650000. 95 percent C I for p, (0.440963, 0.859037). Test. Null hypothesis, H sub 0, p equals 0.5. Alternative hypothesis, H sub 1, p does not equal 0.5. Z value, 1.34. P value, 0.180.

Check-in

8.5 Draw a picture. Draw a picture of a standard Normal curve and shade the tail areas to illustrate the calculation of the P-value for Example 8.5.
8.6 What does the confidence interval tell us? Inspect the outputs in Figure 8.3. Report the confidence interval for the percent of people who would get better sun protection from your product than from your competitor’s. Be sure to convert from proportions to percents and to round appropriately. Interpret the confidence interval and compare this way of analyzing data with the significance test.
8.7 The effect of X. In Example 8.5 (page 457), suppose that your product provided better UVA and UVB protection for 16 of the 20 subjects. Perform the significance test and summarize the results.
8.8 The effect of n. In Example 8.5 (page 457), consider what would have happened if you had paid for three times as many subjects to be tested. Assume that the results would be similar to those in Example 8.5, that is, 65% of the subjects had better UVA and UVB protection with your product. Perform the significance test and summarize the results.

In Example 8.5, we treated an outcome as a success whenever your product provided better sun protection. Would we get the same results if we defined success as an outcome where your competitor’s product was superior? You will find in answering the next Check-in question that the answer is yes.

Check-in

8.9 Redefining success. In Example 8.5 (page 457), we performed a significance test to compare your product with your competitor’s. Success was defined as the outcome where your product provided better protection. Now, take the viewpoint of your competitor where success is defined to be the outcome where your competitor’s product provides better protection. In other words, n remains the same, but X is now 7.
1. Perform the two-sided significance test and report the results. How do these compare with what we found in Example 8.5?
2. Find the 95% confidence interval for this setting and compare it with the interval calculated when success is defined as the outcome where your product provides better protection.

caution We do not often use significance tests for a single proportion because it is uncommon to have a situation where there is a precise p0 that we want to test. For physical experiments such as coin tossing or drawing cards from a well-shuffled deck, probability arguments lead to an ideal p0. Even then, it can be argued, for example, that no real coin has a probability of heads exactly equal to 0.5.

Data from past large samples can sometimes provide a p0 for the null hypothesis of a significance test. In some types of epidemiology research, for example, “historical controls” from past studies serve as the benchmark for evaluating new treatments. Medical researchers argue about the validity of these approaches, because the past never quite resembles the present. In general, we prefer comparative studies whenever possible.

Choosing a sample size for a confidence interval

Data set icon for Vtm.

In Chapter 6, we showed how to choose the sample size n to obtain a confidence interval with specified margin of error m for a mean. Because we are using a Normal approximation for inference about a population proportion, sample size selection proceeds in much the same way.

Recall that the margin of error for the large-sample confidence interval for a population proportion is

m=z*SEp^=z*p^(1−p^)n

Choosing a confidence level C fixes the critical value z*. But the margin of error also depends on the data through the value of p^ and the sample size n. Because we don’t know the value of p^ until we gather the data, we must guess a value to use in the calculations. We will call the guessed value p*. There are two common ways to get p*:

Use the sample estimate from a pilot study or from similar studies done earlier.
Use p*=0.5. Because the margin of error is largest when p^=0.5, this choice gives a sample size that is somewhat larger than we really need for the confidence level we choose. It is a safe choice no matter what the data later show.

Once we have chosen p* and the margin of error m that we want, we can find the n we need to achieve this margin of error. Here is the result.

Sample size for desired margin of error

The level C confidence interval for a proportion p will have a margin of error approximately equal to a specified value m when the sample size satisfies

n=(z*m)2 p*(1−p*)

Here, z* is the critical value for confidence level C, and p* is a guessed value for the proportion of successes in the future sample.

The margin of error will be less than or equal to m if p* is chosen to be 0.5. Substituting p*=0.5 into the formula above gives

n=14(z*m)2

The value of n obtained by this method is not particularly sensitive to the choice of p* when p* is fairly close to 0.5. However, if the value of p is likely to be smaller than about 0.3 or larger than about 0.7, use of p*=0.5 may result in a sample size that is much larger than needed.

Example 8.7 Planning a survey of students.

A large university is interested in assessing student satisfaction with the overall campus environment. The plan is to distribute a questionnaire to an SRS of students, but before proceeding, the university wants to determine how many students to sample. The questionnaire asks about a student’s degree of satisfaction with various student services, each measured on a five-point scale. The university is interested in the proportion p of students who are satisfied (that is, who choose either “satisfied” or “very satisfied,” the two highest levels on the five-point scale).

The university wants to estimate p with 95% confidence and a margin of error less than or equal to 3%, or 0.03. For planning purposes, it is willing to use p*=0.5. To find the sample size required,

n=14(z*m)2=14(1.960.03)2=1067.1

Round up to get n=1068. (Always round up. Rounding down would give a margin of error slightly greater than 0.03.)

Similarly, for a 2.5% margin of error, we have (after rounding up)

n=14(1.960.025)2=1537

and for a 2% margin of error,

n=14(1.960.02)2=2401

News reports frequently describe the results of surveys with sample sizes between 1000 and 1500 and a margin of error of about 3%. These surveys generally use sampling procedures more complicated than simple random sampling, so the calculation of confidence intervals is more involved than what we have studied in this section. The calculations in Example 8.7 show in principle how such surveys are planned.

Example 8.8 Assessing interest in Pilates classes.

The Division of Recreational Sports (Rec Sports) at a major university is responsible for offering comprehensive recreational programs, services, and facilities to the students. Rec Sports is continually examining its programs to determine how well it is meeting the needs of the students. Rec Sports is considering adding some new programs and would like to know how much interest there is in a new exercise program based on the Pilates method.⁶ It will take a survey of undergraduate students. In the past, Rec Sports emailed short surveys to all undergraduate students. The response rate obtained in this way was about 5%. This time it will send emails to a simple random sample of the students and will follow up with additional emails and eventually a phone call to get a higher response rate. Because of limited staff and the work involved with the follow-up, it would like to use a sample size of about 200 responses. It assumes that the new procedures will improve the response rate to 90%, so it will contact 225 students in the hope that these will provide at least 200 valid responses. One of the questions it will ask is, “Have you ever heard about the Pilates method of exercise?”

The primary purpose of the survey is to estimate various sample proportions for undergraduate students. Will the proposed sample size of n=200 be adequate to provide Rec Sports with the needed information? To address this question, we calculate the margins of error of 95% confidence intervals for various values of p^.

Example 8.9 Margins of error.

In the Rec Sports survey, the margin of error of a 95% confidence interval for any value of p^ and n=200 is

m=z*SEp^=1.96p^(1−p^)200=0.139p^(1−p^)

The results for various values of p^ are

p^	m	p^	m
0.05	0.030	0.60	0.068
0.10	0.042	0.70	0.064
0.20	0.056	0.80	0.056
0.30	0.064	0.90	0.042
0.40	0.068	0.95	0.030
0.50	0.070

Rec Sports judged these margins of error to be acceptable, and it contacted 225 students, hoping to achieve a sample size of 200 for its survey.

The table in Example 8.9 illustrates two points. First, the margins of error for p^ and 1−p^ are the same. This is a direct consequence of the form of the confidence interval. Second, the margin of error varies between only 0.064 and 0.070 as p^ varies from 0.3 to 0.7, and the margin of error is greatest when p^=0.5, as we claimed earlier (page 460). It is true in general that the margin of error will vary relatively little for values of p^ between 0.3 and 0.7. Therefore, when planning a study, it is not necessary to have a very precise guess for p. If p*=0.5 is used and the observed p^ is between 0.3 and 0.7, the actual interval will be a little shorter than needed, but the difference will be small.

caution Again, it is important to emphasize that these calculations consider only the effects of sampling variability that are quantified in the margin of error. Other sources of error, such as nonresponse and possible misinterpretation of questions, are not included in the table of margins of error for Example 8.9. Rec Sports is trying to minimize these kinds of errors. It performed a pilot study using a small group of current users of its facilities to check the wording of the questions, and for the final survey it devised a careful plan to follow up with the students who did not respond to the initial email.

Check-in

8.10 Confidence level and sample size. Refer to Example 8.7 (page 460). Suppose that the university was interested in a 95% confidence interval with margin of error 0.02. Would the required sample size be smaller or larger than 1068 students? Verify your answer by performing the calculation.
8.11 Make a plot. Use the values for p^ and m given in Example 8.9 to draw a plot of the sample proportion versus the margin of error. Summarize the major features of your plot.

Choosing a sample size for a significance test

In Chapter 6, we introduced the idea of power for a significance test. In Chapter 7, we discussed the relationship between sample size and power and described the use of software to calculate power for both one- and two-sample t tests. Those ideas also apply to the significance test for a proportion that we studied in this section. Thus, we can concentrate on the input and output and let software do the messy calculations.

To find the required sample size, we need to specify

The significance level α (the probability of rejecting the null hypothesis when it is true); usually we choose 5% for α.
Power (probability of rejecting the null hypothesis when it is false); usually we choose 80% (0.80) for power.
The value of p0 in the null hypothesis H0: p=p0.
The alternative hypothesis, two-sided Ha: p≠p0, one-sided Ha: p>p0 or Ha: p<p0.
A value of p for the alternative hypothesis.

Example 8.10 Sample size for comparing two sunblock lotions.

In Example 8.5 (page 457), we performed the significance test for comparing two sunblock lotions in a setting where each subject used the two lotions, and the product that provided better protection was recorded. Although your product performed better 13 times in 20 trials, the value p^=13/20=0.65 was not sufficiently far from the null hypothesized value of p0=0.5 for us to reject the H0(p=0.18). Let’s suppose that the true percent of the time that your lotion would perform better is p0=0.65, and we plan to test the null hypothesis H0: p=0.5 versus the two-sided alternative Ha: p≠0.5 using the 0.05 significance level.

What sample size n should we choose if we want to have an 80% chance of rejecting H0? Outputs from JMP and Minitab are given in Figure 8.4. JMP indicates that n=89 should be used, while Minitab suggests n=85. The difference is due to the different methods used for these calculations.

A J M P input and Minitab output. — Figure 8.4 JMP and Minitab outputs for sample size needed to compare sunblock lotions, Example 8.10.

At the top of the J M P window is an expanded menu, sample size. Below is a power calculator. It lists several measures with dropdown menus, radio buttons, and textboxes for entering desired values. From top to bottom, it reads as follows. One proportion. Testing if one proportion is different from the hypothesized value. Alpha, 0.05 entered. Proportion, 0.65 entered. Dropdown menu, method, exact Agresti Coull selected. Two options, two-sided or one-sided, with one-sided selected. Enter one value to see a plot of the other two. Null proportion, 0.5. Sample size, 89. Power, 0.8. Actual test size equals 0.0557783. The Minitab output shows a graph of a power curve for one proportion. The graph plots power on the vertical axis, ranging from 0.0 to 1.0 in increments of 0.2, versus comparison p on the horizontal axis, ranging from 0.2 to 0.8 in increments of 1. To the right, the output lists the following data for the graph. Sample size, 85. Assumptions. alpha, 0.05. Hypothesized p, 0.5. Alternative, does not equal. On the graph, a U-shaped plot falls from (0.28, 1.0) to (0.5, 0.05), then rises to (0.72, 1.0). A point is plotted on the curve at (0.65, 0.8). All values estimated.

Note that Minitab provides a graph as a function of the value of the proportion for the alternative hypothesis. Similar plots can be produced by JMP. In some situations like those in Chapter 7, you might want to specify the sample size n and have software compute the power. This option is available in JMP, Minitab, and other software.

Check-in

8.12 Compute the sample size for a different alternative. Refer to Example 8.10. Use software to find the sample size needed for a two-sided test of the null hypothesis that p=0.5 versus the two-sided alternative with α=0.05 and 80% power if the alternative is p=0.6.
8.13 Compute the power for a given sample size. Consider the setting in Example 8.10. You have a budget that will allow you to test 50 subjects. Use software to find the power of the test for this value of n.

Section 8.1 Summary

Inference about a population proportion p from an SRS of size n is based on the sample proportion p^=X/n. When n is large, p^ has approximately the Normal distribution with mean p and standard deviation p(1−p)/n.
For large samples, the level C margin of error of p^ is

m=z*SEp^

where the critical value z* is the value for the standard Normal density curve with area C between −z* and z*, and the standard error of p^ is

SEp^=p^(1−p^)n
The level C large-sample confidence interval is

p^±m

We recommend using this interval for 90%, 95%, and 99% confidence whenever the number of successes and the number of failures are both at least 10. When sample sizes are smaller, alternative procedures such as the plus four estimate of the population proportion are recommended.
Tests of H0:p=p0 are based on the z statistic

z=p^−p0p0(1−p0)n

with P-values calculated from the N(0, 1) distribution. Use this procedure when the expected number of successes, np0, and the expected number of failures, n(1−p0), are both greater than 10.
The sample size required to obtain a confidence interval of approximate margin of error m for a proportion is found from

n=(z*m)2 p*(1−p*)

where p* is a guessed value for the proportion and z* is the standard Normal critical value for the desired level of confidence. To ensure that the margin of error of the interval is less than or equal to m no matter what p^ may be, use

n=14(z*m)2
Software can be used to determine the sample sizes for significance tests. Inputs include the significance level, the desired power, the null hypothesized value of p, and the alternative value of p.

Now that you have completed this section, you will be able to:

Identify the sample size, the count, and the sample proportion for a single sample. Review Example 8.1 (page 451) and try Exercise 8.1.
Calculate the standard error of a sample proportion and the margin of error. Review Example 8.2 (page 453) and try Exercise 8.3.
Construct the large-sample confidence interval for a single proportion. Review Example 8.2 (page 453) and try Exercise 8.3.
Use the large-sample significance test to test a null hypothesis about a population proportion. Review Example 8.5 (page 457) and try Exercise 8.7.
Find the sample size needed for a desired margin of error. Review Example 8.7 (page 460) and try Exercise 8.9.
Find the sample size needed for a significance test of a single proportion. Review Example 8.10 (page 463) and try Exercise 8.31.

Section 8.1 EXERCISES

8.1 Do you use a smart watch or fitness tracker? A Pew Internet poll asked 4272 U.S. adults about their use of smart watches and fitness trackers. A summary of the results reported that 897 adults regularly wear a smart watch or fitness tracker.⁷
1. Identify the sample size and the count.
2. Calculate the sample proportion.
3. Explain the relationship between the population proportion and the sample proportion.
8.2 What do you know about science? A Pew Internet poll tested 4464 U.S. adults about their knowledge of science. One of the questions asked how far a car will travel in 45 minutes if it travels at a constant speed of 40 miles per hour. Possible answers presented were 25 miles, 30 miles, 35 miles, and 45 miles. The correct answer was given by 2544 adults.⁸
1. Identify the sample size and the count.
2. Calculate the sample proportion.
3. Explain the relationship between the population proportion and the sample proportion.
8.3 Analysis of the smart watch or fitness tracker data. Refer to Exercise 8.1.
1. Report the sample proportion, the standard error of the sample proportion, and the margin of error for 95% confidence.
2. Are the guidelines for when to use the large-sample confidence interval for a population proportion satisfied in this setting? Explain your answer.
3. Find the 95% large-sample confidence interval for the population proportion.
4. Write a short statement explaining the meaning of your confidence interval.
8.4 Analysis of the science knowledge data. Refer to Exercise 8.2.
1. Report the sample proportion, the standard error of the sample proportion, and the margin of error for 95% confidence.
2. Are the guidelines for when to use the large-sample confidence interval for a population proportion satisfied in this setting? Explain your answer.
3. Find the 95% large-sample confidence interval for the population proportion.
4. Write a short statement explaining the meaning of your confidence interval.
8.5 Would you recommend the service to a friend? An automobile dealership asks all its customers who used its service department in a given two-week period if they would recommend the service to a friend. A total of 200 customers used the service during the two-week period, and 180 said that they would recommend the service to a friend.
1. Identify the sample size and the count.
2. Calculate the sample proportion.
3. Explain the relationship between the population proportion and the sample proportion.
8.6 Analysis of the service recommendation data. Refer to the previous exercise.
1. Report the sample proportion, the standard error of the sample proportion, and the margin of error for 95% confidence.
2. Are the guidelines for when to use the large-sample confidence interval for a population proportion satisfied in this setting? Explain your answer.
3. Find the 95% large-sample confidence interval for the population proportion.
4. Write a short statement explaining the meaning of you confidence interval.
8.7 Whole grain versus regular grain? A study of young children was designed to increase their intake of whole-grain, rather than regular-grain, snacks. At the end of the study, the 82 children who participated in the study were presented with a choice between a regular-grain snack and a whole-grain alternative. The whole-grain alternative was chosen by 49 children. You want to examine the possibility that the children are equally likely to choose each type of snack.
1. Formulate the null and alternative hypotheses for this setting.
2. Are the guidelines for using the large-sample significance test satisfied for testing this null hypothesis? Explain your answer.
3. Perform the significance test and summarize your results in a short paragraph.
8.8 What’s wrong? For each of the following statements, explain what is wrong and why.
1. You can use a significance test to evaluate the hypothesis H0: p^=0.4 versus the two-sided alternative.
2. The large-sample significance test for a population proportion is based on a t statistic.
3. An approximate 95% confidence interval for an unknown proportion p is p^ plus or minus its standard error.
8.9 Find the sample size. You are planning a survey similar to the one about the use of smart watches and fitness trackers described in Exercise 8.1. You will report your results with a large-sample 95% confidence interval. How large a sample do you need to be sure that the margin of error will not be greater than 0.02? Show your work, including what you used to choose a value for p*.
8.10 Draw some pictures. Consider the binomial setting with n=200 and p=0.4.
1. The sample proportion p^ will have a distribution that is approximately Normal. Give the mean and the standard deviation of this Normal distribution.
2. Draw a sketch of this Normal distribution. Mark the location of the mean.
3. Find a value p* for which the probability is 95% that p^ will be between ±p*. Mark these two values on your sketch.
8.11 Country food and Inuits. Country food includes seals, caribou, whales, ducks, fish, and berries and is an important part of the diet of the aboriginal people called Inuits who inhabit Inuit Nunangat, the northern region of what is now called Canada. A survey of Inuits in Inuit Nunangat reported that 3274 out of 5000 respondents said that at least half of the meat and fish that they eat is country food.⁹ Find the sample proportion and a 95% confidence interval for the population proportion of Inuits whose meat and fish consumption consists of at least half country food.
8.12 Soft drink consumption in New Zealand. A survey commissioned by the Southern Cross Healthcare Group reported that 16% of New Zealanders consume five or more servings of soft drinks per week. The data were obtained through an online survey of 2006 randomly selected New Zealanders over 15 years of age.¹⁰
1. What number of survey respondents reported that they consume five or more servings of soft drinks per week? You will need to round your answer. Why?
2. Find a 95% confidence interval for the proportion of New Zealanders who report that they consume five or more servings of soft drinks per week.
3. Convert the estimate and your confidence interval to percents.
4. Discuss reasons the estimate might be biased.
8.13 Violent video games. A survey of 1050 parents who have a child under the age of 18 living at home asked about their opinions regarding violent video games. A report describing the results of the survey stated that 89% of parents say that violence in today’s video games is a problem.¹¹
1. What number of survey respondents reported that they thought that violence in today’s video games is a problem? You will need to round your answer. Why?
2. Find a 95% confidence interval for the proportion of parents who think that violence in today’s video games is a problem.
3. Convert the estimate and your confidence interval to percents.
4. Discuss reasons the estimate might be biased.
8.14 Bullying. Refer to the previous exercise. The survey also reported that 93% of the parents surveyed said that bullying contributes to violence in the United States. Answer the questions in the previous exercise for this item on the survey.
8.15 p^ and the Normal distribution. Consider the binomial setting with n=45. You are testing the null hypothesis that p=0.7 versus the two-sided alternative with a 5% chance of rejecting the null hypothesis when it is true.
1. Find the values of the sample proportion p^ that will lead to rejection of the null hypothesis.
2. Repeat part (a), assuming a sample size of n=90.
3. Make a sketch illustrating what you have found in parts (a) and (b). What does your sketch show about the effect of the sample size in this setting?
8.16 Students doing community service. In a sample of 159,949 first-year college students, the National Survey of Student Engagement reported that 39% participated in community service or volunteer work.¹²
1. Find the margin of error for 99% confidence.
2. Here are some facts from the report that summarizes the survey. The students were from 617 four-year colleges and universities. The response rate was 36%. Institutions paid a participation fee of between $1800 and $7800, based on the size of their undergraduate enrollment. Discuss these facts as possible sources of error in this study. How do you think these errors would compare with the margin of error that you calculated in part (a)?
8.17 Plans to study abroad. The survey described in the previous exercise also asked about items related to academics. In response to one of these questions, 42% of first-year students reported that they planned to study abroad.
1. Based on the information available, how many students planned to study abroad?
2. Give a 99% confidence interval for the population proportion of first-year college students who planned to study abroad.
8.18 Student credit cards. In a survey of 1430 undergraduate students, 1087 reported that they had one or more credit cards.¹³ Give a 95% confidence interval for the proportion of all college students who had at least one credit card.
8.19 How many credit cards? The summary of the survey described in the previous exercise reported that 43% of undergraduates had four or more credit cards. Give a 95% confidence interval for the proportion of all college students who had four or more credit cards.
8.20 How would the confidence interval change? Refer to the previous exercise.
1. Would a 90% confidence interval be wider or narrower than the one that you found in the previous exercise? Verify your answer by computing the interval.
2. Would a 97% confidence interval be wider or narrower than the one that you found in that exercise? Verify your results by computing the interval.
8.21 Do students report Internet sources? The National Survey of Student Engagement found that 87% of students report that their peers at least “sometimes” copy information from the Internet in their papers without reporting the source.¹⁴ Assume that the sample size is 430,000.
1. Find the margin of error for 99% confidence.
2. Here are some items from the report that summarizes the survey. More than 430,000 students from 730 four-year colleges and universities participated. The average response rate was 43% and ranged from 15% to 89%. Institutions pay a participation fee of between $3000 and $7500 based on the size of their undergraduate enrollment. Discuss these facts as possible sources of error in this study. How do you think these errors would compare with the error that you calculated in part (a)?
8.22 Can we use the z test? In each of the following cases, state whether or not the Normal approximation to the binomial should be used for a significance test on the population proportion p. Explain your answers.
1. n=20 and H0: p=0.3.
2. n=70 and H0: p=0.2.
3. n=100 and H0: p=0.08.
4. n=150 and H0: p=0.01.
8.23 Long sermons. The National Congregations Study collected data in a one-hour interview with a key informant—that is, a minister, priest, rabbi, or other staff person or leader.¹⁵ One question concerned the length of the typical sermon. For this question, 390 out of 1191 congregations reported that the typical sermon lasted more than 30 minutes.
1. Use the large-sample inference procedures to estimate the true proportion for this question with a 95% confidence interval.
2. The respondents to this question were not asked to use a stopwatch to record the lengths of a random sample of sermons at their congregations. They responded based on their impressions of the sermons. Do you think that ministers, priests, rabbis, or other staff persons or leaders might perceive sermon lengths differently from the people listening to the sermons? Discuss how your ideas would influence your interpretation of the results of this study.
8.24 Instant versus fresh-brewed coffee. A matched pairs experiment compares the taste of instant with fresh-brewed coffee. Each subject tastes two unmarked cups of coffee, one of each type, in random order, and states which they prefer. Of the 50 subjects who participate in the study, 32 preferred the fresh-brewed coffee.
1. Test the claim that a majority of people preferred the taste of fresh-brewed coffee. Report the large-sample z statistic and its P-value.
2. Draw a sketch of a standard Normal curve and mark the location of your z statistic. Shade the appropriate area that corresponds to the P-value.
3. Is your result significant at the 5% level? What is your practical conclusion?
8.25 Tossing a coin 10,000 times! The South African mathematician John Kerrich, while a prisoner of war during World War II, tossed a coin 10,000 times and obtained 5067 heads.
1. Is this significant evidence at the 5% level that the probability that Kerrich’s coin comes up heads is not 0.5? Use a sketch of the standard Normal distribution to illustrate the P-value.
2. Use a 95% confidence interval to find the range of probabilities of heads that would not be rejected at the 5% level.
8.26 Is there interest in a new product? One of your employees has suggested that your company develop a new product. You decide to take a random sample of your customers and ask whether or not there is interest in the new product. The response is on a 1 to 5 scale with 1 indicating “definitely would not purchase”; 2, “probably would not purchase”; 3, “not sure”; 4, “probably would purchase”; and 5, “definitely would purchase.” For an initial analysis, you will record the responses 1, 2, and 3 as No and 4 and 5 as Yes. What sample size would you use if you wanted the 95% margin of error to be 0.20 or less?
8.27 More information is needed. Refer to the previous exercise. Suppose that after reviewing the results of the previous survey, you proceeded with preliminary development of the product. Now you are at the stage where you need to decide whether or not to make a major investment to produce and market it. You will use another random sample of your customers, but now you want the margin of error to be smaller. What sample size would you use if you wanted the 95% margin of error to be 0.01 or less?
8.28 Sample size needed for an evaluation. You are planning an evaluation of a semester-long alcohol awareness campaign at your college. Previous evaluations indicate that about 25% of the students surveyed will respond Yes to the question “Did the campaign alter your behavior toward alcohol consumption?” How large a sample of students should you take if you want the margin of error for 95% confidence to be about 0.06?
8.29 Find more sample sizes. The evaluation in the previous exercise will also have questions that have not been asked before, so you do not have previous information about the possible value of p. Repeat the preceding calculation for the following values of p*: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. Summarize the results in a table and graphically. What sample size will you use?
8.30 Are the customers dissatisfied? An automobile manufacturer would like to know what proportion of its customers are dissatisfied with the service received from their local dealer. The customer relations department will survey a random sample of customers and compute a 95% confidence interval for the proportion who are dissatisfied. From past studies, it believes that this proportion will be about 0.30. Find the sample size needed if the margin of error of the confidence interval is to be no more than 0.04.
8.31 Sample size for coffee. Refer to Exercise 8.24, where we analyzed data from a matched pairs study that compared preferences for instant versus fresh-brewed coffee. Suppose that you want to design a similar study. The null hypothesis is that instant and fresh-brewed are equally likely to be preferred, and the alternative is two-sided. You will use α=0.05. What is the sample size needed to detect a preference of 65% for fresh-brewed with 0.80 probability?
8.32 Sample size for tossing a coin. Refer to Exercise 8.25, where we analyzed the 10,000 coin tosses made by John Kerrich. Suppose that you want to design a study that would test the hypothesis that a coin is fair versus the alternative that the probability of a head is 0.52. Using a two-sided test with α=0.05, what sample size would be needed to have 0.80 power to detect this alternative?
8.33 What’s wrong? For each of the following statements, explain what is wrong and why.
1. The margin of error for a confidence interval used for an opinion poll takes into account the fact that people who did not answer the poll questions would have given the same responses as those who did answer the questions.
2. If the P-value for a significance test is 0.05, we can conclude that the null hypothesis has a 5% chance of being true.
3. A student project used a confidence interval to describe the results in a final report. The confidence level was 115%.