Chapter 8 Exercises in 8 Inference for Proportions

Chapter 8 EXERCISES

8.53 The future of gamification. Gamification is an interactive design that includes rewards such as points, payments, and gifts. A Pew survey of 1021 technology stakeholders and critics was conducted to predict the future of gamification. A report on the survey said that 42% of those surveyed thought that there would be no major increases. On the other hand, 53% said that they believed that there would be significant advances in the adoption and use of gamification.²⁵ Analyze these data using the methods that you learned in this chapter and write a short report summarizing your work.
8.54 Where do you get your news? A report produced by the Pew Research Center’s Project for Excellence in Journalism summarized the results of a survey on how people get their news. Of the 2342 people in the survey who own a desktop or laptop, 1639 reported that they get their news from the desktop or laptop.²⁶
1. Identify the sample size and the count.
2. Find the sample proportion and its standard error.
3. Find and interpret the 95% confidence interval for the population proportion.
4. Are the guidelines for use of the large-sample confidence interval satisfied? Explain your answer.

8.55 Is the calcium intake adequate? Young children need calcium in their diet to support the growth of their bones. The Institute of Medicine provides guidelines for how much calcium should be consumed by people of different ages.²⁷ One study examined whether or not a sample of children consumed an adequate amount of calcium based on these guidelines. Because there are different guidelines for children aged 5 to 10 years and those aged 11 to 13 years, the children were classified into these two age groups. Each student’s calcium intake was classified as meeting or not meeting the guideline. There were 2029 children in the study. Here are the data:²⁸

Met requirement	Age (years)
Met requirement	5 to 10	11 to 13
No	194	557
Yes	861	417

Identify the populations, the counts, and the sample sizes for comparing the extent to which the two age groups of children met the calcium intake requirement.

8.56 Use a confidence interval for the comparison. Refer to the previous exercise. Use a 95% confidence interval for the comparison and explain what the confidence interval tells us. Be sure to include a justification for the use of the large-sample procedure for this comparison.
8.57 Use a significance test for the comparison. Refer to Exercise 8.55. Use a significance test to make the comparison. Interpret the result of your test. Be sure to include a justification for the use of the large-sample procedure for this comparison.
8.58 Confidence interval or significance test? Refer to Exercises 8.55, 8.56, and 8.57. Do you prefer to use the confidence interval or the significance test for this comparison? Give reasons for your answer.
8.59 Changing majors during college. In a simple random sample of 975 students from a large public university, it was found that 463 of the students changed majors during their college years.
1. Give a 95% confidence interval for the proportion of students at this university who change majors.
2. Express your results from part (a) in terms of the percent of students who change majors.
3. University officials concerned with counseling students are interested in the number of students who change majors rather than the proportion. The university has 37,500 undergraduate students. Convert the confidence interval you found in part (a) to a confidence interval for the number of students who change majors during their college years.
8.60 Facebook users. A Pew survey of 1802 Internet users found that 67% used Facebook.²⁹
1. How many of those surveyed used Facebook?
2. Give a 95% confidence interval for the proportion of Internet users who used Facebook.
3. Convert the confidence interval that you found in part (b) to a confidence interval for the percent of Internet users who used Facebook.
8.61 Twitter users. Refer to the previous exercise. The same survey reported that 16% of Internet users used Twitter. Answer the questions in the previous exercise for Twitter use.
8.62 Facebook versus Twitter. Refer to Exercises 8.60 and 8.61. Can you use the data provided in these two exercises to compare the proportion of Facebook users with the proportion of Twitter users? If your answer is Yes, do the comparison. If your answer is No, explain why you cannot make the comparison.

8.63 Video game genres. A survey of 1102 teens collected data about video game use by teens.³⁰ According to the survey, the following are the most popular game genres:

Genre	Examples	Percent who play
Racing	NASCAR, Mario Kart, Burnout	74
Puzzle	Bejeweled, Tetris, Solitaire	72
Sports	Madden, FIFA, Tony Hawk	68
Action	Grand Theft Auto, Devil May Cry, Ratchet and Clank	67
Adventure	Legend of Zelda, Tomb Raider	66
Rhythm	Guitar Hero, Dance Dance Revolution, Lumines	61

Give a 95% confidence interval for the proportion who play games in each of these six genres.

8.64 Too many errors. Refer to the previous exercise. The chance that each of the six intervals that you calculated includes the true proportion for that genre is approximately 95%. In other words, the chance that your interval misses the true value is approximately 5%.
1. Explain why the chance that at least one of your intervals does not contain the true value of the parameter is greater than 5%.
2. One way to deal with this problem is to adjust the confidence level for each interval so that the overall probability of at least one miss is 5%. One simple way to do this is to use a Bonferroni procedure (page 374). Here is the basic idea: You have an error budget of 5% and you choose to spend it equally on six intervals. Each interval has a budget of 0.05/6=0.008. So, each confidence interval should have a 0.8% chance of missing the true value. In other words, the confidence level for each interval should be 1−0.008=0.992. Use Table A to find the value of z* for a large-sample confidence interval for a single proportion corresponding to 99.2% confidence.
3. Calculate the six confidence intervals using the Bonferroni procedure.
8.65 Changes in credit card usage by undergraduates. In Exercise 8.18 (page 466), we looked at data from a survey of 1430 undergraduate students and their credit card use. In the sample, 43% said that they had four or more credit cards. A similar study of a different sample of undergraduates performed four years earlier by the same organization reported that 32% of the sample said that they had four or more credit cards.³¹ Assume that the sample sizes for the two studies are the same. Find a 95% confidence interval for the change in the percent of undergraduates who report having four or more credit cards.
8.66 Do the significance test for the change. Refer to the previous exercise. Perform the significance test for comparing the two proportions. Report your test statistic, the P-value, and summarize your conclusion.
8.67 We did not know the sample size. Refer to the previous two exercises. We did not report the sample size for the earlier study, but it is reasonable to assume that it is close to the sample size for the later study.
1. Suppose that the sample size for the earlier study was only 800. Redo the confidence interval and significance test calculations for this scenario.
2. Suppose that the sample size for the earlier study was 2500. Redo the confidence interval and significance test calculations for this scenario.
3. Compare your results for parts (a) and (b) of this exercise with the results that you found in the previous two exercises. Write a short paragraph about the effects of assuming a value for the sample size on your conclusions.
8.68 Student employment during the school year. A study of 1425 undergraduate students reported that 930 work 10 or more hours a week during the school year. Give a 95% confidence interval for the proportion of all undergraduate students who work 10 or more hours a week during the school year.
8.69 Examine the effect of the sample size. Refer to the previous exercise. Assume a variety of different scenarios where the sample size changes, but the proportion in the sample who work 10 or more hours a week during the school year remains the same. Write a short report summarizing your results and conclusions. Be sure to include numerical and graphical summaries of what you have found.
8.70 Sample size and the P-value. In this exercise, we examine the effect of the sample size on the significance test for comparing two proportions. In each case, suppose that p^1=0.70 and p^2=0.50, and take n to be the common value of n1 and n2. Use the z statistic to test H0: p1=p2 versus the alternative H0: p1≠p2. Compute the statistic and the associated P-value for the following values of n: 50, 70, 100, 600, and 1200. Summarize the results in a table. Explain what you observe about the effect of the sample size on statistical significance when the sample proportions p^1 and p^2 are unchanged.
8.71 Sample size and the margin of error. In Section 8.1, we studied the effect of the sample size on the margin of error of the confidence interval for a single proportion. In this exercise, we perform some calculations to observe this effect for the two-sample problem. Suppose that p^1=0.8 and p^2=0.4 and n represents the common value of n1 and n2. Compute the 95% margins of error for the difference between the two proportions for n=50, 100, 500, 100, 500, and 1000. Present the results in a table and with a graph. Write a short summary of your findings.
8.72 Calculating sample sizes for the two-sample problem. For a single proportion, the margin of error of a confidence interval is largest for any given sample size n and confidence level C when p^=0.5. This led us to use p*=0.5 for planning purposes. The same kind of result is true for the two-sample problem. The margin of error of the confidence interval for the difference between two proportions is largest when p^1=p^2=0.5. You are planning a survey and will calculate a 95% confidence interval for the difference between two proportions when the data are collected. You would like the margin of error of the interval to be less than or equal to 0.04. You will use the same sample size n for both populations.
1. How large a value of n is needed?
2. Give a general formula for n in terms of the desired margin of error m and the critical value z*.
8.73 A corporate liability trial. A major court case on the health effects of drinking contaminated water took place in the town of Woburn, Massachusetts. A town well in Woburn was contaminated by industrial chemicals. During the period that residents drank water from this well, there were 16 birth defects among 414 births. In years when the contaminated well was shut off and water was supplied from other wells, there were 3 birth defects among 228 births. The plaintiffs suing the firm responsible for the contamination claimed that these data show that the rate of birth defects was higher when the contaminated well was in use.³² How statistically significant is the evidence? What assumptions does your analysis require? Do these assumptions seem reasonable in this case?
8.74 Statistics and the law. Castaneda v. Partida is an important court case in which statistical methods were used as part of a legal argument.³³ When reviewing this case, the Supreme Court used the phrase “two or three standard deviations” as a criterion for statistical significance. This Supreme Court review has served as the basis for many subsequent applications of statistical methods in legal settings. (The 2 or 3 standard deviations referred to by the Court are values of the z statistic and correspond to P-values of approximately 0.05 and 0.0026.) In Castaneda, the plaintiffs alleged that the method for selecting juries in a county in Texas was biased against Mexican Americans. For the period of time at issue, there were 181,535 persons eligible for jury duty, of whom 143,611 were Mexican Americans. Of the 870 people selected for jury duty, 339 were Mexican Americans.
1. What proportion of eligible jurors were Mexican Americans? Let this value be p0.
2. Let p be the probability that a randomly selected juror is a Mexican American. The null hypothesis to be tested is H0: p=p0. Find the value of p^ for this problem, compute the z statistic, and find the P-value. What do you conclude? (A finding of statistical significance in this circumstance does not constitute proof of discrimination. It can be used, however, to establish a prima facie case. The burden of proof then shifts to the defense.)
3. We can reformulate this exercise as a two-sample problem. Here we wish to compare the proportion of Mexican Americans among those selected as jurors with the proportion of Mexican Americans among those not selected as jurors. Let p1 be the probability that a randomly selected juror is a Mexican American and let p2 be the probability that a randomly selected nonjuror is a Mexican American. Find the z statistic and its P-value. How do your answers compare with your results in part (b)?

PUTTING IT ALL TOGETHER

8.75 Wallets with money in Canada. Refer to Example 8.11, where we examined U.S. data for a study of returns of lost wallets with no money and with money. Data were also collected for people in other countries. For Canada, 200 wallets were used for each condition. The wallets with money contained 16.50 Canadian dollars. Of the 200 lost wallets with money, 126 were returned. Of the 200 lost wallets without money, 92 were returned. Analyze the Canadian data for wallets with money using the methods you learned in this chapter. Summarize your work in a short report.
8.76 Wallets in Poland. The wallet study also collected data in Poland, with 200 wallets in each of the two conditions. For the wallets with no money, 130 were returned; for wallets with money, 138 were returned. For this study, the wallets with money contained 25 Polish Zloty. Summarize your work in a short report.