16.73 Sex and GPA.
In
Example 16.7
(page 16-15), you used the bootstrap to find a 95% confidence interval for
the 25% trimmed mean of GPA. Let’s change the statistic of
interest to the 5% trimmed mean. Using
Examples 16.5
through
16.7 as
a guide, find the corresponding 95% confidence interval. Compare
this interval with the one in
Example 16.7.
16.74 Change the trim. Refer to the previous
exercise. Change the statistic of interest to the 10% trimmed
mean. Answer the questions in the previous exercise and also
compare your new interval with the one you found there.
16.75 Compare the correlations.
In
Exercise 16.45
(page 16-37), we compared the mean GPA for males and females using the
bootstrap. In
Exercise 16.46, we used the bootstrap to examine the correlation between GPA
and high school math grades. Find the correlations for men and
women separately and determine whether there is evidence that they
differ.
Find the correlation between GPA and high school math grades for the men. Use the bootstrap to find a 95% confidence interval for the population correlation.
Repeat part (a) for the women.
Use the bootstrap to test the null hypothesis that the
population correlations for men and women are the same,
Summarize your findings.
16.76 Use the regression slope. Refer to the
previous exercise, where we used correlations to address the
question of whether or not the relationship between GPA and high
school math grades is the same for men and women. In
Exercise 16.50
(page 16-37), we used the bootstrap to examine the slope of the
least-squares regression line for predicting GPA using high school
math grades. Let’s compute the slope separately for males and
females and determine whether or not they differ. This is another
way to ask the question about whether or not the relationship
between GPA and high school math grades is the same for males and
females. Answer the questions from the previous exercise using the
slope. Compare the results that you find here with those you found
in the previous exercise.
16.77 Bootstrap confidence interval for the difference in proportions. Refer to Exercise 16.70 (page 16-49). We want a 95% confidence interval for the change from 2015 to 2020 in the proportions of U.S. residents who report that they have listened to at least one podcast. Bootstrap the sample data. Give all three bootstrap confidence intervals (t, percentile, and BCa). Compare the three intervals and summarize the results. Which intervals would you recommend? Give reasons for your answer.
16.78 Bootstrap confidence interval for the ratio.
Here is one conclusion from the data in
Table 16.3,
described in
Exercise 16.68: “The mean serum retinol level in uninfected children was 1.255
times the mean level in the infected children. A 95% confidence
interval for the ratio of means in the population of all children
in Papua New Guinea is . . . .”
Bootstrap the data and use the BCa interval to complete this conclusion.
Briefly describe the shape and bias of the bootstrap distribution. Does the bootstrap percentile interval agree closely with the BCa interval for these data?
16.79 Poetry: An occupational hazard. According to William Butler Yeats, “She is the Gaelic muse, for she gives inspiration to those she persecutes. The Gaelic poets die young, for she is restless, and will not let them remain long on earth.” One study designed to investigate this issue examined the age at death for writers from different cultures and sexes.13
In
Example 1.27
(page 34), we
examined the distributions of the age at death for female
novelists, poets, and nonfiction writers.
Figure 1.14
shows modified side-by-side boxplots for the three categories of
writers. The poets do appear to die young! Note that there is an
outlier among the nonfiction writers. This writer died at the age
of 40—young for a nonfiction writer but not for a novelist or a
poet! Let’s use the methods of this chapter to compare the ages at
death for poets and nonfiction writers.
Use numerical and graphical summaries to describe the distribution of age at death for the poets. Do the same for the nonfiction writers.
Use the methods of Chapter 7 (page 417) to compare the means of the two distributions. Summarize your findings.
Use the bootstrap methods of this chapter to compare the means of the two distributions. Summarize your findings.
16.80 Medians for the poets. Refer to the
previous exercise. Use the bootstrap methods of this chapter to
compare the medians of the two distributions. Summarize your
findings and compare them with part (c) of the previous exercise.
16.81 Permutation test for the poets.
Refer to
Exercise 16.79. Answer part (c) of that exercise using the permutation test.
Summarize your findings and compare them with what you found in
Exercise 16.79.
16.82 Variance for poets. Refer to Exercises 16.79 and 16.81.
Instead of comparing means, compare variances using the ratio of sample variances as the statistic. Summarize your findings.
Explain how questions about the equality of standard deviations are related to questions about the equality of variances.
Use the results of this exercise and the previous three
exercises to address the question of whether or not the
distributions of the poets and nonfiction writers are the
same.
16.83 Bootstrap confidence interval for the median.
Most software can generate random numbers that have the uniform
distribution on 0 to 1. For example, Excel has the
RAND()
function (page 168) and R has the
runif()
function. Generate a sample of 50
observations from this distribution.
Figure 4.9 (page 229) shows the density curve of this distribution. What is the population median?
Bootstrap the sample median and describe the bootstrap distribution.
What is the bootstrap standard error? Compute a 95% bootstrap t confidence interval.
Find the 95% BCa confidence interval. Compare with the interval in (c). Is the bootstrap t interval reliable here?
16.84 Are female personal trainers, on average,
younger?
A fitness center employs 20 personal trainers. Here are the ages,
in years, of the female and male personal trainers working at this
center:
Male | 25 | 26 | 23 | 32 | 35 | 29 | 30 | 28 | 31 | 32 | 29 |
Female | 21 | 23 | 22 | 23 | 20 | 29 | 24 | 19 | 22 |
Make a back-to-back stemplot. Do you think the difference in mean ages will be significant?
A two-sample t test gives
What do you conclude about using the t test? What do you conclude about the mean ages of the trainers?
16.85 Planning to attend a four-year college. A Pew survey asked U.S. teenagers whether they plan to attend a four-year college.14 For the boys, 51% of 461 survey participants said they planned to attend a four-year college. For the girls, 68% of 454 survey participants said this. Use the bootstrap to find a 95% confidence interval for the difference between the female proportion who said they planed to attend a four-year college and the male proportion.
16.86 Use a ratio for females versus males. Refer to the previous exercise. In many settings, researchers prefer to communicate the comparison of two proportions with a ratio. For teenagers planning to attend a four-year college, they would report that females are 1.33 (68/51) times more likely to say they plan to attend a four-year college. Use the bootstrap to give a 95% confidence interval for this ratio.
16.87 Another way to communicate the result.
Refer to the previous two exercises. Here is another way to
communicate the result: female teenagers are 33% more likely to
say they plan to attend a four-year college than male teenagers.
Explain how the 33% is computed.
Use the bootstrap to give a 95% confidence interval for this estimate.
Based on this exercise and the previous two, which of the three ways is most effective for communicating the results? Give reasons for your answer.
16.88 Sadness and spending. Refer to
Exercise 7.47
(page 430).
A study of sadness and spending randomized subjects to watch
videos designed to produce sad or neutral moods. Each subject
was given $10, and after watching the video, he or she was asked
to trade $0.50 increments of their $10 for an insulated bottle
of water. Here are the data:
Group | Purchase price | ||||||||
---|---|---|---|---|---|---|---|---|---|
Neutral | 0.00 | 2.00 | 0.00 | 1.00 | 0.50 | 0.00 | 0.50 | ||
2.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | |||
Sad | 3.00 | 4.00 | 0.50 | 1.00 | 2.50 | 2.00 | 1.50 | 0.00 | 1.00 |
1.50 | 1.50 | 2.50 | 4.00 | 3.00 | 3.50 | 1.00 | 3.50 |
Use the two-sample t significance test (page 416) to compare the means of the two groups. Summarize your results.
Use the pooled two-sample t significance test (page 423) to compare the means of the two groups. Summarize your results.
Use a permutation test to compare the two groups. Summarize your results.
Discuss the differences among the results you found for parts (a), (b), and (c). Which method do you prefer? Give reasons for your answer.
16.89 Comparing the variances for sadness and spending.
Refer to the previous exercise. Some treatments in randomized
experiments such as this can cause variances to be different.
Are the variances of the neutral and sad subjects equal?
Compute the ratio
Compare the variances using a permutation test. Summarize your results.
Write a short paragraph comparing the F test with the permutation test for these data.
16.90 Insurance fraud? Jocko’s Garage has been
accused of insurance fraud. Data on estimates (in dollars) made
by Jocko and another garage were obtained for 10 damaged
vehicles. Here is what the investigators found:
Car | 1 | 2 | 3 | 4 | 5 |
Jocko’s | 1375 | 1550 | 1250 | 1300 | 900 |
Other | 1250 | 1300 | 1250 | 1200 | 950 |
Car | 6 | 7 | 8 | 9 | 10 |
Jocko’s | 1500 | 1750 | 3600 | 2250 | 2800 |
Other | 1575 | 1600 | 3300 | 2125 | 2600 |
Compute the mean estimate for Jocko and the mean estimate for the other garage. Report the difference in the means and the 95% standard t confidence interval. Be sure to choose the appropriate t procedure for your analysis and explain why you made this choice.
Use the bootstrap to find the confidence interval. Be sure to give details about how you used the bootstrap, which options you chose, and why.
Compare the t interval with the bootstrap interval.
16.91 Other ways to look at Jocko’s estimates.
Refer to the previous exercise. Let’s consider some other ways
to analyze these data.
For each damaged vehicle, divide Jocko’s estimate by the estimate from the other garage. Perform your analysis on these data. Write a short report that includes numerical and graphical summaries, your estimate, the 95% t confidence interval, the 95% bootstrap confidence interval, and an explanation for all choices (such as whether you chose to examine the mean or the median, bootstrap options, etc.).
Compute the mean of Jocko’s estimates and the mean of the estimates made by the other garage. Divide Jocko’s mean by the mean for the other garage. Report this ratio and find a 95% confidence interval for this quantity. Be sure to justify choices that you made for the bootstrap.
Using what you have learned in this exercise and the previous one, how would you summarize the comparison of Jocko’s estimates with those made by the other garage? Assume that your audience knows very little about statistics but a lot about insurance.
16.92 Comparing two operators.
Exercise 7.29
(page 409)
gives these data on a delicate measurement of total body bone
mineral content made by two operators on the same eight
subjects:15
Operator | Subject | |||||||
---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
1 | 1.328 | 1.342 | 1.075 | 1.228 | 0.939 | 1.004 | 1.178 | 1.286 |
2 | 1.323 | 1.322 | 1.073 | 1.233 | 0.934 | 1.019 | 1.184 | 1.304 |
Do permutation tests give good evidence that measurements made by the two operators differ systematically? If so, in what way do they differ? Do two tests: one that compares centers and one that compares spreads.