Chapter 10 CHECK-IN QUESTIONS

  1. 10.1

    1. The slope is 0.008.

    2. For every milligram of calcium in the supplement, blood pressure increases by 0.008.

    3. The intercept is 0.2.

    4. Without any calcium added to the supplement, blood pressure decreases by 0.2 mmHg.

  2. 10.3

    1. 23.3555.

    2. 0.9445.

    3. There were two data points below 4000 steps per day. 10,000 steps per day is inside the range, so predictions should be good. 16,000 steps is beyond the high end of the data range; this would be extrapolation.

  3. 10.5

    1. df=18, 95% CI=(0.03, 2.77).

    2. df=30, 95%CI=(0.06,4.34).

    3. df=14, 95% CI=(-0.05,4.45).

  4. 10.7 7.3. As the sample size increases, the margin of error decreases.

  5. 10.9 It is the square root of 13.36, which is the MSE.

  6. 10.11 The sum of the residuals is no longer zero.

Chapter 10 EXERCISES

  1. 10.1

    1. The parameters of the regression model are β0, β1, andϵ; those given are estimates of these.

    2. It should be H0:β1=0.

    3. The prediction interval will be wider than the mean response interval.

  2. 10.3

    1. 0.08. When the U.S. market is flat, the overseas returns will be 0.08.

    2. 0.20. For each unit increase in U.S. return, the mean overseas return will increase by 0.20.

    3. Mean overseas return return=β0+β1×USReturn+ϵ. The ϵ allows overseas returns to vary when the U.S. returns remain the same.

  3. 10.5 Prediction intervals concern individuals instead of means. Departures from the Normal distribution assumption would be more severe here.

  4. 10.7 Because the list was narrowed before we took our SRS, our sample really only reflects the schools that met the “academic quality” criteria and not all 500+colleges.

  5. 10.9

    1. $26,587.61.

    2. $23,736.64.

    3. The margin of error would be larger for James Madison.

  6. 10.11

    1. All relationships look approximately linear.

    2. Explanatory variable s P-value Observations removed
      InCostAid 4653.41 0.0059 No
      Admit 5010.25 0.0463 No
      Grad4Rate 4920.01 0.0275 No
      InCost 4919.49 0.0274 No

    3. The best single variable looks like InCostAid.

  7. 10.13

    1. y^=8.25464+0.11259EDUC. The points in the residual plot look scattered and random so the assumptions are satisfied. The histogram shows that the residuals are Normally distributed.

    2. Yes, the log transformed data can effectively be used for inference.

  8. 10.15

    1. A linear trend looks reasonable; nothing unusual.

    2. y^=22393+11.75828 Year.

    3. The intercept only describes what happens at when x is 0, which is far outside the range of our data.

    4. The residual plot looks mostly random.

    5. The residuals are approximately Normal, as shown in the Normal quantile plot.

    6. Yes.

  9. 10.17

    1. The points are much closer to a straight line.

    2. (0.37305, 0.43643).

  10. 10.19

    1. The null hypothesis should test the slope β1.

    2. Sums of squares add; mean squares do not.

    3. The r2 value determines explanatory power.

    4. The total df is equal to n1.

  11. 10.21

    1. Spending is increasing linearly over time.

    2. y^=3535.54+1.77 Year.

    3. 0.13, 0.04, 0.31, 0.22; s=0.28548.

    4. yi=β0+β1xi+ϵi,ϵi~N(0,σ). b0=3535.54; b1=1.77; s=0.28548.

    5. (33.18, 35.92).

  12. 10.23

    1. t=2.14, 0.01<P-value<0.02.

    2. States with more adult binge drinking are more likely to have underage drinking; 10.24% of the variation in underage drinking can be accounted for by the prevalence of adult binge drinking.

    3. Even though most states were used, it is assumed that sampling took place for each state; thus, we can still infer about the true unknown correlation.

  13. 10.25

    Source DF SS MS F
    Regression 1 6427.4 6427.4 39.82
    Residual error 29 4681.1 161.4
    Total 30 11108.5
  14. 10.27

    1. 0.15619.

    2. (0.4898, 1.1362).

    3. It tells us what the EAFE is when there is no return in U.S. markets.

    4. 3.377; (10.177, 3.797).

  15. 10.29 The first plot shows nonconstant variance. The second plot also shows nonconstant variance. The third plot has no violations. The fourth plot has a nonlinear pattern.

  16. 10.31

    1. y^=12.810.05 Year.

    2. 0.0384.

    3. There is not evidence that temperature is associated with performance.

  17. 10.33 Answers will vary.

  18. 10.35

    1. H0: β1=0, Ha: β10.

    2. t=16.28, P-value<0.0001.

    3. r2=89.53%.

    4. (0.85955, 1.10611).

  19. 10.37

    1. y^=3862.7734+1.05511x.

    2. The residual plot looks good; the assumptions are valid.

  20. 10.39

    1. Both distributions are right-skewed; the five-number summaries are 0%, 0.31%, 1.43%, 17.65%, 85.01% and 0, 2.25, 6.31, 12.69, 27.88.

    2. Only the residuals need to be Normal.

    3. The relationship is quite scattered.

    4. y^=6.24693+0.10634x.

    5. The residuals are right-skewed.

  21. 10.41 Answers will vary.

  22. 10.43

    1. 8.41%.

    2. H0:ρ=0.Ha:ρ0.t=9.12, P-value<0.0001.

    3. Students who did not answer might have different characteristics.

  23. 10.45

    1. IBI is slightly left-skewed; x¯=65.94, s=18.28; Forest is slightly right-skewed; x¯=39.39, s=32.20.

    2. A weak positive association.

    3. yi=β0+β1xi+ϵi,ϵi~N(0, σ).

    4. H0: β1=0, Ha: β10.

    5. y^=59.91+0.1531 Forest, s=17.79. t=1.92, P-value=0.0608.

    6. The residual plot shows that there is more variation for small x.

    7. The residuals seem reasonably close to Normal.

  24. 10.47 The first change decreases P (that is, the relationship is more significant) because it accentuates the positive association. The second change weakens the association, so P increases (the relationship is less significant).

  25. 10.49 Using area: 57.52; (23.5598, 91.4892). Using forest: 69.55; (33.2085, 105.9006). Both prediction intervals have a lot of error.

  26. 10.51

    1. Very linear.

    2. y^=61.12+9.3187 Year; r2=98.8%.

    3. (8.3562, 10.2812).

  27. 10.53

    1. 121.

    2. y^=1066.48.

    3. Prediction interval.

  28. 10.55 For n=15, t=2.08; for n=25, t=2.77. The P-values are 0.0579 and 0.0109. Finding the same correlation with more data points provides stronger evidence that the observed correlation is not just due to chance.

  29. 10.57

    1. Strong, positive linear relationship with one outlier.

    2. y^=1.63+0.0214SAT. t=10.78, P-value<0.0005.

    3. r=0.8167.

  30. 10.59

    1. a1=0.02617, a0=2.7522.

    1. y¯=21.13 and sy=4.7137.

  31. 10.61 For n=123: between 0.24 and 0.28 have a P-value<0.01; between 0.20 and 0.23 have a P-value<0.05. For n=96: between 0.22 and 0.24 have a P-value<0.05; the others are not significant.

  32. 10.63

    1. For women: (14.72609, 33.32604). For men: (9.46079, 42.96351). These intervals overlap quite a bit.

    2. For women: 22.78. For men: 16.38. The women’s standard error is smaller in part because it is divided by a larger n.

    3. Choose men with a wider variety of lean body masses.

  33. 10.65

    1. 30.

    2. The relationship is linear, positive, and strong.

    3. House 27 is unusual and could be influential.

    4. y^=9.0176+1.15705x, s=37.34442.

    5. y^=9.43181+1.123x, s=25.39177.

    6. The outlier has some influence; the first model has a much larger standard error.

  34. 10.67

    1. The relationship is linear and positive.

    2. There may be an influential point. The residual plots do not look evenly spread.

    3. y^=21.39844+0.07659x, s=33.84516.

    4. (0.04249, 0.11069).