-
1.1
- The cases are student organizations.
-
The variables are: Whether the majority of members are
undergraduate or graduate students (undergraduate or graduate);
Primary advisor email address (possible values are email
addresses); Meeting day (Sunday–Saturday); and Number of members
(0 to the number of students enrolled at the university).
-
Number of members is quantitative, the rest are categorical.
- The name of the organization is the label.
-
Who: part (a) What: part (b) and (c) Why:
We could look at the number of members depending on when they
meet or whether they are graduate or undergraduate.
-
1.3
- The cases are employed students who have graduated.
-
The variables are: Starting salary ($0 to $100,000); Employment
industry (a list of 20); State of employment (U.S. states or
other country).
- Salary is quantitative, the others are categorical.
- Yes, a label was used; it is the ID numbered 1 to 1255.
-
Who: part (a) What: part (b) and (c) Why:
We could look at the starting salary based on where they are
employed or their industry.
-
1.5
- The cases are employees.
-
No, the last name alone cannot be treated as a label because
there could be multiple people with the same last name. A label
must be unique to each case in the data set.
-
Employee identification number—label, last name—categorical,
first name—categorical, middle initial—categorical,
department—categorical, number of years—quantitative,
salary—quantitative, education—categorical,
age—quantitative.
-
1.7
Answers will vary. This could include enrollment, graduation rate,
job placement rate, in-state tuition, out-of-state tuition,
public/private institution, etc.
-
1.9
Answers will vary.
-
1.11
-
A bar graph because we are comparing a quantitative variable
(minutes) with a categorical variable (day of the week).
-
Stemplot because, assuming that the grades are the percent
grades and not letter grades, the values are quantitative with
stems from 0 to 10, and we can feasibly write 120 numbers in the
plot. We could also use a histogram, but with a stemplot, we can
see the individual grades.
-
We could use a pie chart if we turn the values into percents for
each color. A bar graph could also work here.
-
A histogram would be best for this data because the number of
students in a graduating class is quantitative. We could also
use a stemplot, but we would assume that there are too many high
schools in the entire state of Iowa to write down all of them
individually.
-
1.15
- The distribution is skewed to the right.
- There appears to be one large outlier at 4213.49.
-
The shape is roughly symmetric, the center is around 3130, the
range is from 2664.38 to 4213.49.
-
1.17
-
Energy is highest in January, decreases toward the spring,
increases again in July and August, is lower in September and
October, and increases again in December.
-
The graph makes it much easier to see the variability visually
than does a table of how things change month to month for 12
months.
-
1.19
The Pareto chart is easiest to read because it allows you to
quickly tell which colors had the highest votes and which had the
lowest votes for least favorite color.
-
1.21
It is slightly easier to read the Pareto chart because the many
different categories make the pie chart harder to read.
-
1.23
-
Four variables: GPA, IQ, and self-concept are quantitative;
gender is categorical.
-
The histogram is slightly easier to take in a glance with all of
the GPAs.
-
Unimodal and skewed left, centered near 7.8, spread from 0.5 to
10.8.
-
The males have a much larger spread and a much more left-skewed
distribution.
-
1.25
Older coins are more rare, and so the older the year, the less
likely they are to be in circulation, and therefore you probably
won’t have many of them in your pockets.
-
1.27
Overall times to run the Boston Marathon decrease from 1972 to
about 1982 and then plateau. Times stop improving around 2006.
-
1.29
-
x¯ = 3208.44.
-
M = 3130.37.
-
Because the distribution is right-skewed with a potential
outlier, the median is a better measure of center.
-
1.31
-
s = 306.68.
-
Q1 = 3027.64,
Q3 = 3286.95.
-
Min = 2664.38
(this is the smallest value),
Q1 = 3027.64
(this value has 25% of the observations below it),
M = 3130.37
(this is the middle observation, or has 50% of the observations
below or above it),
Q3 = 3286.95
(this value has 75% of the observations below it),
Max = 4213.49
(this is the largest value).
-
The five-number summary would be better for this distribution
because it is right-skewed with a potential outlier.
-
1.33
-
The distribution is right-skewed with a potential outlier.
- The distribution is right-skewed.
-
Preference will vary. The only advantage of the stemplot is that
it preserves the data; otherwise, the histogram is likely
better. The boxplot is also fine but hides some of the details
that the histogram shows.
-
1.35
- ,
-
The KPOT values are right-skewed, whereas the KSUP values are
fairly symmetric. The center for KSUP is higher than the center
for the KPOT. Also, the KPOT values are more spread out than the
KSUP values.
-
It is easier to compare two groups when looking at side-by-side
boxplots.
-
1.37
-
x¯ = 122.9.
-
M = 102.5.
-
The data set is right-skewed with an outlier, so the median is a
better center.
-
1.39
-
IQR = 62.
-
Outliers are below
−26
or above 222. London is confirmed as an outlier.
-
The first three quarters are about equal in length, and the last
is extremely long.
-
The main part of the distribution is relatively symmetric; there
is one extreme high outlier. The minimum is about 25, the first
quartile is about 70, the median is about 100, and the third
quartile is about 130. There is a gap in the data from roughly
200 to about 425.
-
1.41
-
s = 8.80.
-
With
n = 50, the positions of
Q1
and
Q3
will be at 13 and 38. We find
Q1 = 43.79
and
Q3 = 57.02.
-
1.43
-
Because weight is quantitative and has a decent number of
observations
(n = 25), a histogram is a good choice. Mean and standard deviation are
a good starting point for numerical summaries.
-
Now that we see the distribution is left-skewed, we know that
using the mean and standard deviation was not a good choice.
Median and quartiles would have been a better choice.
-
Answers will vary depending on where the clusters are
split.
-
1.45
-
With the outliers:
x¯ = 5.2,
M = 4.9, Without the outliers:
x¯ = 5,
M = 4.9. The median didn’t change, but without the outliers, the mean
is closer to the median.
-
With the outliers:
s= 1.40,
Q1= 4.4,
Q3= 5.6. Without the outliers:
s= 0.88
(answers will vary),
Q1= 4.4,
Q3= 5.5. The values are nearly identical with and without the
outliers, but the standard deviation decreased without the
outliers.
-
Outliers can strongly affect the standard deviation and mean,
and they don’t affect the quartiles and median as much.
-
1.47
There are fewer people with very large net worth in the United
States (for example, Bill Gates, Oprah Winfrey, and Warren
Buffet). These families will highly affect the mean net worth of
families in the United States and skew the data.
-
1.49
The mean is $102,181.82. Ten of the employees make less than the
mean. The median salary is $40,000.
-
1.51
The median doesn’t change, while the mean increases to
$124,909.09.
-
1.53
For
n = 2, the median is also the average of the two values.
-
1.55
- The mean is 16, and the standard deviation is 4.97.
-
The mean for the 20 cases is 15.75, and the standard deviation
is 3.43.
-
The mean didn’t change much, but the standard deviation
decreased.
-
1.57
The mean is 5.082 pounds. The standard deviation is 2.86 pounds.
-
1.59
The 10% trimmed mean is 5.05, and the 20% trimmed mean is 5.0.
These trimmed means are closer to the median than the original
untrimmed mean was.
-
1.61
- Standardized values can be negative.
-
95% of the values will be within two standard deviations of the
mean.
-
They are switched: The mean should be 0, and the standard
deviation should be 1.
-
1.63
-
When the mean stays the same, the center of the curve stays at
the same place. When the standard deviation increases from 2 to
4, the curve gets flatter and wider.
-
1.65
The table is given below.
μ − 3σ
|
μ − 2σ
|
μ − 1σ
|
μ
|
μ + 1σ
|
μ + 2σ
|
μ + 3σ
|
168 |
199 |
230 |
261 |
292 |
323 |
354 |
-
68% of reading values will be between 230 and 292. 95% of the
reading scores will be between 199 and 323. 99.7% of the reading
scores will be between 168 and 354.
-
1.67
Value |
Standardized Score |
200 |
−1.97
|
250 |
−0.35
|
280 |
0.61 |
300 |
1.26 |
320 |
1.90 |
-
1.69
The values of the eighth-grade geography scores associated with
the percentiles, rounded up, with the given mean and standard
deviation, are 221, 240, 261, 282, 301. These are very close to
the table, so we are satisfied that the values of the eighth-grade
geography scores are approximately normal.
-
1.71
-
68% of the women speak between 7856 and 20,738 words per day.
95% of women speak between 1415 and 27,179 words per day. 99.7%
of women speak between
−5026
and 33,620 words per day.
-
It is not entirely reasonable because people cannot speak fewer
than 0 words per day.
-
68% of the men speak between 4995 and 23,125 words per day. 95%
of men speak between
−4070
and 32,190 words per day. 99.7% of men speak between
−13,135
and 41,255 words per day. This seems less reasonable because
people cannot speak fewer than 0 words per day.
-
Yes, based on this information, we think that potentially women
speak more words per day than men.
-
1.73
- 75% of the observations lie below 0.75.
- 50% of observations lie below 0.50.
- 25% of observations lie between 0.50 and 0.75.
-
This density curve is a square, making the area under the curve
1.
-
1.75
The mean is 0.5, the median is 0.5,
Q1= 0.25, and
Q3= 0.75.
-
1.77
The first quartile is about 0.67 standard deviation below the mean
of a standard Normal distribution. The third quartile is about
0.67 standard deviation above the mean.
-
1.79
- 0.0322.
- 0.9678.
- 0.8159.
- 0.7837.
-
1.81
-
z= 0.47.
-
z= −0.67.
-
1.83
2.28% of adults are developmentally disabled, based on the
criteria.
-
1.85
zJessica= 1.02,
zAshley= 1.20.
-
1.87
Jorge’s equivalent ACT score is 31.16.
-
1.89
Renee scored in the 94.50th percentile.
-
1.91
The top 15% of all SAT scores are above 1242.
-
1.93
The quintiles for the ACT scores are
Q20= 16.96,
Q40= 20.15,
Q60= 22.85,
Q80= 26.04.
-
1.95
- 16.6% of women have low levels of HDL.
- 37.45% have protective levels of HDL.
- 45.95% of women are in the normal range.
-
1.97
- 0.62% of healthy adults have osteoporosis.
- 2.28% of the older population have osteoporosis.
-
1.99
-
Q1 = −0.6745 and Q3 = 0.6745.
-
Q1 = μ − 0.6745σ and Q3 = μ + 0.6745σ.
-
1.101
The interquartile range is 1.3490, and
1.5 × 1.3490 = 2.02. So there are approximately
2 × 0.0217 = 0.0434, or 4.34% outliers.
-
1.103
Looking at the qqplot of DBH, we see that there is a slight
s shape in the plot, which indicates that the diameter may
not be Normally distributed.
-
1.107
We can clearly see that all sources of renewable energy have
increased from 2008 to 2018, with wind and solar having the
largest increases.
-
1.111
-
For car makes (a categorical variable), use either a bar graph
or pie chart. For car age (a quantitative variable), use a
histogram, stemplot, or boxplot.
-
Study time is quantitative, so use a histogram, stemplot, or
boxplot. To show change over time, use a time plot (average
hours studied against time).
-
Use a bar graph or pie chart to show radio station preferences.
-
Use a Normal quantile plot to see whether the measurements
follow a Normal distribution.
-
1.115
-
σ = 7.5.
-
1.117
-
μ = 79
and
σ = 30.4.
-
1.119
-
Most people will “round” their answers when asked to give an
estimate like this; in fact, the most striking answers are ones
such as 115, 170, or 230. The students who claimed 360 minutes
(six hours) and 300 minutes (five hours) may have been
exaggerating.
-
Women seem to generally study more (or claim to), as there are
none that claim less than 60 minutes per night. The center
(median) for women is 170; for men the median is 120 minutes.
-
1.121
x¯ = 35.66,
s = 41.56,
Min = 0,
Q1 = 1,
M = 11.5,
Q3 = 68,
Max = 181. On average, the band pauses for 35.66 seconds; however, the
largest portion of the time, they don’t pause at all. The
distribution is strongly right-skewed and shows that sometimes the
band pauses for as much as 181 seconds, or 3 minutes, before
playing the final note.
-
1.123
Antho2 is approximately Normally distributed.
x¯ = 1.711,
s = 0.590.
-
1.125
The distribution is highly skewed to the right. The five-number
summary is 0.0154, 0.0784, 0.1423, 0.6975, 4.2995.