4.3 Random Variables

Sample spaces need not consist of numbers. When we toss a coin four times, we can record the outcome as a string of heads and tails, such as HTTH. In statistics, however, we are most often interested in numerical outcomes such as the count of heads in the four tosses. It is convenient to use a shorthand notation: Let X be the number of heads in four tosses. If our outcome is HTTH, then X=2. If the next outcome is TTTH, the value of X changes to X=1. The possible values of X are 0, 1, 2, 3, and 4. Tossing a coin four times will give X one of these possible values. Each time we toss a coin four times, X takes one of these values. We call X a random variable because its values vary in repeated tosses. The value of X is only known once we know the outcome.

In this coin-tossing example, the random variable is the number of heads in the four tosses. We usually denote random variables by capital letters near the end of the alphabet, such as X or Y. Of course, the random variables of greatest interest to us are outcomes such as the mean x¯ of a random sample, for which we will keep the familiar notation.⁹ As we progress from general rules of probability toward statistical inference, we will concentrate heavily on random variables. In fact, the next chapter is devoted entirely to studying the random variables x¯ and p^ of a random sample.

To consider the random variable X as a probability model, we need to describe its sample space S and a probability for each outcome. The sample space S is the list of possible values of the random variable. We usually do not mention S separately. There are two main ways of assigning probabilities to the values of a random variable, and the one we use depends on what type of random variable it is.

Discrete random variables

We have learned several rules of probability but only one method of assigning probabilities: state the probabilities of the individual outcomes and assign probabilities to events by summing over the outcomes. The outcome probabilities must be between 0 and 1 and have sum 1. When the outcomes are numerical, they are values of a random variable. We will now attach a name to random variables having probability assigned in this way.

Discrete random variable

A discrete random variable X has possible values that can be given in an ordered list. The probability distribution of X lists the values and their probabilities:

Value of X	x1	x2	x3	. . .
Probability	p1	p2	p3	. . .

The probabilities pi must satisfy two requirements:

Every probability pi is a number between 0 and 1.
p1+p2+⋯=1.

Find the probability of any event by adding the probabilities pi of the particular values xi that make up the event.

In most discrete random variable situations that we will study, the number of possible values is a finite number k. For example, the number of heads in four tosses of a coin has k=5 possible values: 0, 1, 2, 3, and 4. Similarly, in Example 4.15 (page 215), the first digit of legitimate records associated with Benford’s law is a discrete random variable with k=9 outcomes.

There are, however, other settings in which the number of possible values can be countably infinite. We’d assign most of the probability to the small values, but each toss would have a nonzero probability, and the collection of these would sum to one. Think about tossing a fair coin until you get a head. The number of possible tosses is any positive integer.

Example 4.25 Grade distributions.

A liberal arts college posts the grade distributions for its courses. In a recent semester, students in one section of English 130 received 34% A’s, 42% B’s, 18% C’s, 2% D’s, and 4% F’s. Choose an English 130 student at random. To “choose at random” means to give every student the same chance to be chosen. The student’s grade on a five-point scale (with A=4) is a random variable X.

The value of X changes when we repeatedly choose a student at random, but it is always one of 0, 1, 2, 3, or 4. Here is the distribution of X:

Value of X	0	1	2	3	4
Probability	0.04	0.02	0.18	0.42	0.34

The probability that the sampled student got a B or better is the sum of the probabilities of an A and a B. In the language of random variables,

P(X≥3)=P(X=3)+P(X=4)=0.42+0.34=0.76

Check-in

4.12 Will the course satisfy the requirement? Refer to Example 4.25. Suppose that a grade of D or F in English 130 will not satisfy a requirement for a major in linguistics. What is the probability that a randomly selected student will not satisfy this requirement?

We can use histograms to show not only distributions of data but also probability distributions. Here is an example.

Example 4.26 Random digit distributions.

Let’s compare the distribution of random digits with the distribution of the probability model for Benford’s law (Example 4.15) using probability histograms. Figure 4.5 displays the result.

Two probability histograms. — Figure 4.5 Probability histograms for (a) equally likely random digits 1 to 9 and (b) Benford’s law. The height of each bar shows the probability assigned to a single outcome.

The height of each bar shows the probability of the outcome at its base. Because the heights are probabilities, they add to 1. As usual, all the bars in a histogram have the same width. So the areas also display the assignment of probability to outcomes. Think of these histograms as idealized pictures of the results of very many trials. The histograms make it easy to quickly compare the two probability distributions.

Example 4.27 Number of heads in four tosses of a coin.

What is the probability distribution of the discrete random variable X that counts the number of heads in four tosses of a coin? We can derive this distribution if we make two reasonable assumptions:

The coin is balanced, so it is fair, and each toss is equally likely to give H or T.
The coin has no memory, so tosses are independent.

The outcome of four tosses is a sequence of heads and tails. There are 16 possible outcomes in all. Figure 4.6 compiles these outcomes with their values of X to represent the probability distribution of X. For the probabilities we can use the multiplication rule for independent events. For example,

P(HTTH)=12×12×12×12=116

Possible outcomes for four coin tosses. — Figure 4.6 Possible outcomes in four tosses of a coin, Example 4.27. The outcomes are arranged by the values of the random variable X, the number of heads.

Similarly, each of the 16 possible outcomes has probability 1/16. That is, these outcomes are equally likely.

The number of heads X has possible values 0, 1, 2, 3, and 4. These values are not equally likely. As Figure 4.6 shows, there is only one way that X=0 can occur: namely, when the outcome is TTTT. So

P(X=0)=116=0.0625

The event { X=2 } can occur in six different ways, so that

P(X=2)=count of ways X=2 can occur16=616=0.375

We can find the probability of each value of X in Figure 4.6 in the same way. Here is the result:

Value of X	0	1	2	3	4
Probability	0.0625	0.25	0.375	0.25	0.0625

Figure 4.7 is a probability histogram for the distribution in Example 4.27. The probability distribution is exactly symmetric. The probabilities (bar heights) are idealizations of the proportions after very many tosses of four coins. The actual distribution of proportions observed would be nearly symmetric but is unlikely to be exactly symmetric.

Any event involving the number of heads observed in four tosses can be expressed in terms of X, and its probability can be found from the distribution of X. Here is an example.

Example 4.28 Probability of at least three heads.

The probability of tossing at least three heads is

P(X≥3)=0.25+0.0625=0.3125

The probability of at least one head is most simply found by use of the complement rule:

P(X≥1)=1−P(X=0)=1−0.0625=0.9375

Recall that tossing a coin n times is similar to choosing an SRS of size n from a large population and asking a Yes or No question (page 211). We will extend the results of Examples 4.27 and 4.28 when we discuss sampling distributions in the next chapter.

Check-in

4.13 Three tosses of a fair coin. Find the probability distribution for the number of heads that appear in three tosses of a fair coin. Note: Start by finding the sample space S.

Continuous random variables

Data set icon for Vtm.

When we use the table of random digits to select a digit between 0 and 9, the result is a discrete random variable. The probability model assigns probability 1/10 to each of the 10 possible outcomes. Suppose that we want to choose a number at random between 0 and 1, allowing any number between 0 and 1 as the outcome. Software random number generators will do this. In fact, we used the Excel function RAND() for our randomization examples in Chapter 3.

You can visualize such a random number by thinking of a spinner (Figure 4.8) that turns freely on its axis and slowly comes to a stop. The pointer can come to rest anywhere on a circle that is marked from 0 to 1. The sample space is now an entire interval of numbers:

S={ all numbers x such that 0<x<1 }

A circle with a spinning arrow affixed at center is labeled 0, one-fourth, one-half, and three-fourths at regular intervals. — Figure 4.8 A spinner that generates a random number between 0 and 1.

How can we assign probabilities to events such as {0.3≤x≤0.7}? As in the case of selecting a random digit, we would like all possible outcomes to be equally likely. But we cannot assign probabilities to each individual value of x and then sum because there are too many possible values. Instead, we use a new way of assigning probabilities directly to events—as areas under a density curve. Any density curve has area exactly 1 underneath it, corresponding to total probability 1.

Example 4.29 Uniform random numbers.

The random number generator will spread its output uniformly across the entire interval from 0 to 1 as we allow it to generate a long sequence of numbers. The results of many trials are represented by the density curve of a uniform distribution.

This density curve is shown in Figure 4.9. It has height 1 over the interval from 0 to 1 and height 0 everywhere else. The area under the density curve is 1: the area of a square with base 1 and height 1. The probability of any event is the area under the density curve and above the event in question.

Two uniform density curves with areas of probability highlighted. — Figure 4.9 Assigning probabilities for generating a random number between 0 and 1, Example 4.29. The probability of any interval of numbers is the area above the interval and under the density curve.

Example 4.30 A uniform probability.

What is the probability that the random number generator produces a number X between 0.3 and 0.7? The answer is illustrated in Figure 4.9(a). Because the area under the density curve and above the interval from 0.3 to 0.7 is 0.4,

P(0.3≤X≤0.7)=0.4

The height of the density curve is 1, and the area of a rectangle is the product of height and length, so the probability of any interval of outcomes is just the length of the interval.

Similarly,

P(X≤0.5)=0.5P(X>0.8)=0.2P(X≤0.5 or X>0.8)=0.7

Notice that the last event consists of two nonoverlapping intervals, so the total area above the event is found by adding two areas, as illustrated by Figure 4.9(b). This assignment of probabilities obeys all of our rules for probability.

Check-in

4.14 Find a uniform probability. For the uniform distribution described in Examples 4.29 and 4.30, find the probability that X is between 0.4 and 0.8.

Probabilities of events as areas under a density curve is our second important way of assigning probabilities. Figure 4.10 illustrates this idea in general form. We call X in Figures 4.9 and 4.10 a continuous random variable because its values are not isolated numbers but an entire interval of numbers.

A density curve highlighting the probability of event A. — Figure 4.10 The probability distribution of a continuous random variable assigns probabilities as areas under a density curve. The total area under any density curve is 1.

The probability model for a continuous random variable assigns probabilities to intervals of outcomes rather than to individual outcomes. In fact, all continuous probability distributions assign probability 0 to every individual outcome. Only intervals of values have positive probability. To see that this is true, consider a specific outcome such as P(X=0.8) in the context of Example 4.29. For the uniform distribution, the probability of any interval is the same as its length. The point 0.8 has no length, so its probability is 0.

Although this fact may seem odd, it makes intuitive, as well as mathematical, sense. A random number generator produces a number between 0.79 and 0.81 with probability 0.02. An outcome between 0.799 and 0.801 has probability 0.002. A result between 0.799999 and 0.800001 has probability 0.000002. You see that as this interval closes in on 0.8, the probability gets closer to 0.

To be consistent, the probability of an outcome exactly equal to 0.8 must be 0. Because there is no probability exactly at X=0.8, the two events { X>0.8 } and { X≥0.8 } have the same probability. caution We can ignore the distinction between > and ≥ when finding probabilities for continuous (but not discrete) random variables.

Normal distributions as probability distributions

The density curves that are most familiar to us are the Normal curves. Because any density curve describes an assignment of probabilities, Normal distributions are continuous probability distributions. Recall that N(μ, σ) is our shorthand for the Normal distribution having mean μ and standard deviation σ. In the language of random variables, if X has the N(μ, σ) distribution, then the standardized variable

Z=X−μσ

is a standard Normal random variable having the distribution N(0, 1). We used this relationship to compute probabilities regarding X in Chapter 1. Let’s do that again.

Example 4.31 A Normal distribution calculation.

Suppose X is a random variable with the N(10, 5) distribution. What is the probability that X is less than or equal to 15? First we standardize

P(X≤15)=P(X−μσ≤15−105)

In terms of Z, a standard Normal random variable, we have

P(X≤15)=P(Z≤15−105)=P(Z≤1)

Using software or Table A, we find P(Z≤1)=0.8413. So the probability that X is less than or equal to 15 is 0.8413.

Check-in

4.15 Find a Normal probability. Refer to Example 4.31. Without doing any additional calculations, what is the probability that X is less than 15? Explain how you determined your answer.

Here’s a Normal distribution calculation that relates to the discussion of sampling distributions in the next chapter. It is also a more realistic example in that it addresses the uncertainty in a sample survey.

Example 4.32 Texting while driving.

Texting while driving can be dangerous, but many people have a hard time putting down the phone. Suppose that 26% of teen drivers text while driving. If we take a sample of 500 teen drivers, what percent would we expect to say that they text while driving?¹⁰

The proportion p=0.26 is a number that describes the population of teen drivers. The proportion p^ of the sample who say that they text while driving is used to estimate p. The proportion p^ is a random variable because repeating the SRS would give a different sample of 500 teen drivers and a different value of p^.

We will see in the next chapter that in this setting, with teen drivers answering honestly, p^ has approximately the N(0.26, 0.0196) distribution. The mean 0.26 of this distribution is the same as the population proportion because p^ is an unbiased estimate of p. The standard deviation is controlled mainly by the size of the sample.

What is the probability that the survey result differs from the truth about the population by no more than 3 percentage points? We can use what we learned about Normal distribution calculations to answer this question. Because p=0.26, the survey misses by no more than 3 percentage points if the sample proportion is between 0.23 and 0.29.

Figure 4.11 shows this probability as an area under a Normal density curve. You can find it by using software or by standardizing and using Table A. From Table A,

P(0.23≤p^≤0.29)=P(0.23−0.260.0196≤p^−0.260.0196≤0.29−0.260.0196)=P(−1.53≤Z≤1.53)=0.9370−0.0630=0.8740

About 87% of the time, the sample p^ will be within 3 percentage points of the proportion p.

A normal distribution curve has a mean at p hat = 0.26. Values are marked to the left and right of it at p hate = 0.23 and p hat = 0.29. The area under the curve between the two values equals 0.8740. — Figure 4.11 Probability as area under a Normal density curve, Example 4.32.

We began this chapter with a general discussion of the idea of probability and the properties of probability models. Two very useful specific types of probability models are distributions of discrete and continuous random variables. In our study of statistics, we will employ only these two types of probability models.

Section 4.3 SUMMARY

A random variable is a variable taking numerical values determined by the outcome of a random phenomenon. The probability distribution of a random variable X tells us what the possible values of X are and how probabilities are assigned to those values.
A random variable X and its distribution can be discrete or continuous.
A discrete random variable has possible values that can be given in an ordered list. The probability distribution assigns each of these values a probability between 0 and 1 such that the sum of all the probabilities is exactly 1. The probability of any event is the sum of the probabilities of all the values that make up the event.
A continuous random variable takes all values in some interval of numbers. A density curve describes the probability distribution of a continuous random variable. The probability of any event is the area under the curve and above the values that make up the event.
Uniform distributions are continuous probability distributions that are very similar to equally likely discrete distributions.
Normal distributions are one type of continuous probability distribution.
You can picture a probability distribution by drawing a probability histogram in the discrete case or by graphing the density curve in the continuous case.

Now that you have completed this section, you will be able to:

Describe the probability distribution of a discrete random variable. Review Example 4.25 (page 225) and try Exercise 4.35.
Use a probability histogram to provide a graphical description of the probability distribution of a discrete random variable. Review Example 4.26 (page 226) and try Exercise 4.37.
Use the distribution of a discrete random variable to calculate probabilities of events. Review Example 4.28 (page 228) and try Exercise 4.39.
Describe the probability distribution of a continuous random variable. Review Example 4.29 (page 229) and try Exercise 4.53.
Find probabilities of events for the uniform and normal distributions. Review Examples 4.30 and 4.31 (pages 229 and 231) and try Exercises 4.41 and 4.43.

Section 4.3 EXERCISES

4.34 A random variable? You toss two coins and record the outcome as HH, HT, TH, or TT. Is the outcome a random variable? Explain your answer.
4.35 How many courses? At a small liberal arts college, students can register for one to six courses. Let X be the number of courses taken in the fall by a randomly selected student from this college. In a typical fall semester, 6% take one course, 6% take two courses, 12% take three courses, 20% take four courses, 41% take five courses, and 15% take six courses. Let X be the number of courses taken in the fall by a randomly selected student from this college. Describe the probability distribution of this random variable.
4.36 A new random variable. Refer to the previous exercise. Suppose that a student earns three credits for each course taken. Let Y equal the number of credits a student would earn if they complete the course.
1. What is the distribution of Y?
2. Use a probability histogram to describe the distribution of Y.
4.37 Make a graphical display. Refer to Exercise 4.35. Use a probability histogram to provide a graphical description of the distribution of X.
4.38 Find some probabilities. Refer to Exercise 4.36.
1. Find the probability that a randomly selected student earns more than 18 credits.
2. Find the probability that a randomly selected student earns 6 or fewer credits.
3. Find the probability that a randomly selected student earns 15 credits or more.
4.39 Find more probabilities. Refer to Exercise 4.35.
1. Find the probability that a randomly selected student takes two or fewer courses.
2. Find the probability that a randomly selected student takes three or four courses.
3. Find the probability that a randomly selected student takes seven courses.
4.40 What’s wrong? In each of the following scenarios, there is something wrong. Describe what is wrong and give a reason for your answer.
1. The possible values for a discrete random variable can’t be negative.
2. A continuous random variable can take any value between 0 and 1.
3. Normal distributions are discrete random variables.
4.41 Use the uniform distribution. Suppose that a random variable X follows the uniform distribution described in Example 4.29 (page 229). For each of the following events, find the probability and illustrate your calculations with a sketch of the density curve similar to the ones in Figure 4.9 (page 229).
1. The probability that X is less than 0.2.
2. The probability that X is greater than or equal to 0.7.
3. The probability that X is less than 0.8 and greater than 0.4.
4. The probability that X is 0.7.
4.42 Use of Twitter. Suppose that the population proportion of Internet users who say that they use Twitter or another service to post updates about themselves or to see updates about others is 19%.¹¹ Think about selecting random samples from a population in which 19% are Twitter users.
1. Describe the sample space for selecting a single person.
2. If you select three people, describe the sample space.
3. Using the results of part (b), define the sample space for the random variable that expresses the number of Twitter users in the sample of size 3.
4. What information is contained in the sample space for part (b) that is not contained in the sample space for part (c)? Do you think this information is important? Explain your answer.
4.43 Use the Normal distribution. Suppose X is a Normal random variable with mean 20 and standard deviation 4. Find the following probabilities.
1. The probability that X greater than or equal to 22.
2. The probability that X less than 22.
3. The probability that X greater than 22 and less than 24.
4. The probability that X greater than 40.
4.44 Probabilities for Twitter. Refer to the Exercise 4.42. Find the probabilities for the number of Twitter users in a sample of size 2.

4.45 Households and families in government data. In government data, a household consists of all occupants of a dwelling unit, while a family consists of two or more persons who live together and are related by blood or marriage. So all families form households, but some households are not families. Here are the distributions of household size and of family size in the United States:

Number of persons	1	2	3	4	5	6	7
Household probability	0.27	0.33	0.16	0.14	0.06	0.03	0.01
Family probability	0	0.44	0.22	0.20	0.09	0.03	0.02

Make probability histograms for these two discrete distributions, using the same scales. What are the most important differences between the sizes of households and families?

4.46 Discrete or continuous. In each of the following situations, decide whether the random variable is discrete or continuous and give a reason for your answer.
1. Your web page has five different links, and a user can click on one of the links or can leave the page. You record the length of time that a user spends on the web page before clicking one of the links or leaving the page.
2. You record the number of hits per day on your web page.
3. You record the yearly income of a visitor to your web page.

4.47 Texas hold ’em. The game Texas hold ’em starts with each player receiving two cards. Here is the probability distribution for the number of aces in two-card hands:

Number of aces	0	1	2
Probability	0.8507	0.1448	0.0045

Verify that this assignment of probabilities satisfies the requirement that the sum of the probabilities for a discrete distribution must be 1.
Make a probability histogram for this distribution.
What is the probability that a hand contains at least one ace? Show two different ways to calculate this probability.

4.48 Tossing two dice. Some games of chance rely on tossing two dice. Each die has six faces, marked with one, two, . . . , six spots called pips. The dice used in casinos are carefully balanced so that each face is equally likely to come up. When two dice are tossed, each of the 36 possible pairs of faces is equally likely to come up. The outcome of interest to a gambler is the sum of the pips on the two up-faces. Call this random variable X.
1. Write down all 36 possible pairs of up-faces.
2. If all pairs have the same probability, what must be the probability of each pair?
3. Write the value of X next to each pair of up-faces and use this information with the result of part (b) to give the probability distribution of X. Draw a probability histogram to display the distribution.
4. One bet available in the game called craps wins if a 7 or an 11 comes up on the next roll of two dice. What is the probability of rolling a 7 or an 11 on the next roll?
5. Several bets in craps lose if a 7 is rolled. If any outcome other than 7 occurs, these bets either win or continue to the next roll. What is the probability that anything other than a 7 is rolled?
4.49 Nonstandard dice. Nonstandard dice can produce interesting distributions of outcomes. You have two balanced, six-sided dice. One is a standard die, with faces having one, two, three, four, five, and six spots. The other die has three faces with two spots and three faces with five spots. Find the probability distribution for the total number of spots Y on the up-faces when you roll these two dice.

4.50 Spell-checking software. Spell-checking software catches “nonword errors,” which are strings of letters that are not words, as when “the” is typed as “eth.” When undergraduates are asked to write a 250-word essay (without spell-checking), the number X of nonword errors has the following distribution:

Value of X	0	1	2	3	4
Probability	0.2	0.4	0.2	0.1	0.1

Sketch the probability distribution for this random variable.
Write the event “at least one nonword error” in terms of X. What is the probability of this event?
Describe the event X≤2 in words. What is its probability? What is the probability that X<2?

4.51 Find the probabilities. Let the random variable X be a random number with the uniform density curve in Figure 4.9 (page 229). Find the following probabilities:
1. P(X≥0.40).
2. P(X=0.40).
3. P(0.40<X<1.40).
4. P(0.22≤X≤0.25 or 0.42≤X≤0.45).
5. X is not in the interval 0.5 to 0.8.
4.52 Uniform numbers between 0 and 4. Many random number generators allow users to specify the range of the random numbers to be produced. Suppose that you specify that the range is to be all numbers between 0 and 4. Call the random number generated Y. Then the density curve of the random variable Y has constant height between 0 and 4 and height 0 elsewhere.
1. What is the height of the density curve between 0 and 4? Draw a graph of the density curve.
2. Use your graph from part (a) and the fact that probability is area under the curve to find P(Y≤1.8).
3. Find P(0.4<Y<2.7).
4. Find P(Y≥1.95).
4.53 The sum of two uniform random numbers. Generate two random numbers between 0 and 1 and take Y to be their sum. Then Y is a continuous random variable that can take any value between 0 and 2. The density curve of Y is the triangle shown in Figure 4.12.
1. Verify by geometry that the area under this curve is 1.
2. What is the probability that Y is less than 1? Sketch the density curve, shade the area that represents the probability, and find that area. Do this for part (c) also.
3. What is the probability that Y is greater than 1.5?
4. What is the probability that Y is greater than 0.5?
Figure 4.12 The density curve for the sum Y of two random numbers, Exercise 4.53.
4.54 How many close friends? How many close friends do you have? Suppose that the number of close friends adults claim to have varies from person to person with mean μ=8 and standard deviation σ=3. An opinion poll asks this question of an SRS of 500 adults. We will see in the next chapter that, in this situation, the sample mean response x¯ has approximately the Normal distribution with mean 8 and standard deviation 0.1342. What is P(7.1≤x¯≤8.1), the probability that x¯ estimates μ to within ±0.1?
4.55 Normal approximation for a sample proportion. A sample survey contacted an SRS of 700 registered voters in Oregon shortly after an election and asked respondents whether they had voted. Voter records show that 56% of registered voters had actually voted. We will see in the next chapter that, in this situation, the proportion p^ of the sample who voted has approximately the Normal distribution with mean μ=0.56 and standard deviation σ=0.019.
1. If the respondents answer truthfully, what is P(0.52≤p^≤0.60)? This is the probability that the p^ estimates 0.56 within plus or minus 0.04.
2. In fact, 72% of the respondents said they had voted (p^=0.72). If respondents answer truthfully, what is P(p^≥0.72)? This probability is so small that it is good evidence that some people who did not vote claimed that they did vote.