4.1 Randomness

Toss a coin or choose an SRS. The result can’t be predicted in advance because it varies across coin tosses or random draws. There is, nonetheless, a regular pattern in the collection of the results, a pattern that emerges clearly only after many repetitions. This remarkable fact is the basis for the idea of probability.

Example 4.1 Toss a coin 5000 times.

When you toss a coin, there are only two possible outcomes: a head or a tail. Figure 4.1 shows two trials of tossing a coin 5000 times (Trial A and Trial B). For each number of tosses from 1 to 5000, we have plotted the proportion of those tosses that gave a head.

A graph of proportion of heads for 5000 coin tosses. — Figure 4.1 The proportion of tosses of a coin that give a head varies as we make more tosses. Eventually, however, the proportion approaches 0.5, the probability of a head. This figure shows the results of two trials of 5000 tosses each, Example 4.1.

The graph plots proportion of heads on the vertical axis, ranging from 0.0 to 1.0 in increments of 0.1, versus number of tosses on the horizontal axis, ranging from 1 to 5000 in varying increments. The graph includes two plots, representing two trials. Both plots rise and fall irregularly across the graph with the following general trends. Plot A rises diagonally from (1, 0.0) to (2, 0.5), falls to (5, 0.25), rises to (9, 0.6), then falls to (20, 0.4). After the twentieth toss, the variance between proportions decreases to between 0.45 and 0.55 as it gradually approaches horizontal at 0.5 around the five thousandth toss. Plot B extends horizontally from (1, 1.0) to (5, 1.0), then falls to (20, 0.48). After the twentieth toss, the variance between proportions decreases to between 0.6 and 0.49 as it gradually approaches horizontal at 0.5 around the five thousandth toss. All values estimated.

Trial A (solid line) begins tail, head, tail, tail. You can see that the proportion of heads for Trial A starts at 0/1 on the first toss, rises to 1/2 when the second toss gives a head, and falls to 1/3 and 1/4 as we get two more tails. Trial B (dotted line), on the other hand, starts with five straight heads, so the proportion of heads is 1 until the sixth toss.

The proportion of tosses that produce heads is quite variable at first. Trial A starts low, and Trial B starts high. As we make more and more tosses, however, the proportion of heads for each trial gets close to 0.5 and stays there.

If we made yet a third trial at tossing the coin a great many times, the proportion of heads would again settle down to 0.5 in the long run. We say that 0.5 is the probability of a head. The probability 0.5 appears as a horizontal line on the graph.

Applet The Probability applet on the text website animates Figure 4.1. It allows you to choose the probability of a head and simulate any number of tosses of a coin with that probability. Try it. You will see that the proportion of heads gradually settles down close to the chosen probability. Equally importantly, you will also see that the proportion in a small or moderate number of tosses can be far from this probability. caution Probability describes only what happens in the long run. Most people expect chance outcomes to show more short-term regularity than is actually true.

Example 4.2 Significance testing and Type I errors.

In Chapter 6, we will learn about significance testing and Type I errors. When we perform a significance test, we have the possibility of making a Type I error under certain circumstances. The significance-testing procedure is set up so that the probability of making this kind of error is small, usually 5%. This is similar to flipping a coin with the probability of a head being 0.05. If we perform a large number of significance tests under this set of circumstances, the proportion of times that we will make a Type I error is 0.05.

We will learn a lot more about significance testing in later chapters. For now, we are content with some very general ideas. When the Type I error of a statistical significance procedure is set at 0.05, this probability is a characteristic of the procedure. It does not determine the outcome. For example, if we roll a pair of dice once, we do not know whether the sum of the faces will be seven or not. Similarly, if we perform a significance test once, we do not know if we will make a Type I error or not. However, because the procedure is designed to have a Type I error probability of 0.05, we are much less likely than not to make a Type I error.

In the coin toss setting, the probability of a head is a characteristic of the coin being tossed. The probability of a head for a fair coin is 0.5; that is, it is equally likely to come up heads or tails. If we toss a coin five times and it comes up heads for all five tosses, we might suspect that the coin is not fair. Is this outcome likely if, in fact, the coin is fair? This is what happened with a fair coin in Trial B of Example 4.1.

The language of probability

“Random” in statistics is not a synonym for “unpredictable” but a description of a kind of order that emerges in the long run. We often encounter the unpredictable side of randomness in our everyday experience, but we rarely see enough repetitions of the same random phenomenon to observe the long-term regularity that probability describes. You can see that regularity emerging in Figure 4.1 as the number grows. In the very long run, the proportion of tosses that give a head is 0.5. This is the intuitive idea of probability. Probability 0.5 means “occurs half the time in a very large number of trials.”

Not all coins are fair. In fact, most real coins have bumps and imperfections that make the probability of heads a little different from 0.5. The probability might be 0.499999 or 0.500002. For our study of probability, we will assume that we know the actual values of probabilities. Thus, we assume things like fair coins, even though we know that real coins are not exactly fair.

We do this to learn what kinds of outcomes we are likely to see when we make such assumptions. When we study statistical inference in later chapters, we look at the situation from the opposite point of view: given that we have observed certain outcomes, what can we say about the probabilities that generated these outcomes?

Check-in

4.1 Use Table B. We can use the random digits in Table B in the back of the book to simulate tossing a fair coin. Start at line 131 and read the numbers from left to right. If the number is 0, 2, 4, 6, or 8, you will say that the coin toss resulted in a head; if the number is a 1, 3, 5, 7, or 9, the outcome is tails. Use the first 10 random digits on line 131 to simulate 10 tosses of a fair coin. What is the actual proportion of heads in your simulated sample? Explain why you did not get exactly five heads.

Probability describes what happens in very many trials, and we must actually observe many trials to pin down a probability. In the case of tossing a coin, some diligent people have in fact made thousands of tosses.

Example 4.3 Many tosses of a coin.

The French naturalist Count Buffon (1707–1788) tossed a coin 4040 times. Result: 2048 heads, or proportion 2048/4040=0.5069 for heads.

Around 1900, the English statistician Karl Pearson heroically tossed a coin 24,000 times. Result: 12,012 heads, a proportion of 0.5005.

While imprisoned by the Germans during World War II, the South African statistician John Kerrich tossed a coin 10,000 times. Result: 5067 heads, proportion of heads 0.5067.

Thinking about randomness

That some things are random is an observed fact about the world. The outcome of a coin toss, the time between emissions of particles by a radioactive source, and the sexes of the next litter of lab rats are all random. So is the outcome of a random sample or a randomized experiment. Probability theory is the branch of mathematics that describes random behavior. Of course, we can never observe a probability exactly. We could always continue tossing the coin, for example. Mathematical probability is an idealization based on imagining what would happen in an indefinitely long series of trials.

The best way to understand randomness is to observe random behavior—not only the long-run regularity but also the unpredictable results of short runs. You can do this with physical devices such as coins and dice, but software simulations of random behavior allow faster exploration. As you explore randomness, remember:

A probability can be thought of as the result of a long series of independent trials. That is, the outcome of one trial must not influence the outcome of any other.
The idea of probability is empirical. The only way to estimate a real-world probability is by actually observing many trials.
Simulation is helpful to study probability because processes are used to imitate random behavior. Many trials can be generated very quickly.

Example 4.4 Independent tosses of a coin.

If we think about tossing a fair coin many times, the probability of a head is one-half. The coin has no memory, so if a head appears on one toss, the next toss will have the same probability. The trials are independent.

Example 4.5 Will it be warm tomorrow?

Consider the daily weather records for a year. These are not independent trials. We know it will be warm in summer months and cool in winter months. Thus, the temperature on a particular day depends on the temperature of previous days. Unlike in the coin tossing example, there is no single probability of a warm day in the year-long set of trials.

The uses of probability

Probability theory originated in the study of games of chance. Tossing dice, dealing shuffled cards, and spinning a roulette wheel are examples of deliberate randomization. In that respect, they are similar to random sampling. Although games of chance are ancient, they were not studied by mathematicians until the sixteenth and seventeenth centuries.

It is only a mild simplification to say that probability as a branch of mathematics arose when seventeenth-century French gamblers asked the mathematicians Blaise Pascal and Pierre de Fermat for help. Gambling is still with us, in casinos and state lotteries. We will make use of games of chance as simple examples that illustrate the principles of probability.

Careful measurements in astronomy and surveying led to further advances in probability in the eighteenth and nineteenth centuries because the results of repeated measurements are random and can be described using distributions much like those arising from random sampling. Similar distributions appear in data on human life span (mortality tables) and in data on lengths or weights in a population of skulls, leaves, or cockroaches.¹

Now, we employ the mathematics of probability to describe the flow of traffic through a highway system, the Internet, or a computer processor; the genetic makeup of individuals or populations; the energy states of subatomic particles; the spread of pandemics or tweets; and the rate of return on risky investments. Although we are interested in probability because of its usefulness in statistics, the mathematics of chance is important in many fields of study.

Section 4.1 SUMMARY

A random phenomenon has outcomes that we cannot predict but that nonetheless have a regular distribution in very many repetitions.
The probability of an event is the proportion of times the event occurs in many repeated trials of a random phenomenon.
Trials are independent if the outcome of one trial does not influence the outcome of any other trial.
Simulation is often used to study probability because random processes can be repeated numerous times very quickly.

Section 4.1 EXERCISES

4.1 Are these phenomena random? Identify each of the following phenomena as random or not. Give reasons for your answers.
1. You roll a six-sided die that has one to six spots on each side.
2. The last digit of your phone number.
3. You draw a 10 from a well-shuffled deck of 52 cards.
4. Your date of birth.
4.2 Two possible outcomes. Explain why a random phenomenon with two possible outcomes does not necessarily have 0.5 as the probability for each of the outcomes.
4.3 Interpret the probabilities. Refer to Exercise 4.1. In each case, interpret the term probability for the phenomena that are random. For those that are not random, explain why the term probability does not apply.
4.4 Winning at craps. The game of craps starts with a “come-out” roll, in which the shooter rolls a pair of dice. If the total of the “spots” on the up-faces is 7 or 11, the shooter wins immediately. (There are ways that the shooter can win on later rolls if other numbers are rolled on the come-out roll.) Roll a pair of dice 25 times and estimate the probability that the shooter wins immediately on the come-out roll. For a pair of perfectly made dice, the probability is 2/9, approximately 0.2222. How close is your estimate?
4.5 Are the trials independent? For each of the following situations, identify the trials as independent or not. Explain your answers.
1. Throw a ball at a target 100 times. For the first throw, you use your right hand, and for the second you use your left hand. You continue alternating right and left hands for the 100 tosses.
2. You ask 100 people whether they prefer the color blue or the color green.
3. The number of tweets that you receive on the next 10 Mondays.
4. Your grades in the five courses that you are taking this semester.
4.6 Use the Probability applet. The idea of probability is that the proportion of heads in many tosses of a balanced coin eventually gets close to 0.5. But does the actual count of heads get close to one-half the number of tosses? Let’s find out. Set the “Probability of Heads” in the Probability applet to 0.5 and the number of tosses to 100. You can extend the number of tosses by tossing again to get 100 more. Don’t reset the app during this exercise.
1. After 100 tosses, what is the proportion of heads? What is the count of heads? What is the difference between the count of heads and 50 (one-half the number of tosses)?
2. Keep going to 200 tosses. Again record the proportion and count of heads and the difference between the count and 100 (half the number of tosses).
3. Keep going. Stop at 300 tosses and again at 400 tosses and record the same facts. Although it may take a long time, the laws of probability say that the proportion of heads will always get close to 0.5 and also that the difference between the count of heads and half the number of tosses will always grow without limit.
4.7 A question about dice. Here is a question that a French gambler asked the mathematicians Fermat and Pascal at the very beginning of probability theory: What is the probability of getting at least one 6 in rolling four dice? The Law of Large Numbers applet allows you to roll several dice and watch the outcomes. (Ignore the title of the applet for now.) Because simulation—just like real random phenomena—often takes very many trials to estimate a probability accurately, let’s simplify the question: Is this probability clearly greater than 0.5, clearly less than 0.5, or quite close to 0.5? Use the applet to roll four dice until you can confidently answer this question. You will have to set “Rolls” to 1 so that you have time to look at the four up-faces. Keep rolling the dice again and again. How many times did you roll four dice? What percent of your rolls produced at least one 6?