2 Probability Copyright Cengage Learning. All rights reserved. 1 2.2 Axioms, Interpretations, and Properties of Probability

Copyright Cengage Learning. All rights reserved. 2 Axioms, Interpretations, and Properties of Probability Given an experiment and a sample space , the objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance that A will occur. To ensure that the probability assignments will be consistent with our intuitive notions of probability, all

assignments should satisfy the following axioms (basic properties) of probability. 3 Axioms, Interpretations, and Properties of Probability You might wonder why the third axiom contains no reference to a finite collection of disjoint events. 4

Axioms, Interpretations, and Properties of Probability It is because the corresponding property for a finite collection can be derived from our three axioms. We want our axiom list to be as short as possible and not contain any property that can be derived from others on the list. Axiom 1 reflects the intuitive notion that the chance of A occurring should be nonnegative. The sample space is by definition the event that must occur when the experiment is performed ( contains all possible outcomes), so Axiom 2 says that the maximum possible probability of 1 is assigned to . 5

Axioms, Interpretations, and Properties of Probability The third axiom formalizes the idea that if we wish the probability that at least one of a number of events will occur and no two of the events can occur simultaneously, then the chance of at least one occurring is the sum of the chances of the individual events. Proposition 6 Example 2.11

Consider tossing a thumbtack in the air. When it comes to rest on the ground, either its point will be up (the outcome U) or down (the outcome D). The sample space for this event is therefore = {U, D}. The axioms specify P( ) = 1, so the probability assignment will be completed by determining P(U) and P(D). Since U and D are disjoint and their union is , the foregoing proposition implies that 1 = P( ) = P(U) + P(D) 7

Example 2.11 contd It follows that P(D) = 1 P(U). One possible assignment of probabilities is P(U) = .5, P(D) = .5, whereas another possible assignment is P(U) = .75, P(D) = .25. In fact, letting p represent any fixed number between 0 and 1, P(U) = p, P(D) = 1 p is an assignment consistent with the

axioms. 8 Example 2.12 Consider testing batteries coming off an assembly line one by one until one having a voltage within prescribed limits is found. The simple events are E1 = {S}, E3 = {FFS}, E2 = {FS},

E4 = {FFFS}, . . . . Suppose the probability of any particular battery being satisfactory is .99. 9 Example 2.12 contd Then it can be shown that

P(E1) = .99, P(E2) = (.01)(.99), P(E3) = (.01)2(.99), . . . is an assignment of probabilities to the simple events that satisfies the axioms. In particular, because the Eis are disjoint and it must be the case that = E1 E2 E3 , 1 = P(S) = P(E1) + P(E2) + P(E3) +

= .99[1 + .01 + (.01)2 + (.01)3 + ] 10 Example 2.12 contd Here we have used the formula for the sum of a geometric series: However, another legitimate (according to the axioms)

probability assignment of the same geometric type is obtained by replacing .99 by any other number p between 0 and 1 (and .01 by 1 p). 11 Interpreting Probability 12 Interpreting Probability Examples 2.11 and 2.12 show that the axioms do not

completely determine an assignment of probabilities to events. The axioms serve only to rule out assignments inconsistent with our intuitive notions of probability. In the tack-tossing experiment of Example 2.11, two particular assignments were suggested. The appropriate or correct assignment depends on the nature of the thumbtack and also on ones interpretation of probability. 13 Interpreting Probability The interpretation that is most frequently used and most

easily understood is based on the notion of relative frequencies. Consider an experiment that can be repeatedly performed in an identical and independent fashion, and let A be an event consisting of a fixed set of outcomes of the experiment. Simple examples of such repeatable experiments include the tacktossing and die-tossing experiments previously discussed. 14 Interpreting Probability

If the experiment is performed n times, on some of the replications the event A will occur (the outcome will be in the set A), and on others, A will not occur. Let n(A) denote the number of replications on which A does occur. Then the ratio n(A)/n is called the relative frequency of occurrence of the event A in the sequence of n replications. 15 Interpreting Probability For example, let A be the event that a package sent within

the state of California for 2nd day delivery actually arrives within one day. The results from sending 10 such packages (the first 10 replications) are as follows: 16 Interpreting Probability Figure 2.2(a) shows how the relative frequency n(A)/n fluctuates rather substantially over the course of the first 50 replications.

Behavior of relative frequency (a) Initial fluctuation Figure 2.2 17 Interpreting Probability But as the number of replications continues to increase, Figure 2.2(b) illustrates how the relative frequency stabilizes. Behavior of relative frequency (b) Long-run stabilization Figure 2.2

18 Interpreting Probability More generally, empirical evidence, based on the results of many such repeatable experiments, indicates that any relative frequency of this sort will stabilize as the number of replications n increases. That is, as n gets arbitrarily large, n(A)/n approaches a limiting value referred to as the limiting (or long-run) relative frequency of the event A. The objective interpretation of probability identifies this

limiting relative frequency with P(A). 19 Interpreting Probability Suppose that probabilities are assigned to events in accordance with their limiting relative frequencies. Then a statement such as the probability of a package being delivered within one day of mailing is .6 means that of a large number of mailed packages, roughly 60% will arrive within one day. Similarly, if B is the event that an appliance of a particular type will need service while under warranty, then P(B) = .1

is interpreted to mean that in the long run 10% of such appliances will need warranty service. 20 Interpreting Probability This doesnt mean that exactly 1 out of 10 will need service, or that exactly 10 out of 100 will need service, because 10 and 100 are not the long run. This relative frequency interpretation of probability is said to be objective because it rests on a property of the experiment rather than on any particular individual concerned with the experiment.

For example, two different observers of a sequence of coin tosses should both use the same probability assignments since the observers have nothing to do with limiting relative frequency. 21 Interpreting Probability In practice, this interpretation is not as objective as it might seem, since the limiting relative frequency of an event will not be known. Thus we will have to assign probabilities based on our

beliefs about the limiting relative frequency of events under study. Fortunately, there are many experiments for which there will be a consensus with respect to probability assignments. 22 Interpreting Probability When we speak of a fair coin, we shall mean P(H) = P(T) = .5, and a fair die is one for which limiting relative frequencies of the six outcomes are all suggesting probability

assignments P({1}) = = P({6}) = Because the objective interpretation of probability is based on the notion of limiting frequency, its applicability is limited to experimental situations that are repeatable. 23 Interpreting Probability Yet the language of probability is often used in connection with situations that are inherently unrepeatable. Examples include: The chances are good for a peace agreement; It is likely that our company will be awarded the contract; and Because their best quarterback is

injured, I expect them to score no more than 10 points against us. In such situations we would like, as before, to assign numerical probabilities to various outcomes and events (e.g., the probability is .9 that we will get the contract). 24 Interpreting Probability We must therefore adopt an alternative interpretation of these probabilities. Because different observers may have different prior information and opinions concerning such experimental situations, probability assignments may now

differ from individual to individual. Interpretations in such situations are thus referred to as subjective. The book by Robert Winkler listed in the chapter references gives a very readable survey of several subjective interpretations. 25 More Probability Properties 26

More Probability Properties Proposition 27 Example 2.13 Consider a system of five identical components connected in series, as illustrated in Figure 2.3. A system of five components connected in a series Figure 2.3

Denote a component that fails by F and one that doesnt fail by S (for success). Let A be the event that the system fails. For A to occur, at least one of the individual components must fail. 28 Example 2.13 contd Outcomes in A include SSFSS (1, 2, 4, and 5 all work, but 3 does not), FFSSS, and so on.

There are in fact 31 different outcomes in A. However, A, the event that the system works, consists of the single outcome SSSSS. We will see in Section 2.5 that if 90% of all such components do not fail and different components fail independently of one another, then P(A) = P(SSSSS) = .95 = .59. Thus P(A) = 1 .59 = .41; so among a large number of such systems, roughly 41% will fail. 29 More Probability Properties

In general, the foregoing proposition is useful when the event of interest can be expressed as at least . . . , since then the complement less than . . . may be easier to work with (in some problems, more than . . . is easier to deal with than at most . . .). When you are having difficulty calculating P(A) directly, think of determining P(A). Proposition 30 More Probability Properties

This is because 1 = P(A) + P(A) P(A) since P(A) 0. When events A and B are mutually exclusive, P(A B) = P(A) + P(B). For events that are not mutually exclusive, adding P(A) and P(B) results in doublecounting outcomes in the intersection. The next result shows how to correct for this. Proposition 31 More Probability Properties Proof

Note first that A B can be decomposed into two disjoint events, A and B A; the latter is the part of B that lies outside A (see Figure 2.4). Furthermore, B itself is the union of the two disjoint events A B and A B, so P(B) = P(A B) 1 P(A B). Thus 32 More Probability Properties The addition rule for a triple union probability is similar to the foregoing rule.

33 More Probability Properties This can be verified by examining a Venn diagram of A B C, which is shown in Figure 2.6. ABC Figure 2.6 When P(A), P(B), and P(C) are added, the intersection probabilities P(A B), P(A C), and P(B C) are all counted twice. Each one must therefore be subtracted.

But then P(A B C) has been added in three times and subtracted out three times, so it must be added back. 34 More Probability Properties In general, the probability of a union of k events is obtained by summing individual event probabilities, subtracting double intersection probabilities, adding triple intersection probabilities, subtracting quadruple intersection robabilities, and so on.

35 Determining Probabilities Systematically 36 Determining Probabilities Systematically Consider a sample space that is either finite or countably infinite (the latter means that outcomes can be listed in an infinite sequence, so there is a first outcome, a second outcome, a third outcome, and so onfor example, the

battery testing scenario of Example 12). Let E1, E2, E3, denote the corresponding simple events, each consisting of a single outcome. 37 Determining Probabilities Systematically A sensible strategy for probability computation is to first determine each simple event probability, with the requirement that P(Ei) = 1. Then the probability of any compound event A is computed by adding together the P(Ei)s for all Eis in A:

38 Example 2.15 During off-peak hours a commuter train has five cars. Suppose a commuter is twice as likely to select the middle car (#3) as to select either adjacent car (#2 or #4), and is twice as likely to select either adjacent car as to select either end car (#1 or #5). Let pi = P(car i is selected) = P(Ei). Then we have p3 = 2p2 = 2p4 and p2 = 2p1 = 2p5 = p4. This gives 1 = P(Ei) = p1 + 2p1 + 4p1 + 2p1 + p1 = 10p1

implying p1 = p5 = .1, p2 = p4 = .2, p3 = .4. The probability that one of the three middle cars is selected (a compound event) is then p2 + p3 + p4 = .8. 39 Equally Likely Outcomes 40 Equally Likely Outcomes In many experiments consisting of N outcomes, it is

reasonable to assign equal probabilities to all N simple events. These include such obvious examples as tossing a fair coin or fair die once or twice (or any fixed number of times), or selecting one or several cards from a well-shuffled deck of 52. With p = P(Ei) for every i, That is, if there are N equally likely outcomes, the probability for each is 41 Equally Likely Outcomes

Now consider an event A, with N(A) denoting the number of outcomes contained in A. Then Thus when outcomes are equally likely, computing probabilities reduces to counting: determine both the number of outcomes N(A) in A and the number of outcomes N in , and form their ratio. 42 Example 2.16 You have six unread mysteries on your bookshelf and six

unread science fiction books. The first three of each type are hardcover, and the last three are paperback. Consider randomly selecting one of the six mysteries and then randomly selecting one of the six science fiction books to take on a post-finals vacation to Acapulco (after all, you need something to read on the beach). Number the mysteries 1, 2, . . . , 6, and do the same for the science fiction books. 43 Example 2.16

contd Then each outcome is a pair of numbers such as (4, 1), and there are N = 36 possible outcomes (For a visual of this situation, refer the table below and delete the first row and column). 44 Example 2.16

contd With random selection as described, the 36 outcomes are equally likely. Nine of these outcomes are such that both selected books are paperbacks (those in the lower right-hand corner of the referenced table): (4, 4), (4, 5), . . . , (6, 6). So the probability of the event A that both selected books are paperbacks is 45