Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Binomial Distribution: Success & Failure Probabilities in Bernoulli Trials, Study notes of Statistics

The concept of the binomial distribution, which deals with the probabilities of success and failure in independent Bernoulli trials. It covers the binomial distribution formula, the concept of permutations and combinations, and provides examples for better understanding.

What you will learn

  • How can the binomial distribution be used in real-life situations?
  • How is the probability of success calculated in a binomial distribution?
  • What is the relationship between the mean and variance of a binomial distribution?
  • What is the binomial distribution?
  • What is the difference between permutations and combinations?

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

alfred67
alfred67 🇺🇸

4.9

(20)

328 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
The Binomial Distribution
A. It would be very tedious if, every time we had a slightly different problem, we had to
determine the probability distributions from scratch. Luckily, there are enough similarities
between certain types, or families, of experiments, to make it possible to develop formulas
representing their general characteristics.
For example, many experiments share the common element that their outcomes can be classified
into one of two events, e.g. a coin can come up heads or tails; a child can be male or female; a
person can die or not die; a person can be employed or unemployed. These outcomes are often
labeled as “success” or “failure.” Note that there is no connotation of “goodness” here - for
example, when looking at births, the statistician might label the birth of a boy as a “success” and
the birth of a girl as a “failure,” but the parents wouldn’t necessarily see things that way. The
usual notation is
p = probability of success,
q = probability of failure = 1 - p.
Note that p + q = 1. In statistical terms, A Bernoulli trial is each repetition of an experiment
involving only 2 outcomes.
We are often interested in the result of independent, repeated bernoulli trials, i.e. the number of
successes in repeated trials.
1. independent - the result of one trial does not affect the result of another trial.
2. repeated - conditions are the same for each trial, i.e. p and q remain constant
across trials. Hayes refers to this as a stationary process. If p and q can change from trial to trial,
the process is nonstationary. The term identically distributed is also often used.
B. A binomial distribution gives us the probabilities associated with independent, repeated
Bernoulli trials. In a binomial distribution the probabilities of interest are those of receiving
a certain number of successes, r, in n independent trials each having only two possible
outcomes and the same probability, p, of success. So, for example, using a binomial
distribution, we can determine the probability of getting 4 heads in 10 coin tosses.
How does the binomial distribution do this? Basically, a two part process is involved. First, we
have to determine the probability of one possible way the event can occur, and then determine
the number of different ways the event can occur. That is,
P(Event) = (Number of ways event can occur) * P(One occurrence).
Suppose, for example, we want to find the probability of getting 4 heads in 10 tosses. In this
case, we’ll call getting a heads a “success.” Also, in this case, n = 10, the number of successes is
r = 4, and the number of failures (tails) is n – r = 10 – 4 = 6. One way this can occur is if the first
4 tosses are heads and the last 6 are tails, i.e.
The Binomial Distribution - Page 1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Understanding Binomial Distribution: Success & Failure Probabilities in Bernoulli Trials and more Study notes Statistics in PDF only on Docsity!

The Binomial Distribution

A. It would be very tedious if, every time we had a slightly different problem, we had to determine the probability distributions from scratch. Luckily, there are enough similarities between certain types, or families, of experiments, to make it possible to develop formulas representing their general characteristics.

For example, many experiments share the common element that their outcomes can be classified into one of two events, e.g. a coin can come up heads or tails; a child can be male or female; a person can die or not die; a person can be employed or unemployed. These outcomes are often labeled as “success” or “failure.” Note that there is no connotation of “goodness” here - for example, when looking at births, the statistician might label the birth of a boy as a “success” and the birth of a girl as a “failure,” but the parents wouldn’t necessarily see things that way. The usual notation is

p = probability of success, q = probability of failure = 1 - p.

Note that p + q = 1. In statistical terms, A Bernoulli trial is each repetition of an experiment involving only 2 outcomes.

We are often interested in the result of independent, repeated bernoulli trials , i.e. the number of successes in repeated trials.

  1. independent - the result of one trial does not affect the result of another trial.
  2. repeated - conditions are the same for each trial, i.e. p and q remain constant across trials. Hayes refers to this as a stationary process. If p and q can change from trial to trial, the process is nonstationary. The term identically distributed is also often used.

B. A binomial distribution gives us the probabilities associated with independent, repeated Bernoulli trials. In a binomial distribution the probabilities of interest are those of receiving a certain number of successes, r, in n independent trials each having only two possible outcomes and the same probability, p, of success. So, for example, using a binomial distribution, we can determine the probability of getting 4 heads in 10 coin tosses.

How does the binomial distribution do this? Basically, a two part process is involved. First, we have to determine the probability of one possible way the event can occur, and then determine the number of different ways the event can occur. That is,

P(Event) = (Number of ways event can occur) * P(One occurrence).

Suppose, for example, we want to find the probability of getting 4 heads in 10 tosses. In this case, we’ll call getting a heads a “success.” Also, in this case, n = 10, the number of successes is r = 4, and the number of failures (tails) is n – r = 10 – 4 = 6. One way this can occur is if the first 4 tosses are heads and the last 6 are tails, i.e.

S S S S F F F F F F

The likelihood of this occurring is

P(S) * P(S) * P(S) * P(S) * P(F) * P(F) * P(F) * P(F) * P(F) * P(F)

More generally, if p = probability of success and q = 1 – p = probability of failure, the probability of a specific sequence of outcomes where there are r successes and n-r failures is

pr q n −^ r

So, in this particular case, p = q = .5, r = 4, n-r = 6, so the probability of 4 straight heads followed by 6 straight tails is .5^4 .5^6 = 0.0009765625 (or 1 out of 1024).

Of course, this is just one of many ways that you can get 4 heads; further, because the repeated trials are all independent and identically distributed, each way of getting 4 heads is equally likely, e.g. the sequence S S S S F F F F F F is just as likely as the sequence S F S F F S F F S F. So, we also need to know how many different combinations produce 4 heads.

Well, we could just write them all out…but life will be much simpler if we take advantage of two counting rules:

1. The number of different ways that N distinct things may be arranged in order is N! = (1)(2)(3)...(N-1)(N), (where 0! = 1).

An arrangement in order is called a permutation, so that the total number of permutations of N objects is N!. The symbol N! is called N factorial.

EXAMPLE. Rank candidates A, B, and C in order. The possible permutations are: ABC ACB BAC BCA CBA CAB. Hence, there are 6 possible orderings. Note that 3! = (1)(2)(3) = 6.

NOTE: Appendix E, Table 6, p. 19 contains a Table of the factorials for the integers 1 through 50. For example, 12! = 4.79002 * 10^8. (Or see Hayes Table 8, p. 947). Your calculator may have a factorial function labeled something like x!

2. The total number of ways of selecting r distinct combinations of N objects, irrespective of order, is

⎟⎟ ⎠

N r

N

r

N

r!(N-r)!

N!

We refer to this as “N choose r.” Sometimes the number of combinations is known as a binomial coefficient , and sometimes the notation (^) NCr is used. So, in the present example,

To put it another way, the random variable X in a binomial distribution can be defined as follows:

Let Xi = 1 if the ith bernoulli trial is successful, 0 otherwise. Then,

X = ΣXi, where the Xi’s are independent and identically distributed (iid).

That is, X = the # of successes. Hence , Any random variable X with probability function given by

pq , X=0,1,2,..., N r!(N-r)!

N!

p q = r

N

p(X =r;N,p)= ⎟⎟ r N-r r N-r

is said to have a binomial distribution with parameters N and p.

EXAMPLE. In each of 4 races, the Democrats have a 60% chance of winning. Assuming that the races are independent of each other, what is the probability that: a. The Democrats will win 0 races, 1 race, 2 races, 3 races, or all 4 races? b. The Democrats will win at least 1 race c. The Democrats will win a majority of the races

SOLUTION. Let X equal the number of races the Democrats win.

a. Using the formula for the binomial distribution,

. 60. 40 =. 60 =. 4!(4-4)!

= 4! pq 4

4

. 60. 40 = 4 *. 60. 40 =.3456, 3!(4-3)! pq = 4! 3

4

. 60. 40 = 6 *. 60. 40 =.3456, 2!(4-2)!

pq = 4! 2

4

. 60. 40 = 4 .60. 40 =.1536, 1!(4-1)! pq = 4! 1

4

. 60. 40 =. 40 =.0256, 0!(4-0)! pq = 4! 0

4

4 4 - (^4404)

3 4 - (^33131)

2 4 - (^22222)

1 4 - (^1133)

0 4 - (^0044)

⎟⎟⎠

⎞ ⎜⎜⎝

⎟⎟⎠

⎞ ⎜⎜⎝

⎟⎟ ⎠

⎞ ⎜⎜ ⎝

⎟⎟⎠

⎞ ⎜⎜⎝

⎟⎟ ⎠

⎞ ⎜⎜ ⎝

b. P(at least 1) = P(X ≥ 1) = 1 - P(none) = 1 - P(0) = .9744. Or, P(1) + P(2) + P(3) + P(4) = .9744.

c. P(Democrats will win a majority) = P(X ≥ 3) = P(3) + P(4) = .3456 + .1296 = .4752.

EXAMPLE. In a family of 11 children, what is the probability that there will be more boys than girls? Solve this problem WITHOUT using the complements rule.

SOLUTION. You could go through the same tedious process described above, which is what most students did when I first asked this question on an exam. You would compute P(6), P(7), P(8), P(9), P(10), and P(11).

Or, you can look at Appendix E, Table II (or Hays pp. 927-931). Here, both Hayes and I list binomial probabilities for values of N and r from 1 through 20, and for values of p that range from .05 through .50.

Thus, on page E-5, we see that for N = 11 and p = .50,

P(6) + P(7) + P(8) + P(9) + P(10) + P(11) = .2256 + .1611 + .0806 + .0269 + .0054 + .0005 = .50.

NOTE: Understanding the tables in Appendix E can make things a lot simpler for you!

EXAMPLE. [WE MAY SKIP THIS EXAMPLE IF WE RUN SHORT OF TIME, BUT YOU SHOULD STILL GO OVER IT AND MAKE SURE YOU UNDERSTAND IT]

Use Appendix E, Table II, to once again solve this problem: In each of 4 races, the Democrats have a 60% chance of winning. Assuming that the races are independent of each other, what is the probability that: a. The Democrats will win 0 races, 1 race, 2 races, 3 races, or all 4 races? b. The Democrats will win at least 1 race c. The Democrats will win a majority of the races

SOLUTION. It may seem like you can’t do this, since the table doesn’t list p = .60. However, all you have to do is redefine success and failure. Let success = P(opponents win a race) = .40. The question can then be recast as finding the probability that a. The opponents will win 4 races, 3 races, 2 races, 1 race, or none of the races? b. The opponents will win 0, 1, 2, or 3 races; or, the opponents will not win all the races c. The opponents will not win a majority of the races

We therefore look at page E-4 (or Hayes, p. 927), N = 4 and p = .40, and find that a. P(4) = .0256, P(3) = .1536, P(2) = .3456, P(1) = .3456, and P(0) = .1296. b. P(0) + P(1) + P(2) + P(3) = 1 - P(4) =. c. P(1) + P(0) = .3456 + .1296 =.

In general, for p > .50: To use Table II, substitute 1 - p for p, and substitute N - r for r. Thus, for p = .60 and N = 4, the probability of 1 success can be found by looking up p = .40 and r = 3.

D. Mean of the binomial distribution. Recall that, for any discrete random variable, E(X) = Σxp(x). Therefore, E(Xi) = Σxp(x) = 0 * (1 - p) + 1 * p = p, that is, the mean of any

g. A MORE COMPLETE LISTING OF THE COUNTING RULES FOR

PERMUTATIONS AND COMBINATIONS (OPTIONAL).

Here is a more extensive set of counting rules that can be useful for various problems in probability. They aren’t essential for our immediate purposes, so we probably won’t go over them in class unless we have extra time. But, these aren’t very hard, and they may come in handy for you some day, so I am including them here.

1. NUMBER OF POSSIBLE SEQUENCES FOR N TRIALS. Suppose that a series of N trials were carried out, and that on each trial any of K events might occur. Then the following rule holds:

If any one of K mutually exclusive and exhaustive events can occur on each of N trials, then there are Kn^ different sequences that may result from a set of such trials.

EXAMPLE. If you toss a die once, any of 6 numbers can show up (K = 6). Ergo, if you toss it 3 times, any of 6^3 = 216 sequences are possible (e.g., 111, 342, 652, etc.). [Your calculator probably has a yx^ function or something similar; on mine, I press 6, then yx, then 3, then =.]

2. SEQUENCES. Sometimes the number of possible events in the first trial of a series is different from the number possible in the second, the second different from the third, etc. That is, K 1 ≠ K 2 ≠ K 3 , etc. Under such conditions,

If K 1 ,...,KN are the numbers of distinct events that can occur on trials 1,..., N in a series, then the number of different sequences of N events that can occur is (K 1 )(K 2 )...(KN).

EXAMPLE. Two occupations and three religions yield 6 combinations of occupation and religion. Tossing a coin (2 outcomes) and tossing a die (6 outcomes) yield 12 possible outcomes. Note that, when Ki = K for all i, then rule 1 becomes a special case of rule 2. Note also that, when K 1 = 1, K 2 = 2,..., KN = N, then rule 3 becomes a special case of rule 2.

3. PERMUTATIONS. A rule of extreme importance in probability computations concerns the number of ways that objects may be arranged in order.

The number of different ways that N distinct things may be arranged in order is N! = (1)(2)(3)...(N-1)(N), (where 0! = 1).

An arrangement in order is called a permutation, so that the total number of permutations of N objects is N!. The symbol N! is called N factorial. The notation (^) NPN is also sometimes used for N!, for reasons which will be clear in a moment.

EXAMPLE. Rank candidates A, B, and C in order. The possible permutations are: ABC ACB BAC BCA CBA CAB

There are 6 possible orderings. Note that 3! = (1)(2)(3) = 6.

NOTE: Appendix E, Table 6, p. 19 contains a Table of the factorials for the integers 1 through 50. For example, 12! = 4.79002 * 10^8. (Or see Hayes Table 8, p. 947). Your calculator may have a factorial function labeled something like x!

3B. PERMUTATIONS OF SIMILAR OBJECTS.

Suppose we have N objects, N 1 alike, N 2 alike,..., Nk alike (ΣNi = N). Then, the number of ways of arranging these objects is

N !N !...N^!

N!

1 2 k

EXAMPLE. We have 4 balls, 2 red and 2 blue. The possible ways of arranging the balls are BBRR, BRBR, BRRB, RRBB, RBRB, RBBR, or 6 ways altogether. To confirm that there are 6 ways,

= 6 2!2!

N!N !...N!

N!

1 2 k

EXAMPLE. If we have 6 balls, 2 red, 2 blue, and 2 green, the number of possible ways of arranging them is

= 90 8

N!N !...N!

N!

1 2 k

NOTE: I add this rule to Hayes’s list because many of the other rules are special cases of it. When Ni = 1 for all i, Rules 3 (Permutations) and 3B are equivalent. When Ni = 1 for i = 1 to r, and Nr+1 = N - r, then rules 4 (Ordered combinations) and 3B are the same. When N 1 = r and N 2 = N - r, Rules 5 (Combinations) and 3B are the same.

4. ORDERED COMBINATIONS; or, PERMUTATIONS OF N OBJECTS TAKEN r AT A TIME.

Sometimes it is necessary to count the number of ways that r objects might be selected from among some N objects in all (r ≤ N). Further, each different arrangement of the r objects is considered separately.

The number of ways of selecting & arranging r objects from among N distinct objects is

(N- r)!

N!

Verbally, we refer to this as ordered combinations of N things taken r at a time. The notation (^) NPr is sometimes used, and may appear on your calculator using similar notation. Note also that (^) NPN = N!.

AB AC AD BC BD CD, i.e. there are 6 possible combinations. Confirming this with rule #5, we get

= 6 2

r!(N-r)!

N!

EXAMPLE. There are 100 applicants for 3 job openings. The number of possible combinations is

= 161, 6

r!(N-r)!

N!

NOTE: You could also use rule 3B. People are divided into 2 groups: the 3 best (N 1 = 3) and all the rest (N 2 = 97). Hence, rule 3B yields 100!/(3!97!)

See Appendix E, Table 7, page 20 for (^) NCr values for various values of N and r. (Or see Hayes, Appendix E, Table IX, p. 948). Your calculator may have a function labeled nCr or something similar.

6. Binomial distribution. In sampling from a stationary Bernoulli process, with the probability of success equal to p, the probability of observing exactly r successes in N independent trials is

p q r!(N-r)!

N!

p q = r

N (^) r N-r r N-r ⎟⎟ ⎠

7. Binomially distributed variables. Let Xi = 1 if the ith bernoulli trial is successful, 0 otherwise. If X = ΣXi, where the Xi’s are independent and identically distributed (iid), then X has a binomial distribution, and E(X) = Np, V(X) = Npq.

SUMMARY OF HIGHLIGHTS

1. NUMBER OF POSSIBLE SEQUENCES FOR N TRIALS. If any one of K mutually exclusive and exhaustive events can occur on each of N trials, then there are Kn different sequences that may result from a set of such trials. 2. SEQUENCES. If K 1 ,...,KN are the numbers of distinct events that can occur on trials 1,..., N in a series, then the number of different sequences of N events that can occur is (K 1 )(K 2 )...(KN). 3. PERMUTATIONS The number of different ways that N distinct things may be arranged in order is N! = (1)(2)(3)...(N-1)(N), (where 0! = 1).

3B. PERMUTATIONS OF SIMILAR OBJECTS. Suppose we have N objects, N 1 alike, N 2 alike,..., Nk alike (ΣNi = N). Then, the number of ways of arranging these objects is

N !N !...N^!

N!

1 2 k

4. ORDERED COMBINATIONS; or, PERMUTATIONS OF N OBJECTS TAKEN r AT A TIME. The number of ways of selecting & arranging r objects from among N distinct objects is

(N- r)!

N!

5. COMBINATIONS. The total number of ways of selecting r distinct combinations of N objects, irrespective of order, is

N- r

N

r

N

r!(N-r)!

N!

6. Binomial distribution. In sampling from a stationary Bernoulli process, with the probability of success equal to p, the probability of observing exactly r successes in N independent trials is

p q r!(N-r)!

N!

pq = r

N (^) r N-r r N-r ⎟⎟ ⎠

7. Binomially distributed variables. Let Xi = 1 if the ith bernoulli trial is successful, 0 otherwise. If X = ΣXi, where the Xi’s are independent and identically distributed (iid), then X has a binomial distribution, and E(X) = Np, V(X) = Npq.