






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to probability theory, explaining key concepts such as probability rules, joint probability, additive probability, and conditional probability. It includes examples and calculations to help illustrate these concepts. The document also covers Bayes' Rule and the relationship between marginal and joint probabilities.
What you will learn
Typology: Study notes
1 / 11
This page cannot be seen from the preview
Don't miss anything!
Probability a number between 0 and 1 that indicates how likely it is that a specific event or set of events will occur.
Simple experiment
some well-defined act or process that leads to a single well-defined outcome. For example, a coin toss will yield either a heads or a tails; a birth will yield either a boy or a girl. (NOTE: Statisticians do NOT use the term “Experiment” in the same way a Social psychologist or a chemist would).
Sample space the set of all possible distinct outcomes of an experiment. For example, if you toss a coin once, the possible outcomes are H or T; toss it twice, and the possible outcomes are HH, HT, TH, and TT.
Sample point, or elementary event
any member of the sample space. One possible result of a single trial of the experiment. e.g., getting a Heads when tossing a coin; getting the Ace of Hearts when pulling a card from a deck.
Event, or event class
some subset of the outcomes of an experiment; any set of elementary events. e.g. getting a “heart” when you pull a card from a deck is achieved by 13 different elementary events.
Mutually exclusive outcomes
Any set of events that cannot occur simultaneously. For example, for the variable GENDER, a person cannot be both male and female. Conversely, for ETHNICITY, an individual could claim both European and Asian ethnic heritages.
Independent events
events that have nothing to do with each other; the occurrence of one event in no way affects the occurrence of the other. For example, the result of one coin toss does not affect the possible value of the next.
NOTE: Mutually exclusive and independent are not one and the same!!! If someone is Male, we know they are not female; male and female are mutually exclusive events. But, if one coin toss comes up heads, we know nothing about the value of the next coin toss.
PROBABILITY AXIOMS:
Totalnumberof possible events
)=NumberofelementaryeventsinE P( (^) E i i
For example, if E 1 = getting a 1 when you roll a fair die, P(E 1 ) = 1/6; if E 1 = getting an even number when you roll a fair die, P(E 1 ) = 3/6.
Summary of probability rules: Let A and B be two events of interest in a particular experiment. (The rules will make more sense once you see the examples.)
Rule Name/ Explanation
General Rule Rule for mutually exclusive events
Rule for independence Complements - Prob that A does not occur P(A)=^1 - P(A)
Conditional probability - Prob that A occurs given that B has or will occur [“Probability of A given B”]
Joint Probability - Prob that both A & B occur in one replication of the experiment; the prob of the intersection of A & B [“Probability of A and B”]
Additive probability, aka Probability of a union - Prob that either A or B or both occur in one replication of the experiment [“Probability of A or B”]
Marginal probability
i i
i ∑
Bayes' Rule (Another formula for conditional probability) P(E )P(A|E )
j j j
i i
i
A
B C
Again, you are given marginal probabilities. These happen to be mutually exclusive events, so add them all up and you get P(Dying in a year) = 15%. Everything not contained in the three circles reflects the 85% of the population that does not die.
White Females
White Males
Black Females Black Males
Here, you are given several joint probabilities, e.g. the probability of being both white and female = P(white ∩ female) = .50. From the information given, you could easily determine the marginal probabilities for race and gender, e.g. P(White) = P(White ∩ Female) + P(White ∩ Male) = .85. Similarly, P(Black) = .15, P(Female) = .59, P(Male) = .41.
A B
You have two marginal probabilities here: P(Football) = .10 and P(Track) = .08. There is one conditional probability: P(Track Team | Football team) = .50. And, there is one additive probability: P(Football ∪ Track) = .13.
Why isn’t this last number .18? Because track and football are not mutually exclusive events; some boys were in both sports, as is shown by the overlap in the two circles. Since 10% of the boys are in football and half of those are also in track, the probability of being on both the track and football teams is .05, i.e. P(Football ∩ Track) = P(Football) * P(Track Team | Football team) = .10 * .50 = .05. Hence, to find the probability of being on football or track, you take P(Football) + P(Track) – P(Football ∩ Track) = .10 + .08 - .05 = .13.
Probability – Math Examples
1. In a family of 11 children, what is the probability that there will be more boys than girls?
SOLUTION. The easiest way to solve this is via the complements rule. Let event A = Having
more boys than girls. A is therefore more girls than boys. Each of these events are equally likely, so P(A) = .50. Note that the problem would be a little more difficult if there were 10 children, since you would then have to figure out the probability that there were an equal number of boys and girls. We’ll see later how to do that.
2. You are playing Let's Make a Deal with Monte Hall. You are offered your choice of door #1, door #2, or door #3. Monte tells you that goats are behind two of the doors; but, behind the other door is a new car. You choose door #1. Monte, who knows what is behind each door, then opens door #3, revealing a goat. He then offers you the choice of either keeping your own door, #1, or else switching to Monte's remaining door, #2. Should you switch?
SOLUTION. The easiest way to see this is by using the complements rule. Let A = switch, A = doesn't switch. Note that resolving not to switch is the same as not having the option to
A little less than half (44%) of the readers are Protestant, and of those more than 2/ (68%) are Democrats. That is, if a reader is Protestant, there is better than a 2:1 chance they are also Democrat.
c. What proportion of the population is either Democrat or Protestant, or both?
SOLUTION. This is a question about the probability of a union. Note that
Note that we can’t just add up the % Protestant and the % Democrat, because then the Protestant Democrats would get counted twice (once as Protestants and then again as Democrats).
4. Prove that Race and Gender are independent in the following table:
Race \ Gender Male Female Total
White 35 35 70
Nonwhite 15 15 30
Total 50 50 100
SOLUTION. A = White, B = Male, P(A) = .70, P(B) = .50, P(A ∩ B) = .35. A and B are independent because
5. A new, less expensive method has been developed for testing for the AIDS virus. Fifty percent of the people who test positive actually have AIDS. Of those who test negative, 5% have AIDS. Twenty percent of the population tests positive. a. What percentage of the population will receive false positive scores - that is, the test will say they have AIDS when they really don't? [i.e. what % will be unnecessarily scared?]
b. What percentage of the population will receive false negative scores - that is, the test will say they don't have AIDS when they really do? [i.e. what % will have a false sense of security] c. What percentage of the population has AIDS? (Don't worry, these are fictitious numbers.) d. What is the probability that someone who has AIDS will test positive?
SOLUTION. Let us use the following terms:
A = has Aids, A = does not have AIDS E 1 = positive test, E 2 = negative test
We are told:
P(E 1 ) = P(Positive test) = .20, implying P(E 2 ) = P(Negative test) =.
P(A | E 1 ) = P(having Aids if you test positive) = .50, implying P( A | E 1 ) = P(not having Aids if you test positive) =.
P(A | E 2 ) = P(having aids if you test negative) = .05, implying P( A | E 2 ) = P(not having aids if you test negative) = .95.
a. We are asked to find what percentage of the population does not have AIDS and
receives a positive score, that is, P(E 1 ∩ A ). Use joint probability.
That is, half of the 20% of the population that tests positive will needlessly worry that they have AIDS.
b. We are asked to find what percentage of the population has AIDS and receives a negative score, that is, P(A ∩ E 2 ). Again use joint probability.
P(A ∩ E 2 )=P(E 2 )P(A|E 2 )=.80*.05=.
That is, 5% of the 80% who test negative, or 4% altogether, will have AIDS but still test negative.
c. We are asked to find P(A), i.e. the probability of having AIDS. Use the marginal probability theorem:
P(A) = ∑ P(Ei)P(A|Ei)=(.20.50)+(.80.05)=.
6. A researcher is doing a study of gender discrimination in the American labor force. She has come up with a 3-part classification of occupations (Occupation 1, Occupation 2, and Occupation 3) and a 2-part classification for wages (“good” and “bad”). She finds that, by gender, the distribution of occupation and wages is as follows:
Women Men
Pay/Occ Occ 1 Occ 2 Occ 3 Occ 1 Occ 2 Occ 3
Good Pay 20% 7% 10% 7% 10% 60%
Bad Pay 50% 8% 5% 8% 5% 10%
From the table, it is immediately apparent that 37% of all women receive good pay, compared to 77% of the men. At the same time, it is also very clear that the types of occupations are very different for men and women. For women, 70% are in occupation 1, which pays poorly, while 70% of men are in occupation 3, which pays very well. Therefore, the researcher wants to know whether differences in the types of occupations held by men and women account for the wage differential between them. How can she address this question?
SOLUTION. This problem is best addressed by asking a “what if” sort of question: Suppose women were distributed across occupations the same way men were, but within each occupation had the same wage structure that they do now. If differences in types of occupations alone account for the wage discrepancies, then this approach should control for those differences and wage differentials should disappear.
We will use the following terms:
Event A = Receives Good pay, A = Bad pay, Ei = Employed in occupation i.
Given these definitions, this problem requires that we combine the occupational distribution for men (P(Ei))M^ with the conditional probabilities that a woman receives good wages given the occupation she is in (P(A | Ei))F
For men P(E 1 )M^ = .15, P(E 2 )M^ = .15, P(E 3 )M^ =.
For women P(A | E 1 )W^ = 2/7, P(A | E 2 )W^ = 7/15, P(A | E 3 )W^ = 10/
Using the marginal probability theorem, we get
P(A) = ∑ P(Ei)MP(A|Ei)W=(.15* 72 )+(.15* 157 )+(.70* 1510 )=.
Hence, if women had the same occupational distribution while continuing to make the same salaries within occupations that they do now, 58% of women would make good wages. This is much more than the 37% of women who currently make good wages, but still well short of the
male figure of 77%. Differences in occupational structure account for much of the difference between men and women, but not all.
NOTE: This sort of “what if” question comes up all the time. For example, in demography it is often difficult to compare death rates across populations, because one population might be relatively old (and hence has a lot of people at high risk of dying) while another is relatively young. Therefore, a common approach is to standardize using the age composition of one population and the age-specific death rates of the other. For example, Mexico and the United States have very similar crude death rates, i.e. about the same proportion of people die in both countries each year. This is highly deceptive because high fertility in recent years has caused Mexico's population to be much younger than is the United States'.