Download Confidence Interval for Population Proportion π and more Study notes Algebra in PDF only on Docsity!
I. Confidence interval for π: Motivation
- Want to estimate a population proportion π.
- The sample proportion p is a good estimator.
- But we need to quantify the “uncertainty” in p.
- We’ve done this in an ad-hoc way, by computing the probability that p is within 0.03 or 0.05 or... of π.
- It would be more satisfying if we could construct an interval of values that we’re sure contains the population proportion π.
- For example, we’d like to be able to say “We’re sure that the population proportion is between 0.04 and 0.07.”
- Unfortunately, we can’t do this (unless we choose “useless” intervals like (0, 1)).
- Why not? Because there’s always a small chance that we’ll get a sample that is very unrepresentative of the population.
- For example:
- Population has 1000000 members.
- 10000 are female (so π = 0.01)
- Choose n = 500 at random.
- There’s a (very small) chance that all 500 will be female, leading to p = 1!
- So we’ll have to be satisfied with something like “We’re pretty sure” in place of “We’re sure.”
III. Confidence interval for π: Derivation
- We know (as long as n is large enough) that p is approximately normal with
- Mean π
- Standard deviation
π(1 − π)/n.
- So √ p^ −^ π π(1 − π)/n is approximately standard normal.
- Chapter 4 problem: Find −c and c such that
P
−c ≤ √ p^ −^ π π(1 − π)/n
≤ c
- Answer: c = 2.33. See Figure 1.
x
density
−3 −c 0 c 3
shaded area is 0.
Figure 1: For a 98% confidence interval, we need to find c
p ± 2. 33
π(1 − π)/n
will work.
- Failure? We can’t use this interval, because we don’t know π.
- Redemption? We’ll plug in p for π to get the interval
p ± 2. 33
p(1 − p)/n
- Note: Now we have two potential inaccuracies:
- Approximating the distribution of p by a normal.
- Replacing π by p.
- Rule of thumb: If np > 10 and n(1 − p) > 10, then this method is typically valid, in the sense that the probability is close to 0.98.
IV. Confidence Interval for π: Computation Computing confidence intervals is easy! Example: Death penalty opinion poll from last week.
- Poll of 1003 U.S. adults.
- Found p = 0.46 supported the death penalty instead of life in prison for convicted murderers.
- A 98% confidence interval for the proportion π of all U.S. adults who favor the death penalty is p± 2. 33
p(1 − p)/n = 0. 46 ± 2. 33
- So we are “98% confident that the proportion of U.S. adults who favor the death penalty is between 0.4233 and 0 .4967.”
- For example, to get a 95% confidence interval:
- Find c and −c such that
P (−c ≤ Z ≤ c) = 0. 95 ,
where Z is standard normal.
- The answer is c = 1.96. See Figure 2.
- Then replace 2.33 by 1.96 in the formula:
p ± 1. 96
p(1 − p)/n
x
density
−3 −c 0 c 3
shaded area is 0.
Figure 2: For a 95% confidence interval, we need to find c
- Changing confidence levels just changes the multiplier c. Some common values for c: Confidence level Multiplier 90% c = 1. 64 95% c = 1. 96 98% c = 2. 33 99% c = 2. 58
- Increasing the confidence level increases the width of the interval. This makes intuitive sense!
VI. Sample size determination
- The “margin of error” of an interval is the “something” we add and subtract.
- In the present case, the margin of error is
c
p(1 − p)/n,
where c is the multiplier we get from Table B.2, which depends on the confidence level.
- A common question: How much data do I need to get a confidence interval with margin of error B or less?
- It’s easy to answer this.
Example revisited In the death penalty example we computed a 98% confidence interval to be 0. 46 ± (0.0367), so the margin of error is 0.0367.
- Question: What sample size n guarantees a margin of error of 0.02 or less?
- Answer: Plug c = 2.33 and B = 0. 02 into the formula:
n = (2.33)
(0.02)^2 = 3393.^0625.
- To be safe we always round up. In this case, we get n = 3394.
Example: A Washington Post/ABC News poll asked 1513 randomly selected adults “Would you support or oppose the federal government giving parents money to send their children to private or religious schools instead?” Of those surveyed, 48% said YES.
- Find a 96% confidence interval for the proportion of adults who support this.
- First find c = 2.05 from Table B.2 as in Figure 3.
- Then plug in to the formula:
- 48 ± (2.05)
- How large would n have to be to assure that a 97% confidence interval would have margin of error 0.01 or less?
- First find c = 2.17 from Table B.2 as in Figure 4.
- Then plug c = 2.17 and B = 0.01 into the formula:
n = (2.17)
(0.01)^2 = 11772.^25.
x
density
−3 −c 0 c 3
shaded area is 0.
Figure 4: For a 97% confidence interval, we need to find c