Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Confidence Intervals for a Proportion: Lecture Notes, Lecture notes of Statistics

Appalachian Bible College (ABC)Statistics

Lecture notes on calculating confidence intervals for a proportion using R. It covers the formulas for 95%, 90%, and 85% confidence intervals, as well as ways to write and interpret the intervals. The document also includes examples and critical values for various confidence levels.

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

explain 🇺🇸

4

(2)

230 documents

1 / 23

This page cannot be seen from the preview

Don't miss anything!

150 Chapter 4. Statistics (LECTURE NOTES 8)

4.5 Confidence Intervals for a Proportion

Let Zbe N(0,1) and pbe a number between 0 and 1; critical z-value zpis

P(Z > zp)=1−Φ(zp) = p.

Let 0 < α < 1 and xbe number of successes in nobserved trials of a Bernoulli

experiment with unknown probability of success p. For ˆp=x

n, the 100(1 −α)%

confidence interval for proportion pis

ˆp±zα

2rˆp(1 −ˆp)

n="ˆp−zα

2rˆp(1 −ˆp)

n,ˆp+zα

2rˆp(1 −ˆp)

n#,

where

E=zα

2rˆp(1 −ˆp)

n,and rˆp(1 −ˆp)

n

are the margin of error and standard deviation of the proportion respectively and

αis the level of significance. We assume a large random sample is chosen, both

np ≥5 and np(1 −p)≥5 and the conditions of a binomial distribution is satisfied.

Also, one-sided confidence interval estimates for pinclude lower and upper bound

respectively:

"ˆp−zαrˆp(1 −ˆp)

n,1#,"0,ˆp+zαrˆp(1 −ˆp)

n#.

Exercise 4.5 (Confidence Intervals for a Proportion)

1. Confidence interval (CI) for proportion, p, of purchase slips made with Visa.

It is found 54 of 180 (or ˆp=54

180 = 0.3) randomly selected from all credit card

purchase slips are made with Visa where conditions of binomial distribution are

satisfied. Calculate a 95% confidence interval (CI) of proportion pof purchase

slips made with Visa.

(a) Point estimate.

Point estimate of population (actual, true) proportion of all credit card

purchase slips made with Visa, p, is

ˆp= (i) 0.3(ii) 54 (iii) 180.

Statistic ˆp= 0.3 probably does not exactly equal unknown parameter p.

(b) Check assumptions.

Since random sample chosen,

conditions of binomial distribution are satisfied,

and np(1 −p)≈nˆp(1 −ˆp) = 180(0.3)(0.7) = 37.8≥5,

and np ≈nˆp= 180(0.3) = 54 ≥5,

assumptions (i) have (ii) have not been satisfied

and so it is appropriate ˆp±zα

2qˆp(1−ˆp)

nestimate parameter p.

Partial preview of the text

Download Confidence Intervals for a Proportion: Lecture Notes and more Lecture notes Statistics in PDF only on Docsity!

150 Chapter 4. Statistics (LECTURE NOTES 8)

4.5 Confidence Intervals for a Proportion

Let Z be N (0, 1) and p be a number between 0 and 1; critical z-value zp is

P (Z > zp) = 1 − Φ(zp) = p.

Let 0 < α < 1 and x be number of successes in n observed trials of a Bernoulli experiment with unknown probability of success p. For ˆp = xn , the 100(1 − α)% confidence interval for proportion p is

pˆ ± zα 2

pˆ(1 − pˆ) n

[

p ˆ − zα 2

pˆ(1 − pˆ) n

, pˆ + zα 2

pˆ(1 − pˆ) n

]

where

E = zα 2

pˆ(1 − pˆ) n

, and

pˆ(1 − pˆ) n are the margin of error and standard deviation of the proportion respectively and α is the level of significance. We assume a large random sample is chosen, both np ≥ 5 and np(1 − p) ≥ 5 and the conditions of a binomial distribution is satisfied. Also, one-sided confidence interval estimates for p include lower and upper bound respectively: (^) [

p ˆ − zα

pˆ(1 − pˆ) n

]

[

0 , pˆ + zα

pˆ(1 − pˆ) n

]

Exercise 4.5 (Confidence Intervals for a Proportion)

Confidence interval (CI) for proportion, p, of purchase slips made with Visa. It is found 54 of 180 (or ˆp = 18054 = 0.3) randomly selected from all credit card purchase slips are made with Visa where conditions of binomial distribution are satisfied. Calculate a 95% confidence interval (CI) of proportion p of purchase slips made with Visa.

(a) Point estimate. Point estimate of population (actual, true) proportion of all credit card purchase slips made with Visa, p, is pˆ = (i) 0. 3 (ii) 54 (iii) 180. Statistic ˆp = 0.3 probably does not exactly equal unknown parameter p. (b) Check assumptions. Since random sample chosen, conditions of binomial distribution are satisfied, and np(1 − p) ≈ npˆ(1 − pˆ) = 180(0.3)(0.7) = 37. 8 ≥ 5, and np ≈ npˆ = 180(0.3) = 54 ≥ 5, assumptions (i) have (ii) have not been satisfied and so it is appropriate ˆp ± zα 2

pˆ(1−pˆ) n estimate parameter^ p.

Section 5. Confidence Intervals for a Proportion (LECTURE NOTES 8) 151

(c) 95% Confidence Interval (CI) using R. The 95% CI for proportion of all credit cards made with Visa, p, is (i) (0. 251 , 0 .349) (ii) (0. 273 , 0 .367) (iii) (0. 233 , 0 .367). prop1.interval <- function(x,n,conf.level) # function of 1-proportion CI for p { p <- x/n z.crit <- -1qnorm((1-conf.level)/2) margin.error <- z.critsqrt(p*(1-p)/n) ci.lower <- p - margin.error ci.upper <- p + margin.error dat <- c(p, z.crit, margin.error, ci.lower, ci.upper) names(dat) <- c("Mean", "Critical Value", "Margin of Error", "CI lower", "CI upper") return(dat) } prop1.interval(54,180,0.95) # 1-proportion 95% CI for p Mean Critical Value Margin of Error CI lower CI upper 0.30000000 1.95996398 0.06694551 0.23305449 0. where this interval includes not only smallest possible proportion of 0. and largest possible proportion of 0.367, but also other proportions in between these two extremes such as point estimate, ˆp = 0.3. Length of this CI is L ≈ 0. 367 − 0 .233 = 0.134. So, 95% confident population parameter p in (0.233, 0.367). (d) 90% CI using R. The 90% CI for proportion of all credit cards made with Visa, p, is (i) (0. 251 , 0 .349) (ii) (0. 244 , 0 .356) (iii) (0. 233 , 0 .367). Length of this CI is L ≈ 0. 356 − 0 .244 = 0.112. prop1.interval(54,180,0.90) # 1-proportion 90% CI for p Mean Critical Value Margin of Error CI lower CI upper 0.30000000 1.64485363 0.05618245 0.24381755 0. (e) 85% CI using R. The 85% CI for proportion of all credit cards made with Visa, p, is (i) (0. 251 , 0 .349) (ii) (0. 273 , 0 .367) (iii) (0. 233 , 0 .367). Length of this CI is L ≈ 0. 349 − 0 .251 = 0.098. prop1.interval(54,180,0.85) # 1-proportion 85% CI for p Mean Critical Value Margin of Error CI lower CI upper 0.30000000 1.43953147 0.04916936 0.25083064 0.

(f) Comparing CI lengths. Length of 95% CI for p, L = 0.134, is (i) longer than (ii) same length as (iii) shorter than length of 90% CI for p, L = 0.112, which is (i) longer than (ii) same length as (iii) shorter than length of 85% CI for p, L = 0.098. Increasing confidence increases CI length. (g) Margin of error. Half of length, L, is margin of error, E = L 2. Consequently, for 95% CI for p,

Section 5. Confidence Intervals for a Proportion (LECTURE NOTES 8) 153

f(z)

z

(a) z critical value

95% in middle of normal

f(z)

z

90% in middle of normal 97.5% to left 2.5% to right 95% to left

0.025 0.

2.5th percentille -z critical value

97.5th percentile z critical value

5% to right

95th percentile z critical value (b) z (^) 0.05critical value

5th percentile -z critical value

Figure 4.5: Critical values

Critical value for 90% = (1 − α) · 100% = (1 − 0 .10) · 100% CI is zα 2 = z^0. 210 = z 0. 05 = (i) 1. 96 (ii) 1. 645 (iii) 1. 44. qnorm(0.95) # critical value z_0.1/

qnorm(0.95) # critical value z_0.1/ [1] 1. Critical value for 85% = (1 − α) · 100% = (1 − 0 .15) · 100% CI is zα 2 = z 0. 215 = z 0. 075 = (i) 1. 96 (ii) 1. 645 (iii) 1. 44. qnorm(0.925) # critical value z_0.15/ qnorm(0.925) # critical value z_0.15/ [1] 1.

(k) CI using formula. A 95% CI for proportion of Visa credit card purchase slips, p, is ˆp ± zα 2

ˆp(1−pˆ) n =

i. 0. 3 ± 1. 96 ×

0 .3(1− 0 .3) 180 ii. 0. 3 ± 1. 645 ×

0 .3(1− 0 .3) 180 iii. 0. 3 ± 1. 44 ×

0 .3(1− 0 .3) 180 and a 90% CI for proportion of Visa credit card purchase slips, p, is

i. 0. 3 ± 1. 96 ×

0 .3(1− 0 .3) 180 ii. 0. 3 ± 1. 645 ×

0 .3(1− 0 .3) 180 iii. 0. 3 ± 1. 44 ×

0 .3(1− 0 .3) 180 and an 85% CI for proportion of Visa credit card purchase slips, p, is

i. 0. 3 ± 1. 96 ×

0 .3(1− 0 .3) 180 ii. 0. 3 ± 1. 645 ×

0 .3(1− 0 .3) 180

154 Chapter 4. Statistics (LECTURE NOTES 8)

iii. 0. 3 ± 1. 44 ×

0 .3(1− 0 .3) 180 (l) Population, Sample, Statistic and Parameter. Match columns.

terms credit card example (a) population (a) Visa or not, all purchase slips (b) sample (b) proportion of all slips made with Visa, p (c) statistic (c) Visa or not, 180 purchase slips (d) parameter (d) proportion of 180 slips made with Visa, ˆp

terms (a) (b) (c) (d) credit card example

95% CI, proportion of student heights over 6 feet tall. 37 of 102 students, chosen at random from PNW, over 6 feet tall.

(a) Point estimate Point estimate of proportion, p, of student heights over 6 feet tall is pˆ = 10237 ≈ (i) 0. 363 (ii) 0. 378 (iii) 0. 391. (b) Check assumptions. Since np ≈ npˆ = 102

102

and np(1 − p) ≈ npˆ(1 − pˆ) = 102

102

assumptions (i) have (ii) have not been satisfied and so it is appropriate ˆp ± zα 2

pˆ(1−pˆ) n estimate parameter^ p. (c) Using R. The 95% CI for p is (i) (0. 269 , 0 .456) (ii) (0. 273 , 0 .367) (iii) (0. 233 , 0 .367). prop1.interval(37,102,0.95) # 1-proportion 95% CI for p Mean Critical Value Margin of Error CI lower CI upper 0.3627451 1.9599640 0.0933051 0.2694400 0. (d) Using formula: critical value using R. Critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI for p is zα 2 = z 0. 05 2 = z 0. 025 = (i) 1. 28 (ii) 1. 96 (iii) 2. 58. qnorm(0.975) # critical value z_0.05/2 for 95% CI

qnorm(0.975) # critical value z_0.05/ [1] 1.

(e) Using formula: critical value using Table C.1. Critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI for p is zα 2 = z 0. 205 = z 0. 025 = (i) 1. 28 (ii) 1. 96 (iii) 2. 58. (f) Using formula. Since ˆp = 10237 and n = 102, the 95% CI for p is pˆ ± zα 2

pˆ(1−pˆ) n =

156 Chapter 4. Statistics (LECTURE NOTES 8)

μ is called a z-interval:

¯x ± zα 2

σ √ n

The (1 − α) · 100% confidence interval for μ with unknown σ is called a t-interval:

x¯ ± tα 2

s √ n

where T = X¯−μ √^ Sn^ has a Student-t distribution and where

E = tα 2

s √ n

and

s √ n

are the margin of error and standard error of the mean respectively and α is the level of significance. We assume a large random sample, where either the underlying distribution is normal with no outliers or if the sample size large (n > 30). Also, one- sided confidence interval estimates for μ include lower and upper bound respectively:

( x ¯ − tα

s √ n

−∞, ¯x + tα

s √ n

Exercise 4.6 (Confidence Intervals for a Mean)

Estimates for population average weight of PNW students. Average weight of simple random sample of 11 PNW students is ¯x = 167 pounds with sample SD s = 20.1 pounds. Weights normally distributed, no outliers.

(a) Point estimate. Point estimate of population weight of all students, μ, is x¯ = (i) 11 (ii) 20. 1 (iii) 167. Also notice σ is unknown and estimated by s = 20.1. (b) 95% CI i. Using R. The 95% CI for μ is (i) (143. 5 , 182 .5) (ii) (151. 5 , 180 .5) (iii) (153. 5 , 180 .5). mean1.t.interval <- function(m,s,n,conf.level) { t.crit <- -1qt((1-conf.level)/2,n-1) margin.error <- t.crits/sqrt(n) ci.lower <- m - margin.error ci.upper <- m + margin.error dat <- c(mean, t.crit, margin.error, ci.lower, ci.upper) names(dat) <- c("Mean", "Critical Value", "Margin of Error", "CI lower", "CI upper") return(dat) } mean1.t.interval(167,20.1,11,0.95) # m: mean, s: SD, n: sample size, 95% t-interval

Section 6. Confidence Intervals for a Mean (LECTURE NOTES 8) 157

Mean Critical Value Margin of Error CI lower CI upper 167.000000 2.228139 13.503364 153.496636 180. So, 95% confident population parameter μ in (153.5, 180.5). ii. Using formula: degrees of freedom (df ). df = n − 1 = 11 − 1 = (i) 10 (ii) 11. iii. Using formula: critical value using R. Critical value 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI, 10 df tα 2 = t^0. 205 = t 0. 025 ≈ (i) 1. 28 (ii) 2. 23 (iii) 2. 58. qt(0.975,10) # critical value t, 10 df, for 95% CI

qt(0.975,10) # critical value t for 95% CI [1] 2. iv. Using formula: critical value using Table C.3. Critical value 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI, 10 df tα 2 = t 0. 205 = t 0. 025 ≈ (i) 1. 28 (ii) 2. 23 (iii) 2. 58. v. Using formula. The 95% CI for μ is x¯ ± tα 2 √^ sn = (i) 20. 1 ± 167 × √^2.^2311 (ii) 2. 23 ± 167 × 20 √ 11.^1 (iii) 167 ± 2. 23 × 20 √ 11.^1 which equals (i) 20. 1 ± 12. 51 (ii) 2. 23 ± 13. 51 (iii) 167 ± 13. 51 ≈ (153. 5 , 180 .5). (c) 99% CI i. Using R. The 99% CI for μ is (i) (147. 8 , 186 .2) (ii) (151. 5 , 180 .5) (iii) (153. 5 , 180 .5). mean1.t.interval(167,20.1,11,0.99) # m: mean, s: SD, n: sample size, 99% t-interval Mean Critical Value Margin of Error CI lower CI upper 167.000000 3.169273 19.206990 147.793010 186. So, 99% confident population parameter μ in (147.8, 186.2). ii. Using formula: degrees of freedom. df = n − 1 = 11 − 1 = (i) 10 (ii) 11. iii. Using formula: critical value. Critical value 99% = (1 − α) · 100% = (1 − 0 .01) · 100% CI, 10 df tα 2 = t 0. 201 = t 0. 005 ≈ (i) 1. 28 (ii) 2. 23 (iii) 3. 17. qt(0.995,10) # critical value t, 10 df, for 99% CI [1] 3. iv. Using formula. The 99% CI for μ is x¯ ± tα 2 √^ sn = (i) 20. 1 ± 20. 1 × 3 √.^1711 (ii) 3. 17 ± 167 × 20 √ 11.^1 (iii) 167 ± 3. 17 × 20 √ 11.^1. which equals (i) 20. 1 ± 19. 21 (ii) 3. 17 ± 19. 21 (iii) 167 ± 19. 21 ≈ (147. 8 , 186 .2)

Section 6. Confidence Intervals for a Mean (LECTURE NOTES 8) 159

iv. Using formula. The 95% CI for μ is x¯ ± tα 2 √^ sn = (i) 21. 6 ± 2. 15 × 2 √.^9715 (ii) 21. 6 ± 2. 15 × 3 √.^9715 (iii) 21. 6 ± 3. 15 × 2 √.^9715. (c) 99% CI i. Using R. The 99% CI for μ is (i) (19. 23 , 23 .45) (ii) (19. 96 , 23 .24) (iii) (19. 32 , 23 .88). mean1.t.interval(m,s,n,0.99) # m: mean, s: SD, n: sample size, 99% t-interval Mean Critical Value Margin of Error CI lower CI upper 21.600000 2.976843 2.283786 19.316214 23. ii. Using formula: degrees of freedom (df ). The df, here, for 99% CI is (i) same as (ii) different from degrees of freedom calculated for 95% CI above because same sample size is used in both cases. iii. Using formula: critical value. Critical value 99% = (1 − α) · 100% = (1 − 0 .01) · 100% CI, 14 df tα 2 = t 0. 201 = t 0. 005 ≈ (i) 1. 76 (ii) 2. 98. qt(0.995,14) # critical value t, 14 df, for 99% CI [1] 2. iv. Using formula. Thus, the 99% CI for μ is x¯ ± tα 2 √^ sn = (i) 21. 6 ± 2. 15 × 2 √.^9715 (ii) 21. 6 ± 2. 15 × 3 √.^9715 (iii) 21. 6 ± 2. 98 × 2 √.^9715. which equals (i) 21. 6 ± 1. 29 (ii) 21. 6 ± 2. 29 (iii) 21. 6 ± 3. 29 ≈ (19. 32 , 23 .88). (d) Some comments i. (i) True (ii) False. Long 99% CI better than shorter 95% CI in the sense we are more confident 99% contains or “captures” unknown parameter μ. However, 95% CI better than longer 99% CI in the sense, if unknown parameter μ is 95% interval estimate, we are more certain of location of this unknown parameter. ii. Since sample size is small, we can (ii) cannot use central limit theo- rem. iii. Match columns. terms corn example (a) population (a) average length of 15 plants, X¯ (b) sample (b) average length of all plants, μ (c) statistic (c) lengths of all plants (d) parameter (d) observed lengths of 15 plants terms (a) (b) (c) (d) corn example

160 Chapter 4. Statistics (LECTURE NOTES 8)

Population, sample, statistic and parameter: CI for average corn cob length. Simple random sample of 15 corn cobs is taken. Assume sample SD in length is s = 2.97 and, although we typically don’t know it, population (not sample) length is μ = 22 inches. Assume normality.

(a) Population μ = 22 length Population μ = 22 is a (i) statistic (ii) parameter. Population μ (i) changes (ii) remains same for every random sample. Population μ (usually) (i) known (ii) unknown to us, (although we are pretending for this question we do know it.) (b) Sample ¯x length Sample ¯x is a (i) statistic (ii) parameter. Sample ¯x (i) changes (ii) remains same for every random sample. Sample ¯x (usually)(i) known (ii) unknown to us: it may be ¯x = 21.6 for one sample, but ¯x = 29.8 for another sample. (c) A 95% CI for μ, if ¯x = 21.6, is x¯ ± tα 2 √^ sn = 21. 6 ± 1. 96 2 √.^9715 = (i) (19. 95 , 23 .24) (ii) (23. 45 , 27 .80) (iii) (28. 16 , 31 .44). mean1.t.interval(21.6,2.97,14,0.95) # m: mean, s: SD, n: sample size, 95% t-interval Mean Critical Value Margin of Error CI lower CI upper 21.600000 2.160369 1.714827 19.885173 23. This 95% CI (i) contains (ii) does not contain μ = 22. (d) A 95% CI for μ, if ¯x = 29.8, is x¯ ± tα 2 √^ sn = 29. 8 ± 1. 96 2 √.^9715 = (i) (19. 60 , 23 .60) (ii) (23. 45 , 27 .80) (iii) (28. 16 , 31 .44). mean1.t.interval(29.8,2.97,14,0.95) # m: mean, s: SD, n: sample size, 95% t-interval Mean Critical Value Margin of Error CI lower CI upper 29.800000 2.160369 1.714827 28.085173 31. This 95% CI (i) contains (ii) does not contain μ = 22. (e) If sample average length, ¯x, changes, corresponding 95% CI, x¯ ± tα 2 √^ sn , (i) changes (ii) remains the same. More than this,

i. all possible 95% CIs contain μ = 22. ii. none of all possible 95% CIs contain μ = 22. iii. ninety–nine percent of all possible 95% CIs contain μ = 22, and so one percent of all possible 95% CIs do not contain μ = 22. iv. ninety–five percent of all possible 95% CIs contain μ = 22, and so five percent of all possible 95% CIs do not contain μ = 22. This is demonstrated in figure below. (f) Choose true or false.

162 Chapter 4. Statistics (LECTURE NOTES 8)

(a) Using R. The 95% CI for σ^2 is (i) (0. 39 , 1 .22) (ii) (0. 41 , 1 .25) (iii) (0. 44 , 1 .30). var1.chi2.interval = function(v,n,conf.level) { df = n - 1 chilower = qchisq((1 - conf.level)/2, df) chiupper = qchisq((1 - conf.level)/2, df, lower.tail = FALSE) ci.lower <- df * v/chiupper ci.upper <- df * v/chilower margin.error <- (ci.upper - ci.lower)/ dat <- c(v, chilower, chiupper, margin.error, ci.lower, ci.upper) names(dat) <- c("Variance", "Lower Crit Val", "Upper Crit Val", "Margin of Error", "CI lower", "CI upper") return(dat) } var1.chi2.interval(0.7,28,0.95) # 95% CI for variance, n = 28 Variance Lower Crit Val Upper Crit Val Margin of Error CI lower CI upper 0.7000000 14.5733827 43.1945110 0.4296647 0.4375556 1. (b) Upper critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI is χ^2 α 2 = χ^20. 05 2

= χ^20. 025 = (i) 8. 7 (ii) 40. 1 (iii) 43. 2 qchisq(0.975, 27) # 95% upper critical chi-square value [1] 43. (c) Lower critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI is χ^21 − α 2 = χ^21 − 0. 05 2

= χ^20. 975 = (i) 14. 6 (ii) 40. 1 (iii) 43. 2 qchisq(0.025, 27) # 95% lower critical chi-square value [1] 14. (d) Using Table C.4, lower critical value for 95% CI is χ^21 − α 2 = χ^21 − 0. 05 2

= χ^20. 975 = (i) between 13.12 and 16. 79 (ii) 40. 1 (iii) 43. 2 (e) So, 95% CI for variance σ^2 is ( (n − 1)s^2 χ^2 α/ 2

(n − 1)s^2 χ^21 −α/ 2

(i) (0. 61 , 1 .65) (ii) (0. 59 , 1 .29) (iii) (0. 43 , 1 .29). (f) Since 95% CI (0.43, 1.29) does not include 0.40, this indicates variance in distance between door and jamb (i) is (ii) is not 0.4 mm^2. (g) Population, parameter, sample and statistic. Match columns.

terms jamb example (a) population (a) variance in jamb–door distance, of 28 cars, s^2 (b) sample (b) variance in jamb–door distance, of all cars, σ^2 (c) statistic (c) jamb–door distances, of all cars (d) parameter (d) jamb–door distances, of 28 cars

terms (a) (b) (c) (d) jamb example

Section 8. Confidence Intervals for a Differences (LECTURE NOTES 8) 163

Estimation for variance: machine parts. In a simple random sample of 18 machine parts, variance in lengths is s^2 = 12^2. Calculate 90% CI. Assume normality with no outliers.

(a) Using R. The 90% CI for σ^2 is (i) (88. 1 , 281 .3) (ii) (88. 7 , 282 .3) (iii) (88. 2 , 282 .3). var1.chi2.interval(12^2,18,0.90) # 90% CI for variance, n = 18 Variance Lower Crit Val Upper Crit Val Margin of Error CI lower CI upper 144.00000 8.67176 27.58711 96.77927 88.73709 282. (b) Upper critical value for 90% = (1 − α) · 100% = (1 − 0 .10) · 100% CI is χ^2 α 2 = χ^20. 10 2

= χ^20. 05 = (i) 8. 7 (ii) 27. 6 (iii) 43. 2 qchisq(0.95, 17) # 90% upper critical chi-square value [1] 27. (c) Lower critical value for 90% = (1 − α) · 100% = (1 − 0 .10) · 100% CI is χ^21 − α 2 = χ^21 − 0. 10 2 = χ^20. 95 = (i) 8. 7 (ii) 40. 1 (iii) 43. 2 qchisq(0.05, 17) # 90% lower critical chi-square value [1] 8.

(d) So, 90% CI for variance( σ^2 is (there may round-off error) (n−1)s^2 χ^2 U^ ,^

(n−1)s^2 χ^2 L

(18−1)12^2

6 ,^

(18−1)12^2

7

(i) (80. 5 , 101 .4) (ii) (100. 5 , 104 .2) (iii) (88. 7 , 281 .4). (e) Since 90% CI (88.7, 281.4) includes test statistic 13^2 = 169, this indicates variance in lengths (i) is (ii) is not σ^2 = 13^2 mm^2. (f) Also, 90% CI for standard deviation σ is (√ (n−1)s^2 χ^2 U^ ,

(n−1)s^2 χ^2 L

(18−1)12^2

6 ,

(18−1)12^2

7

(i) (9. 4 , 16 .8) (ii) (10. 5 , 14 .2) (iii) (88. 7 , 281 .4).

4.8 Confidence Intervals for Differences

Let x 1 and x 2 be number of successes in two independent samples of size n 1 and n 2 (with ˆp 1 = x n^11 and ˆp 2 = (^) nx^22 ) taken two populations with proportions p 1 and p 2. The (1 − α) · 100% 2-proportion z-interval for p 1 − p 2 is

pˆ 1 − pˆ 2 ± zα 2

p ˆ 1 (1 − pˆ 1 ) n 1

pˆ 2 (1 − pˆ 2 ) n 2

where we assume the samples random and there are at least 5 successes and 5 failures in each sample.

Section 8. Confidence Intervals for a Differences (LECTURE NOTES 8) 165

military (1) civilian (2) male doctors 358 6786 total doctors 407 7363

From above, ˆp 1 = 358407 , ˆp 2 = 67867363 ; also critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI, of zα 2 = z 0. 05 2 = z 0. 025 ≈ (i) 1. 65 (ii) 1. 96 (iii) 2. 09 ,

qnorm(0.975) # critical value z, for 95% CI

[1] 1.

and so 95% CI for p 1 − p 2 is

pˆ 1 − pˆ 2 ± zα 2

p ˆ 1 (1 − pˆ 1 ) n 1

pˆ 2 (1 − pˆ 2 ) n 2

358 407

6786 7363

(i) (− 0. 054 , − 0 .008) (ii) (− 0. 064 , − 0 .009) (iii) (− 0. 074 , − 0 .010) prop2.interval <- function(x, n, conf.level) { x1 <- x[1]; x2 <- x[2]; n1 <- n[1]; n2 <- n[2] p.hat1 <- x1/n1; p.hat2 <- x2/n z.crit <- -1qnorm((1-conf.level)/2) margin.error <- z.critsqrt(p.hat1(1-p.hat1)/n1+p.hat2(1-p.hat2)/n2) ci.lower <- p.hat1-p.hat2 - margin.error ci.upper <- p.hat1-p.hat2 + margin.error dat <- c(p.hat1, p.hat2, z.crit, margin.error, ci.lower, ci.upper) names(dat) <- c("p.hat1", "p.hat2", "z crit", "Margin of Error", "CI lower", "CI upper") return(dat) } prop2.interval(c(358,6786), c(407,7363), 0.95) # approx 2-proportion z-test for p, two-sided

p.hat1 p.hat2 z crit Margin of Error CI lower CI upper 0.879606880 0.921635203 1.959963985 0.032205624 -0.074233948 -0.

Since confidence interval does not include (is, in fact, smaller than) zero, this indicates population proportion of male military doctors (i) is less than (ii) equals (iii) is greater than (iv) is different from the population proportion of male civilian doctors.

CI for μ 1 − μ 2 , independent samples, unknown σ^21 = σ^22 : progesterone. A study is conducted to determine cellular response to progesterone in females. Blood cells from four females are injected with progesterone; blood cells from four different females are, for comparison purposes, left untreated. Calculate 95% CI. Assume normality with no outliers.

166 Chapter 4. Statistics (LECTURE NOTES 8)

female progesterone (1) female control (2) 1 5.85 5 5. 2 2.28 6 1. 3 1.51 7 1. 4 2.12 8 1.

progesterone <- c(5.85, 2.28, 1.51, 2.12) control <- c(5.23, 1.21, 1.40, 1.38)

From R, ¯x 1 ≈ 2 .94, s 1 ≈ 1 .97, ¯x 2 ≈ 2 .305, s 2 ≈ 1 .95, m1 <- mean(progesterone); m1; s1 <- sqrt(var(progesterone)); s m2 <- mean(control); m2; s2 <- sqrt(var(control)); s

mean(progesterone); sqrt(var(progesterone)) [1] 2. [1] 1. mean(control); sqrt(var(control)) [1] 2. [1] 1.

so pooled standard deviation is

sp =

(n 1 − 1)s^21 + (n 2 − 1)s^22 n 1 + n 2 − 2

(i) 1. 95 (ii) 1. 96 (iii) 1. 97 (which not surprising since s 1 ≈ 1 .97, s 2 ≈ 1 .95) n1 <- length(progesterone); n2 <- length(control) s12 <- var(progesterone); s22 <- var(control) sp <- sqrt(((n1-1)s12 + (n2-1)s22)/(n1+n2-2)); sp

[1] 1.

and critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI, with degrees of freedom = n 1 + n 2 − 2 = 4 + 4 − 2 = (i) 4 (ii) 6 (ii) 8 , so tα 2 = t^0. 205 = t 0. 025 ≈ (i) 2. 31 (ii) 2. 45 (iii) 3. 09 ,

qt(0.975,6) # critical t value, 95% CI, using r df

[1] 2.

and so 95% CI for μ 1 − μ 2 is

(¯x 1 − x¯ 2 ) ± sp · tα 2

n 1

n 2

(i) (− 2. 52 , 6 .49) (ii) (− 2. 62 , 6 .39) (iii) (− 2. 76 , 4 .03)

168 Chapter 4. Statistics (LECTURE NOTES 8)

and critical value for 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI, with degrees of freedom =

r =

s^21 n 1 +^

s^22 n 2

1 n 1 − 1

s^21 n 1

(^) n 21 − 1

s^22 n 2

972 4 +^
952 4

1 4 − 1

4

+ 4 −^11

4

(i) 4 (ii) 6 (ii) 8 (same as when σ^21 = σ 22 ) df = 5. so tα 2 = t 0. 205 = t 0. 025 ≈ (i) 2. 31 (ii) 2. 45 (iii) 3. 09 , qt(0.975,6) # critical t value, 95% CI, n1 + n2 - 2 = 6 df

[1] 2.

and so 95% CI for μ 1 − μ 2 is

(¯x 1 − ¯x 2 ) ± tα 2

s^21 n 1

s^22 n 2

(i) (− 2. 52 , 6 .49) (ii) (− 2. 62 , 6 .39) (iii) (− 2. 76 , 4 .03) mean2.t.interval(m1,m2,s1,s2,n1,n2, 0.95,"diff.var")

Mean Difference df Critical Value Margin of Error CI lower CI upper 0.635000 5.999585 2.446953 3.391355 -2.756355 4.

Since confidence interval does include zero, this indicates progesterone population mean cellular response (i) is less than (ii) equals (ii) is greater than (ii) is different from control population mean cellular response.

Inference for difference in dependent means, μd: milk yield. A study is conducted to determine effect of “gentech” animal feed on milk yield of 9 cows. Cow 1 is fed a control feed for three months and then gentech feed for next three months for comparison purposes. Other cows are treated in same way. Calculate 95% CI of mean paired differences in milk yield. Fill in blanks.

cow gentech (1) control (2) differences, di 1 62 54 2 45 43 3 53 55 4 35 39 5 71 65 6 64 62 7 63 56 8 57 50 9 43 52

Section 8. Confidence Intervals for a Differences (LECTURE NOTES 8) 169

gentech <- c(62, 45, 53, 35, 71, 64, 63, 57, 43) control <- c(54, 43, 55, 39, 65, 62, 56, 50, 52) diff <- gentech - control; diff

[1] 8 2 -2 -4 6 2 7 7 -

d^ ¯ ≈ (i) 1. 41 (ii) 1. 89 (iii) 2. 52 , sd ≈ (i) 5. 47 (ii) 5. 86 (iii) 6. 52 , mean(diff); sqrt(var(diff))

[1] 1. [1] 5.

with n − 1 = 9 − 1 = (i) 6 (ii) 7 (ii) 8 degrees of freedom, and critical value 95% = (1 − α) · 100% = (1 − 0 .05) · 100% CI, so tα 2 = t^0. 205 = t 0. 025 ≈ (i) 2. 31 (ii) 2. 53 (iii) 3. 09 ,

qt(0.975,8) # critical t value, 95% CI, nd - 1 = 9 - 1 = 8 df

[1] 2.

and so 95% CI for μd is

d¯ ± tα 2

sd √ n

= 1. 89 ± 2. 31 ×

(i) (− 2. 52 , 6 .49) (ii) (− 2. 62 , 6 .39) (iii) (− 2. 72 , 6 .29) mean1.t.interval <- function(m,s,n,conf.level) { t.crit <- -1qt((1-conf.level)/2,n-1) margin.error <- t.crits/sqrt(n) ci.lower <- m - margin.error ci.upper <- m + margin.error dat <- c(mean, t.crit, margin.error, ci.lower, ci.upper) names(dat) <- c("Mean", "Critical Value", "Margin of Error", "CI lower", "CI upper") return(dat) } mean1.t.interval(1.889,5.8618,9,0.95) # m: mean, s: SD, n: sample size, 95% t-interval

Mean Critical Value Margin of Error CI lower CI upper 1.889000 2.306004 4.505778 -2.616778 6.

Since confidence interval does include zero, this indicates gentech population mean milk yield (i) is less than (ii) equals (iii) is greater than (iv) is different from control population mean milk yield.

Confidence Intervals for a Proportion: Lecture Notes, Lecture notes of Statistics

Related documents

Partial preview of the text

Download Confidence Intervals for a Proportion: Lecture Notes and more Lecture notes Statistics in PDF only on Docsity!

4.5 Confidence Intervals for a Proportion

[

]

]

[

]

4.8 Confidence Intervals for Differences

+ 4 −^11

= 1. 89 ± 2. 31 ×