Inference: Tests & Estimation for Differences in Means & Proportions | Study notes Mathematics

Inference (hypothesis tests and estimation)

Differences between means, proportions in two populations

The idea: We have two populations and a variable of interest. We may be interested in the difference in mean values

(on this variable) in the two populations [difference of means] or in the difference in proportions of the populations that

have a particular value on the variable [difference of proportions]

In dealing with means, we want to either

1. Estimate the difference between the mean values (this involves a confidence interval)

2. Decide whether we have evidence of a difference between the mean values (this involves a test)

In dealing with proportions, we want to either

1. Estimate the difference between the proportions (in the two populations) that give a certain value on the variable

(confidence interval)

2. Decide whether there is a difference between the proportions (in the two populations) that give a certain value on the

variable (Test)

The methods parallel the methods for estimation and for tests on the mean of one population, but the calculations

are different because we have different (and more complicated) distributions.

There is a special situation [the “matched samples” case] which is usually discussed with (and often confused with)

inference on two populations but is really a special case of inference on one population [of differences].

Difference of means (independent samples)

The basic important fact is that our best estimator of µ1−µ2(the difference between the population means — order of

subraction matters) is the difference between sample means ¯x1−¯x2. The mean of the difference in sample means (as long

as we keep sample sizes the same) is exactly the difference in the population means (in the same order):µ¯x1−¯x2=µX1−µX2

and the variance of ¯x1−¯x2is the sum of the variances of ¯x1and ¯x2, so σ¯x1−¯x2=qσ2

n1+σ2

n2. In addition, if X1and

X2are approximately normally distributed or if the sample sizes are large enough then the distribution of ¯x1−¯x2is

approximately normal (which means ¯x1−¯x2−(µ1−µ2)

rσ2

n1+

σ2

is a Z).

Thus, if we happen to know σ1and σ2and n1, n2are large enough or X1, X2are approximately normal, our 1 −α

confidence interval for µ1−µ2is given by

¯x1−¯x2±Ewith E=Zα

2sσ2

+σ2

In the usual situation, we don’t know σ1, σ2; the difference of sample means, compared using s1, s2involves four values

that vary from case to case, and is not even really a t– it is closely approximated by a t(if X1, X2normal or n1, n2large)

but ( to make the approximation work) we have to use a strange value of degrees of freedom, so our interval for confidence

1−αis given by

¯x1−¯x2±Ewith E=tα

2ss2

+s2

df =s2

n1+s2

n22

n1−1s2

n12+1

n2−1s2

n22

[This is the fractional degrees of freedom value that will be reported by your calculator or by Minitab if you use either of

these for the calculation)

Testing follow the same six-step procedure as testing on one mean, but slightly different numbers appear. There are the

same three forms for the alternative — the order in which the two populations are identified will matter for one-sided tests.

“Greater”

H0:µ1=µ2

Ha:µ1> µ2

H0:µ1−µ2= 0

Ha:µ1−µ2>0

“Less”

H0:µ1=µ2

Ha:µ1< µ2

H0:µ1−µ2= 0

Ha:µ1−µ2<0

“not equal”

H0:µ1=µ2

Ha:µ16=µ2

H0:µ1−µ2= 0

Ha:µ1−µ26= 0

Reject H0if sample t>tαReject H0if sample t < −tαReject H0if sample t < −tα

or sample t>tα

sample t=¯x1−¯x2−(µ1−µ2)

qs2

n1+22

with df =s2

n1+s2

n22

n1−1s2

n12+1

n2−1s2

n22

Inference: Tests & Estimation for Differences in Means & Proportions, Study notes of Mathematics

Related documents

Partial preview of the text

Download Inference: Tests & Estimation for Differences in Means & Proportions and more Study notes Mathematics in PDF only on Docsity!