Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Notes on Multiple Regression - Statistical Applications | MATH 241, Exams of Mathematics

Material Type: Exam; Class: Statistical Applications; Subject: Mathematics; University: Saint Mary's College; Term: Unknown 1989;

Typology: Exams

Pre 2010

Uploaded on 08/05/2009

koofers-user-ecy
koofers-user-ecy 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Multiple Regression
Multiple regression is the extension of simple regression to the prediction of one dependent variable from more than
one independent variable. The resulting models produce better predictors, but the calculations and interpretation are
more complex.
In the case of several independent variables, “linear” means that the regression equation is linear in all independent
variables and thus has the form: y=b0+b1x1+b2x2+b3x3+. . . +bpxp(prepresents the number of predictors)
We are using a model matching the model used for simple regression: We assume that in the population yis given by
y=β0+β1x1+β2x2+. . . +βpxp+satisfying the conditions:
1. Errors are independent
2. Equal variance of the errors (and so of the y’s) across the whole range of all the xi’s.
3. Normal distribution for errors(that is, of ), with mean 0.
In simple linear regression the geometric interpretation of the regression equation was a straight line. In multiple
linear regression with two independent variables the regression equation is given by the equation, y=b0+b1x1+b2x2,
which is represented by a plane in three space. We cannot give a geometric representation for the cases of three or more
independent variables.
An example:
The price of a house is to be predicted from the number of bedrooms and bathrooms.
Price
#bedrooms #bathrooms ($10,000)
x1x2y
3 2 13.6
2 1 11.6
4 3 15.6
2 1 9.8
3 2 14.0
2 2 12.2
5 3 17.2
4 2 15.4
4 2.5 17.0
5 3.5 18.2
The regression equation is y= 6.87 + 1.60x1+ 0.979x2.
Standard deviation of the errors (residuals)
With two independent variables is estimated by: s=rP(yiˆyi)2
n3
In general, with ppredictors, s=sP(yiˆyi)2
n1p=sSSE
n1pbecause there are p+ 1 parameters estimated (to
calculate ˆyi) the numbers β0, β1, . . . , βp.
The normal equations for two independent variables:
Py=nb0+b1Px1+b2Px2
Px1y=b0Px1+b1x2
1+b2x1x2
Px2y=b0Px2+b1Px1x2+b2Px2
2
We will not write down the normal equations for regression with three or more independent variables —they are really
matrix equations. We will rely on Minitab for calculation of regression coefficients.
Meaning of the coefficients:
The regression coefficients have a slightly more subtle interpretation with multiple predictors. As before, the coefficient
biestimates βi, but βigives the amount of change in ydue to a change in xiif all the other variables are held constant
(no change in other variables) so it gives the effect that xihas on ythat is distinct from change due to other variables
Sometimes this is different (often less) than what you would see as the coefficient with simple regression.
Coefficient of determination:
As with simple regression, R2[always capitalized - there is no multiple-variable correlation coefficient r] indicates the
1
pf2

Partial preview of the text

Download Notes on Multiple Regression - Statistical Applications | MATH 241 and more Exams Mathematics in PDF only on Docsity!

Multiple Regression

Multiple regression is the extension of simple regression to the prediction of one dependent variable from more than one independent variable. The resulting models produce better predictors, but the calculations and interpretation are more complex. In the case of several independent variables, “linear” means that the regression equation is linear in all independent variables and thus has the form: y = b 0 + b 1 x 1 + b 2 x 2 + b 3 x 3 +... + bpxp (p represents the number of predictors) We are using a model matching the model used for simple regression: We assume that in the population y is given by y = β 0 + β 1 x 1 + β 2 x 2 +... + βpxp +  satisfying the conditions:

  1. Errors are independent
  2. Equal variance of the errors (and so of the y’s) across the whole range of all the xi’s.
  3. Normal distribution for errors(that is, of ), with mean 0.

In simple linear regression the geometric interpretation of the regression equation was a straight line. In multiple linear regression with two independent variables the regression equation is given by the equation, y = b 0 + b 1 x 1 + b 2 x 2 , which is represented by a plane in three space. We cannot give a geometric representation for the cases of three or more independent variables. An example: The price of a house is to be predicted from the number of bedrooms and bathrooms. Price #bedrooms #bathrooms ($10,000) x 1 x 2 y 3 2 13. 2 1 11. 4 3 15. 2 1 9. 3 2 14. 2 2 12. 5 3 17. 4 2 15. 4 2.5 17. 5 3.5 18.

The regression equation is y = 6.87 + 1. 60 x 1 + 0. 979 x 2.

Standard deviation of the errors (residuals)

With two independent variables is estimated by: s =

(yi − ˆyi)^2 n − 3

In general, with p predictors, s =

(yi − ˆyi)^2 n − 1 − p

SSE

n − 1 − p because there are p + 1 parameters estimated (to

calculate ˆyi) – the numbers β 0 , β 1 ,... , βp.

The normal equations for two independent variables:∑ y = nb 0 + b 1

x 1 + b 2

∑ x^2 x 1 y = b 0

∑ x^1 +^ b^1 x^21 +^ b^2 x^1 x^2 x 2 y = b 0

x 2 + b 1

x 1 x 2 + b 2

x^22

We will not write down the normal equations for regression with three or more independent variables —they are really matrix equations. We will rely on Minitab for calculation of regression coefficients.

Meaning of the coefficients: The regression coefficients have a slightly more subtle interpretation with multiple predictors. As before, the coefficient bi estimates βi, but βi gives the amount of change in y due to a change in xi if all the other variables are held constant (no change in other variables) — so it gives the effect that xi has on y that is distinct from change due to other variables Sometimes this is different (often less) than what you would see as the coefficient with simple regression.

Coefficient of determination: As with simple regression, R^2 [always capitalized - there is no multiple-variable correlation coefficient r] indicates the

proportion of total variation that is “explained” by the regression equation.

Adjusted coefficient of determination:

Unfortunately, adding even a total nonsense variable xk will reduce R^2 We use R^2 a = 1 − (1 − R^2 )

n− 1 n− 1 −p

for comparing

regression models with different numbers of predictors (so that adding non-useful variables does not give the appearance of a meaningful improvement).

Test for significance: Our overall test for significance of the regression equation (“Do we have a linear part to any relation between y and the collection of xi’s?”) is H 0 : β 1 = ... = βk = 0 [all βis are 0 — there is, in the population, no linear part to any relationship between y and the xi’s] Ha : not all βis are 0 [There is, in the population, a linear part to a relationship between y and one or more of the xi’s]

The test statistic is F =

MSR

MSE

with numerator degrees of freedom p (degrees of freedom of the regression) and

denominator degrees of freedom n − p − 1 [degrees of freedom of the error]. Here MSR and MSE [and degrees of free- dom] are the mean squares [and degrees of freedom] calculated by Minitab in the Analysis of Variance table. The square root of MSE is again s - the standard deviation of the residuals (but this time calculated with degrees of freedom n−p−1)

Test for significance on individual coefficients: If the F-test shows the regression is significant overall, we can test for significance of individual coefficients (notice the similarity to ANOVA here – we must test for the overall model first, and only go to individuals if the overall test shows significance) The test for the ith coefficient (ith variable) is H 0 : βi = 0 [No independent linear contribution of xi in the prediction] Ha : βi 6 = 0 [xi contributes to the linear prediction of y independently of the other variables]

The test statistic is sample t =

bi sbi

with n − p − 1 degrees of freedom, and with sbi =

s √ SSxi

[as in the one-predictor

case] [Minitab gives the t-values, as well as this [sample] standard error for bi in the regression printout]

Estimation of coefficient: We get a (1 − α) confidence interval for βi using bi ± tα/ 2 sbi

In general, we want to eliminate from our model the variables whose coefficients are not significant; having non-useful variables costs us degrees of freedom and, in general, hides the significant relationships. It may happen, though that two variables are not significantly related to the response when both are in the equation – but each one is significant without the other; this indicates that they give some of the same information about the response – the independent contributions are small but the common contribution is real. In this case we would prefer to use one, but not both, of the variables as a predictor.

Confidence interval for μy|xp is given by ˆyp ± tα/ 2 syˆ

Prediction interval for individual value of y|xp is given by ˆyp ± tα/ 2 sp The standard error estimates syˆ and sp involve matrix multiplications and formulas will not be given here - we will use Minitab to calculate confidence intervals (for μy|xp and prediction intervals (for y|xp).

Qualitative [categorical] variables: “Dummy” or “indicator” variables [also called “binary” variables because the values are 0 or 1 for “yes” or “no”] are used for categorical variables. The coefficient indicates the effect of the “1” category (as compared to the “0” category) on the (average) value of the response. Tests for significance work as with other variables. The number of dummy variables needed to represent a categorical variable is, in general, one less than the number of categories [because one category can be represented by “none of the others” – all 0’s]

Multicollinearity: “Multicollinearity” refers to correlation among the independent variables (predictors). It makes t tests for the significance of the coefficients of the independent variables invalid, but does not harm the predictability of the dependent variable. A “rule of thumb” says that if the absolute value of the correlation coefficient (rxixj ) between any pair of independent variables exceeds 0.7 then t-tests on the regression coefficients will not be reliable. [The coefficients will in general be less significant than indicated by the t-tests.]