Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Understanding Simple Linear Regression: Model, Assumptions, and Estimation, Schemes and Mind Maps of Calculus

Oxford Brookes University (OBU)Calculus

The concept of simple linear regression, focusing on the model, assumptions, and estimation of the unknown intercept and slope parameters. Using the example of the relationship between height and salary, it discusses the normal distribution of errors, the least squares method for finding the best-fitting line, and the interpretation of the resulting regression equation.

Typology: Schemes and Mind Maps

2021/2022

Uploaded on 09/27/2022

melanycox 🇬🇧

(8)

227 documents

1 / 17

This page cannot be seen from the preview

Don't miss anything!

17. SIMPLE LINEAR REGRESSION II

The Model

In linear regression analysis, we assume that the relationship

between X and Y is linear. This does not mean, however, that Y

can be perfectly predicted from X. In real applications there will

almost always be some random variability which blurs any

underlying systematic relationship which might exist.

We formalize these ideas by writing down a model which

specifies exactly how the systematic and random components

come together to produce our data.

Note that a model is simply a set of assumptions about how the

world works. Of course, all models are wrong, but they can help

us in understanding our data, and (if judiciously selected) may

serve as useful approximations to the truth.

Partial preview of the text

Download Understanding Simple Linear Regression: Model, Assumptions, and Estimation and more Schemes and Mind Maps Calculus in PDF only on Docsity!

17. SIMPLE LINEAR REGRESSION II

The Model

In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X. In real applications there will almost always be some random variability which blurs any underlying systematic relationship which might exist.

We formalize these ideas by writing down a model which specifies exactly how the systematic and random components come together to produce our data.

Note that a model is simply a set of assumptions about how the world works. Of course, all models are wrong, but they can help us in understanding our data, and (if judiciously selected) may serve as useful approximations to the truth.

We start by assuming that for each value of X, the corresponding value of Y is random , and has a normal distribution.

Error probability distribution

E(Y|X) = α + βX

x 1 x 2 x 3

Even if we just consider graduates who are all of the same height (say 6 feet), their salaries will fluctuate, and therefore are not perfectly predictable. Perhaps the mean salary for 6-foot-tall graduates is $6300/Month. Clearly, the mean salary for 5 ½ - foot-tall graduates is less than it is for 6-foot-tall graduates.

Based on our data, we are going to try to estimate the mean value of y for a given value of x , denoted by E ( Y|X ).

(Can you suggest a simple way to do this?)

In general, E ( Y|X ) will depend on X.

Viewed as a function of X, E ( Y|X ) is called the true regression of Y on X.

In linear regression analysis , we assume that the true regression function E ( Y|X ) is a linear function of X:

The parameter β is the slope of the true regression line, and can be interpreted as the population mean change in y for a unit change in x.

E ( Y | x )=α+β x

The parameter α is the intercept of the true regression line and can be interpreted as the mean value of Y when X is zero.

For this interpretation to apply in practice, however, it is necessary to have data for X near zero.

In the advertising example, α would represent the mean baseline sales level without any advertisements.

Unfortunately, the notation α is traditionally used for both the significance level of a hypothesis test and for the intercept of the true regression line.

In practice, α and β will be unknown parameters, which must be estimated from our data. The reason is that our observed data will not lie exactly on the true regression line. (Why?)

Instead, we assume that the true regression line is observed with error , i.e., that y (^) i is given by

y i = α + β x i + ε i, (1)

for. Here, ε i is a normally distributed random

variable (called the error ) with mean 0 and variance σ^2 which does not depend on x.

We also assume that the values of ε associated with any two values of y are independent.

Thus, the error at one point does not affect the error at any other point.

i = 1 , , n

The variance of the error, σ^2 , measures how close the points are to the true line, in terms of expected squared vertical deviation.

Under the model (1) which has normal errors, there is a 95% probability that a y value will fall within ± 2 σ from the true line (measured vertically).

So σ measures the "thickness" of the band of points as they scatter about the true line.

Given our data on x and y , we want to estimate the unknown intercept and slope α and β of the true regression line.

One approach is to find the line which best fits the data, in some sense.

In principle, we could try all possible lines a + b x and pick the line which comes "closest" to the data. This procedure is called “Least-Squares Fitting".

[R Demo: LeastSquaresFit]

The Least Squares Estimators

The resulting line is called the least squares line , since it makes the sum of squared vertical prediction errors as small as possible.

The least squares line is also referred to as the fitted line , and the regression line , but do not confuse the regression line with the true regression line

y ˆ^ =αˆ+βˆ x

E ( y | x )=α+β x.

It can be shown that the sample statistics and are unbiased estimators of the population parameters α and β. That is,

Give interpretations of in the Salary vs. Height example, and the Beer example (Calories / 12 oz serving vs. %Alcohol).

Instead of using complicated formulas, you can obtain the least squares estimates from Minitab.

αˆ^ βˆ

E (α =ˆ) α ,

αˆ and βˆ

E (β ˆ)=β.

αˆ and βˆ

Regression Analysis: Salary versus Height

Coefficients

Term Coef SE Coef T-Value P-Value Constant -902 837 -1.08 0. Height 100.4 12.0 8.35 0.

Regression Equation

Salary = -902 + 100.4 Height

65.0 67.5 70.0 72.5 75.0 77.

6750

6500

6250

6000

5750

5500

Height

Salary

S 192. R-Sq 71.4% R-Sq(adj) 70.3%

Fitted Line Plot for Salary vs. Height Salary = - 902.2 + 100.4 Height

Understanding Simple Linear Regression: Model, Assumptions, and Estimation, Schemes and Mind Maps of Calculus

Related documents

Partial preview of the text

Download Understanding Simple Linear Regression: Model, Assumptions, and Estimation and more Schemes and Mind Maps Calculus in PDF only on Docsity!

17. SIMPLE LINEAR REGRESSION II

E ( Y | x )=α+β x

y i = α + β x i + ε i, (1)

for. Here, ε i is a normally distributed random

The Least Squares Estimators

y ˆ^ =αˆ+βˆ x

E ( y | x )=α+β x.

αˆ^ βˆ

E (α =ˆ) α ,