Linear Regression: Modeling Input-Outcome Relationship | Lecture notes Machine Learning

Unit 3 Linear Regression

•Linear regression is an analytical technique used to model the relationship

between several input variables and a continuous outcome variable.

•A key assumption is that the relationship between an input variable

and the outcome variable is linear.

•Although this assumption may appear restrictive, it is often possible to

properly transform the input or outcome variables to achieve a linear

relationship between the modified input and outcome variables.

•The physical sciences have well-known linear models, such as Ohm's

Law, which states that the electrical current flowing through a resistive

circuit is linearly proportional to the voltage applied to the circuit.

•Such a model is considered deterministic in the sense that if the input

values are known, the value of the outcome variable is precisely

determined.

•A linear regression model is a probabilistic one that accounts for

the randomness that can affect any particular outcome.

•Based on known input values, a linear regression

model provides the expected value of the outcome variable based on the

values of the input variables,

•but some uncertainty may remain in predicting any particular outcome.

Thus, linear regression models are useful in physical and social science

applications where there may be considerable variation in a particular

outcome based on a given set of input values.

•After presenting possible linear regression use cases, the foundations of

linear regression modelling are provided.

Use Cases

Linear regression is often used in business, government, and other

scenarios. Some common practical applications of linear regression in

the real world inc lude the following:

• Real estate: A simple linear regression analysis can be used to

model residential home prices as a function of the home's living

area. Such a model helps set or evaluate the list price of a home on

the market. The model could be further improved by including

other input variables such as number of bathrooms, number of bed

rooms, lot size, school district ran kings, crime statistics, and

property taxes.

Demand forecasting: Businesses and governments can use linear

regression models to predict demand for goods and services. For

example, restaurant chains can appropriately prepare for the

predicted type and quantity of food that customers will consume

based upon the weather, the day of the week, whether an item is

offered as a special, the time of day, and the reservation volume.

Similar models can be built to predict retail sales, emergency room

visits, and ambulance dispatches.

Partial preview of the text

Download Linear Regression: Modeling Input-Outcome Relationship and more Lecture notes Machine Learning in PDF only on Docsity!

Unit 3 Linear Regression

Linear regression is an analytical technique used to model the relationship between several input variables and a continuous outcome variable.
A key assumption is that the relationship between an input variable and the outcome variable is linear.
Although this assumption may appear restrictive, it is often possible to properly transform the input or outcome variables to achieve a linear relationship between the modified input and outcome variables.
(^) The physical sciences have well-known linear models, such as Ohm's Law, which states that the electrical current flowing through a resistive circuit is linearly proportional to the voltage applied to the circuit.
Such a model is considered deterministic in the sense that if the input values are known, the value of the outcome variable is precisely determined.
A linear regression model is a probabilistic one that accounts for the randomness that can affect any particular outcome.
Based on known input values, a linear regression model provides the expected value of the outcome variable based on the values of the input variables,
but some uncertainty may remain in predicting any particular outcome. Thus, linear regression models are useful in physical and social science applications where there may be considerable variation in a particular outcome based on a given set of input values.
After presenting possible linear regression use cases, the foundations of linear regression modelling are provided.

Use Cases Linear regression is often used in business, government, and other scenarios. Some common practical applications of linear regression in the real world inc lude the following:

Real estate: A simple linear regression analysis can be used to model residential home prices as a function of the home's living area. Such a model helps set or evaluate the list price of a home on the market. The model could be further improved by including other input variables such as number of bathrooms, number of bed rooms, lot size, school district ran kings, crime statistics, and property taxes. Demand forecasting: Businesses and governments can use linear regression models to predict demand for goods and services. For example, restaurant chains can appropriately prepare for the predicted type and quantity of food that customers will consume based upon the weather, the day of the week, whether an item is offered as a special, the time of day, and the reservation volume. Similar models can be built to predict retail sales, emergency room visits, and ambulance dispatches.

Medical : A linear regression model can be used to analyze the effect of a proposed radiation treatment on reducing tumor sizes. Input variables might include duration of a single radiation treatment, frequency of radiation treatment, and patient attributes such as age or weight.

Model Description As the name of this technique suggests, the linear regression model assumes that there is a linear relationship between the input variables and the outcome variable. This relationship can be expressed as shown in Equation 6-1.

where: y is the outcome variable xi are the input variables, for j = 1, 2, ... , p- 1 {30 is the value of y when each xi equals zero f]i is the change in y based on a unit change in xi' for j = 1, 2, .. • , p- 1 Eisa random error term that represents the difference in the linear model and a particular observed value for y.

Suppose it is desired to build a linear regression model that estimates a person's annual income as a function of two variables-age and education-both expressed in years. In this case, income is the outcome variable, and the input variables are age and education. Although it may be an over generalization, such a model seems intuitively correct in the sense that people's income should increase as their skill set and experience expand with age. Also, the employment opportunities and starting salaries would be expected to be greater for those who have attained more education. However, it is also obvious that there is considerable variation in income levels for a group of people with identical ages and years of education. This variation is represented by E in the model. So, in this example, the model would be expressed as shown in Equation 6-2.

In the linear model, the p is represent the unknown p parameters. The estimates for these unknown parameters are chosen so that, on average, the model provides a reasonable estimate of a person's income based on age and education. In other words, the fitted model should minimize the overall error between

Linear Regression: Modeling Input-Outcome Relationship, Lecture notes of Machine Learning

Related documents

Partial preview of the text

Download Linear Regression: Modeling Input-Outcome Relationship and more Lecture notes Machine Learning in PDF only on Docsity!