

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Introduction to Statistical Learning (James/Witten/Hastie/Tibshirani)
Typology: Exercises
1 / 3
This page cannot be seen from the preview
Don't miss anything!
library (ISLR) data ("Auto") attach (Auto)
(a). Create a binary variable, mpg01, that contains a 1 if mpg contains a value above its median, and a 0 if mpg contains a value below its median. You can compute the median using the median() function. Note you may find it helpful to use the data.frame() function to create a single data set containing both mpg01 and the other Auto variables.
set.seed (1) mpg01 = rep (0, nrow (Auto)) mpg01[mpg> median (mpg)]= Auto = data.frame (Auto, mpg01)
(b). Explore the data graphically in order to investigate the association between mpg01 and the other fea- tures. Which of the other features seem most likely to be useful in predicting mpg01? Scatterplots and boxplots may be useful tools to answer this question. Describe your findings.
cor (Auto[,-9])
pairs (Auto)
mpg
3
5
7
50
200
10
20
10 30
3 5 7
cylinders
displacement
100 400
50 200
horsepower
weight
1500 4500
10 20
acceleration
year
70 78
1.0 2.
origin
name
0 200
0.0 0.
10
30
100
400
1500
4500
70
78
0
200
mpg
mpg01 seems to have a strong inverse relationship with cylinders, displacement, horsepower, and weight. Year is positively corellated, indicating that newer cars are more fuel efficient.
(c). Split the data into a training set and a test set.
train.OBS= sample (1: nrow (Auto),.5* nrow (Auto),replace=FALSE) Auto.train = Auto[train.OBS,] Auto.test = Auto[-train.OBS,] mpg01.test = Auto.test$mpg
(d). Perform LDA on the training data in order to predict mpg01 using the variables that seemed most associated with mpg01 in (b). What is the test error of the model obtained?
library (MASS) lda.fit = lda (mpg01 ~ cylinders + weight + displacement + horsepower,data=Auto,subset = train.OBS)