

































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A project report submitted as part of a master of fashion management degree at the national institute of fashion technology (nift). The report focuses on the application of cluster analysis to segment the passenger car market. The report includes data analysis using logistic regression and conditional inference trees, with variables such as age, gender, estimatedsalary, lb, ac, and fm. The goal is to identify customer segments based on their purchasing behavior.
Typology: Study Guides, Projects, Research
1 / 41
This page cannot be seen from the preview
Don't miss anything!
What is Big Data Analytic and Big Data? Business analytics and big data are essential tools for businesses that want to stay competitive in today's data-driven economy. They enable organizations to make better decisions, optimize their operations, and gain valuable insights into customer behavior, market trends, and other important factors that impact their bottom line.Business analytics (BA) refers to "The skills, technologies, practices for continuously developing new insights and understanding of business performance based on data and statistical methods".
Highest precedence is of
data"
data.frame(fruit_name=c("apple","banana","gauva"),fruit_cost+c(10,20,30))- fruits Error in data.frame(fruit_name = c("apple", "banana", "gauva"), fruit_cost + : object 'fruit_cost' not found data.frame(fruit_name=c("apple","banana","guava"),fruit_cost=c(10,20,30))- fruits fruits fruit_name fruit_cost 1 apple 10 2 banana 20 3 guava 30 fruits$fruit_name [1] "apple" "banana" "guava" fruits$fruits_cost
NULL
fruits$fruit_cost [1] 10 20 30 View(iris) if(iris$Sepal.Length[1]>4){print("sepallengthisgreaterthan4")} [1] "sepallengthisgreaterthan4" if(iris$Sepal.Length[3]>6){print}("sepallengthisgreaterthan6")} Error: unexpected '}' in "if(iris$Sepal.Length[3]>6){print}("sepallengthisgre aterthan6")}" if(iris$Sepal.Length[3]>6){print("sepallengthisgreaterthan6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")} [1] "sepal length is greater than 4" if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")}
[1] "sepal length is greater than 4"
vec1<-1: for(i in vec1){print(i+6)} [1] 7 [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13 [1] 14 [1] 15 for(i in vec1){print(i+7)} [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13 [1] 14 [1] 15 [1] 16 for(i in vec1){print(i*7)} [1] 7 [1] 14 [1] 21 [1] 28 [1] 35 [1] 42 [1] 49 [1] 56 [1] 63
for(i in vec1){print(i/5)} [1] 0. [1] 0. [1] 0. [1] 0. [1] 1 [1] 1. [1] 1. [1] 1. [1] 1. for(i in vec1){print(i-5)} [1] - 4 [1] - 3 [1] - 2 [1] - 1 [1] 0 [1] 1 [1] 2
[1] 3 [1] 4
for(i in vec1){print(i*5)} [1] 5 [1] 10 [1] 15 [1] 20 [1] 25 [1] 30 [1] 35 [1] 40 [1] 45 View(iris) if(iris$Sepal.Length[1]>4){print("sepallengthisgreaterthan4")} [1] "sepallengthisgreaterthan4" if(iris$Sepal.Length[3]>6){print}("sepallengthisgreaterthan6")} Error: unexpected '}' in "if(iris$Sepal.Length[3]>6){print}("sepallengthisgre aterthan6")}" if(iris$Sepal.Length[3]>6){print("sepallengthisgreaterthan6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")} [1] "sepal length is greater than 4" if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")} [1] "sepal length is greater than 4" vec1<-1: for(i in vec1){print(i+6)} [1] 7 [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13 [1] 14 [1] 15 for(i in vec1){print(i+7)} [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13
[1] 14 [1] 15
mousedata size weight tail 1 1.4 0.9 0. 2 2.6 1.8 1. 3 1.0 2.4 0. 4 3.7 3.5 2.
plot(mousedata$weight,mousedata$size) linearregression<-lm(size~weight,data=mousedata) summary(linearregression) Call: lm(formula = size ~ weight, data = mousedata) Residuals: 1 2 3 4 5 0.4843 0.6019 - 1.7197 - 0.3427 0. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) - 0.1666 1.3816 - 0.121 0. weight 1.2027 0.5060 2.377 0.0979.
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.242 on 3 degrees of freedom Multiple R-squared: 0.6531, Adjusted R-squared: 0. F-statistic: 5.648 on 1 and 3 DF, p-value: 0.
abline(linearregression,col="pink",lwd=5)
plot(mousedata) multiple.regression<-lm(size~weight+tail,data=mousedata) summary(mousedata) size weight tail Min. :1.00 Min. :0.9 Min. :0. 1st Qu.:1.40 1st Qu.:1.8 1st Qu.:0. Median :2.60 Median :2.4 Median :1.
$ EstimatedSalary: int 19000 20000 43000 57000 76000 58000 84000 150000 33000 65 000 ... $ Purchased : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 1 1 ...
xtabs(~Gender+Purchased,data=logisticregression) Purchased Gender 0 1 Female 127 77 Male 130 66 xtabs(~Age+Purchased,data = logisticregression) Purchased Age 0 1 18 5 0 19 7 0 20 7 0 21 4 0 22 5 0 23 6 0
24 9 0 25 6 0 26 16 0 27 11 2 28 11 1 29 9 1 30 9 2 31 10 1 32 4 5 33 8 1 34 5 1 35 29 3 36 7 5 37 13 7 38 12 1 39 9 6 40 12 3 41 15 1 42 10 6 43 1 2 44 1 1 45 1 6 46 5 7 47 2 12 48 1 13 49 2 8 50 1 3 51 1 2 52 1 5 53 0 5 54 0 4 55 0 3 56 0 3 57 0 5 58 0 6 59 2 5 60 0 7
CLUSTER: collection of data objects.
logisticregression<-read.csv(file.choose(),header = T) View(logisticregression) str(logisticregression) 'data.frame': 400 obs. of 5 variables: $ User.ID : int 15624510 15810944 15668575 15603246 15804002 15728773 15 598044 15694829 15600575 15727311 ...
$ Gender : chr "Male" "Male" "Female" "Female" ... $ Age : int 19 35 26 27 19 27 27 32 25 35 ... $ EstimatedSalary: int 19000 20000 43000 57000 76000 58000 84000 150000 33000 6 5000 ... $ Purchased : int 0 0 0 0 0 0 0 1 0 0 ... GLM FUNCTION - LOGISTIC REGRESSION
mydata<-glm(Purchased~Age+ Gender+EstimatedSalary, data=logisticregression,fami ly='binomial')