Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Passenger Car Market Segmentation with Business Analytics & Big Data, Study Guides, Projects, Research of Database Management Systems (DBMS)

National Institute of Fashion Technology Database Management Systems (DBMS)

A project report submitted as part of a master of fashion management degree at the national institute of fashion technology (nift). The report focuses on the application of cluster analysis to segment the passenger car market. The report includes data analysis using logistic regression and conditional inference trees, with variables such as age, gender, estimatedsalary, lb, ac, and fm. The goal is to identify customer segments based on their purchasing behavior.

Typology: Study Guides, Projects, Research

2023/2024

Uploaded on 03/12/2024

yoshita-gupta 🇮🇳

1 document

1 / 41

This page cannot be seen from the preview

Don't miss anything!

BUSINESS ANALYTICS & BIG DATA

A Project Report Submitted

In Partial Fulfilment of the Requirements

For the degree Master of Fashion Management

End-term assignment

Submitted by -Yoshita Gupta (MFM/22/945)

Under the Supervision of Ms. Akanksha Dayma

(Assistant Professor)

Department of Fashion Management Studies

National Institute of Fashion Technology (NIFT)

Talpura, Chebb, Kangra, Himachal Pradesh, PIN- 176001

Partial preview of the text

Download Passenger Car Market Segmentation with Business Analytics & Big Data and more Study Guides, Projects, Research Database Management Systems (DBMS) in PDF only on Docsity!

BUSINESS ANALYTICS & BIG DATA

A Project Report Submitted

In Partial Fulfilment of the Requirements

For the degree Master of Fashion Management

End-term assignment

Submitted by - Yoshita Gupta (MFM/22/945)

Under the Supervision of Ms. Akanksha Dayma

(Assistant Professor)

Department of Fashion Management Studies

National Institute of Fashion Technology (NIFT)

Talpura, Chebb, Kangra, Himachal Pradesh, PIN- 176001

What is Big Data Analytic and Big Data? Business analytics and big data are essential tools for businesses that want to stay competitive in today's data-driven economy. They enable organizations to make better decisions, optimize their operations, and gain valuable insights into customer behavior, market trends, and other important factors that impact their bottom line.Business analytics (BA) refers to "The skills, technologies, practices for continuously developing new insights and understanding of business performance based on data and statistical methods".

Data Management
Data Visualization: making chart (pie chart n all)
Machine Learning: python, R studio Significance of Business Analytics:

● data driven decision

● convert data into valuable information.

● eliminate guesswork

● faster answers

● reduce costs

4. Complex- 7+5i

> A="MOBILE"

> A

[1] "MOBILE"

> A="YOSHITA"

> A

[1] "YOSHITA"

> > NUM1=

> NUM

[1] 3

> CLASS(NUM1)

Error in CLASS(NUM1) : could not find function "CLASS"

> class (NUM1) [1] "numeric"

> > LOG1=TRUE

> class (LOG1) [1] "logical"

> CHAR2="CHARACTER"

> class(CHAR2)

[1] "character"

> > COMPLEX1= 5+7i

> COMPLEX1 [1]

5+7i

> class(COMPLEX1)

[1] "complex" >

OPERATOR in R

1. Assignment operator

> B = 1

> B

[1] 1

> U <- 7

> U

[1] 7

> 7 - > U

> U

2. Relational operator

> NUM1 =

> NUM2 =

> NUM1+NUM

[1] 30

> NUM1 = 30

> NUM2 =

> NUM1 +NUM

[1] 40

> NUM1 - NUM

[1] 20

> NUM1*NUM

[1] 300

> NUM1/NUM

[1] 3

> NUM2-NUM

[1] - 20

3. Airthmetic operator

> N=

> N

[1] 12

> APPLE=TRUE

> BANANA=FALSE

>APPLE&BANANA

[1] FALSE

> ORANGE=FALSE

> PEAR=FALSE

> ORANGE&PEAR

[1] FALSE

DATA STRUCTURES IN R

1. Vector

2. List

3. Matrix

4. Array

5. Factor

6. Data frame

1. Vector:

It is a homogeneous single dimension data frame.

> VEC1=C(1,3,5)

Error in C(1, 3, 5) : object not interpretable as a factor (C chota rakhna, it's used to combine)

> VEC1=c(1,3,5)

> VEC1 [1] 1 3 5

> class(VEC1)

[1] "numeric"

> NAME=c("YOSHITA","DOO","QWERTY")

> NAME [1] "YOSHITA" "DOO" "QWERTY"

> YUIOP=c(Y, U,Y)

> YUIOP

[1] TRUE FALSE TRUE

> > MOBILE 1=c(1,T,2,F)

MOBILE 1

[1] 1 1 2 0

> class(MOBILE)

[1] "numeric"

> U=c(2,"A",3,"B")

> U [1] "2" "A" "3" "B"

> CLASS(U)

Error in CLASS(U) : could not find function "CLASS" ( caps was on)

> class(U)

[1] "character"

> Y=c(1,"a",T)

> Y

[1] "1" "a" "TRUE"

> class(Y)

[1] "character"

Highest precedence is of

CHARACTER > NUMERIC > LOGIC

> Y[2]

[1] "a"

> U[3]

[1] "3"

Now we are extracting the elements from indexes

> HAHA[[3]][1]

[1] FALSE

> HAHA[[2]][2]

[1] "b"

3. Matrix

Matrix is a two-dimensional homogeneous data structure

> m1=matrix(c(1,2,3,5,6,7,8,56))

> m

[,1]

[1,] 1

[2,] 2

[3,] 3

[4,] 5

[5,] 6

[6,] 7

[7,] 8

[8,] 56

> 1 column and 8 rows

We are making different rows and different columns

> m1=matrix(c(1,2,3,5,6,7,8,56),nrow = 2,ncol=4)

> m

[,1] [,2] [,3] [,4]

[1,] 1 3 6 8

[2,] 2 5 7 56

These values are stored column wise

Now to do it row wise

> m1=matrix(c(1,2,3,5,6,7,8,56),nrow = 2,ncol=4,byrow = T)

> m1 [,1] [,2] [,3] [,4]

[1,] 1 2 3 5

[2,] 6 7 8 56

> m1=matrix(c(1,2,3,5,6,7,8,56),nrow = 2,ncol=4, byrow=F)

> m

Now we will learn about how to extract the element in matrix

> m1[1,2]

[1] 3

> m1[2,4]

[1] 56

Inbuilt Functions in R

1. str()

2. head()

3. tail()

4. table()

5. min()

6. max()

7. range()

> View(iris)

> str(iris)

'data.frame':150 obs. of 5 variables:

$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4. ...

$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3. ...

$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1. ...

$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0. ...

$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

dataframe means table

observations means rows

variable means columns

factor means characters

str means structure which tells about the table

head se top ki 6 entries dega and tail se last ki 6 entries.

data"

data.frame(fruit_name=c("apple","banana","gauva"),fruit_cost+c(10,20,30))- fruits Error in data.frame(fruit_name = c("apple", "banana", "gauva"), fruit_cost + : object 'fruit_cost' not found data.frame(fruit_name=c("apple","banana","guava"),fruit_cost=c(10,20,30))- fruits fruits fruit_name fruit_cost 1 apple 10 2 banana 20 3 guava 30 fruits$fruit_name [1] "apple" "banana" "guava" fruits$fruits_cost

NULL

fruits$fruit_cost [1] 10 20 30 View(iris) if(iris$Sepal.Length[1]>4){print("sepallengthisgreaterthan4")} [1] "sepallengthisgreaterthan4" if(iris$Sepal.Length[3]>6){print}("sepallengthisgreaterthan6")} Error: unexpected '}' in "if(iris$Sepal.Length[3]>6){print}("sepallengthisgre aterthan6")}" if(iris$Sepal.Length[3]>6){print("sepallengthisgreaterthan6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")} [1] "sepal length is greater than 4" if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")}

[1] "sepal length is greater than 4"

vec1<-1: for(i in vec1){print(i+6)} [1] 7 [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13 [1] 14 [1] 15 for(i in vec1){print(i+7)} [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13 [1] 14 [1] 15 [1] 16 for(i in vec1){print(i*7)} [1] 7 [1] 14 [1] 21 [1] 28 [1] 35 [1] 42 [1] 49 [1] 56 [1] 63

for(i in vec1){print(i/5)} [1] 0. [1] 0. [1] 0. [1] 0. [1] 1 [1] 1. [1] 1. [1] 1. [1] 1. for(i in vec1){print(i-5)} [1] - 4 [1] - 3 [1] - 2 [1] - 1 [1] 0 [1] 1 [1] 2

[1] 3 [1] 4

for(i in vec1){print(i*5)} [1] 5 [1] 10 [1] 15 [1] 20 [1] 25 [1] 30 [1] 35 [1] 40 [1] 45 View(iris) if(iris$Sepal.Length[1]>4){print("sepallengthisgreaterthan4")} [1] "sepallengthisgreaterthan4" if(iris$Sepal.Length[3]>6){print}("sepallengthisgreaterthan6")} Error: unexpected '}' in "if(iris$Sepal.Length[3]>6){print}("sepallengthisgre aterthan6")}" if(iris$Sepal.Length[3]>6){print("sepallengthisgreaterthan6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")} [1] "sepal length is greater than 4" if(iris$Sepal.Length[3]>6){print("sepal length is greater than 6")} else{pr int("sepal length is greater than 4")} [1] "sepal length is greater than 4" vec1<-1: for(i in vec1){print(i+6)} [1] 7 [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13 [1] 14 [1] 15 for(i in vec1){print(i+7)} [1] 8 [1] 9 [1] 10 [1] 11 [1] 12 [1] 13

[1] 14 [1] 15

LINEAR REGRESSION (LM FUNCTION)

➢ Linearregression<-lm(size~weight,data+Mousedata)

mousedata size weight tail 1 1.4 0.9 0. 2 2.6 1.8 1. 3 1.0 2.4 0. 4 3.7 3.5 2.

plot(mousedata$weight,mousedata$size) linearregression<-lm(size~weight,data=mousedata) summary(linearregression) Call: lm(formula = size ~ weight, data = mousedata) Residuals: 1 2 3 4 5 0.4843 0.6019 - 1.7197 - 0.3427 0. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) - 0.1666 1.3816 - 0.121 0. weight 1.2027 0.5060 2.377 0.0979.

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.242 on 3 degrees of freedom Multiple R-squared: 0.6531, Adjusted R-squared: 0. F-statistic: 5.648 on 1 and 3 DF, p-value: 0.

abline(linearregression,col="pink",lwd=5)

plot(mousedata) multiple.regression<-lm(size~weight+tail,data=mousedata) summary(mousedata) size weight tail Min. :1.00 Min. :0.9 Min. :0. 1st Qu.:1.40 1st Qu.:1.8 1st Qu.:0. Median :2.60 Median :2.4 Median :1.

$ EstimatedSalary: int 19000 20000 43000 57000 76000 58000 84000 150000 33000 65 000 ... $ Purchased : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 1 1 ...

xtabs(~Gender+Purchased,data=logisticregression) Purchased Gender 0 1 Female 127 77 Male 130 66 xtabs(~Age+Purchased,data = logisticregression) Purchased Age 0 1 18 5 0 19 7 0 20 7 0 21 4 0 22 5 0 23 6 0

24 9 0 25 6 0 26 16 0 27 11 2 28 11 1 29 9 1 30 9 2 31 10 1 32 4 5 33 8 1 34 5 1 35 29 3 36 7 5 37 13 7 38 12 1 39 9 6 40 12 3 41 15 1 42 10 6 43 1 2 44 1 1 45 1 6 46 5 7 47 2 12 48 1 13 49 2 8 50 1 3 51 1 2 52 1 5 53 0 5 54 0 4 55 0 3 56 0 3 57 0 5 58 0 6 59 2 5 60 0 7

SEGMENTATION USING CLUSTERT ANALYSIS:

OUTLINE -

About market segmentation

About cluster analysis
Example (Segmenting Passenger car market)

“Kotler” anything that is capable of satisfying felt need is a PRODUCT.

CLUSTER: collection of data objects.

Class of technique use to classify object in to relatively homogenous grp called cluster.

Finding similarities between data acc to the characteristics found in data

BROAD STEPS IN CLUSTER ANALYSIS

Defining the variables on the which the clustering will be based.
Collect data on the selected variables.
Standardized the data collected.
Measuring the inter respondents’ distance.
Grouping the objects based on distances between them.

MAJOR CLUSTERING APPROCHES:

PARTIONING APPROCH- there are 2 to 3 groups exist and we assign according to data there get

1 cluster and take a centroid. It is also a k- mean clustering.

HIERARCHIAL APPROCH- individual is separate cluster and some it ends that entire data set is

one cluster

K- MEAN CLUSTERING METHOD

EUCLIDEAN DISTANCE

DISTANCE BETWEEN CLUSTER

Single linkage- take a centroid or average
Complete linkage-

LOGISTIC REGRESSION- GLM FUNCTION

logisticregression<-read.csv(file.choose(),header = T) View(logisticregression) str(logisticregression) 'data.frame': 400 obs. of 5 variables: $ User.ID : int 15624510 15810944 15668575 15603246 15804002 15728773 15 598044 15694829 15600575 15727311 ...

$ Gender : chr "Male" "Male" "Female" "Female" ... $ Age : int 19 35 26 27 19 27 27 32 25 35 ... $ EstimatedSalary: int 19000 20000 43000 57000 76000 58000 84000 150000 33000 6 5000 ... $ Purchased : int 0 0 0 0 0 0 0 1 0 0 ... GLM FUNCTION - LOGISTIC REGRESSION

mydata<-glm(Purchased~Age+ Gender+EstimatedSalary, data=logisticregression,fami ly='binomial')

Passenger Car Market Segmentation with Business Analytics & Big Data, Study Guides, Projects, Research of Database Management Systems (DBMS)

Related documents

Partial preview of the text

Download Passenger Car Market Segmentation with Business Analytics & Big Data and more Study Guides, Projects, Research Database Management Systems (DBMS) in PDF only on Docsity!

BUSINESS ANALYTICS & BIG DATA

A Project Report Submitted

In Partial Fulfilment of the Requirements

For the degree Master of Fashion Management

End-term assignment

Submitted by - Yoshita Gupta (MFM/22/945)

Under the Supervision of Ms. Akanksha Dayma

(Assistant Professor)

Department of Fashion Management Studies

National Institute of Fashion Technology (NIFT)

Talpura, Chebb, Kangra, Himachal Pradesh, PIN- 176001

● data driven decision

● convert data into valuable information.

● eliminate guesswork

● faster answers

● reduce costs

4. Complex- 7+5i

> A="MOBILE"

> A

[1] "MOBILE"

> A="YOSHITA"

> A

[1] "YOSHITA"

> > NUM1=

> NUM

[1] 3

> CLASS(NUM1)

Error in CLASS(NUM1) : could not find function "CLASS"

> class (NUM1) [1] "numeric"

> > LOG1=TRUE

> class (LOG1) [1] "logical"

> CHAR2="CHARACTER"

> class(CHAR2)

[1] "character"

> > COMPLEX1= 5+7i

> COMPLEX1 [1]

5+7i

> class(COMPLEX1)

[1] "complex" >

OPERATOR in R

1. Assignment operator

> B = 1

> B

[1] 1

> U <- 7

> U

[1] 7

> 7 - > U

> U

2. Relational operator

> NUM1 =

> NUM2 =

> NUM1+NUM

[1] 30

> NUM1 = 30

> NUM2 =

> NUM1 +NUM

[1] 40

> NUM1 - NUM

[1] 20

> NUM1*NUM

[1] 300

> NUM1/NUM

[1] 3

> NUM2-NUM

[1] - 20

3. Airthmetic operator

> N=

> N

[1] 12

> APPLE=TRUE

> BANANA=FALSE

>APPLE&BANANA

[1] FALSE

> ORANGE=FALSE

> PEAR=FALSE