Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Decision tree algorithm in supervised learning algorithm, Lecture notes of Machine Learning

decision tree used for classification in supervised learning algorithm

Typology: Lecture notes

2018/2019

Uploaded on 08/08/2019

sriharsha-pjv
sriharsha-pjv 🇮🇳

5

(1)

2 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Decision Tree - Classicaon
Decision tree builds classicaon or regression models in the form of a tree structure. It breaks down a dataset into
smaller and smaller subsets while at the same me an associated decision tree is incrementally developed. The
nal result is a tree with decision nodes and leaf nodes. A decision node (e.g., Outlook) has two or more
branches (e.g., Sunny, Overcast and Rainy). Leaf node (e.g., Play) represents a classicaon or decision. The
topmost decision node in a tree which corresponds to the best predictor called root node. Decision trees
can handle both categorical and numerical data.
Algorithm
The core algorithm for building decision trees called ID3 by J. R. Quinlan which employs a top-down,
greedy search through the space of possible branches with no backtracking. ID3 uses Entropy and Informaon
Gain to construct a decision tree. In ZeroR model there is no predictor, in OneR model we try to nd the single best
predictor, naive Bayesian includes all predictors using Bayes' rule and the independence assumpons between
predictors but decision tree includes all predictors with the dependence assumpons between predictors.
Entropy
A decision tree is built top-down from a root node and involves par��oning the data into subsets that contain
instances with similar values (homogenous). ID3 algorithm uses entropy to calculate the homogeneity of a sample.
If the sample is completely homogeneous the entropy is zero and if the sample is an equally divided it has entropy
of one.
pf3
pf4
pf5

Partial preview of the text

Download Decision tree algorithm in supervised learning algorithm and more Lecture notes Machine Learning in PDF only on Docsity!

Decision Tree - Classifica�on

Decision tree builds classifica�on or regression models in the form of a tree structure. It breaks down a data smaller and smaller subsets while at the same �me an associated decision tree is incrementally developed. T final result is a tree with decision nodes and leaf nodes. A decision node (e.g., Outlook) has two or m branches (e.g., Sunny, Overcast and Rainy). Leaf node (e.g., Play) represents a classifica�on or decision. The topmost decision node in a tree which corresponds to the best predictor called root node. Decision tree can handle both categorical and numerical data.

Algorithm

The core algorithm for building decision trees called ID3 by J. R. Quinlan which employs a top-d greedy search through the space of possible branches with no backtracking. ID3 uses Entropy and Informa Gain to construct a decision tree. In ZeroR model there is no predictor, in OneR model we try to find the sing predictor, naive Bayesian includes all predictors using Bayes' rule and the independence assump�ons betwe predictors but decision tree includes all predictors with the dependence assump�ons between predictors.

Entropy

A decision tree is built top-down from a root node and involves par��oning the data into subsets that conta instances with similar values (homogenous). ID3 algorithm uses entropy to calculate the homogeneity of a s If the sample is completely homogeneous the entropy is zero and if the sample is an equally divided it has en of one.

To build a decision tree, we need to calculate two types of entropy using frequency tables as follows:

a) Entropy using the frequency table of one a�ribute:

b) Entropy using the frequency table of two a�ributes:

Step 3 : Choose a�ribute with the largest informa�on gain as the decision node, divide the dataset by its bran and repeat the same process on every branch.

Step 4a : A branch with entropy of 0 is a leaf node.

Step 4b : A branch with entropy more than 0 needs further spli�ng.

Step 5 : The ID3 algorithm is run recursively on the non-leaf branches, un�l all data is classified.

Decision Tree to Decision Rules

A decision tree can easily be transformed to a set of rules by mapping from the root node to the leaf nodes o one.