Decision_Tree

Decision Tree is very powerful tool in machine learning. It is widely used by people both in regression and classification.

There are many applied version of it like Random Forest and XGboost. They are leading algorithms in machine learning world.

Before we dive into a decision tree, we should know about information entropy.

[1]img1

We can think this entropy same as a entropy we learned in physics. If a value of entropy is high, there is less purity and organized and vice versa. If data is organized and high purity, it will look like this. We can clearly distinct this data to two class.

In this implementation, I used "Gini", another measure of information entropy.

"Gini" has a reverse relationship with information entropy. If gini is high, the data has high purity.

Now, let's see how a decision tree is made. First, if it is a categorical data. We calculate all gini like this.

D = all data, D1 = splited1, D2 = splited2

We calculate this gini for all factors and find out which factor has the highest gini. The factor which has the highest gini will be selected for the split. Then the data will be split based on the factor chosen.

This process will be continued until the height we want or there are no data left.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
image		image
Decision_Tree.py		Decision_Tree.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Decision_Tree

About

Uh oh!

Releases

Packages

Languages

hyun11732/Decision_Tree

Folders and files

Latest commit

History

Repository files navigation

Decision_Tree

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages