An Introduction To Decision Trees
Every AI model and agent in the world has at its core a strategy to make decisions.
Decisions on what to do next, what to do when low on battery, what to do when asked to open the door, etc.
Have you ever wondered what is at the core of that AI machine that enables it to make such decisions?
The decision tree algorithm is one such algorithm.
What is a decision tree?
So what is a decision tree after all?
As the name suggests, a decision tree is a tree data structure that is used to enable an AI machine to make decisions.
Components Of A Decision Tree
A decision tree like any other tree has four basic components
The root is the starting point of a decision tree. This is the very first decision or a situation based analysis that the decision tree makes.
Internal nodes just like the root are used to specify a condition on the basis of which the algorithm will move forward to its completion.
A branch is usually a label that specifies the result of the condition applied by the above node.
A leaf node is the end of the tree. This consists of the result of the final decision that the machine or our intelligent agent has made.
How Are Attributes Selected?
At every node, there is a condition. Generally, this condition is applied to check the value of a particular attribute.
The respective result is what makes us propel forward.
But there rises a valid question, does the order of attribute selection matter?
And the answer is yes! It does matter.
The order in which attributes is compared generally affects the overall efficiency and speed of our model.
So how do we decide which attributes to compare or check at a given node?
We have two basic techniques for this and a combination of both is generally used.
Information gain is defined as the reduction in entropy.
That was rude!
So basically information gain is how well can an attribute distribute the given set. Its maximum value is 1 and the minimum is 0.
When we divide the set into equal parts it is 1 and when no division takes place, the value is 0.
For example, consider a set of humans. One attribute is if they have a face or not and the other one is their nationality.
Which one do you think will have more information gain?
And which one do you think we will use?
Their nationality again.
So, we prefer attributes with high information gain.
Gini index is basically the measure of how often will we be mistaken in identifying an attribute.
Let's consider the above example.
So what do you think, how sure are we to identify correctly that a person has face or not?
We are damn sure! That makes it 100%.
And how sure can we be of his nationality?
Obviously he could forge a wrong passport or lie or anything.
So we can be sure, but not 100%. Rather 99%.
Which parameter has a high Gini index?
What should we choose to be sure we are not mistaken in our classification?
Thus we prefer attributes with high gini index.