Decision Tree
Decision trees are non-parametric and supervised machine learning algorithms
- It has a hierarchical tree structure
- Internal nodes - Decisions
- Leaf nodes - All possible outcomes within the dataset
- It is used for both regression and classification
- It is a Rule-based ML algorithm
- DTs are explainable - Easy to represent and one can better understand how and why a decision was mode
- It follows a divide-and-conquer approach
- It performs a greedy search to identify the optimal split points of a tree → Getting to an optimal decision faster?
- The split points are decisions (for example, age > 30)
- DTs perform better when smaller (Occam’s Razor). Complex DTs are prone to data fragmentation/overfitting
- Smaller DTs can more quickly attain leaf nodes. In other words, it is more easily able to cluster data points into single classes/homogenous sets
- DTs may be great for less complex problems
- Overfitting can be reduced using pruning
- Alternatively, random forest algorithms can be used to enhance the accuracy of DTs
- Smaller DTs can more quickly attain leaf nodes. In other words, it is more easily able to cluster data points into single classes/homogenous sets
Homogeneous Set
A homogeneous set is a set consisting of all data points of one type/class. For example, in a class of 34 boys and 39 girls, the set of all boys together is one such set
Advantages | Disadvantages |
---|---|
Easy to interpret (explainable and transparent) | Prone to overfitting |
Little to no data prep required | High variance estimators. Bagging may help |
More flexible and insensitive to relationships within datasets | Computationally expensive to train |
My notes
- Graph theory may be connected to DTs
- DTs and expander graphs are related?