Decision Tree

Decision trees are non-parametric and supervised machine learning algorithms

  • It has a hierarchical tree structure
    • Internal nodes - Decisions
    • Leaf nodes - All possible outcomes within the dataset
  • It is used for both regression and classification
  • It is a Rule-based ML algorithm
  • DTs are explainable - Easy to represent and one can better understand how and why a decision was mode
  • It follows a divide-and-conquer approach
  • It performs a greedy search to identify the optimal split points of a tree Getting to an optimal decision faster?
    • The split points are decisions (for example, age > 30)
  • DTs perform better when smaller (Occam’s Razor). Complex DTs are prone to data fragmentation/overfitting
    • Smaller DTs can more quickly attain leaf nodes. In other words, it is more easily able to cluster data points into single classes/homogenous sets
      • DTs may be great for less complex problems
    • Overfitting can be reduced using pruning
    • Alternatively, random forest algorithms can be used to enhance the accuracy of DTs

Homogeneous Set

A homogeneous set is a set consisting of all data points of one type/class. For example, in a class of 34 boys and 39 girls, the set of all boys together is one such set

AdvantagesDisadvantages
Easy to interpret (explainable and transparent)Prone to overfitting
Little to no data prep requiredHigh variance estimators. Bagging may help
More flexible and insensitive to relationships within datasetsComputationally expensive to train

My notes

  • Graph theory may be connected to DTs

Decision Trees - IBM