Decision trees are a very important class of machine learning models and they are also building blocks of many more advanced algorithms, such as Random Forest or the famous XGBoost. The trees are also a good starting point for a baseline model, which we subsequently try to improve upon with more complex algorithms.
One of the biggest advantages of the decision trees is their interpretability — after fitting the model, it is effectively a set of rules that can be used to predict the target variable. That is also why it is easy to plot the rules and show them to stakeholders, so they can easily understand the model’s underlying logic. Of course, provided that the tree is not too deep.
Visualizing the decision trees can be really simple using a combination of scikit-learn and matplotlib. However, there is a nice library called dtreeviz, which brings much more to the table and creates visualizations that are not only prettier but also convey more information about the decision process.
In this article, I will first show the “old way” of plotting the decision trees and then introduce the improved approach using dtreeviz.
网友评论