AdaBoost

Explain the AdaBoost algorithm.
1. Initially, all observations are given equal weights.
2. A model is built on a subset of data.
3. Using this model, predictions are made on the whole dataset.
4. Errors are calculated by comparing the predictions and actual values.
5. While creating the next model, higher weights are given to the data points which were predicted incorrectly.
6. Weights can be determined using the error value. For instance, the higher the error the more is the weight assigned to the observation.
7. This process is repeated until the error function does not change, or the maximum limit of the number of estimators is reached.
AdaBoost

Disadvantages of AdaBoost.
1. Boosting technique learns progressively, it is important to ensure that you have quality data.
2. AdaBoost is also extremely sensitive to Noisy data and outliers so if you do plan to use AdaBoost then it is highly recommended to eliminate them.
3. AdaBoost has also been proven to be slower than XGBoost.

References
1. A Guide to AdaBoost: Boosting To Save The Day
2. Explaining AdaBoost

Gradient Boosting

Explain the Gradient Boosting algorithm.
1. A model is built on a subset of data.
2. Using this model, predictions are made on the whole dataset.
3. Errors are calculated by comparing the predictions and actual values.
4. A new model is created using the errors calculated as target variable. Our objective is to find the best split to minimize the error.
5. The predictions made by this new model are combined with the predictions of the previous.
6. New errors are calculated using this predicted value and actual value.
7. This process is repeated until the error function does not change, or the maximum limit of the number of estimators is reached.
Gradient Boosting

XGBoost

What is XGBoost?

XGBoost is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework.
How XGBoost optimizes standard GBM algorithm?
- System Optimization
  1. Parallelization: XGBoost approaches the process of sequential tree building using parallelized implementation.
  2. Tree Pruning: The stopping criterion for tree splitting within GBM framework is greedy in nature and depends on the negative loss criterion at the point of split. XGBoost uses ‘max_depth’ parameter as specified instead of criterion first, and starts pruning trees backward.
  3. Hardware Optimization: This algorithm has been designed to make efficient use of hardware resources. This is accomplished by cache awareness by allocating internal buffers in each thread to store gradient statistics.
- Algorithmic Enhancements
  1. Regularization: It penalizes more complex models through both LASSO (L1) and Ridge (L2) regularization to prevent overfitting.
  2. Sparsity Awareness: XGBoost naturally admits sparse features for inputs by automatically ‘learning’ best missing value depending on training loss and handles different types of sparsity patterns in the data more efficiently.
  3. Weighted Quantile Sketch: XGBoost employs the distributed weighted Quantile Sketch algorithm to effectively find the optimal split points among weighted datasets.
  4. Cross-validation: The algorithm comes with built-in cross-validation method at each iteration, taking away the need to explicitly program this search and to specify the exact number of boosting iterations required in a single run.
References
1. XGBoost Algorithm: Long May She Reign!

LightGBM

What is LightGBM?

LightGBM is a fast, distributed, high-performance gradient boosting framework based on decision tree algorithm, used for ranking, classification and many other machine learning tasks.
What's the difference between LightGBM and XGBoost?

LightGBM uses a novel technique of Gradient-based One-Side Sampling (GOSS) to filter out the data instances for finding a split value while XGBoost uses pre-sorted algorithm & Histogram-based algorithm for computing the best split. Here instances mean observations/samples.

Advantages of LightGBM.
- Faster training speed and higher efficiency
- Lower memory usage
- Better accuracy than any other boosting algorithm
- Compatibility with Large Datasets
- Parallel learning supported
References
1. Which algorithm takes the crown: Light GBM vs XGBOOST?
2. CatBoost vs. LightGBM vs. XGBoost

CatBoost

What is CatBoost?

CatBoost name comes from two words "Category" and "Boosting". It works well with multiple categories of data, such as audion, text, image including historical data. CatBoost does not require conversion of data set to any specific format like XGBoost and LightGBM.
Advantages of CatBoost Library.
1. Performance
2. Handling Categorical features automatically
3. Robust
4. Easy-to-use
References
1. CatBoost: A machine learning library to handle categorical (CAT) data automatically