Bias: The difference in fit to the algorithm to the observations. The inability of a machine learning method to capture the true relationship is called bias.
Variance: The difference to fit the training datasets and the testing datasets is called variance.
Capacity: How complex the model is. E.g. low capacity, linear regression.
Overfit: The model fits exactly against its training data. When a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data.
Underfit: The model is unable to capture the relationship between the input and output variables accurately.
The ideal algorithm in machine learning:
To find the ideal algorithm, there are three methods, which are regularization, boosting, and bagging.
Regularization:
Boosting:
Bagging:
Reference:
https://www.youtube.com/watch?v=EuBBz3bI-aA
网友评论