机器学习：9.2 HPO algorithms

作者: Cache_wood | 来源:发表于2022-04-18 10:11 被阅读0次

Hyper-Parameter	Range	Distribution
model(backbone)	[mobilenetv,resnet,vgg]	categorical
learning rate*	[1e-6,1e-1]	log-uniform
batch size*	[8,16,32,64,128,256,512]	categorical
monmentum**	[0.85,0.85]	uniform
weight decay**	[1e-6,1e-2]	log-uniform
detector	[faster-rcnn,ssd,yolo-v3,center-net]	categorical

The search space can be exponentially large
- Need to carefully design the space to improve efficiency

Black-box: treats a training job as a black-box in HPO:
- Completes the training process for each trial
Multi-fidelity: modifies the training job to speed up the search
- Train on subsampled datasets
- Reduce model size (e.g less #layers, #channels)
- Stop bad configuration earlier

Grid search
- All combinations are evaluated
- Guarantees the best results
- Curse of dimensionality
Random search
- Random combinations are tried
- More efficient than grid search(empirically and in theory, shown in Random Search for Hyper-Parameter Optimization)

BO: Iteratively learn a mapping from HP to objective function. Based on previous trials. Select the next trial based on the current estimation.
Surrogate model
- Estimate how the objective function depends on HP
- Probabilistic regression models: Random forest, Gaussian process
Acquisition function
- Acquisition max means uncertainty and predicted objective are high.
- Sample the next trial according to the acquisition function
- Trade off exploration and exploitation
Limitation of BO:
- In the initial stages, similar to random search
- Optimization process is sequential

In Successive Halving
- n: exploration
- m: exploitation
Hyperband runs multiple Successive Halving, each time decreases n and
increases m
- More exploration first, then do more exploit

Black-box HPO: grid/random search, bayesian optimization
Multi-fidelity HPO: Successive Halving, Hyperband
In practice, start with random search
Beware there are top performers
- You can find them by mining your training logs, or what common
  configurations used in paper/code

网友评论

本文标题：机器学习：9.2 HPO algorithms

本文链接：https://www.haomeiwen.com/subject/nveosrtx.html

机器学习：9.2 HPO algorithms