文章标题:Importance Estimation for Neural Network Pruning
[link](Importance Estimation for Neural Network Pruning)
阅读目的:
- 查看其中的prune算法:
(1)是weight prune还是neuron prune
(2)选择significant的方法 - 实验参数和实验效果,主要是对inference time的影响
阅读笔记:
- 摘要部分
- neuron (filter) prune
- using the first and second- order Taylor expansions to approximate a filter’s contribution
- 提出质疑
传统观点:Many of them rely on the belief that the magnitude of a weight and its importance are strongly correlated.
质疑:We question this belief and observe a significant gap in correlation between weight-based pruning decisions and empirically optimal one-step decisions(经验最优的一步决策) – a gap which our greedy criterion aims to fill
新提出的标准:
We define the importance as the squared change in loss induced by removing a specific filter from the network.
新标准执行时遇到的问题:
computing the exact importance is extremely expensive for large networks
解决办法:
approximate it with a Taylor expansion , resulting in a criterion computed from parameter gradients readily available during standard training -
算法部分
输入为trained network,prune,再retrain with a small learning rate
(1)For each minibatch, we compute parameter gradients and update network weights by gradient descent. (即,梯度下降)
We also compute the importance of each neuron (or filter) using the gradient averaged over the minibatch (原文献中有介绍,见下图中的公式7,8)
公式介绍图
(2)After a predefined number of minibatches, we average the importance score of each neuron (or filter) over the
of minibatches, and remove the N neurons with the
smallest importance scores
4.实验效果
对比了neurons pruned vs loss
在补充材料中,we evaluate inference speed of pruned.
- Pruning results in inference speed reduction, especially for the larger batch size
- Pruning skip connections results in higher time reduction compared to pruning all layers.
inference time experiment result
网友评论