Approach
-
Depthwise Separable Convolution
For MobileNets the depthwise convolution applies a single filter to each input channel. The pointwise convolution then applies a 1×1 convolution to combine the outputs the depthwise convolution.
data:image/s3,"s3://crabby-images/2ded9/2ded917d8312c5371ef0f5192fef6da912c9b9a3" alt=""
Standard convolutions have the computational cost of: D K · D K · M · N · D F · D F
Depthwise separable convolutions cost: D K · D K · M · D F · D F + M · N · D F · D F
By expressing convolution as a two step process of filtering and combining we get a reduction in computation of:
data:image/s3,"s3://crabby-images/4874d/4874d6298d23149c72546b09d01fac15ecd0bb89" alt=""
- Network Structure
data:image/s3,"s3://crabby-images/0e4cc/0e4cc3459c7be98a95f7e30ce2e64da973d43531" alt=""
data:image/s3,"s3://crabby-images/fe15c/fe15cc0ce12694ed1de8d352cece9e12ceb6a78e" alt=""
Experiment
data:image/s3,"s3://crabby-images/269cb/269cb3412966557851c97d1ac44d71461cd8c853" alt=""
References:
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, Andrew G. Howard, 2017, arXiv
网友评论