Introduction
语义分割的两个挑战
The first one is the reducedfeature resolution caused by consecutive pooling operationsor convolution striding
解决方法:
带孔卷积
Another difficulty comes from the existence of object sat multiple scales
解决方法:
图像金字塔
Encoder-Decoder
DenseCRF
空间金字塔池化
Related Work
Image pyramid
Encoder-decoder
Context module
Spatial pyramid pooling
Atrous convolution
Methods
Atrous Convolution for Dense Feature Extraction
用带孔卷积提取特征
空洞卷积详解
Multi-Scale Context Aggregation by Dilated Convolution
Going Deeper with Atrous Convolution
级联结构
Multi-grid Method
We adopt different atrous rates within block4 toblock7 in the proposed model. In particular, we define as MultiGrid= (r1, r2, r3) the unit rates for the three convo-lutional layers within block4 to block7
ASPP
In the extreme case where the rate value is close to the feature map size, the3×3 filter, instead of capturing the whole image context, degenerates to a simple 1×1 filter since only the center filter weight is effective.
Valid Weight:he weights that are applied to the valid fea-ture region, instead of padded zeros
We apply global average pooling on the last feature map of the model, feed the resulting image-level features to a 1×1 convolution with 256 filters (and batch normalization [38]), and then bilinearly upsample the feature to the desired spatial dimension.
全局平均池化
image_level_features = tf.reduce_mean(net, [1, 2], name='image_level_global_pool', keepdims=True)
image_level_features = slim.conv2d(image_level_features, depth, [1, 1],
scope="image_level_conv_1x1",activation_fn=None)
image_level_features = tf.image.resize_bilinear(image_level_features, (feature_map_size[1], feature_map_size[2]))
并联结构
网友评论