Abstract:An end-to-end multi-scale semantic segmentation model based on fully convolutional DenseNet is proposed, aiming at the problems of traditional semantic segmentation methods for street scene, such as the large number of parameters and low computational efficiency and precision. Firstly, convolution layers embedded with hybrid dilation convolution are stacked to establish a dense module, and then the modules are cascaded along channel dimension to extract features. Next, multi-scale visual information regarded as supervised signals are transferred back to original channels. Finally, the prediction results are obtained by bilinear interpolation method. Experimental results on Cityscapes dataset demonstrate that the proposed method achieves an efficient segmentation and performs a better accuracy for street scene parsing.
蒋斌, 涂文轩, 杨超, 刘虹雨, 赵子龙. 基于DenseNet的复杂交通场景语义分割方法[J]. 模式识别与人工智能, 2019, 32(5): 472-480.
JIANG Bin1, TU Wenxuan1, YANG Chao1, LIU Hongyu1, ZHAO Zilong1. Semantic Segmentation Method for Complex Traffic Scene Based on DenseNet. , 2019, 32(5): 472-480.
[1] GARCIA-GARCIA A, ORTS-ESCOLANO S, OPREA S, et al. A Review on Deep Learning Techniques Applied to Semantic Segmentation[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1704.06857.pdf. [2] 张新明,祝晓斌,蔡 强,等.图像语义分割深度学习模型综述.高技术通讯, 2017, 27(9): 808-815. (ZHANG X M, ZHU X B, CAI Q, et al. Survey of the Deep Learning Models for Image Semantic Segmentation. Chinese High Technology Letters, 2017, 27(9): 808-815.) [3] 姜 枫,顾 庆,郝慧珍,等.基于内容的图像分割方法综述.软件学报, 2017, 28(1): 160-183. (JIANG F, GU Q, HAO H Z, et al. Survey on Content-Based Image Segmentation Methods. Journal of Software, 2017, 28(1): 160-183.) [4] ZHAO H S, QI X J, SHEN X Y, et al. ICNet for Real Time Semantic Segmentation on High-Resolution Images[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1704.08545.pdf. [5] 李琳辉,钱 波,连 静,等.基于卷积神经网络的交通场景语义分割方法研究.通信学报, 2018, 39(4): 123-130. (LI L H, QIAN B, LIAN J, et al. Study on Traffic Scene Semantic Segmentation Method Based on Convolutional Neural Network. Journal on Communications, 2018, 39(4): 123-130.) [6] LIU C, YUEN J, TORRALBA A. Sift Flow: Dense Corresponden-ce across Scenes and Its Applications // HASSNER T, LIU C, eds. Dense Image Correspondences for Computer Vision. Berlin, Germany: Springer, 2011: 15-49. [7] FARABET C, COUPRIE C, NAJMAN L, et al. Learning Hierarchical Features for Scene Labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1915-1929. [8] LONG J, SHELHAMER E, DARRELL T. Fully Convolutional Networks for Semantic Segmentation[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1411.4038.pdf. [9] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 2017, 60(6): 84-90. [10] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1409.1556.pdf. [11] SZEGEDY C, LIU W, JIA Y Q, et al. Going Deeper with Convolutions // Proc of the 28th IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 1-9. [12] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the 28th IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 770-778. [13] BADRINARAYANAN V, HANDA A, CIPOLLA R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2015, 39(12): 2481-2495. [14] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241. [15] BANSAL A, CHEN X L, RUSSELL B, et al. PixelNet: Representation of the Pixels, by the Pixels, and for the Pixels[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1702.06506.pdf. [16] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1412.7062.pdf. [17] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. [18] ZHAO H S, SHI J P, QI X J, et al. Pyramid Scene Parsing Network // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2017: 6230-6239. [19] YU F, KOLTUN V. Multi-scale Context Aggregation by Dilated Convolutions[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1511.07122.pdf. [20] YU F, KOLTUN V, FUNKHOUSER T. Dilated Residual Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 636-644. [21] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1706.05587.pdf. [22] XIE C W, ZHOU H Y, WU J X. Vortex Pooling: Improving Con-text Representation in Semantic Segmentation[C/OL]. [2018-08-15]. https://arxiv.org/pdf/1804.06242v2.pdf. [23] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely Connected Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2261-2269. [24] DENG J, DONG W, SOCHER R, et al. ImageNet: A Large-Scale Hierarchical Image Database // Proc of the 22th IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2009: 248-255. [25] BENENSON R, FRANKE U, ROTH S, et al. The Cityscapes Da-taset for Semantic Urban Scene Understanding // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3213-3223.