Abstract:The low precision exists in the existing part segmentation, and the generalization and precision can not be balanced. Aiming at the problems, a part segmentation network(DeepLab-MAFE-DSC) based on DeepLab is proposed. A multi-scale adaptive-pattern feature extraction(MAFE) module is proposed in encoder part of the network. The deformable convolution is exploited to enhance the processing capability to irregular contour, and sampling mode of cascade and concatenate in parallel is adopted to balance global and local information simultaneously. A decoder module based on skip connection(DSC) is designed to connect high-level semantic information and low-level character information. Experiments on the dataset show the advantages of DeepLab-MAFE-DSC in simplicity, high part segmentation accuracy and strong generalization.
[1] JOHNSON M, SHOTTON J, CIPOLLA R. Semantic Texton Forests for Image Categorization and Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2008: 211-227. [2] SHOTTON J, FITZGIBBON A, COOK M, et al. Real-Time Human Pose Recognition in Parts from Single Depth Images // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2011: 1297-1304. [3] HARIHARAN B, ARBELÁEZ P, GIRSHICK R, et al. Simulta-neous Detection and Segmentation // Proc of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 297-312. [4] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 580-587. [5] JONATHAN L, EVAN S, TREVOR D. Fully Convolutional Networks for Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 3431-3440. [6] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017, 12(39): 2481-2495. [7] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241. [8] LIN G S, MILAN A, SHEN C H, et al. Refinenet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 5168-5177. [9] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [10] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. [C/OL].[2019-10-16]. https://arxiv.org/pdf/1412.7062.pdf. [11] ZHAO H S, SHI J P, QI X J, et al. Pyramid Scene Parsing Network // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2017: 6230-6239. [12] XIA F T, WANG P, CHEN X J, et al. Joint Multi-person Pose Estimation and Semantic Part Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6080-6089. [13] XIA F T, WANG P, CHEN L C, et al. Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net // Proc of the European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 648-663. [14] CHEN L C, YANG Y, WANG J, et al. Attention to Scale: Scale-Aware Semantic Image Segmentation // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3640-3649. [15] LIANG X D, SHEN X H, FENG J S, et al. Semantic Object Pa-rsing with Graph LSTM // Proc of the European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 125-143. [16] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. [17] DAI J F, QI H Z, XIONG Y W, et al. Deformable Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 764-773. [18] CHEN X J, MOTTAGHI R, LIU X B, et al. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 1979-1986. [19] 周志华.机器学习.北京:清华大学出版社, 2016. (ZHOU Z H. Machine Learning. Beijing, China: Tsinghua University Press, 2016.) [20] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The Pascal Visual Object Classes(VOC) Challenge. International Journal of Computer Vision, 2010, 88: 303-338.