Abstract:Aiming at the problem of high cost of urban scene label acquisition, an algorithm of domain adaptation semantic segmentation for urban scene combining self-ensembling and adversarial learning is proposed. For the inter-domain gap between source and target domains, the method of style transfer is employed to transfer the source domain into a new dataset with the style of target domain. For the problem of intra-domain gap in the target domain, the self-ensembling method is introduced and a teacher network is constructed. The teacher network is utilized to supervise and guide the student network through consistency constraints on the target domain segmentation map to reduce the intra-domain gap of the target domain and improve the segmentation accuracy. The self-training method is exploited to obtain the pseudo label of the target domain and add the pseudo label into the adversarial learning method to retrain the network and further improve the segmentation ability. Experiments on segmentation datasets verify the effectiveness of the proposed algorithm.
[1] 王中宇,倪显扬,尚振东.利用卷积神经网络的自动驾驶场景语义分割.光学精密工程, 2019, 27(11): 2429-2438. (WANG Z Y, NI X Y, SHANG Z D. Autonomous Driving Semantic Segmentation with Convolution Neural Networks. Optics and Precision Engineering, 2019, 27(11): 2429-2438.) [2] SHELHAMER E, LONG J, DARRELL T. Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Ana- lysis and Machine Intelligence, 2017, 39(4): 640-651. [3] LIN G S, MILAN A, SHEN C H, et al. RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 1925-1934. [4] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241. [5] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[C/OL]. [2020-08-30]. http://de.arxiv.org/pdf/1412.7062. [6] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pa-ttern Analysis and Machine Intelligence, 2018, 40(4): 834-848. [7] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation[C/OL]. [2020-08-30]. https://arxiv.org/pdf/1706.05587.pdf. [8] FU J, LIU J, TIAN H J, et al. Dual Attention Network for Scene Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 3146-3154. [9] CORDTS M, OMRAN M, RAMOS S. The Cityscapes Dataset for Semantic Urban Scene Understanding // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3213-3223. [10] RICHTER S R, VINEET V, ROTH S, et al. Playing for Data: Ground Truth from Computer Games // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 102-118. [11] ZOU Y, YU Z D, KUMAR B V, et al. Unsupervised Domain Adaptation for Semantic Segmentation via Class Balanced Self-training // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 289-305. [12] ZOU Y, YU Z D, LIU X F, et al. Confidence Regularized Self-training // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 5982-5991. [13] LI Y S, YUAN L, VASCONCELOS N. Bidirectional Learning for Domain Adaptation of Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 6936-6945. [14] TARVAINEN A, VALPOLA H. Mean Teachers Are Better Role Models: Weight-Averaged Consistency Targets Improve Semi-supervised Deep Learning Results // GUYON I, LUXBURG U V, BENGIO S, et al., eds. Advances on Neural Information Proce-ssing Systems 30. Cambridge, USA: The MIT Press, 2017: 1195-1204. [15] YU L Q, WANG S J, LI X M, et al. Uncertainty-Aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation // Proc of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Sprin-ger, 2019: 605-613. [16] XU Y H, DU B, ZHANG L F, et al. Self-ensembling Attention Networks: Addressing Domain Shift for Semantic Segmentation // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 5581-5588. [17] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Nets // Proc of the 27th International Confe-rence on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2014: 2672-2680. [18] ZHU J Y, PARK T, ISOLA P, et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2242-2251. [19] TSAI Y, HUNG W, SCHULTER S, et al. Learning to Adapt Structured Output Space for Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 7472-7481. [20] LUO Y W, ZHENG L, GUAN T, et al. Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2502-2511. [21] VU T, JAIN H, BUCHER M, et al. ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2019: 2517-2526. [22] 张桂梅,潘国峰,刘建新.域自适应城市场景语义分割.中国图象图形学报, 2020, 25(5): 913-925. (ZHANG G M, PAN G F, LIU J X. Domain Adaptation for Semantic Segmentation Based on Adaption Learning Rate. Journal of Image and Graphics, 2020, 25(5): 913-925.) [23] ZHENG Z D, YANG Y. Unsupervised Scene Adaptation with Me-mory Regularization in Vivo[C/OL]. [2020-08-30]. https://www.ijcai.org/Proceedings/2020/0150.pdf. [24] PAN F, SHIN I, RAMEAU F, et al. Unsupervised Intra-Domain Adaptation for Semantic Segmentation through Self-supervision // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 3764-3773. [25] LI Y J, LIU M Y, LI X T, et al. A Closed-Form Solution to Photo- realistic Image Stylization // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 468-483. [26] ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep Mutual Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4320-4328. [27] YUAN L, TAY F E, LI G L, et al. Revisiting Knowledge Disti-llation via Label Smoothing Regularization // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 3902-3910. [28] ROS G, SELLART L, MATERZYNSKA J, et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3234-3243.