结合自集成和对抗学习的域自适应城市场景语义分割

doi:10.16451/j.cnki.issn1003-6059.202101006

摘要
图/表
参考文献
相关文章 (12)

全文: PDF (2021 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对城市场景标签获取的高额成本问题,文中提出结合自集成和对抗学习的域自适应城市场景语义分割方法.对于源域和目标域的较大域间差异问题,采用风格转换的方法将源域数据集合成具有目标域风格的新数据集,作为新的源域数据集,从而有效减少源域与目标域的域间差异.对于目标域的域内差异问题,引入自集成方法,构造教师网络,利用教师网络在目标域分割图上通过一致性约束监督与指导学生网络,从而减小目标域的域内差异,提高分割精度.采用自训练的方法获得目标域的伪标签,将伪标签加入对抗学习方法中,重新训练网络模型,进一步提高模型的分割能力.在数据集上的分割实验表明文中方法的有效性.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	张桂梅
	鲁飞飞
	龙邦耀
	缪君

关键词 ：自集成, 对抗学习, 域自适应, 城市场景, 语义分割

Abstract：Aiming at the problem of high cost of urban scene label acquisition, an algorithm of domain adaptation semantic segmentation for urban scene combining self-ensembling and adversarial learning is proposed. For the inter-domain gap between source and target domains, the method of style transfer is employed to transfer the source domain into a new dataset with the style of target domain. For the problem of intra-domain gap in the target domain, the self-ensembling method is introduced and a teacher network is constructed. The teacher network is utilized to supervise and guide the student network through consistency constraints on the target domain segmentation map to reduce the intra-domain gap of the target domain and improve the segmentation accuracy. The self-training method is exploited to obtain the pseudo label of the target domain and add the pseudo label into the adversarial learning method to retrain the network and further improve the segmentation ability. Experiments on segmentation datasets verify the effectiveness of the proposed algorithm.

Key words： Self-ensembling Adversarial Learning Domain Adaptation Urban Scene Semantic Segmentation

收稿日期: 2020-09-28

ZTFLH:

TP 319.4

基金资助:国家自然科学基金项目(No.61462065,61661036)资助

通讯作者: 张桂梅,博士,教授,主要研究方向为计算机视觉、图像处理、模式识别.E-mail:guimei.zh@163.com.

作者简介: 鲁飞飞,硕士研究生,主要研究方向为计算机视觉、图像处理、模式识别.E-mail:1323568545@qq.com.
龙邦耀,硕士研究生,主要研究方向为计算机视觉、图像处理、模式识别.E-mail:lby609527215@163.com.
缪君,博士,教授,主要研究方向为基于图像的三维重建、图像处理、模式识别.E-mail:miaojun@nchu.edu.cn.

引用本文:

张桂梅, 鲁飞飞, 龙邦耀, 缪君. 结合自集成和对抗学习的域自适应城市场景语义分割[J]. 模式识别与人工智能, 2021, 34(1): 58-67. ZHANG Guimei, LU Feifei, LONG Bangyao, MIAO Jun. Domain Adaptation Semantic Segmentation for Urban Scene Combining Self-ensembling and Adversarial Learning. , 2021, 34(1): 58-67.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202101006 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2021/V34/I1/58

[1] 王中宇,倪显扬,尚振东.利用卷积神经网络的自动驾驶场景语义分割.光学精密工程, 2019, 27(11): 2429-2438.
(WANG Z Y, NI X Y, SHANG Z D. Autonomous Driving Semantic Segmentation with Convolution Neural Networks. Optics and Precision Engineering, 2019, 27(11): 2429-2438.)
[2] SHELHAMER E, LONG J, DARRELL T. Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Ana- lysis and Machine Intelligence, 2017, 39(4): 640-651.
[3] LIN G S, MILAN A, SHEN C H, et al. RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 1925-1934.
[4] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241.
[5] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[C/OL]. [2020-08-30]. http://de.arxiv.org/pdf/1412.7062.
[6] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pa-ttern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[7] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation[C/OL]. [2020-08-30]. https://arxiv.org/pdf/1706.05587.pdf.
[8] FU J, LIU J, TIAN H J, et al. Dual Attention Network for Scene Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 3146-3154.
[9] CORDTS M, OMRAN M, RAMOS S. The Cityscapes Dataset for Semantic Urban Scene Understanding // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3213-3223.
[10] RICHTER S R, VINEET V, ROTH S, et al. Playing for Data: Ground Truth from Computer Games // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 102-118.
[11] ZOU Y, YU Z D, KUMAR B V, et al. Unsupervised Domain Adaptation for Semantic Segmentation via Class Balanced Self-training // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 289-305.
[12] ZOU Y, YU Z D, LIU X F, et al. Confidence Regularized Self-training // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 5982-5991.
[13] LI Y S, YUAN L, VASCONCELOS N. Bidirectional Learning for Domain Adaptation of Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 6936-6945.
[14] TARVAINEN A, VALPOLA H. Mean Teachers Are Better Role Models: Weight-Averaged Consistency Targets Improve Semi-supervised Deep Learning Results // GUYON I, LUXBURG U V, BENGIO S, et al., eds. Advances on Neural Information Proce-ssing Systems 30. Cambridge, USA: The MIT Press, 2017: 1195-1204.
[15] YU L Q, WANG S J, LI X M, et al. Uncertainty-Aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation // Proc of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Sprin-ger, 2019: 605-613.
[16] XU Y H, DU B, ZHANG L F, et al. Self-ensembling Attention Networks: Addressing Domain Shift for Semantic Segmentation // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 5581-5588.
[17] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Nets // Proc of the 27th International Confe-rence on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2014: 2672-2680.
[18] ZHU J Y, PARK T, ISOLA P, et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2242-2251.
[19] TSAI Y, HUNG W, SCHULTER S, et al. Learning to Adapt Structured Output Space for Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 7472-7481.
[20] LUO Y W, ZHENG L, GUAN T, et al. Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2502-2511.
[21] VU T, JAIN H, BUCHER M, et al. ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2019: 2517-2526.
[22] 张桂梅,潘国峰,刘建新.域自适应城市场景语义分割.中国图象图形学报, 2020, 25(5): 913-925.
(ZHANG G M, PAN G F, LIU J X. Domain Adaptation for Semantic Segmentation Based on Adaption Learning Rate. Journal of Image and Graphics, 2020, 25(5): 913-925.)
[23] ZHENG Z D, YANG Y. Unsupervised Scene Adaptation with Me-mory Regularization in Vivo[C/OL]. [2020-08-30]. https://www.ijcai.org/Proceedings/2020/0150.pdf.
[24] PAN F, SHIN I, RAMEAU F, et al. Unsupervised Intra-Domain Adaptation for Semantic Segmentation through Self-supervision // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 3764-3773.
[25] LI Y J, LIU M Y, LI X T, et al. A Closed-Form Solution to Photo-
realistic Image Stylization // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 468-483.
[26] ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep Mutual Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4320-4328.
[27] YUAN L, TAY F E, LI G L, et al. Revisiting Knowledge Disti-llation via Label Smoothing Regularization // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 3902-3910.
[28] ROS G, SELLART L, MATERZYNSKA J, et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3234-3243.