|
|
Deep Snake with 2D-Circular Convolution and Difficulty Sensitive Contour-IoU Loss |
LI Hao1, YUAN Guanglin1, LI Congli2, QIN Xiaoyan1, ZHU Hong1 |
1. Department of Information Engineering, Army Academy of Artillery and Air Defense of People's Liberation Army of China, Hefei 230031 2. Department of Ordnance Engineering, Army Academy of Artillery and Air Defense of People's Liberation Army of China, Hefei 230031 |
|
|
Abstract The initial bounding box is deformed to the object contour end-to-end by Deep Snake, and the performance of instance segmentation is significantly improved. However, the problems of sensitivity to the initial bounding box and independent regression of contour parameters emerge. To address these issues, Deep Snake with 2D-circular convolution and difficulty sensitive intersection over union(contour-IoU) loss is proposed. Firstly, 2D-circular convolution is designed based on the spatial context information of the contour to solve the problem of sensitivity to the initial bounding box. Secondly, difficulty sensitive contour-IoU loss function is proposed according to the geometric meaning of the definite integral and the difficulty of the sample to regress the contour parameters as a whole unit. Finally, instance segmentation is accomplished by the proposed 2D-circular convolution and difficulty sensitive contour-IoU loss function. Experiments on Cityscapes, Kins and Sbd datasets show that the proposed method achieves better segmentation accuracy.
|
Received: 08 July 2021
|
|
Fund:Natural Science Foundation of Anhui Province(No.2008085QF325) |
Corresponding Authors:
YUAN Guanglin, Ph.D., associate professor. His research interests include computer vision, machine learning and its application.
|
About author:: LI Hao, master student. His research interests include instance segmentation, object tracking and object detection. LI Congli, master, professor. His research interests include image computer vision. QIN Xiaoyan, master, associate professor. Her research interests include object detection, machine learning and its application. ZHU Hong, master, lecturer. Her research interests include image processing and computer vision. |
|
|
|
[1] LI Y, QI H Z, DAI J F, et al. Fully Convolutional Instance-Aware Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 4438-4446. [2] HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2980-2988. [3] LIU S, QI L, QIN H F, et al. Path Aggregation Network for Instance Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 8759-8768. [4] ZHANG H, TIAN Y L, WANG K F, et al. Mask SSD: An Effective Single-Stage Approach to Object Instance Segmentation. IEEE Transactions on Image Processing, 2020, 29(1): 2078-2093. [5] DAI J F, HE K M, LI Y, et al. Instance-Sensitive Fully Convolutional Networks // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 534-549. [6] LIANG X D, LIN L, WEI Y C, et al. Proposal-Free Network for Instance-Level Object Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(12): 2978-2991. [7] LIU Y D, YANG S Y, LI B, et al. Affinity Derivation and Graph Merge for Instance Segmentation // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 708-724. [8] GAO N Y, SHAN Y H, WANG Y P, et al. SSAP: Single-Shot Instance Segmentation with Affinity Pyramid. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(2): 661-673. [9] NEVEN D, DE BRABANDERE B, PROESMANS M, et al. Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 8829-8837. [10] CHEN X L, GIRSHICK R, HE K M, et al. TensorMask: A Foundation for Dense Object Segmentation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 2061-2069. [11] KASS M, WITKIN A, TERZOPOULOS D. Snakes: Active Contour Models. International Journal of Computer Vision, 1988, 1(4): 321-331. [12] 周昌雄,于盛林,吴 陈,等.基于先验知识和区域信息的Snake模型图像分割研究. 模式识别与人工智能, 2006, 19(2): 257-261. (ZHOU C X, YU S L, WU C, et al. Research on Image Segmentation Based on Snake Model of Previous Knowledge and Region Information. Pattern Recognition and Artificial Intelligence, 2006, 19(2): 257-261.) [13] 李春明,李玉山,张大朴,等.基于PCA/Snake混合模型的运动目标外轮廓求解.模式识别与人工智能, 2007, 20(3): 313-318. (LI C M, LI Y S, ZHANG D P, et al. Contour Extraction of Mo-ving Objects Based on PCA/Snake Mixture Model. Pattern Recognition and Artificial Intelligence, 2007, 20(3): 313-318.) [14] 李 敏,梁久祯,廖翠萃,等.基于聚类信息的活动轮廓图像分割模型.模式识别与人工智能, 2015, 28(7): 665-672. (LI M, LIANG J Z, LIAO C C, et al. Active Contour Model for Image Segmentation Based on Clustering Information. Pattern Re-cognition and Artificial Intelligence, 2015, 28(7): 665-672.) [15] JETLEY S, SAPIENZA M, GOLODETZ S, et al. Straight to Shapes: Real-Time Detection of Encoded Shapes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 4207-4216. [16] ZHANG L S, BAI M, LIAO R J, et al. Learning Deep Structured Active Contours End-to-End // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 8877-8885. [17] YANG Z, XU Y H, XUE H, et al. Dense RepPoints: Representing Visual Objects with Dense Point Sets[C/OL]. [2021-06-26]. https://export.arxiv.org/pdf/1912.11473. [18] XIE E Z, SUN P E, SONG X G, et al. PolarMask: Single Shot Instance Segmentation with Polar Representation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 12190-12199. [19] PENG S D, JIANG W, PI H J, et al. Deep Snake for Real-Time Instance Segmentation // Proc of the IEEE Conference on Compu-ter Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 8533-8542. [20] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [21] LING H, GAO J, KAR A, et al. Fast Interactive Object Annotation with Curve-GCN // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5252-5261. [22] YU J H, JIANG Y N, WANG Z Y, et al. UnitBox: An Advanced Object Detection Network // Proc of the 24th ACM International Conference on Multimedia. New York, USA: ACM, 2016: 516-520. [23] FREUND Y, SCHAPIRE R E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 1997, 55(1): 119-139. [24] ZHOU X Y, WANG D Q, KRÜHENBÜHL P. Objects as Points[C/OL]. [2021-06-26]. https://arxiv.org/pdf/1904.07850.pdf. [25] CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes Dataset for Semantic Urban Scene Understanding // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3213-3223. [26] QI L, JIANG L, LIU S, et al. Amodal Instance Segmentation with Kins Dataset // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 3009-3018. [27] HARIHARAN B, ARBELÁEZ P, BOURDEV L, et al. Semantic Contours from Inverse Detectors // Proc of the International Confe-rence on Computer Vision. Washington, USA: IEEE, 2011: 991-998. [28] LIU S, JIA J Y, FIDLER S, et al. SGN: Sequential Grouping Networks for Instance Segmentation // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 3516-3524. [29] ACUNA D, LING H, KAR A, et al. Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 859-868. [30] DAI J F, HE K M, SUN J. Instance-Aware Semantic Segmentation via Multi-task Network Cascades // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3150-3158. [31] FOLLMANN P, KONIG R, HÄRTINGER P, et al. Learning to See the Invisible: End-to-End Trainable Amodal Instance Segmentation // Proc of the IEEE Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2019: 1328-1336. [32] XU W Q, WANG H Y, QI F B, et al. Explicit Shape Encoding for Real-Time Instance Segmentation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 5168-5177. |
|
|
|