Underwater Image Generation Method Based on Contrastive Learning with Hard Negative Samples
LIU Zijian1, WANG Xingmei1,2, CHEN Weijing1, ZHANG Wansong1, ZHANG Tianzi1
1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001; 2. National Key Laboratory of Underwater Acoustic Technology, Harbin Engineering University, Harbin 150001
Abstract:Image generation is essential to acquire scarce underwater images, and it is typically reliant on paired data. Considering the limitation of practical access to such data distributions in marine environments, a contrastive learning-based generative adversarial network(CL-GAN) is introduced to overcome the constraints of bijection in image domain. However, the model struggles to learn complex content features from noisy images due to the low quality of negative samples resulting from random sampling. To address this issue, a hard negative sample contrastive learning-based feature level GAN(HCFGAN) for underwater image generation is proposed. To improve the quality of negative samples, a hard negative sampling module(HNS) is designed to mine feature similarity between samples. The hard negative samples close to the anchor sample are incorporated into contrastive loss for complex feature learning. To ensure the complexity and comprehensiveness of negative samples, a negative sample generation module(NSG) is constructed. The adversarial training of NSG and HNS ensures the validity of hard negative samples. To enhance feature extraction capability and training stability of the model for underwater fuzzy images, a contextual feature generator and a global feature discriminator are designed. Experiments show that the underwater images generated by HCFGAN exhibit good authenticity and richness with practical value in underwater image generation.
[1] ZHAO Y X, ZHU K X, ZHAO T, et al. Small-Sample Seabed Se-diment Classification Based on Deep Learning. Remote Sensing, 2023, 15(8). DOI: 10.3390/rs15082178. [2] 叶赵兵,段先华,赵楚.改进YOLOv3-SPP水下目标检测研究.计算机工程与应用, 2023, 59(6): 231-240. (YE Z B, DUAN X H, ZHAO C.Research on Underwater Target Detection by Improved YOLOv3-SPP. Computer Engineering and Applications, 2023, 59(6): 231-240.) [3] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Nets // Proc of the 27th International Confe-rence on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2014, II: 2672-2680. [4] 黄淑英,汪斌,李红霞,等.基于生成对抗网络的图像去雾算法.模式识别与人工智能, 2021, 34(11): 990-1003. (HUANG S Y, WANG B, LI H X, et al. Image Dehazing Based on Generative Adversarial Network. Pattern Recognition and Artificial Intelligence, 2021, 34(11): 990-1003.) [5] ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-Image Translation with Conditional Adversarial Networks // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 5967-5976. [6] 程文聪,史小康,王志刚.基于生成对抗网络的仿真卫星云图生成方法.系统仿真学报, 2021, 33(6): 1297-1306. (CHENG W C, SHI X K, WANG Z G.Creating Synthetic Satellite Cloud Data Based on GAN Method. Journal of System Simulation, 2021, 33(6): 1297-1306.) [7] GUO Y C, LI H Y, ZHUANG P X.Underwater Image Enhancement Using a Multiscale Dense Generative Adversarial Network. IEEE Journal of Oceanic Engineering, 2020, 45(3): 862-870. [8] FABBRI C, ISLAM M J, SATTAR J.Enhancing Underwater Imagery Using Generative Adversarial Networks // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2018: 7159-7165. [9] LIU M Y, BREUEL T, KAUTZ J. Unsupervised Image-to-Image Translation Networks // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 700-708. [10] SHAO X N, ZHANG W D.SPatchGAN: A Statistical Feature Based Discriminator for Unsupervised Image-to-Image Translation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 6526-6535. [11] 卫星,李佳,孙晓,等.基于混合生成对抗网络的多视角图像生成算法.自动化学报, 2021, 47(11): 2623-2636. (WEI X, LI J, SUN X, et al. Cross-View Image Generation via Mixture Generative Adversarial Network. Acta Automatica Sinica, 2021, 47(11): 2623-2636.) [12] ZHU J Y, PARK T, ISOLA P, et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Consistent Adversarial Networks // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2242-2251. [13] LIU M Y, HUANG X, MALLYA A, et al. Few-Shot Unsupervised Unsupervised Image-to-Image Translation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 10550-10559. [14] LI C Y, GUO J C, GUO C L.Emerging from Water: Underwater Image Color Correction Based on Weakly Supervised Color Transfer. IEEE Signal Processing Letters, 2018, 25(3): 323-327. [15] LIU P, WANG G Y, QI H, et al. Underwater Image Enhancement with a Deep Residual Framework. IEEE Access, 2019, 7: 94614-94629. [16] CHEN W J, WANG X M, LI M H, et al. An Underwater Sonar Small Sample Image Transformation for Feature Domain Optimization // Proc of the IEEE 6th International Conference on Electronic Information and Communication Technology. Washington, USA: IEEE, 2023: 1377-1386. [17] SUN B Y, MEI Y P, YAN N, et al. UMGAN: Underwater Image Enhancement Network for Unpaired Image-to-Image Translation. Journal of Marine Science and Engineering, 2023, 11(2). DOI: 10.3390/jmse11020447. [18] PARK T, EFROS A A, ZHANG R, et al. Contrastive Learning for Unpaired Image-to-Image Translation // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 319-345. [19] JIA Z W, YUAN B D, WANG K K, et al. Semantically Robust Unpaired Image Translation for Data with Unmatched Semantics Statistics // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 14253-14263. [20] EMAMI H, DONG M, GLIDE-HURST C.CL-GAN: Contrastive Learning-Based Generative Adversarial Network for Modality Transfer with Limited Paired Data // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2023: 527-542. [21] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-Resolution Image Synthesis with Latent Diffusion Models // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 10674-10685. [22] SAHARIA C, HO J, CHAN W, et al. Image Super-Resolution via Iterative Refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 4713-4726. [23] KIM G, KWON T, YE J C.DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 2416-2425. [24] KWON G, YE J C.Diffusion-Based Image Translation Using Disentangled Style and Content Representation[C/OL]. [2024-07-12].https://arxiv.org/pdf/2209.15264. [25] ZHANG Y X, HUANG N S, TANG F, et al. Inversion-Based Style Transfer with Diffusion Models // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 10146-10156. [26] HE K M, FAN H Q, WU Y X, et al. Momentum Contrast for Unsupervised Visual Representation Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 9726-9735. [27] CHEN T, KORNBLITH S, NOROUZI M, et al. A Simple Framework for Contrastive Learning of Visual Representations // Proc of the 37th International Conference on Machine Learning. San Diego, USA: JMLR, 2020: 1597-1607. [28] LIU Y C, SHAO Z R, HOFFMANN N.Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions[C/OL]. [2024-07-12]. https://arxiv.org/pdf/2112.05561. [29] LIN H Z, CHENG X, WU X Y, et al. CAT: Cross Attention in Vision Transformer // Proc of the IEEE International Conference on Multimedia and Expo. Washington, USA: IEEE, 2022. DOI: 10.1109/ICME52920.2022.9859720. [30] MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral Normalization for Generative Adversarial Networks[C/OL].[2024-07-12]. https://arxiv.org/pdf/1802.05957. [31] YANG D M, ZHANG T Z, LI B Q, et al. Underwater Image Translation via Multi-scale Generative Adversarial Network. Journal of Marine Science and Engineering, 2023, 11(10). DOI: 10.3390/jmse11101929. [32] LI C Y, GUO C L, REN W Q, et al. An Underwater Image Enhancement Benchmark Dataset and Beyond. IEEE Transactions on Image Processing, 2020, 29: 4376-4389. [33] HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 6629-6640. [34] BINKOWSKI M, SUTHERLAND D J, ARBEL M, et al. Demystifying MMD GANs[C/OL].[2024-07-12]. https://arxiv.org/pdf/1801.01401. [35] CHONG M J, FORSYTH D.Effectively Unbiased FID and Inception Score and Where to Find Them // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 6069-6078. [36] ZHOU J Q, LI Y, QIN H M, et al. Sonar Image Generation by MFA-CycleGAN for Boosting Underwater Object Detection of AUVs. IEEE Journal of Oceanic Engineering, 2024, 49(3): 905-919. [37] LEE H Y, TSENG H Y, HUANG J B, et al. Diverse Image-to-Image Translation via Disentangled Representations // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 36-52. [38] FU H, GONG M M, WANG C H, et al. Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2422-2431.