模式识别与人工智能
Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence
22 Judgement and Disposal of Academic Misconduct Article
22 Copyright Transfer Agreement
22 Proof of Confidentiality
22 Requirements for Electronic Version
More....
22 Chinese Association of Automation
22 National ResearchCenter for Intelligent Computing System
22 Institute of Intelligent Machines,Chinese Academy of Sciences
More....
 
 
2022 Vol.35 Issue.7, Published 2022-07-25

Deep Learning Based Image Processing   
   
Deep Learning Based Image Processing
575 Uneven Hazy Image Dehazing Based on Transmitted Attention Mechanism
WANG Keping, DUAN Yumeng, YANG Yi, FEI Shumin
It is difficult to model accurate uneven hazy image and solve residual problems during dehazing process. Therefore, an uneven hazy image dehazing method based on transmitted attention mechanism is proposed in this paper. Aiming at the heterogeneity of haze distribution, the transmitted attentions mechanism is designed in the network. The weight information in different modules can flow and cooperate to target and deal with the noise in the uneven hazy image. To reduce the loss of detail information caused by the common deep convolution, sparse smoothed dilated convolution is built to extract image features. Consequently, the receptive field is larger with more details retained. Finally, a lightweight residual block is utilized in parallel to supplement the color and detail information for the reconstructed image. Compared with mainstream methods, experiments on the uneven hazy image datasets and synthetic hazy image datasets show that the proposed method holds the advantages in subjective effects and objective evaluations.
2022 Vol. 35 (7): 575-588 [Abstract] ( 825 ) [HTML 1KB] [ PDF 4232KB] ( 579 )
589 Face Sketch Synthesis Based on Lmser-in-Lmser Bidirectional Network
SHENG Qingjie, SU Ruidan, TU Shikui, XÜ Lei
Face sketch synthesis aims to transform a face photo into a face sketch. For existing methods, the generated sketches are over-smooth and the pre-training on additional large scale datasets is required. In this paper, a deep bidirectional network based on the least mean square error reconstruction(Lmser) self-organizing network is constructed with a feature of duality in paired neurons(DPN) to generate face sketch. DPN is realized with bidirectional shortcuts between encoder and decoder. It helps transfer features learn from different layers of the Lmser to improve texture details in synthesized sketch. Another sketch-to-photo mapping network is built by a complement Lmser with converse direction sharing the same structure. The bidirectional mappings form an outer Lmser network with DPN enforce consistency between the paired blocks in a global manner. Experiments on benchmark datasets demonstrate that the performance of the proposed method is superior, and it is more applicable and does not need pre-training on additional datasets.
2022 Vol. 35 (7): 589-601 [Abstract] ( 521 ) [HTML 1KB] [ PDF 4720KB] ( 351 )
602 Combining Visual Saliency and Attention Mechanism for Low-Light Image Enhancement
SHANG Xiaoke, AN Nan, SHANG Jingjie, ZHANG Shaomin, DING Nai
Low-light image enhancement is the foundation and core step for solving various visual analysis tasks in low-light environments. However, the existing mainstream methods generally fail to characterize structural information effectively, resulting in some problems, such as unbalanced exposure and color distortion. Therefore, a low-light image enhancement network combining visual saliency and attention mechanism is proposed in this paper. A low-light image enhancement framework based on attention mechanism is firstly constructed by introducing attention mechanism with consideration of both local details and global information to characterize the color information in the enhancement results correctly. To achieve refined construction, a progressive process is designed to refine the enhancement process in stages following the concept of gradual optimization from coarse to fine. The feature fusion module guided by visual saliency is introduced to enhance the ability of the network to perceive salient objects in images and improve the expression of structural information from a perspective of being more in line with visual cognitive needs. Thus, noise/artifacts and other problems are avoided effectively. Experimentsshow that the proposed method solves the problems of unbalanced exposure and color distortion effectively with superior performance.
2022 Vol. 35 (7): 602-613 [Abstract] ( 652 ) [HTML 1KB] [ PDF 4083KB] ( 378 )
614 Scene Text Removal Based on Multi-scale Attention Mechanism
HE Ping, ZHANG Heng, LIU Chenglin
Scene text removal is of great significance for privacy protection and image editing in image communication. However, existing scene text removal models are insufficient in extracting robust features for images with complex background and multi-scale texts, resulting in incomplete text detection and background repair. To solve this problem, a scene text removal framework based on multi-scale attention mechanism is proposed for robust background repair and text detection. The proposed framework is mainly composed of background repair network and text detection network, sharing a backbone network. In the background repair network, a texture adaptive module is designed to encode the channel/spatial features and adaptively integrate local/global features, effectively repairing shadow parts in text reconstruction. To improve text detection, a context aware module is designed to learn the discriminative features between texts and non-texts in the image. Besides, to enhance the receptive field of the network and improve the removal of multi-scale texts, a multi-scale feature loss function is designed to optimize the background repair and text detection modules. Experimental results on SCUT-SYN and SCUT-EnsText datasets show that the proposed method can achieve the state-of-the-art performance in text removal.
2022 Vol. 35 (7): 614-624 [Abstract] ( 756 ) [HTML 1KB] [ PDF 5454KB] ( 331 )
625 Lightweight Image Super-Resolution Network Based on Regional Complementary Attention and Multi-dimensional Attention
ZHOU Dengwen, WANG Wanjun, MA Yu, GAO Dandan
Lightweight convolution neural networks embody the advantages in small parameters, low computational cost and fast reasoning speed. However, the performance of the networks is greatly limited. To improve the performance of the lightweight image super-resolution network, a lightweight image super-resolution network based on regional complementary attention and multi-dimensional attention is proposed. Its basic component,dual branch multiple interactive residual block, can fuse multi-scale features effectively. To improve the utilization and expression ability of features, effective lightweight region complementary attention is designed to make the information in different regions of the feature map complement each other. Multi-dimensional attention is designed to model the dependencies between pixels in channel and spatial dimensions. Experimental results demonstrate that the proposed network is superior to the current lightweight super-resolution methods in complexity and performance balance.
2022 Vol. 35 (7): 625-636 [Abstract] ( 486 ) [HTML 1KB] [ PDF 2607KB] ( 323 )
637 Multi-field Features Representation Based Colorization of Grayscale Images
LI Hong'an, ZHENG Qiaoxue, MA Tian, ZHANG Jing, LI Zhanli, KANG Baosheng
Image colorization improves image quality by predicting color information of gray-scale images. Although the grayscale images can be colored automatically by deep learning methods, the colorization quality of targets with different scales in the images is not satifactory. Especially, the existing colorizing methods is confronted with problems of color overflow, mis-coloring and inconsistent image colors, while dealing with complex objects and small target objects. To address these problems, a method for image colorization of multi-field features representation is proposed in the paper. Firstly, the multi-field feature representation block(MFRB) is designed and combined with the upgraded U-Net to acquire multi-field feature representation U-Net. Then, a grayscale image is input into the U-Net and the color image is obtained by adversarial training with PatchGAN. Finally, the VGG-19 network is employed to compute the perceptual loss of pictures at different scales to enhance the general consistency of the image colorization results. Experimental results on six distinct datasets demonstrate that the proposed method successfully enhances the quality of colorized images and creates color images with richer colors and more consistent tones. The results of the proposed method outperform the main colorization algorithms in both quantitative assessment and subjective perception.
2022 Vol. 35 (7): 637-648 [Abstract] ( 594 ) [HTML 1KB] [ PDF 3359KB] ( 239 )
649 Thyroid Nodule Segmentation Model Integrating Global Reasoning and MLP Architecture
LI Binrong, XIE Jun, LI Gang, XU Xinying, LAN Zijun
To address the problems of large noise interference in ultrasound images and variable nodule size and high computational complexity of the existing thyroid nodule segmentation methods, a segmentation model combining global reasoning and multi-layer perception(MLP) architecture is proposed. The model is based on the axial shift MLP module, and hence the interaction between different spatial location features is realized with less computational complexity. The end-to-end global reasoning unit is integrated into the encoder and the global information interaction is conducted based on graph convolutional networks to alleviate the interference of image noise. The pyramid feature layer is introduced into the decoder and multi-scale feature interaction is performed to deal with the problem of variable nodule size. Experimental results on DDIT datasets show that the proposed model yields better performance, and it can be applied to other medical image segmentation task, such as breast nodule segmentation and retinal vessel segmentation.
2022 Vol. 35 (7): 649-660 [Abstract] ( 487 ) [HTML 1KB] [ PDF 1068KB] ( 426 )
661 Multi-stage Image Fusion Method Based on Differential Dual-Branch Encoder
HONG Yulu, WU Xiaojun, XU Tianyang
In the existing infrared and visible image fusion methods, the details of the fused image are lost seriously and the visual effect is poor. Aiming at the problems, a multi-stage image fusion method based on differential dual-branch encoder is proposed. The features of multi-modal images are extracted by two encoders with different network structures to enhance the diversity of features. A multi-stage fusion strategy is designed to achieve refined image fusion. Firstly, primary fusion is performed on the differential features extracted by the two encoding branches in the differential dual-branch encoder. Then, mid-level fusion on the saliency features of the multi-modal images is conducted in the fusion stage. Finally, the long-range lateral connections are adopted to transmit shallow features of the differential dual-branch encoder implemented to the decoder and guide the fusion process and the image reconstruction simultaneously. Experimental results show the proposed method enhances the detailed information of the fused images and achieves better performance in both visual effect and objective evaluation.
2022 Vol. 35 (7): 661-670 [Abstract] ( 485 ) [HTML 1KB] [ PDF 2952KB] ( 286 )
模式识别与人工智能
 

Supervised by
China Association for Science and Technology
Sponsored by
Chinese Association of Automation
NationalResearchCenter for Intelligent Computing System
Institute of Intelligent Machines, Chinese Academy of Sciences
Published by
Science Press
 
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn