|
|
|
| Swin Transformer-Based Skin Disease Segmentation Network via Dynamic Agent Bottleneck and Multi-scale Dilated Attention |
| SUN Lin1, XUE Hongke1, LÜ Juan1 |
| 1. College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin 300457 |
|
|
|
|
Abstract Accurate segmentation of skin lesion areas is critical for the diagnosis and treatment of dermatological diseases. To address the challenges posed by diverse lesion morphologies, high similarity between lesions and surrounding tissues, and blurred boundaries in existing networks, a Swin Transformer-based skin disease segmentation network via dynamic agent bottleneck and multi-scale dilated attention(STNDA) is proposed. First, a Swin Transformer-based backbone network is constructed to overcome the limitations of traditional convolutions in capturing global context. By leveraging the hierarchical architecture of the network, multi-scale feature fusion is achieved, and long-range dependencies are established to enhance the network ability to extract semantic features from skin lesions with varying morphologies. Second, to improve the feature expression ability of STNDA, a dynamic agent bottleneck module is designed. The module adaptively generates agent vectors and positional biases based on input features, allowing the network to dynamically adjust its focus on local receptive fields. Thus, the segmentation errors caused by the interference from highly similar skin tissues are further mitigated. Finally, a multi-scale dilated attention fusion module is proposed to enhance edge perception ability of the network. A multi-branch parallel architecture with multi-scale dilated convolutions is designed by integrating with spatial-channel attention mechanisms to improve the network sensitivity to lesion boundaries. Experiments on ISIC2017, PH2 and ISIC2018 datasets demonstrate STNDA achieves superior performance, thereby confirming its effectiveness.
|
|
Received: 06 June 2025
|
|
|
| Fund:National Natural Science Foundation of China(No.62576245,62076089), Natural Science Foundation of Tianjin(No.24JCYBJC00890) |
|
Corresponding Authors:
LÜ Juan, Ph.D., associate professor. Her research interests include machine learning and deep learning.
|
About author:: SUN Lin, Ph.D., professor. His research interests include granular computing, graph machine learning and data mining. XUE Hongke, Master student. His research interests include machine learning and deep learning. |
|
|
|
[1] 王凌翔,张莉. 面向皮肤镜图像识别的内卷胶囊网络. 模式识别与人工智能, 2024, 37(11): 986-998. (WANG L X, ZHANG L. Involutional Capsule Network for Dermoscopy Image Recognition. Pattern Recognition and Artificial Intelligence, 2024, 37(11): 986-998.) [2] 蒋清婷,叶海良,曹飞龙. 基于三路径网络的医学图像分割方法. 模式识别与人工智能, 2024, 37(1): 1-12. (JIANG Q T, YE H L, CAO F L. Medical Image Segmentation Method with Triplet-Path Network. Pattern Recognition and Artificial Intelligence, 2024, 37(1): 1-12.) [3] LI S C, ZHANG L Z, GUO H B, et al. CSA-FCN: Channel-and Spatial-Gated Attention Mechanism Based Fully Complex-Valued Neural Network for System Matrix Calibration in Magnetic Particle Imaging. IEEE Transactions on Computational Imaging, 2025, 11: 65-76. [4] ZHOU X Y, JIN G, LIU Y, et al. Multi-resolution Based Dual-Channel UNet with Cross Clique for Medical Image Dense Prediction. Expert Systems with Applications, 2025, 276. DOI: 10.1016/j.eswa.2025.127190. [5] ZHONG J H, TIAN W H, XIE Y L, et al. PMFSNet: Polarized Multi-scale Feature Self-Attention Network for Lightweight Medical Image Segmentation. Computer Methods and Programs in Biomedicine, 2025, 261. DOI: 10.1016/j.cmpb.2025.108611. [6] ZHENG X Y, HUANG Y, LIU W S, et al. LW-XNet for Segmentation and Classification of skin Lesions from Dermoscopy Images. Expert Systems with Applications, 2024, 255(D). DOI: 10.1016/j.eswa.2024.124826. [7] LIU Y T, ZHU H J, LIU M T, et al.. Rolling-Unet: Revitalizing MLP's Ability to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation // Proc of the 38th AAAI Conference on Artificial Intelligence and 36th Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: AAAI Press, 2024: 3819-3827. [8] LI G J, HUANG Q H, WANG W, et al. Selective and Multi-scale Fusion Mamba for Medical Image Segmentation. Expert Systems with Applications, 2025, 261. DOI: 10.1016/j.eswa.2024.125518. [9] 申华磊,上官国庆,袁成雨,等. MCNet:融合多层感知机和卷积的轻量级病变区域分割网络. 河南师范大学学报(自然科学版), 2025, 53(3): 96-103. (SHEN H L, SHANGGUAN G Q, YUAN C Y, et al. MCNet: A Lightweight Lesion Segmentation Network Integrating Multilayer Perceptrons and Convolutions. Journal of Henan Normal University(Natural Science Edition), 2025, 53(3): 96-103.) [10] LIU X, GAO P, YU T, et al. CSWin-UNet: Transformer UNet with Cross-Shaped Windows for Medical Image Segmentation. Information Fusion, 2025, 113. DOI: 10.1016/j.inffus.2024.102634. [11] GAO Y F, ZHANG S C, SHI L, et al. Collaborative Transformer U-Shaped Network for Medical Image Segmentation. Applied Soft Computing, 2025, 173. DOI: 10.1016/j.asoc.2025.112841. [12] PATIL S S, RAMTEKE M, RATHORE A S. Permutation Invariant Self-Attention Infused U-Shaped Transformer for Medical Image Segmentation. Neurocomputing, 2025, 625. DOI: 10.1016/j.neucom.2025.129577. [13] WU R K, LIU Y H, LIANG P C, et al. H-Vmunet: High-Order Vision Mamba UNet for Medical Image Segmentation. Neurocomputing, 2025, 624. DOI: 10.1016/j.neucom.2025.129447. [14] WU R K, LIANG P C, HUANG X, et al. MHorUNet: High-Order Spatial Interaction UNet for Skin Lesion Segmentation. Biomedical Signal Processing and Control, 2024, 88. DOI: 10.1016/j.bspc.2023.105517. [15] CHEN W, MU Q, QI J. TrUNet: Dual-Branch Network by Fusing CNN and Transformer for Skin Lesion Segmentation. IEEE Access, 2024, 12: 144174-144185. [16] SHU X, WANG J S, ZHANG A P, et al. CSCA U-Net: A Cha-nnel and Space Compound Attention CNN for Medical Image Segmentation. Artificial Intelligence in Medicine, 2024, 150. DOI: 10.1016/j.artmed.2024.102800. [17] CAI Z F, FAN Y L, ZHU M W, et al. Ultra-Lightweight Network for Medical Image Segmentation Inspired by Bio-Visual Interaction. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 35(4): 3486-3497. [18] BALRAJ K, RAMTEKE M, MITTAL S, et al. MADR-Net: Multi-level Attention Dilated Residual Neural Network for Segmentation of Medical Images. Scientific Reports, 2024, 14. DOI: 10.1038/s41598-024-63538-2. [19] 王斯豪,张笃振,杨昌昌. 基于双路径注意力机制和多尺度信息融合的皮肤病灶图像分割. 计算机应用, 2025. DOI: 10.11772/j.issn.1001-9081.2024111669. (WANG S H, ZHANG D Z, YANG C C. Skin Lesion Image Segmentation Based on Dual-Path Attention Mechanism and Multi-scale Information Fusion. Journal of Computer Applications, 2025. DOI: 10.11772/j.issn.1001-9081.2024111669.) [20] 卢力玮,汪洋,柳杨,等. 基于多尺度通道融合注意力的皮肤癌U-Net分割模型. 信息与控制, 2025. DOI: 10.13976/j.cnki.xk.2024.3231. (LU L W, WANG Y, LIU Y, et al. A U-Net Segmentation Model for Skin Cancer Based on Multi-scale Channel Fusion Attention. Information and Control, 2025. DOI: 10.13976/j.cnki.xk.2024.3231.) [21] LIU Z, HU H, LIN Y T, et al. Swin Transformer V2: Scaling up Capacity and Resolution // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 11999-12009. [22] DIN S, MOURAD O, SERPEDIN E. LSCS-Net: A Lightweight Skin Cancer Segmentation Network with Densely Connected Multi-Rate Atrous Convolution. Computers in Biology and Medicine, 2024. DOI: 10.1016/j.compbiomed.2024.108303. [23] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 9992-10002. [24] 赵宏,王枭. 基于Swin-Transformer的黑色素瘤图像病灶分割研究. 计算机工程, 2024, 50(8): 249-258. (ZHAO H, WANG X. Study on Lesion Segmentation of Melanoma Images Based on Swin-Transformer. Computer Engineering, 2024, 50(8): 249-258.) [25] YI L, WU Y, TOLBA A, et al. SA-MLP-Mixer: A Compact All-MLP Deep Neural Net Architecture for UAV Navigation in Indoor Environments. IEEE Internet of Things Journal, 2024, 11(12): 21359-21371. [26] FAROOQ H, ZAFAR Z, SAADAT A, et al. LSSF-Net: Lightweight Segmentation with Self-Awareness, Spatial Attention, and Focal Modulation. Artificial Intelligence in Medicine, 2024, 158. DOI: 10.1016/j.artmed.2024.103012. [27] ESPEJO-GARCIA B, PANOUTSOPOULOS H, ANASTASIOU E, et al. Top-Tuning on Transformers and Data Augmentation Transferring for Boosting the Performance of Weed Identification. Computers and Electronics in Agriculture, 2023, 211. DOI: 10.1016/j.compag.2023.108055. [28] HAN D C, YE T Z, HAN Y Z, et al. Agent Attention: On the Integration of Softmax and Linear Attention, // Proc of the European Conference on Computer Vision. Berlin Germany: Springer, 2024: 124-140. [29] 夏淑芳,袁彬,瞿中. 基于注意力机制和深层特征优化的混凝土路面裂缝检测. 计算机科学, 2024, 51(11): 198-204. (XIA S F, YUAN B, QU Z. Crack Detection of Concrete Pavement Based on Attention Mechanism and Deep Feature Optimization. Computer Science, 2024, 51(11): 198-204.) [30] CODELLA N C F, GUTMAN D, CELEBI M E, et al. Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging(ISBI), Hosted by the International Skin Imaging Collaboration(ISIC) // Proc of the IEEE 15th International Symposium on Biomedical Imaging. Wa-shington, USA: USA, 2018: 168-172. [31] MENDONÇA T, FERREIRA P M, MARQUES J S, et al. PH2-A Dermoscopic Image Database for Research and Benchmarking // Proc of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Washington, USA: USA, 2013: 5437-5440. [32] BAI Y, ZHOU H, ZHU H J, et al. A Novel Approach to Skin Di-sease Segmentation Using a Visual Selective State Spatial Model with Integrated Spatial Constraints. Scientific Reports, 2025, 15. DOI: 10.1038/s41598-025-85301-x. [33] YAN L F, LIU D W, XIANG Q, et al. PSP Net-Based Automatic Segmentation Network Model for Prostate Magnetic Resonance Imaging. Computer Methods and Programs in Biomedicine, 2021, 207. DOI: 10.1016/j.cmpb.2021.106211. [34] WANG C S, DU P F, WU H R, et al. A Cucumber Leaf Disease Severity Classification Method Based on the Fusion of DeepLabV3+ and U-Net. Computers and Electronics in Agriculture, 2021, 189. DOI: 10.1016/j.compag.2021.106373. [35] WU H S, LIANG C X, LIU M S, et al. Optimized HRNet for Ima-ge Semantic Segmentation. Expert Systems with Applications, 2021, 174. DOI: 10.1016/j.eswa.2020.114532. |
|
|
|