|
|
|
| Collaborative Inference Acceleration Strategy via Interleaved Operator Partitioning for Edge Intelligence |
| LIU Zhibang1,2, WU Fan1,2, XU Chaonong1,2, ZHANG Zixiao1,2, MA Dan1,2 |
1. College of Artificial Intelligence, China University of Petro-leum, Beijing 102249; 2. Beijing Key Laboratory of Petroleum Data Mining, China University of Petroleum, Beijing 102249 |
|
|
|
|
Abstract Collaborative inference is an effective method for deploying models and accelerating inference on resource-constrained edge devices. However, the existing operator partitioning strategies still suffer from high inter-device communication overhead. To solve this problem, an interleaved operator partitioning(IOP) collaborative inference acceleration strategy for edge intelligence is proposed. The core mechanism is to partition adjacent operators along the input and output channel dimensions, respectively. By matching the number of channels between consecutive operators, the concatenation of output activations is reduced, and thereby the time overhead of collaborative inference is decreased. First, the computation and communication costs of devices are modeled based on operator information in the model. An integer programming model is established to minimize the total inference time. Second, a heuristic operator pairing algorithm is designed and adjacent operators are enumerated in a forward order. The inference time overhead of IOP and traditional output channel partitioning(OCP) is compared. The operator pair with the highest benefit is executed. Finally, interleaved partitioning and distributed deployment are applied to the selected operator pairs. Experiments demonstrate that IOP achieves superior performance in terms of inference time, memory usage, and energy consumption, while maintaining robustness under sudden link fluctuations.
|
|
Received: 23 June 2025
|
|
|
| Fund:National Key Research and Development Program of China(No.2022YFB4501600) |
|
Corresponding Authors:
XU Chaonong, Ph.D., associate professor.His research interests include edge Intelligence, Internet of Things, and embedded systems.
|
About author:: LIU Zhibang, Ph.D.candidate.His research interests include edge intelligence and cooperative inference. WU Fan, Ph.D.candidate.His research interests include wireless communications and Internet of Things. ZHANG Zixiao, Master student.Her research interests include edge intelligence and embedded systems. MA Dan, Master student.Her research interests include edge intelligence and Internet of Things. |
|
|
|
[1] QU K G, ZHUANG W H, WU W, et al. Stochastic Cumulative DNN Inference with RL-Aided Adaptive IoT Device-Edge Collaboration. IEEE Internet of Things Journal, 2023, 10(20): 18000-18015. [2] SHAO J W, ZHANG J.Communication-Computation Trade-Off in Resource-Constrained Edge Inference. IEEE Communications Magazine, 2021, 58(12): 20-26. [3] SOORI M, AREZOO B, DASTRES R.Internet of Things for Smart Factories in Industry 4.0: A Review. Internet of Things and Cyber-Physical Systems, 2023, 3: 192-204. [4] REN W Q, QU Y B, DONG C, et al. A Survey on Collaborative DNN Inference for Edge Intelligence. Machine Intelligence Research, 2023, 20(3): 370-395. [5] 赵婵婵,吕飞,石宝,等.面向边缘智能的协同推理方法研究综述.计算机工程与应用, 2025, 61(3): 1-20. (ZHAO C C, LÜ F, SHI B, et al. Review of Collaborative Infe-rence Methods for Edge Intelligence. Computer Engineering and Applications, 2025, 61(3): 1-20.) [6] 吉根林,戚小莎,王嘉琦.基于深度学习的视频异常检测研究综述.模式识别与人工智能, 2024, 37(2): 128-143. (JI G L, QI X S, WANG J Q.Review of Deep Learning-Based Vi-deo Anomaly Detection. Pattern Recognition and Artificial Intelligence, 2024, 37(2): 128-143.) [7] HUANG Y K, QIAO X Q, DUSTDAR S, et al. Toward Decentra-lized and Collaborative Deep Learning Inference for Intelligent IoT Devices. IEEE Network, 2022, 36(1): 59-68. [8] HU S S, LI M S, GAO J, et al. Adaptive Device-Edge Collaboration on DNN Inference in AIoT: A Digital-Twin-Assisted Approach. IEEE Internet of Things Journal, 2024, 11(7): 12893-12908. [9] WU W, YANG P, ZHANG W T, et al. Accuracy-Guaranteed Co-llaborative DNN Inference in Industrial IoT via Deep Reinforcement Learning. IEEE Transactions on Industrial Informatics, 2021, 17(7): 4988-4998. [10] 李智灏,李俊杰,崔苗,等.空地协同移动边缘计算系统的资源分配和轨迹优化.计算机应用研究, 2024, 41(12): 3807-3813. (LI Z H, LI J J, CUI M, et al. Resource Allocation and Trajectory Optimization for Air-Ground Cooperative Mobile Edge Computing Systems. Application Research of Computers, 2024, 41(12): 3807-3813.) [11] NAGARAJU C, RAMESH Y, MOHAN C K.A Data Parallel Approach for Distributed Neural Networks to Achieve Faster Convergence. Proceedings of SPIE, 2024, 13072. DOI: 10.1117/12.3023413. [12] MENCZER A, LEGEZA O.Massively Parallel Tensor Network State Algorithms on Hybrid CPU-GPU Based Architectures. Journal of Chemical Theory and Computation, 2025, 21(4): 1572-1587. [13] SHI H J, ZHENG W C, LIU Z F, et al. Automatic Pipeline Para-llelism: A Parallel Inference Framework for Deep Learning Applications in 6G Mobile Communication Systems. IEEE Journal on Selected Areas in Communications, 2023, 41(7): 2041-2056. [14] 袁晓彤,张煦尧,刘希,等.面向开放环境的机器学习理论研究进展.模式识别与人工智能, 2023, 36(12): 1059-1071. (YUAN X T, ZHANG X Y, LIU X, et al. Research Advances on Theory of Open-Environment Machine Learning. Pattern Recognition and Artificial Intelligence, 2023, 36(12): 1059-1071.) [15] WAN L J, ZHENG W H, YUAN X P.Efficient Inter-Device Task Scheduling Schemes for Multi-device Co-processing of Data-Parallel Kernels on Heterogeneous Systems. IEEE Access, 2021, 9: 59968-59978. [16] LI N, IOSIFIDIS A, ZHANG Q.Distributed Deep Learning Infe-rence Acceleration Using Seamless Collaboration in Edge Computing// Proc of the IEEE International Conference on Communications. Washington, USA: IEEE, 2022: 3667-3672. [17] ZHANG H, LI Y, DENG Z J, et al. AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning[C/OL].[2025-05-17]. https://proceedings.neurips.cc/paper/2020/file/0a2298a72858d90d5c4b4fee954b6896-Paper.pdf. [18] DONG Z T, LI N, IOSIFIDIS A, et al. Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing// Proc of the 27th European Wireless Conference. Washington, USA: IEEE, 2022: 72-77. [19] HSU C C, YANG C K, KUO J J, et al. Cooperative Convolutional Neural Network Deployment over Mobile Networks// Proc of the IEEE International Conference on Communications. Washington, USA: IEEE, 2020. DOI: 10.1109/ICC40277.2020.9149094. [20] MOHAMMED T, JOE-WONG C, BABBAR R, et al. Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloa-ding// Proc of the IEEE Conference on Computer Communications. Washington, USA: IEEE, 2020, 854-863. [21] MAO J C, CHEN X, NIXON K W, et al. MoDNN: Local Distri-buted Mobile Computing System for Deep Neural Network// Proc of the Design, Automation & Test in Europe Conference & Exhibition. Washington, USA: IEEE, 2017: 1396-1401. [22] ZENG L K, CHEN X, ZHOU Z, et al. CoEdge: Cooperative DNN Inference with Adaptive Workload Partitioning over Heterogeneous Edge Devices. IEEE/ACM Transactions on Networking, 2021, 29(2): 595-608. [23] JIA Z H, LIN S, QI C R, et al. Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks// Proc of the 35th International Conference on Machine Learning. San Diego, USA: JMLR, 2018: 2279-2288. [24] HADIDI R, CAO J S, RYOO M S, et al. Toward Collaborative Inferencing of Deep Neural Networks on Internet-of-Things Devices. IEEE Internet of Things Journal, 2020, 7(6): 4950-4960. [25] WANG M J, HUANG C C, LI J Y.Supporting Very Large Models Using Automatic Dataflow Graph Partitioning// Proc of the 14th EuroSys Conference. New York, USA: ACM, 2019. DOI: 10.1145/3302424.3303953. [26] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Cla-ssification with Deep Convolutional Neural Networks// Proc of the 25th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2012: 1097-1105. [27] KIM Y, KIM J, CHAE D, et al. μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Proce-ssor-Friendly Quantization// Proc of the 14th EuroSys Conference. New York, USA: ACM, 2019. DOI: 10.1145/3302424.3303950. [28] JIA F C, ZHANG D Y, CAO T, et al. CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices// Proc of the 20th Annual Conference on Mobile Systems, Applications, and Services. New York, USA: ACM, 2022: 209-221. [29] HUANG Y P, CHENG Y L, BAPNA A, et al. GPipe: Efficient Training of Giant Neural Networks Using Pipeline Parallelism// Proc of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 103-112. [30] NARAYANAN D, HARLAP A, PHANISHAYEE A, et al. PipeDream: Generalized Pipeline Parallelism for DNN Training// Proc of the 27th ACM Symposium on Operating Systems Principles. New York, USA: ACM, 2019. DOI: 10.1145/3341301.3359646. [31] HU Y, IMES C, ZHAO X N, et al. PipeEdge: Pipeline Paralle-lism for Large-Scale Model Inference on Heterogeneous Edge Devices// Proc of the 25th Euromicro Conference on Digital System Design. Washington, USA: IEEE, 2022: 298-307. [32] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Lear-ning Applied to Document Recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [33] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2025-05-17]. https://arxiv.org/pdf/1409.1556. [34] DENG L.The MNIST Database of Handwritten Digit Images for Machine Learning Research. IEEE Signal Processing Magazine, 2012, 29(6): 141-142. [35] KRIZHEVSKY A, HINTON G. Learning Multiple Layers of Features from Tiny Images[C/OL]. [2025-05-17]. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf. [36] DENG J, DONG W, SOCHER R, et al. ImageNet: A Large-Scale Hierarchical Image Database// Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2009: 248-255. [37] LIU Z W, LUO P, WANG X G, et al. Deep Learning Face Attri-butes in the Wild// Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 3730-3738. [38] CODELLA N C F, GUTMAN D, CELEBI M E, et al. Skin Lesion Analysis toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging(ISBI), Hosted by the International Skin Imaging Collaboration(ISIC) // Proc of the IEEE 15th International Symposium on Biomedical Imaging. Wa-shington, USA: IEEE, 2018: 168-172. [39] BOSSARD L, GUILLAUMIN M, VAN GOOL L.Food-101-Mining Discriminative Components with Random Forests// Proc of the 13th European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 446-461. [40] LIN T Y, MAIRE M, BELONGIE S, et al.Microsoft COCO: Common Objects in Context// Proc of the 13th European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2014: 740-755. [41] PARKHI O M, VEDALDI A, ZISSERMAN A, et al. Cats and Dogs// Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2012: 3498-3505. |
|
|
|