[1] WIGHTMAN R, TOUVRON H, JÉGOU H. ResNet Strikes Back: An Improved Training Procedure in Timm[C/OL].[2024-02-16]. https://arxiv.org/pdf/2110.00476.
[2] DING X H, ZHANG X Y, MA N N, et al. RepVGG: Making VGG-Style ConvNets Great Again // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 13728-13737.
[3] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778.
[4] SRIVASTAVA S, SHARMA G. OmniVec: Learning Robust Representations with Cross Modal Sharing // Proc of the IEEE/CVF Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2024: 1225-1237.
[5] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241.
[6] YE P, LI B P, CHEN T, et al. Efficient Joint-Dimensional Search with Solution Space Regularization for Real-Time Semantic Segmentation. International Journal of Computer Vision, 2022, 130(11): 2674-2694.
[7] WANG W H, DAI J F, CHEN Z, et al. InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2023: 14408-14419.
[8] ZHAO H S, SHI J P, QI X J, et al. Pyramid Scene Parsing Network // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6230-6239.
[9] GIRSHICK R. Fast R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1440-1448.
[10] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[11] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 779-788.
[12] WANG C C, HE W, NIE Y, et al. Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism[C/OL].[2024-02-16]. https://arxiv.org/pdf/2309.11331.
[13] HE K M, ZHANG X Y, REN S Q, et al. Identity Mappings in Deep Residual Networks // Proc of the 14th European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 630-645.
[14] BALDUZZI D, FREAN M, LEARY L, et al. The Shattered Gradients Problem: If ResNets Are the Answer, Then What Is the Question? // Proc of the 34th International Conference on Machine Learning. San Diego, USA: JMLR, 2017: 342-350.
[15] VEIT A, WILBER M, BELNGIE S. Residual Networks Behave Like Ensembles of Relatively Shallow Networks // Proc of the 30th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2016: 550-558.
[16] SUN T F, DING S F, GUO L L. Low-Degree Term First in ResNet, Its Variants and the Whole Neural Network Family. Neural Networks, 2022, 148: 155-165.
[17] CHANG S N, WANG P C, LUO H, et al. Revisiting Vision Trans-former from the View of Path Ensemble // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2023: 19832-19842.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 6000-6010.
[19] HAN D C, PAN X R, HAN Y Z, et al. Flatten Transformer: Vision Transformer Using Focused Linear Attention // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2023: 5938-5948.
[20] LI F, ZHANG H, XU H Z, et al. Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 3041-3050.
[21] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[22] YANG T, ZHU S J, CHEN C. GradAug: A New Regularization Method for Deep Neural Networks // Proc of the 34th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 14207-14218.
[23] HUANG G, SUN Y, LIU Z, et al. Deep Networks with Stochastic Depth // Proc of the 14th European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 646-661.
[24] PENG Y, TANG S J, LI B P, et al. Stimulative Training of Resi-dual Networks: A Social Psychology Perspective of Loafing // Proc of the 36th International Conference on Neural Information Proce-ssing Systems. Cambridge, USA: MIT Press, 2022: 3596-3608.
[25] TANG S J, YE P, LI B P, et al. Boosting Residual Networks with Group Knowledge. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(6): 5162-5170.
[26] CHO J H, HARIHARAN B. On the Efficacy of Knowledge Disti-llation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 4793-4801.
[27] MIRZADEH S I, FARAJTABAR M, LI N, et al. Improved Know-ledge Distillation via Teacher Assistant. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 5191-5198.
[28] LI X C, FAN W S, SONG S M, et al. Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again // Proc of the 36th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press,2022: 3830-3842.
[29] LAN X, ZHU X T, GONG S G. Knowledge Distillation by On-the-Fly Native Ensemble // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2018: 7528-7538.
[30] WU G L, GONG S G. Peer Collaborative Learning for Online Knowledge Distillation. Proceedings of the AAAI Conference on Arti-ficial Intelligence, 2021, 35(12): 10302-10310.
[31] CHEN D F, MEI J P, WANG C, et al. Online Knowledge Distillation with Diverse Peers. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 3430-3437.
[32] YANG C G, AN Z L, ZHOU H L, et al. Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 10212-10227.
[33] ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: Beyond Empirical Risk Minimization[C/OL].[2024-02-16]. https://arxiv.org/pdf/1710.09412.
[34] YUN S D, HAN D, CHUN S, et al. CutMix: Regularization Stra-tegy to Train Strong Classifiers with Localizable Features // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 6022-6031.
[35] WALAWALKAR D, SHEN Z Q, LIU Z C, et al. Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Washington, USA: IEEE, 2020. DOI: 10.1109/ICASSP40776.2020.9053994.
[36] KIM G, HAN D K, KO H. SpecMix: A Mixed Sample Data Augmentation Method for Training with Time-Frequency Domain Features[C/OL]. [2024-02-16]. https://arxiv.org/abs/2108.03020.
[37] HE K M, FAN H Q, WU Y X, et al. Momentum Contrast for Unsupervised Visual Representation Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 9726-9735.
[38] ARNSTRÖM D, BEMPORAD A, AXEHILL D. A Dual Active-Set Solver for Embedded Quadratic Programming Using Recursive LDLT Updates. IEEE Transactions on Automatic Control, 2022, 67(8): 4362-4369.
[39] DOMAHIDI A, CHU E, BOYD S. ECOS: An SOCP Solver for Embedded Systems // Proc of the European Control Conference. Washington, USA: IEEE, 2013: 2071-2076.
[40] PANDALA A G, DING Y R, PARK H W. qpSWIFT: A Real-Time Sparse Quadratic Program Solver for Robotic Applications. IEEE Robotics and Automation Letters, 2019, 4(4): 3355-3362.
[41] AMOS B, KOLTER J Z. OptNet: Differentiable Optimization as a Layer in Neural Networks // Proc of the 34th International Confe-rence on Machine Learning. San Diego, USA: JMLR, 2017, 70: 136-145.
[42] KRIZHEVSKY A, HINTON G. Learning Multiple Layers of Features from Tiny Images[C/OL]. [2024-02-16]. http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
[43] DENG J, DANG W, SOCHER R, et al. ImageNet: A Large-Scale Hierarchical Image Database // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2009: 248-255.
[44] ZAGORUYKO S, KOMODAKIS N. Wide Residual Networks[C/OL]. [2024-02-16]. https://bmva-archive.org.uk/bmvc/2016/papers/paper087/paper087.pdf.
[45] HOWARD A, SANDLER M, CHEN B, et al. Searching for Mobile-netv3 // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1314-1324.
[46] YUN S, PARK J, LEE K, et al. Regularizing Class-Wise Predictions via Self-Knowledge Distillation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 13873-13882.
[47] KIM K, JI B, YOON D, et al. Self-Knowledge Distillation with Progressive Refinement of Targets // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 6547-6556.
[48] DENG X, ZHANG Z F. Learning with Retrospection. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(8): 7201-7209.
[49] SHEN Y Q, XU L W, YANG Y Z, et al. Self-Distillation from the Last Mini-Batch for Consistency Regularization // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 11933-11942.
[50] ZHOU B L, KHOSLA A, LAPEDRIZA A, et al. Learning Deep Features for Discriminative Localization // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2921-2929.
[51] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 618-626. |