[1] LECUN Y, BENGIO Y. Convolutional Networks for Images, Speech, and Time-Series in the Handbook of Brain Theory and Neural Networks. Cambridge, USA: The MIT Press, 1995.
[2] KRIZHEVSKY A, SUTSKEVER L, HINTON G E. Imagenet Classification with Deep Convolutional Neural Networks[C/OL].[2017-11-20]. https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
[3] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1409.1556.pdf.
[4] SZEGEDY C, LIU W, JIA Y Q, et al. Going Deeper with Convolutions[C/OL]. [2017-11-20]. https://arxiv.org/pdf/1409.4842.pdf.
[5] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778.
[6] DO M N, VETTERLI M. The Finite Ridgelet Transform for Image Representation. IEEE Transactions on Image Processing, 2003, 12(1): 16-28.
[7] CANDE/S E J, DONOHO D L. Curvelets, Multiresolution Representation, and Scaling Laws[J/OL]. [2017-11-20]. http://statweb.stanford.edu/~candes/papers/SPIE_Curvelets.pdf.
[8] DO M N, VETTERLI M. The Contourlet Transform: An Efficient Directional Multiresolution Image Representation. IEEE Transactions on Image Processing, 2005, 14(12): 2091-2106.
[9] DONOHO D L. Wedgelets: Nearly Minimax Estimation of Edges. The Annals of Statistics, 1999, 27(3): 859-897.
[10] LE PENNEC E, MALLAT S. Sparse Geometric Image Representations with Bandelets. IEEE Transactions on Image Processing, 2005, 14(4): 423-438.
[11] VELISAVLJEVIC V, BEFERULL-LOZANO B, VETTERLI M, et al. Directionlets: Anisotropic Multidirectional Representation with Separable Filtering. IEEE Transactions on Image Processing, 2006, 15(7): 1916-1933.
[12] HAMMOND D K, VANDERGHEYNST P, GRIBONVAL R. Wavelets on Graphs via Spectral Graph Theory. Applied and Computational Harmonic Analysis, 2011, 30(2): 129-150.
[13] BRUNA J, MALLAT S. Invariant Scattering Convolution Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1872-1886.
[14] 徐 璨.可逆深度卷积网络的构建.硕士学位论文.上海:上海交通大学, 2017.
(XU C. The Construction of Reversible Deep Convolution Networks. Master Dissertation. Shanghai, China: Shanghai Jiao Tong University, 2017.)
[15] BAMBERGER R H, SMITH M J. A Filter Bank for the Directional Decomposition of Images: Theory and Design. IEEE Transactions on Signal Processing, 1992, 40(4): 882-893.
[16] CANDE/S E J, DEMANET L, DONOHO D, et al. Fast Discrete Curvelet Transforms. Multiscale Modeling & Simulation, 2006, 5(3): 861-899.
[17] ZHOU Z H, FENG J. Deep Forest: Towards an Alternative to Deep Neural Networks[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1702.08835.pdf.
[18] ZEILER M D, KRISHNAN D, TAYLOR G W, et al. Deconvolutional Networks // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2010: 2528-2535.
[19] ZEILER M D, TAYLOR G W, FERGUS R. Adaptive Deconvolutional Networks for Mid and High Level Feature Learning // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2011: 2018-2025.
[20] DOSOVITSKIY A, BROX T. Inverting Visual Representations with Convolutional Networks[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1506.02753.pdf.
[21] ZHANG J, ZHONG P, CHEN Y, et al. L1/2-Regularized Deconvolution Network for the Representation and Restoration of Optical Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(5): 2617-2627.
[22] SIMON M, RODNER E. Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks // Proc of the IEEE International Conference on Computer Vision. Wa-shington, USA: IEEE, 2015: 1143-1151.
[23] ZHANG Q S, WU Y N, ZHU S C. Interpretable Convolutional Neural Networks[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1710.00935.pdf.
[24] BAKIR G H, HOFMANN T, SCHLKOPF B, et al. Predicting Structured Data. Cambridge, USA: The MIT Press, 2007.
[25] MESHI D, SONTAG T, JAAKKOLA T, et al. Learning Efficiently with Approximate Inference via Dual Losses // Proc of the 27th International Conference on Machine Learning. New York, USA: Omnipress, 2010: 783-790.
[26] HAZAN T, URTASUN R. A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction // Proc of the 23rd International Conference on Neural Information Proce-ssing Systems. Cambridge, USA: The MIT Press, 2010: 838-846.
[27] DOMKE J. Parameter Learning with Truncated Message-Passing // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2011: 2937-2943.
[28] ROSS S, MUNOZ D, HEBERT M, et al. Learning Message-Pa-ssing Inference Machines for Structured Prediction // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2011: 2737-2744.
[29] STOYANOV V, ROPSON A, EISNER J. Empirical Risk Minimization of Graphical Model Parameters Given Approximate Infe-rence, Decoding, and Model Structure // Proc of the 14th International Conference on Artificial Intelligence and Statistics. New York, USA: ACM, 2011: 725-733.
[30] DAUME/ III H, LANGFORD J, MARCU D. Search-Based Structured Prediction. Machine Learning, 2009, 75(3): 297-325.
[31] ROSS G, GORDON G, BAGNELL D. A Reduction of Imitation Learning and Structured Prediction to No-regret Online Learning // Proc of the 14th International Conference on Artificial Intelligence and Statistics. New York, USA: ACM, 2011: 627-635.
[32] TOMPSON J, JAIN A, LECUN Y, et al. Joint Training of a Con-volutional Network and a Graphical Model for Human Pose Estimation[C/OL]. [2017-11-20]. https://arxiv.org/pdf/1406.2984.pdf.
[33] TOMPSON J, GOROSHIN R, JAIN A, et al. Efficient Object Localization Using Convolutional Networks // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 648-656.
[34] GAL Y, GHAHRAMANI Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning[C/OL]. [2017-11-20]. https://arxiv.org/pdf/1506.02142.pdf.
[35] CHU X, OUYANG W L, LI H S, et al. Structured Feature Lear-ning for Pose Estimation // Proc of the IEEE International Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 4715-4723.
[36] ZHENG S, JAYASUMANA S, ROMERA-PAREDES B, et al. Conditional Random Fields as Recurrent Neural Networks // Proc of the IEEE International Conference on Computer Vision. Wa-shington, USA: IEEE, 2015: 1529-1537.
[37] BELANGER D, MCCALLUM A. Structured Prediction Energy Networks[C/OL]. [2017-11-20]. https://arxiv.org/pdf/1511.06350.pdf.
[38] HAN S, POOL J, TRAN J, et al. Learning both Weights and Connections for Efficient Neural Network // CORTES C, LAWRENCE N D, LEE D D, et al., eds. Advances in Neural Information Processing Systems 25. Cambridge, USA: The MIT Press, 2015: 1135-1143.
[39] LI H, KADAV A, DURDANOVIC I. Pruning Filters for Efficient Convnets[C/OL]. [2017-11-20]. https://arxiv.org/pdf/1608.08710v1.pdf.
[40] WEI W, WU C P, WANG Y D, et al. Learning Structured Sparsity in Deep Neural Networks[C/OL]. [2017-11-20]. https://arxiv.org/pdf/1608.08710v1.pdf.
[41] MAIRAL J, BACH F, PINCE J. Sparse Modeling for Image and Vision Processing[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1411.3230.pdf.
[42] KIROS R, SALAKHUTDINOV R, ZEMEL R. Multimodal Neural Language Models // Proc of the 31st International Conference on Machine Learning. New York, USA: ACM, 2014: 595-603.
[43] VINYALS O, TOSHEV A, BENGIO S, et al. Show and Tell: A Neural Image Caption Generator[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1411.4555v1.pdf.
[44] KARPATHY A, LI F F. Deep Visual-Semantic Alignments for Generating Image Descriptions // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 3128-3137.
[45] DONAHUE J, HENDRICKS L A, GUADARRAMA S, et al. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 677-691.
[46] MAO J H, XU W, YANG Y, et al. Deep Captioning with Multimodal Recurrent Neural Networks(m-RNN)[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1412.6632.pdf.
[47] XU K, BA J, KIROS R, et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1502.03044.pdf.
[48] JIA X, GAVVES E, FERNANDO B, et al. Guiding Long-Short Term Memory for Image Caption Generation[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1502.03044.pdf.
[49] YOU Q Z, JIN H L, WANG Z W, et al. Image Captioning with Semantic Attention[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1603.03925.pdf.
[50] CHEN L, ZHANG H W, XIAO J, et al. SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1611.05594.pdf.
[51] LU J S, XIONG C M, PARIKH D, et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning[J/OL]. [2017-11-20]. https://arxiv.org/pdf/1612.01887.pdf.
[52] WU Q, SHEN C H, LIU L Q, et al. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? [J/OL]. [2017-11-20]. https://arxiv.org/pdf/1506.01144.pdf. |