|
|
Zero-Shot Image Recognition Algorithm via Semantic Auto-Encoder Combining Relation Network |
LIN Kezheng1, LI Haotian1, BAI Jingxuan1, LI Ao1 |
1.School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080 |
|
|
Abstract A semantic auto-encoder structure improved by relation network is proposed and used for zero sample identification algorithm to handle the projection domain shift problem and improve the robustness of distance similarity measure in the traditional model of zero-shot recognition. The feature map between image visual features and semantic vectors is constructed by the proposed algorithm based on the semantic auto-encoder, and then the reconstructed vector is sent to the neural network after concatenating the true value of the corresponding vector. Finally, the prediction category is determined by the output scalar. The experimental results show that compared with the traditional distance measurement method, the recognition rate of the proposed algorithm on the public datasets AWA, CUB and ImageNet-2 is improved and its semantic-visual projection has a better effect than back projection on some datasets.
|
Received: 25 October 2018
|
|
Fund:Supported by National Natural Science Foundation of China(No.61501147), Natural Science Foundation of Heilongjiang Province(No.F2015040), University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province(No.2018203) |
Corresponding Authors:
LIN Kezheng, Ph.D., professor. His research interests include image processing, machine vision and pattern recognition.
|
About author:: (LI Haotian, master student. His research interests include machine learning and computer vision.) (BAI Jingxuan, master student. Her research interests include image processing and computer vision.) (LI Ao, Ph.D., lecturer. His research interests include sparse representation, image restoration and computer vision.) |
|
|
|
[1] SMIRNOV E A, TIMOSHENKO D M, ANDRIANOV S N. Comparison of Regularization Methods for ImageNet Classification with Deep Convolutional Neural Networks. AASRI Procedia, 2014, 6: 89-94. [2] LAMPERT C H, NICKISCH H, HARMELING S. Attribute-Based Classification for Zero-Shot Visual Object Categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(3): 453-465. [3] HWANG S J, SHA F, GRAUMAN K. Sharing Features between Objects and Their Attributes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2011: 1761-1768. [4] CHEN L, ZHANG Q, LI B X. Predicting Multiple Attributes via Relative Multi-task Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 1027-1034. [5] AKATA Z, REED S, WALTER D, et al. Evaluation of Output Embeddings for Fine-Grained Image Classification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 2927-2936. [6] XIAN Y Q, AKATA Z, SHARMA G, et al. Latent Embeddings for Zero-Shot Classification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 69-77. [7] BA J L, SWERSKY K, FIDLER S, et al. Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 4247-4255. [8] 乔 雪,彭 晨,段 贺,等.基于共享特征相对属性的零样本图像分类.电子与信息学报, 2017, 39(7): 1563-1570. (QIAO X, PENG C, DUAN H, et al. Shared Features Based Relative Attributes for Zero-Shot Image Classification. Journal of Electronics and Information Technology, 2017, 39(7): 1563-1570.) [9] 程玉虎,乔 雪,王雪松.基于混合属性的零样本图像分类.电子学报, 2017, 45(6): 1462-1468. (CHENG Y H, QIAO X, WANG X S. Hybrid Attribute-Based Zero-Shot Image Classification. Acta Electronica Sinica, 2017, 45(6): 1462-1468.) [10] JI Z, YU Y L, PANG Y W, et al. Manifold Regularized Cross-Modal Embedding for Zero-Shot Learning. Information Sciences, 2017, 378: 48-58. [11] SOCHER R, GANJOO M, BASTANI O, et al. Zero-Shot Lear-ning through Cross-Modal Transfer // BURGES C J C, BOTTOU L, WELLING W, et al., eds. Advances in Neural Information Processing Systems 26. Cambridge, USA: The MIT Press, 2013: 935-943. [12] FU Y W, HOSPEDALES T M, XIANG T, et al. Transductive Multi-view Zero-Shot Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(11): 2332-2345. [13] KODIROV E, XIANG T, FU Z Y, et al. Unsupervised Domain Adaptation for Zero-Shot Learning // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2016: 2452-2460. [14] ZHANG L, XIANG T, GONG S G. Learning a Deep Embedding Model for Zero-Shot Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2021-2030. [15] KODIROV E, XIANG T, GONG S G. Semantic Autoencoder for Zero-Shot Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 4447-4456. [16] XIAN Y Q, LORENZ T, SCHIELE B, et al. Feature Generating Networks for Zero-Shot Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 5542-5551. [17] PENNINGTON J, SOCHER R, MANNING C. GloVe: Global Ve-ctors for Word Representation[C/OL]. [2018-07-01]. https://nlp.stanford.edu/pubs/glove.pdf. [18] SAENKO K, KULIS B, FRITZ M, et al. Adapting Visual Category Models to New Domains // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2010: 213-226. [19] BARTELS R H, STEWART G W. Algorithm 432: Solution of the Matrix Equation AX+XB=C. Communications of the ACM, 1972, 15(9): 820-826. [20] SUNG F, YANG Y X, ZHANG L, et al. Learning to Compare: Relation Network for Few-Shot Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 1199-1208. [21] ZHAO M Z, XU B, LIN H F, et al. Discover Potential Adverse Drug Reactions Using the Skip-Gram Model // Proc of the IEEE International Conference on Bioinformatics and Biomedicine. Washington, USA: IEEE, 2015: 1765 - 1767. [22] VAN DER MAATEN L, HINTON G. Visualizing Data Using t-SNE. Journal of Machine Learning Research, 2008, 9: 2579-2605. [23] ROMERA-PAREDES B, TORR P H S. An Embarrassingly Simple Approach to Zero-Shot Learning // FERIS R S, LAMPERT C, PARIKH D, eds. Visual Attributes. Berlin, Germany: Springer, 2017: 11-30. [24] ZHANG Z M, SALIGRAMA V. Zero-Shot Learning via Joint Latent Similarity Embedding // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 6034-6042. [25] FU Z Y, XIANG T A, KODIROV E, et al. Zero-Shot Object Recognition by Semantic Manifold Distance // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 2635-2644. [26] NOROUZI M, MIKOLOV T, BENGIO S, et al. Zero-Shot Learning by Convex Combination of Semantic Embeddings[C/OL]. [2018-07-01]. https://arxiv.org/pdf/1312.5650.pdf. |
|
|
|