Sparse Graph Based Transductive Multi-Label Learning for Video Concept Detection
ZHAO Ying-Hai1,2, CAI Jun-Jie1, WU Xiu-Qing1, SUN Fu-Ming3
1.School of Information Science and Technology, University of Science and Technology of China, Hefei 230027 2.The 35th Research Institute of China Aerospace Science and Industry Corp., Beijing 100013 3.College of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou 121001
Abstract:A sparse graph based transductive multi-label learning method is proposed for video concept detection. Firstly, the sparse signal representation theory is exploited to mine the point-wise similarity relationships and the concept-wise distribution correlation relationships. Then, the multi-label sparse graph structure is constructed based on discrete hidden Markov random field to conduct transductive semi-supervised video concept detection. The sparse representation for correlative information can remove the negative effect of redundant information, reduce the complexity of graph-based classification problem and improve the model efficiency and discriminability. The proposed method is evaluated on the TRECVID 2005 dataset, and extensive comparative experiments are conducted with respect to multiple supervised and semi-supervised classification methods. The experimental results demonstrate the effectiveness of the proposed method.
[1] Zhu Xiaojin. Semi-Supervised Learning Literature Survey. Computer Sciences Technical Report, 1530. Madison, USA: University of Wisconsin, 2008 [2] Zhu Xiaojin, Ghahramani Z, Lafferty J. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions // Proc of the 20th International Conference on Machine Learning. Washington, USA, 2003: 912-919 [3] Zhou Dengyong, Olivier B, Lal T N, et al. Learning with Local and Global Consistency // Thrun S, Saul L K, Schlkopf B, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2004, XVI: 321-328 [4] Qi Guojun, Hua Xiansheng, Rui Yong, et al. Correlative Multi-Label Video Annotation // Proc of the 15th International Conference on Multimedia. Augsburg, Germany, 2007: 17-26 [5] Chen Gang, Song Yanqiu, Wang Fei, et al. Semi-Supervised Multi-Label Learning by Solving a Sylvester Equation // Proc of the 8th SIAM Conference on Data Mining. Atlanta, USA, 2008: 410-419 [6] Liu Yi, Jin Rong, Yang Liu. Semi-Supervised Multi-Label Learning by Constrained Non-Negative Matrix Factorization // Proc of the 21st National Conference on Artificial Intelligence. Saint Paul, USA, 2006, I: 421-426 [7] Wang Jingdong, Zhao Yinghai, Wu Xiuqing, et al. Transductive Multi-Label Learning for Video Concept Detection // Proc of the 1st ACM International Conference on Multimedia Information Retrieval. Vancouver, Canada, 2008: 298-304 [8] Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000, 290(5500): 2323-2326 [9] Wang Fei, Zhang Changshui. Label Propagation through Linear Neighborhoods // Proc of the 23rd International Conference on Machine Learning. Edinburgh, UK, 2006: 985-992 [10] Rao R P N, Olshausen B A, Lewicki M S. Probabilistic Models of the Brain: Perception and Neural Function. Cambridge, USA: MIT Press, 2002 [11] Wright J, Yang A, Ganesh A, et al. Robust Face Recognition via Sparse Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2009, 31(2): 210-227 [12] Tang Jinhui, Yan Shuicheng, Hong Richang, et al. Inferring Semantic Concepts from Community-Contributed Images and Noisy Tags // Proc of the 17th ACM International Conference on Multimedia. Beijing, China, 2009: 223-232 [13] Liu Xiaobai, Cheng Bin, Yan Shuicheng, et al. Label to Region by Bi-Layer Sparsity Priors // Proc of the 17th ACM International Conference on Multimedia. Beijing, China, 2009: 115-124 [14] Candes E, Rudelson M, Tao T, et al. Error Correction via Linear Programming // Proc of the 46th Annual IEEE Symposium on Foundations of Computer Science. Pittsburgh, USA, 2005: 295-308 [15] Donoho D L. For Most Large Underdetermined Systems of Linear Equations the Minimal l1-Norm Solution is also the Sparsest Solution. Communications on Pure and Applied Mathematics, 2004, 59(7): 907-934 [16] Vapnik V N. Statistical Learning Theory. New York, USA: Wiley, 1998 [17] Boykov Y, Kolmogorov V. An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Trans on Pattern Analysis and Machine Intelligence, 2004, 26(9): 1124-1137 [18] Kolmogorov V. Convergent Tree-Reweighted Message Passing for Energy Minimization. IEEE Trans on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1568-1583 [19] TRECVID 2005 [DB/OL]. [2010-03-05]. http://www-nlpir.nist.gov/projects/tv2005/tv2005.html [20]Trec-10 Proceedings Appendix on Common Evaluation Measures [EB/OL]. [2010-03-05]. http://trec.nist.gov/pubs/trec10/appendices/measures.pdf. [21] Naphade M, Smith J R, Tesic J, et al. Large-Scale Concept Ontology for Multimedia. IEEE MultiMedia, 2006, 13(3): 86-91 [22] LSCOM Annotation [DB/OL]. [2010-03-05]. http://www.ee.columbia.edu/ln/dvmm/columbia374/ [23] Zha Zhengjun, Mei Tao, Wang Jingdong, et al. Graph-Based Semi-Supervised Learning with Multi-Label // Proc of the IEEE International Conference on Multimedia and Exposition. Hannover, Germany, 2008: 1321-1324