Cross-Language Sentiment Classification Algorithm Based on Dependency Analysis Parser and Weight on Property Probability
ZHANG Ling-Ling1, JI Jun-Zhong1, BEI Fei1, WU Chen-Sheng2
1.Beijing Key Laboratory of Multimedia and Intelligent Software Technology,Beijing University of Technology, Beijing 100124 2.Beijing Institute of Science and Technology Information, Beijing 100048
Abstract:In the document-level sentiment classification methods,only the distribution information of emotion is taken into account, while the semantic emotion knowledge is ignored. To solve these problems,a cross-language sentiment classification algorithm based on the dependency analysis and property probability weight is proposed. Firstly, dependency relations are got by dependency relation parsing before translation. Then, based on the correlation between the distribution of dictionary polar and the document-level sentiment classification, the weight feature of property probability is merged into Naive bayesian classification to improve the classification effect. Finally, extensive experiments are performed on English datasets for training and standard Chinese datasets for testing. The results show that the proposed algorithm is superior to other existing algorithms in performance.
[1] Mihalcea R, Banea C. Learning Multilingual Subjective Language via Cross-Lingual Projections // Proc of the 45th Annual Meeting of the Association for Computational Linguistics. Prague, Czech Republic, 2007: 976-983 [2] Yao J X, Wu G F, Liu J, et al. Using Bilingual Lexicon to Judge Sentiment Orientation of Chinese Words // Proc of the 6th IEEE International Conference on Computer and Information Technology. Seoul, Korea, 2006. DOI: 10.1109/CIT.2006.190 [3] Wan X J. Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis // Proc of the Conference on Empirical Methods in Natural Language Processing. Honolulu, USA, 2008: 553-561 [4] Xu J, Xu R F, Ding Y X, et al. Cross Lingual Opinion Analysis via Transfer Learning. Australian Journal of Intelligent Information Processing Systems, 2010, 11(2): 28-34 [5] Hajmohammadi M S, Ibrahim R, Selamat A, et al. Combination of Multi-view Multi-source Language Classifiers for Cross-Lingual Sentiment Classification // Proc of the 6th Asian Conference on Intelligent Information and Database System. Bangkok, Thailand, 2014, I: 21-30 [6] Prettenhofer P, Stein B. Cross Language Text Classification Using Structural Correspondence Learning // Proc of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala,Sweden, 2010: 1118-1127 [7] Hajmohammadi M S, Ibrahim R, Selamat A. Graph-Based Semi-supervised Learning for Cross-Lingual Sentiment Classification // Proc of the 7th Asian Conference on Intelligent Information and Database Systems. Bali, Indonesia, 2015: 97-106 [8] Zhao Y Y, Qin B, Liu T. Sentiment Analysis. Journal of Software, 2010, 21(8): 1834-1848 (in Chinese) (赵妍妍,秦 兵,刘 挺.文本情感分析.软件学报, 2010, 21(8): 1834-1848) [9] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment Classification Using Machine Learning Techniques // Proc of the Conference on Empirical Methods in Natural Language Processing. Philadelphia, USA, 2002: 79-86 [10] Kennedy A, Inkpen D. Sentiment Classification of Movie Reviews Using Contextual Valence Shifters. Computational Intelligence, 2006, 22(2): 110-125 [11] Dave K, Lawrence S, Pennock D. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Review // Proc of the 12th International Conference on World Wide Web. Budapest, Hungary, 2003: 519-528 [12] Zhao Y Y, Qin B, Che W X, et al. Appraisal Expression Recognition Based on Syntactic Path. Journal of Software, 2011, 22(5): 887-898 (in Chinese) (赵妍妍,秦 兵,车万翔,等. 基于句法路径的情感评价单元识别.软件学报, 2011, 22(5): 887-898) [13] Yao T F, Nie Q Y, Li J C, et al. An Opinion Mining System for Chinese Automobile Reviews // Cao Y Q, Sun M S, eds. Proc of the Frontiers of Chinese Information Processing. Beijing, China: Tsinghua University Press, 2006: 260-281 (in Chinese) (姚天昉,聂青阳,李建超,等.一个用于汉语汽车评论的意见挖掘系统//曹石琦,孙茂松,编.中文信息处理前沿进展.北京:清华大学出版社, 2001: 260-281) [14] Li S S, Lee S Y M, Chen Y, et al. Sentiment Classification and Polarity Shifting // Proc of the 23rd International Conference on Computational Linguistics. Beijing, China, 2010: 635-643 [15] Chen Q, He Y X, Liu X L, et al. Cross-Language Sentiment Ana-lysis Based on Parser. Acta Scientiarum Naturalium Universitatis Pekinensis, 2014, 50(1): 55-60 (in Chinese) (陈 强,何炎祥,刘续乐,等.基于句法分析的跨语言情感分析.北京大学学报:自然科学版, 2014, 50(1): 55-60) [16] Wan X J. A Comparative Study of Cross-Lingual Sentiment Classification // Proc of the IEEE/WIC/ACM International Joint Confe-rences on Web Intelligence and Intelligent Agent Technology. Macau, China, 2012, I: 24-31 [17] Melville P, Gryc W, Lawrence R D. Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification // Proc of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Paris, France, 2009: 1275-1284 [18] Zhang X Z, Zhou Y. Holistic Approaches to Identifying the Sentiment of Blogs Using Opinion Words // Proc of the 12th International Conference on Web Information System Engineering. Sydney, Australia, 2011: 15-28 [19] Turney P D. Thumbs up or Thumbs down? Semantic Orientation Applied to Unsupervised Classification of Reviews // Proc of the 40th Annual Meeting on Association for Computational Linguistics.Philadelphia, USA, 2002: 417-424 [20] Agrawal R, Bayardo R, Srikant R. Athena: Mining-Based Interactive Management of Text Databases // Proc of the 7th International Conference on Extending Database Technology. Konstanz, Ger-many, 2000: 365-379