结合用户兴趣的微博信息传播模式挖掘<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201610006

Abstract
Figure/Table
References
Related Citation (5)

Download: PDF (588 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract Information diffusion modeling is the basis of the community mining and community influence research. Based on a user interest related information diffusion model, a microscopic pattern mining method is proposed to detect the information diffusion features using frequent subtree mining in this paper. Firstly, microscopic information diffusion pattern is converted into frequent subtrees mining by formulating social network in microblog as a series of graphs with users multiple labels. In terms of the microblog social network characteristics of multiple labels on single node, an efficient frequent subtrees mining algorithm on the tree with multiple labels tree miner (MLTreeMiner) is proposed. Finally, combined with topic information extraction method, MLTreeMiner is used to mine information diffusion patterns. Experiments on synthetic data demonstrate that MLTreeMiner is efficient for frequent subtrees mining on the tree with multiple labels. Experiments are also carried out on real data from Sina Weibo, and the validity of the MLTreeMinner is verified.

Key words： Social Network User Interest Diffusion Pattern Frequent Subtree Mining

Received: 30 May 2016

ZTFLH:

TP 391.4

Fund:Supported by National Natural Science Foundation of China (No.61572143,61472089,61202269), Natural Science Foundation of Guangdong Province (No.2014A030306004,2014A030308008), Science and Technology Planning Project of Guangdong Province (No.2015B010108006,2013B051000076,2012B01010029).

About author:: (HAO Zhifeng, born in 1968, Ph.D., professor. His research interests include machine learning and artificial intelligence.)
(HUANG Canjin(Corresponding author), born in 1990, master student. His research interests include data mining and user behavior analysis.)
(CAI Ruichu, born in 1983, Ph.D., professor. His research interests include machine learning and data mining.)
(WEN Wen, born in 1981, Ph.D., associate professor. Her research interests include SVM and pattern recognition.)
(HUANG Yupeng, born in 1990, master student. His research interests include social network data mining and massive data processing of cloud computing.)
(CHEN Bingfeng, born in 1983, Ph.D. candidate. His research interests include public opinion analysis and data mi-ning.)

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	HAO Zhifeng
	HUANG Canjin
	CAI Ruichu
	WEN Wen
	HUANG Yupeng
	CHEN Bingfeng

Cite this article:

HAO Zhifeng,HUANG Canjin,CAI Ruichu等. User Interest Related Information Diffusion Pattern Mining in Microblog[J]. , 2016, 29(10): 924-935.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201610006 OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2016/V29/I10/924

[1] G TZ M, LESKOVEC J, MCGLOHON M, et al. Modeling Blog Dynamics // Proc of the 3rd International ICWSM Conference. Menlo Park, USA: AAAI Press, 2009: 26-33.
[2] GOMEZ-RODRIGUEZ M, LESKOVEC J, KRAUSE A. Inferring Networks of Diffusion and Influence. ACM Trans on Knowledge Discovery from Data, 2010, 5(4): 1019-1028.
[3] GOMEZ-RODRIGUEZ M, BALDUZZI D, SCH LKOPF B. Uncovering the Temporal Dynamics of Diffusion Networks // Proc of the 28th International Conference on Machine Learning. New York, USA: ACM, 2011: 561-568.
[4] EGHLIDI N A, AFSHAR A, ASHENAGAR B, et al. A Lightweight Method to Investigate Unknown Social Network Structure // Proc of the 5th International Conference on Computer and Knowledge Engineering. New York, USA: IEEE, 2015: 262-267.
[5] TSUR O, RAPPOPORT A. What's in a Hashtag?: Content Based Prediction of the Spread of Ideas in Microblogging Communities // Proc of the 5th ACM International Conference on Web Search and Data Mining. New York, USA: ACM, 2012: 643-652.
[6] YANG Z, GUO J Y, CAI K K, et al. Understanding Retweeting Behaviors in Social Networks // Proc of the 19th International Conference on Information and Knowledge Management. New York, USA: ACM, 2010: 1633-1636.
[7] PENG H K, ZHU J, PIAO D Z, et al. Retweet Modeling Using Conditional Random Fields // Proc of the 11th IEEE International Conference on Data Mining Workshops. Washington, USA: IEEE, 2011: 336-343.
[8] BARABSI A L, ALBERT R. Emergence of Scaling in Random Networks. Science, 1999, 286(5439): 509-512.
[9] JIN E M, GIRVAN M, NEWMAN M E J. The Structure of Growing Social Networks. Physical Review Letters, 2001, 8: 132-136.
[10] NEWMAN M E J. The Structure and Function of Complex Networks. SIAM Review, 2003, 45(2): 167-256.
[11] HONG L J, DOUMITH A S, DAVISON B D. Co-factorization Machines: Modeling User Interests and Predicting Individual Decisions in Twitter // Proc of the 6th ACM International Conference on Web Search and Data Mining. New York, USA: ACM, 2013: 557-566.
[12] LIN C X, MEI Q Z, HAN J W, et al. The Joint Inference of Topic Diffusion and Evolution in Social Communities // Proc of the 13th IEEE International Conference on Data Mining. Washington, USA: IEEE, 2011: 378-387.
[13] LESKOVEC J, BACKSTROM L, KLEINBERG J. Meme-Tracking and the Dynamics of the News Cycle // Proc of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2009: 497-506.
[14] LIBEN-NOWELL D, KLEINBERG J. Tracing Information Flow on a Global Scale Using Internet Chain-Letter Data. Proceedings of the National Academy of Sciences of the United States of America, 2008, 105(12): 4633-4638.
[15] ZAKI M J. Efficiently Mining Frequent Trees in a Forest: Algorithms and Application. IEEE Trans on Knowledge and Data Engineering, 2005, 17(8): 1021-1035.
[16] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet Allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
[17] 张晨逸,孙建伶,丁轶群.基于MB-LDA模型的微博主题挖掘.计算机研究与发展, 2011, 48(10): 1795-1802.
(ZHANG C Y, SUN J L, DING Y Q. Topic Mining for Microblog Based on MB-LDA Model. Journal of Computer Research and Development, 2011, 48(10): 1795-1802.)
[18] ZHAO W X, JIANG J, WENG J S, et al. Comparing Twitter and Traditional Media Using Topic Models // Proc of the 33rd Euro-pean Conference on IR Research. Berlin, Germany: Springer-Verlag, 2011: 338-349.
[19] INOKUCHI A, WASHIO T, MOTODA H. An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data // Proc of the 4th European Conference on Principles of Data Mining and Knowledge Discovery. Berlin, Germany: Springer-Verlag, 2000: 13-23.
[20] KURAMOCHI M, KARYPIS G. Frequent Subgraph Discovery // Proc of the 1st IEEE Internation Conference on Data Mining. Washington, USA: IEEE, 2001: 313-320.
[21] YAN X F, HAN J W. gSpan: Graph-Based Substructure Pattern Mining // Proc of the 2nd IEEE Internation Conference on Data Mining. Washington, USA: IEEE, 2002: 721-724.
[22] NIJSSEN S, KOK J N. Efficient Discovery of Frequent Unordered Trees [C/OL]. [2016-04-22]. http://www.ar.sanken.osakatc.ac.jp/~washio/lict/6.pdf.
[23] CHI Y, YANG Y R, MUNTZ R R. Indexing and Mining Free Trees // Proc of the 3rd IEEE Internation Conference on Data Mining. Wa-shington, USA: IEEE, 2003: 509-512.
[24] DEEPAK A, FERNNDEZ-BACA D, TIRTHAPURA S, et al. EvoMiner: Frequent Subtree Mining in Phylogenetic Databases. Knowledge and Information Systems, 2014, 41(3): 559-590.
[25] MOUGEL P N, RIGOTTI C, GANDRILLON O. Finding Collections of k-Clique Percolated Components in Attributed Graphs // Proc of the 16th Pacific-Asia Conference on Advances in Knowledge Discovery & Data Mining. Berlin, Germany: Springer-Verlag, 2012: 181-192.
[26] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed Representations of Words and Phrases and Their Compositionality // BURGES C J C, BOTTOU L, WELLING M, et al., eds. Advances in Neural Information Processing Systems 26. Cambridge, USA: MIT Press, 2013: 3111-3119.