|
|
Link Prediction Method Based on PU Learning |
LI Qi1, WANG Zhiqiang1, LIANG Jiye1 |
1.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Institute of Intelligent Information Processing, Shanxi University, Taiyuan 030006 |
|
|
Abstract In classification-based link prediction methods, it is difficult to choose reliable negative examples to construct the link prediction classifier due to the large-scale and uncertainty of the unknown node pairs. Therefore, a link prediction method based on positive and unlabeled(PU) learning is proposed. Firstly, topological information of node pairs is extracted to construct example sets. Secondly, distribution of candidate negative examples is determined by community structure, and several candidate negative example sets are obtained through multiple under-sampling based on the distribution. Then, the classifiers constructed from multiple negative example sets and positive example sets are integrated to select reliable negative examples. Finally, the link prediction classifier is constructed based on positive examples and reliable negative examples. Experiments on four datasets show that the proposed link prediction method produces better prediction results than other related methods.
|
Received: 13 May 2019
|
|
Fund:Supported by National Natural Science Foundation of China(No.61876103, 61906111), Science and Technological Innovation Programs of Higher Education Institutions in Shanxi(No.2019L0023), 1331 Engineering Project of Shanxi Province |
Corresponding Authors:
LIANG Jiye, Ph.D., professor. His research interests include artificial intelligence, granular computing, data mining and machine learning.
|
About author:: LI Qi, master student. Her research interests include social network analysis;WANG Zhiqiang, Ph.D., lecturer. His research interests include social network analysis and machine learning. |
|
|
|
[1] 吕琳媛.复杂网络链路预测.电子科技大学学报, 2010, 39(5): 651-661. (LÜ L Y. Link Prediction on Complex Networks. Journal of University of Electronic Science and Technology of China, 2010, 39(5): 651-661.) [2] 王智强,梁吉业,李 茹.基于信息融合的概率矩阵分解链路预测方法.计算机研究与发展, 2019, 56(2): 306-318. (WANG Z Q, LIANG J Y, LI R. Probability Matrix Factorization for Link Prediction Based on Information Fusion. Journal of Computer Research and Development, 2019, 56(2): 306-318.) [3] WANG Z Q, LIANG J Y, LI R. A Fusion Probability Matrix Factorization Framework for Link Prediction. Knowledge-Based Systems, 2018, 159: 72-85. [4] WANG P, XU B W, WU Y R, et al. Link Prediction in Social Networks: the State-of-the-Art. Science China(Information Sciences), 2015, 58(1). DOI: 10.1007/s11432-014-5237-y. [5] LICHTENWALTER R N, CHAWIA N V. Vertex Collocation Profiles: Subgraph Counting for Link Analysis and Prediction // Proc of the 21th International Conference on World Wide Web. New York, USA: ACM, 2012: 1019-1028. [6] LESKOVEC J, HUTTENLOCHER D, KLEINBERG J. Predicting Positive and Negative Links in Online Social Networks // Proc of the 19th International Conference on World Wide Web. New York, USA: ACM, 2010: 641-650. [7] CHIANG K Y, NATARAJAN N, TEWARI A, et al. Exploiting Longer Cycles for Link Prediction in Signed Networks // Proc of the 20th ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2011: 1157-1162. [8] DE SÁHIALLY R, PRUDÉNCIO R B C. Supervised Link Prediction in Weighted Networks // Proc of the 21th International Joint Conference on Neural Networks. Washington, USA: IEEE, 2011: 2281-2288. [9] CHUAN P M, GIAP C N, LE SON H, et al. Enhance Link Prediction in Online Social Networks Using Similarity Metrics, Sampling, and Classification // BHATEJA V, LE NGUYEN N, NGUYEN N G, et al., eds. Information Systems Design and Intelligent Applications. Berlin, Germany: Springer, 2018: 823-833. [10] BROUARD C, D′ALCHÉ-BUC F, SZAFRANSKI M. Semi-supervised Penalized Output Kernel Regression for Link Prediction // Proc of the 28th International Conference on Machine Learning. New York, USA: ACM, 2011: 593-600. [11] KASHIMA H, KATO T, YAMANISHI Y, et al. Link Propagation: A Fast Semi-supervised Learning Algorithm for Link Prediction // Proc of the 9th SIAM International Conference on Data Mining. Philadelphia, USA: SIAM, 2009: 1100-1111. [12] HU H, ZHU C Y, AI H X, et al. LPI-ETSLP: IncRNA-Protein Interaction Prediction Using Eigenvalue Transformation-Based Semi-supervised Link Prediction. Molecular BioSystems, 2017, 13(9): 1781-1787. [13] DENIS F. PAC Learning from Positive Statistical Queries // Proc of the 9th International Conference on Algorithmic Learning Theory. Berlin, Germany: Springer, 1998: 112-126. [14] LIU B, LEE W S, YU P S, et al. Partially Supervised Classification of Text Documents // Proc of the 19th International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann, 2002: 387-394. [15] LU L, TAO P. Clustering-Based Method for Positive and Unlabeled Text Categorization Enhanced by Improved TFIDF. Journal of Information Science and Engineering, 2014, 30(5): 1463-1481. [16] YU H, HAN J W, CHANG K C C. PEBL: Positive Example Ba-sed Learning for Web Page Classification Using SVM // Proc of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2002: 239-248. [17] ZENG Z Z, CHEN K J, ZHANG S B, et al. A Link Prediction Approach Using Semi-supervised Learning in Dynamic Networks // Proc of the 6th International Conference on Advanced Computational Intelligence. Washington, USA: IEEE, 2013: 276-280. [18] PENG G J, CHEN K J, XUE S J, et al. A Relation Prediction Method Based on PU Learning // Proc of the 12th International Conference on Intelligent Systems and Knowledge Engineering. Washington, USA: IEEE, 2017. DOI: 10.1109/ISKE.2017.8258752. [19] GUO H X, LI Y J, SHANG J, et al. Learning from Class-Imbalanced Data: Review of Methods and Applications. Expert Systems with Applications, 2017, 73: 220-239. [20] BREIMAN L. Bagging Predictors. Machine Learning, 1996, 24(2): 123-140. [21] MOTHE J, MKHITARYAN K, HAROUTUNIAN M. Community Detection: Comparison of State of the Art Algorithms // Proc of the 11th Conference on Computer Science and Information Technologies. Washington, USA: IEEE, 2017: 125-129. [22] BLONDEL V D, GUILLAUME J L, LAMBIOTTE R, et al. Fast Unfolding of Communities in Large Networks[C/OL]. [2019-03-01]. https://arxiv.org/pdf/0803.0476.pdf. [23] HE F X, LIU T L, WEBB G I, et al. Instance-Dependent PU Learning by Bayesian Optimal Relabeling[C/OL]. [2019-03-01]. https://arxiv.org/pdf/1808.02180.pdf. [24] MANEVITZ L M, YONSEF M. One-Class SVMs for Document Classification. Journal of Machine Learning Research, 2001, 2: 139-154. |
|
|
|