Optimal Scale Selection and Attribute Reduction of Multi-scale Multiset-Valued Information Systems Based on Entropy
WANG Leixi1,2, WU Weizhi1,2, XIE Zhenhuang1,2
1. School of Information Engineering, Zhejiang Ocean University, Zhoushan 316022; 2. Key Laboratory of Oceanograhic Big Data Mining and Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan 316022
Abstract:Existing information systems are difficult to reflect and deal with the data duplication in the process of data fusion. In this paper, the concept of multi-scale multiset-valued information systems is introduced and the optimal scale selection and attribute reduction in these systems are discussed. Firstly, a similarity relation on the universe of discourse from any attribute subset in a multi-scale multiset-valued information system is defined by employing the Hellinger distance on multi-sets of the domain of any attribute. Then, information granules in the form of similarity classes are constructed. Knowledge rough entropy is further introduced in the context of multi-scale multiset-valued information systems. Optimal scales based on the similarity relation and the knowledge rough entropy are defined in a multi-scale multiset-valued information system, respectively. It is examined that the optimal scale based on the similarity relation and entropy optimal scale are equivalent. Finally, reducts and entropy reducts based on the optimal scale are discussed in the multi-scale multiset-valued information system, and algorithms for calculating the entropy optimal scale and an entropy reduct are also designed in a multi-scale multiset-valued information system.
王蕾晰, 吴伟志, 谢祯晃. 基于熵的多尺度多重集值信息系统的最优尺度选择与属性约简[J]. 模式识别与人工智能, 2023, 36(6): 495-510.
WANG Leixi, WU Weizhi, XIE Zhenhuang. Optimal Scale Selection and Attribute Reduction of Multi-scale Multiset-Valued Information Systems Based on Entropy. Pattern Recognition and Artificial Intelligence, 2023, 36(6): 495-510.
[1] LIN T Y.Granular Computing: Structures, Representations, and Applications//Proc of the 9th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Berlin, Germany: Springer, 2003: 16-24. [2] PEDRYCZ W. Granular Computing: An Introduction//Proc of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference. Washington, USA: IEEE, 2001: 1349-1354. [3] YAO J T, VASILAKOS A V, PEDRYCZ W.Granular Computing: Perspectives and Challenges. IEEE Transactions on Cybernetics, 2013, 43(6): 1977-1989. [4] ZADEH L A.Fuzzy Sets and Information Granularity//GUPTA M, RAGADE R, YAGER R, eds. Advances in Fuzzy Set Theory and Applications. Amsterdam, the Netherland: North-Holland Publi-shing, 1979: 3-18. [5] 陈德刚,徐伟华,李金海,等.粒计算基础教程.北京:科学出版社, 2019. (CHEN D G, XU W H, LI J H, et al.Basic Course of Granular Computing. Beijing, China: Science Press, 2019.) [6] NIU J J, CHEN D G, LI J H, et al. A Dynamic Rule-Based Classification Model Via Granular Computing. Information Sciences(An International Journal), 2022, 584: 325-341. [7] YANG X, LI T R, LIU D, et al. A Multilevel Neighborhood Sequential Decision Approach of Three-Way Granular Computing. Information Sciences, 2020, 538: 119-141. [8] PAWLAK Z. RoughSets. International Journal of Computer and Information Sciences, 1982, 11(5): 341-356. [9] MIYAMOTO S.Multisets and Fuzzy Multisets//LIU Z Q, MIYAMOTO S, eds. Soft Computing and Human-Centered Machines. Berlin, Germany: Springer, 2000: 9-33. [10] BLIZARD W D. Multiset Theory. Notre Dame Journal of Formal Logic, 1989, 30(1): 36-66. [11] YAGER R R.On the Theory of Bags. International Journal of Ge-neral System, 1986, 13(1): 23-37. [12] CHOPYUK Y, VASYLYSHYN T, ZAGORODNYUK A.Rings of Multisets and Integer Multinumbers. Mathematics, 2022, 10(5). DOI: 10.3390/MOTH/0050778. [13] EISENBERG A, MELTON J, KULKARNI K, et al. SQL: 2003 Has Been Published. ACM SIGMOD Record, 2004, 33(1): 119-126. [14] D'AGOSTINO G, VISSER A. Finality Regained: A Coalgebraic Study of Scott-Sets and Multisets. Archive for Mathematical Logic, 2002, 41: 267-298. [15] 赵前进,平昕瑞,苏树智,等.标签敏感的多重集正交相关特征融合方法.电子与信息学报, 2022, 44(10): 3458-3464. (ZHAO Q J, PING X R, SU S Z, et al. Feature Fusion Method Based on Label-Sensitive Multi-set Orthogonal Correlation. Journal of Electronics & Information Technology, 2022, 44(10): 3458-3464.) [16] LI Z W, YANG T L, LI J J.Semi-Supervised Attribute Reduction for Partially Labelled Multiset-Valued Data via a Prediction Label Strategy. Information Sciences, 2023, 634: 477-504. [17] DU C, YE J.Weighted Parameterized Correlation Coefficients of Indeterminacy Fuzzy Multisets and Their Multicriteria Group Decision Making Method with Different Decision Risks. Computer Mo-deling in Engineering and Sciences, 2021, 129(1): 341-354. [18] TÜRKARSLAN E, YE J, ÜNVER M, et al. Consistency Fuzzy Sets and a Cosine Similarity Measure in Fuzzy Multiset Setting and Application to Medical Diagnosis. Mathematical Problems in Engineering, 2021.DOI: 10.1155/2021.9975983. [19] ZHAO X R, HU B Q.Three-Way Decisions with Decision-Theore-tic Rough Sets in Multiset-Valued Information Tables. Information Sciences, 2020, 507: 684-699. [20] 王虹,柴晓华.多集值信息表中的多粒度模糊决策论粗糙集.数学的实践与认识, 2020, 50(8): 141-148. (WANG H, CHAI X H.Multi-Granularity Fuzzy Decision Theory Rough Set in Multi-set-Valued Information Table. Mathematics in Practice and Theory, 2020, 50(8): 141-148.) [21] 陈跃,李小南.基于多重集值局势表的三支冲突分析.模糊系统与数学, 2022, 36(2): 107-120. (CHEN Y, LI X N.Three-Way Conflict Analysis Based on Multiset-Valued Situation Tables. Fuzzy Systems and Mathematics, 2022, 36(2): 107-120.) [22] ZHANG P F, LI T R, WANG G Q, et al. Multi-source Information Fusion Based on Rough Set Theory: A Review. Information Fusion, 2021, 68(1): 85-117. [23] HUANG D, LIN H, LI Z W.Information Structures in a Multiset-Valued Information System with Application to Uncertainty Mea-surement. Journal of Intelligent and Fuzzy Systems, 2022, 43(6): 7447-7469. [24] WU W Z, LEUNG Y.Theory and Applications of Granular Labelled Partitions in Multi-scale Decision Tables. Information Sciences, 2011, 181(18): 3878-3897. [25] 陈应生,李进金,林荣德,等.多尺度覆盖决策信息系统的布尔矩阵方法.模式识别与人工智能, 2020, 33(9): 776-785. (CHEN Y S, LI J J, LIN R D, et al. Boolean Matrix Approach for Multi-scale Covering Decision Information System. Pattern Recognition and Artificial Intelligence, 2020, 33(9): 776-785.) [26] ZHAN J M, ZHANG K, WU W Z.An Investigation on Wu-Leung Multi-scale Information Systems and Multi-expert Group Decision-Making. Expert Systems with Applications, 2021, 170(1). DOI: 10.1016/j.eswa.2020.114542. [27] 陈艳,胡军,张清华,等.多尺度集值信息系统及其最优尺度选择.山西大学学报(自然科学版), 2020, 43(4): 765-775. (CHEN Y, HU J, ZHANG Q H, et al. Multi-scale Set-Valued Information System and Its Optimal Scale Selection. Journal of Shanxi University(Natural Science Edition), 2020, 43(4): 765-775.) [28] NEVEU P, TIREAU A, HILGERT N, et al. Dealing with Multi-source and Multi-scale Information in Plant Phenomics: The Onto-logy-Driven Phenotyping Hybrid Information System. The New Phytologist, 2019, 221(1): 588-601. [29] WANG H R, LI W T, ZHAN T, et al. Multi-granulation-Based Optimal Scale Selection in Multi-scale Information Systems. Computers and Electrical Engineering, 2021, 92. DOI: 10.1016/j.compeleceng.2021.107107. [30] SHANNON C E.A Mathematical Theory of Communication. ACM SIGMOBILE Mobile Computing and Communications Review, 2001, 5(1): 3-55. [31] 张嘉宇. 基于信息熵的多尺度决策表最优尺度选择.硕士学位论文.太原:山西大学, 2020. (ZHANG J Y.Information Entropy Based Optimal Scale Selection in Multi-scale Decision Tables. Master Dissertation. Taiyuan, China: Shanxi University, 2020.) [32] 郑嘉文,吴伟志,包菡,等.基于熵的多尺度决策系统的最优尺度选择.南京大学学报(自然科学), 2021, 57(1): 130-140. (ZHENG J W, WU W Z, BAO H, et al. Entropy Based Optimal Scale Selection for Multi-scale Decision Systems. Journal of Nanjing University(Natural Science), 2021, 57(1): 130-140.) [33] LIANG J Y, XU Z B.Uncertainty Measures of Roughness of Know-ledge and Rough Sets in Incomplete Information Systems//Proc of the 3rd World Congress on Intelligent Control and Automation. Washington, USA: IEEE, 2000: 2526-2529. [34] LIANG J Y, XU Z B.The Algorithm on Knowledge Reduction in Incomplete Information Systems. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10(1): 95-103. [35] LIANG J Y, SHI Z Z, LI D Y, et al. Information Entropy, Rough Entropy and Knowledge Granulation in Incomplete Information Systems. International Journal of General Systems, 2006, 35(6): 641-654. [36] SUN L, XU J C, TIAN Y.Feature Selection Using Rough Entropy-Based Uncertainty Measures in Incomplete Decision Systems. Knowledge-Based Systems, 2012, 36: 206-216. [37] MA W M, SUN B Z.Probabilistic Rough Set over Two Universes and Rough Entropy. International Journal of Approximate Reaso-ning, 2012, 53(4): 608-619. [38] 邓切,张贤勇,杨霁琳,等.基于双论域笛卡尔积的粗糙熵与知识粒度.模式识别与人工智能, 2019,32(11): 975-986. (DENG Q, ZHANG X Y, YANG J L, et al. Rough Entropy and Knowledge Granularity Based on Cartesian Product of Double Universes. Pattern Recognition and Artificial Intelligence, 2019, 32(11): 975-986.) [39] 陈艳. 多尺度集值信息系统及其应用研究.硕士学位论文. 重庆: 重庆邮电大学, 2021. (CHEN Y.Research on Multi-scale Set-Valued Information System and Its Application. Master Dissertation. Chongqing, China: Chong-qing University of Posts and Telecommunications, 2021.) [40] 王金波,吴伟志.基于证据理论的广义多尺度覆盖决策系统的最优尺度组合.模式识别与人工智能, 2022, 35(4): 291-305. (WANG J B, WU W Z.Evidence-Theory-Based Optimal Scale Combinations in Generalized Multi-scale Covering Decision Systems. Pattern Recognition and Artificial Intelligence, 2022, 35(4): 291-305.) [41] 苗夺谦,王珏.粗糙集理论中知识粗糙性与信息熵关系的讨论.模式识别与人工智能, 1998, 11(1): 34-40. (MIAO D Q, WANG J.On the Relationships between Information Entropy and Roughness of Knowledge in Rough Set Theory. Pattern Recognition and Artificial Intelligence, 1998, 11(1): 34-40.)