An Algorithm for Mining Complement-Alternative Relationship Based on Frequent Itemsets
CHAI Yu-Mei1, WANG Chun-Li2,WANG Li-Ming1
School of Information Engineering,Zhengzhou University,Zhengzhou 450001 Department of Computer Science and Engineering,Henan University of Urban Construction,Pingdingshan 467036
Abstract:Based on the TOP-k-Closed Miner algorithm, an algorithm for mining frequent itemsets based on index (Index-FIM) is proposed. Bit-vector is used to represent dataset, the breadth expansion pruning strategy and region-index skimming strategy are simultaneously introduced. The experimental results show that Index-FIM has high execution efficiency for mining frequent itemsets in sparse datasets. In order to acquire useful information for direct prediction, another algorithm for complement-alternative relationship mining based on frequent itemsets (CARM) is proposed. The relevant coefficient between the items included in frequent itemsets is computed to get their complement-alternative relationship, and the complementary-alternative relationship is represented by complement-alternative relationship graph (CAG), which is convenient for the decision maker to make reasonable and accurate judgment. Experimental results show that CAG is more efficient and precise than frequent itemsets in expressing information.
柴玉梅,王春丽,王黎明. 基于频繁项集的互补替代关系挖掘算法[J]. 模式识别与人工智能, 2012, 25(1): 157-165.
CHAI Yu-Mei, WANG Chun-Li,WANG Li-Ming. An Algorithm for Mining Complement-Alternative Relationship Based on Frequent Itemsets. , 2012, 25(1): 157-165.
[1] Han Jiawei,Pei Jian,Yin Yiwen.Mining Frequent Patterns without Candidate Generation // Proc of the ACM SIGMOD International Conference on Management of Data.Dallas,USA,2000: 1-12 [2] Tsay Y J,Chiang J Y.CBAR: An Efficient Method for Mining Association Rules.Knowledge Based Systems,2005,18(2/3): 99-105 [3] Bashir S,Jan Z,Baig A R.Fast Algorithms for Mining Interesting Frequent Itemsets without Minimum Support [EB/OL].[2009-04-21].http://arxiv.org/ftp/arxiv/papers/0904/0904.3319.pdf [4] Verhein F,Chawla S.Using Significant,Positively Associated and Relatively Class Correlated Rules for Associative Classification of Imbalanced Datasets // Proc of the 7th IEEE International Conference on Data Mining.Omaha,USA,2007: 679 - 684 [5] Tsay Y J,Hsu T J,Yu J R.FIUT: A New Method for Mining Frequent Itemsets.Information Sciences: An International Journal,2009,179(11): 1724-1737 [6] Burdick D,Calimlim M,Gehrke J.MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases // Proc of the 17th International Conference on Data Engineering.Heidelberg,Germany,2001: 443-452 [7] Gouda K,Zaki M J.Efficiently Mining Maximal Frequent Itemsets // Proc of the IEEE International Conference on Data Mining.San Jose,USA,2001: 163-170 [8] Wang Liming,Li Kun.Algorithm of Studying the Structure of Graphical Utility Based on Nearest-Biclusters Collaboration Filtering Technology.Chinese Journal of Computers,2010,33(12): 2291-2299 (in Chinese) (王黎明,李 琨.基于Nearest-Biclusters 协作过滤技术的效用图结构学习算法.计算机学报,2010,33(12): 2291-2299) [9] Luo Xiangfeng,Liang Guoning,Liu Shijun.Generating Associated Relation between Documents // Proc of the 10th IEEE International Conference on High Performance Computing and Communications.Dalian,China,2008: 831-836 [10] Natarajan R,Shekar B.Understandability of Association Rules: A Heuristic Measure to Enhance Rule Quality // Guillet F J,Hamilton H J,eds.Quality Measures in Data Mining.New York,USA: Springer Uerlag,2007,43: 179-203 [11] Hilderman R J,Peckham T.Statistical Methodologies for Mining Potentially Interesting Contrast Sets // Guillet F J,Hamilton H J,eds. Quality Measures in Data Mining.New York,USA: Springer Uerlag,2007,43: 153-177 [12] Hilderman R J,Hamilton H J.Measuring the Interestingness of Discovered Knowledge: A Principled Approach.Intelligent Data Analysis,2003,7(4): 347-382 [13] Hilderman R J,Hamilton H J.Evaluation of Interestingness Measures for Ranking Discovered Knowledge // Proc of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining.Shenzhen,China,2001: 247-259 [14] Zhang Yulei,Zhao Liping.Quantitative Analysis of the Relationship of Bacteria Based on the Dinucleotide Frequency Profile.Chinese Journal of Microecology,2006,18(4): 261-263 (in Chinese) (张宇镭,赵立平.基于双核酸频率分布特征的细菌亲缘关系定量分析.中国微生态学杂志,2006,18(4): 261-263) [15] Gonzales C,Perny P,Queiroz S.Preference Aggregation with Graphical Utility Models // Proc of the 23rd AAAI Conference on Artificial Intelligence.Chicago,USA,2008: 1037-1042 [16] Wang Liming,Li Kun.A Model of Negotiation Based on GAI Multi-Attribute Dependencies.Pattern Recognition and Artificial Intelligence,2008,21(5): 569-576 (in Chinese) (王黎明,李 琨.基于GAI多属性依赖的协商模型.模式识别与人工智能,2008,21(5): 569-576) [17] Wang Liming,Huang Houkuan,Chai Yumei.An Algorithm for Resource Negotiation Based on Progresses Reduction in Process of Calculating.Pattern Recognition and Artificial Intelligence,2005,18(3): 273-280 (in Chinese) (王黎明,黄厚宽,柴玉梅.推测计算中基于进程约简的资源协商算法.模式识别与人工智能,2005,18(3): 273-280)