An Algorithm for Mining Closed Frequent Patterns Based on Projection Sum Tree
YANG ChuanYao1, ZHANG ChengHong2, HU YunFa1
1.Department of Computing and Information Technology, Fudan University, Shanghai 2004332. Department of Information Management and Information System, Fudan University, Shanghai 200433
Abstract:In this paper, a new algorithm for mining closed frequent patterns is presented based on a projection sum frequent items tree. This algorithm projects the transaction base into a projection sum frequent items tree and stores the patterns compactly with the help of tiers. When mining, it can make full use of the existing computational result which has been done without repeat computation. It traverses the projection tree only once and does not need to generate the conditional FP trees dynamically and recursively and it avoids much timeconsuming I/O. The experiment shows that it has a high efficiency on dense datasets.
杨传耀,张成洪,胡运发. 一种基于投影和树的闭合频繁模式算法*[J]. 模式识别与人工智能, 2008, 21(1): 6-11.
YANG ChuanYao, ZHANG ChengHong, HU YunFa. An Algorithm for Mining Closed Frequent Patterns Based on Projection Sum Tree. , 2008, 21(1): 6-11.
[1] Stumme G, Taouil R, Bastide Y, et al. Computing Iceberg Concept Lattices with Titanic. Data and Knowledge Engineering, 2002, 42(2): 189222 [2] Boulicaut J F, Bykowski A, Rigotti C. FreeSets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries. Data Mining and Knowledge Discovery, 2003, 7(1): 522 [3] Pasquier N, Bastide Y, Taouil R, et al. Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems, 1999, 24(1): 2546 [4] Pasquier N, Bastide Y, Touil R, et al. Discovering Frequent Closed Itemsets for Association Rules // Beeri C, Buneman P, eds. Proc of the 7th International Conference on Database Theory. Jerusalem, Israel, 1999: 398416 [5] Pei Jian, Han Jiawei, Mao Runying. Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets // Proc of the ACMSIGMOD International Workshop on Data Mining and Knowledge Discovery. Dallas, USA, 2000: 2130 [6] Han Jiawei, Pei Jian, Yin Yiwen. Mining Frequent Patterns without Candidate Generation // Proc of the ACM SIGMOD International Conference on Management of Data. Dallas, USA, 2000: 112 [7] Wang Jianyong, Han Jiawei, Pei Jian. Closet+: Searching for the Best Strategies for Mining Frequent Closed Itemsets // Proc of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, USA, 2003: 236245 [8] Grahne G, Zhu Jianfen. Efficiently Using PrefixTrees in Mining Frequent Itemsets // Goethals B, Zaki M J, eds. Proc of the 1st IEEE ICDM Workshop on Frequent Itemset Mining Implementations. Melbourne, USA, 2003: 135143 [9] Zaki M J, Hsiao C J. ChARM: An Efficient Algorithm for Closed Itemset Mining // Proc of the 2nd SIAM International Conference on Data Mining. Arlington, USA, 2002: 3443