N个最频繁项集挖掘算法<sup>*</sup>

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (457 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract The computing complexity of the frequent itemsets mining algorithm and the number of frequent itemsets are increased exponentially with the number of items in a transaction set. The minimum support threshold becomes a key to control such an increase. However, in practical application it will be difficult to control frequent itemsets scale, if only support threshold is used. The problem of Nmost frequent itemsets is introduced, and the breadthfirstsearch algorithm NApriori and the depthfirstsearch algorithm IntvMatrix based on the dynamic minimum support threshold are presented to solve the problem. Experimental result shows the proposed algorithms are faster than nave method, and the improvement of the speed is remarkable when N is low.

Key words： Data Mining NMost Frequent Itemsets Support Threshold Inverted Matrix

Received: 22 November 2005

ZTFLH:

TP311

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	CHEN XiaoYun
	HU YunFa

Cite this article:

CHEN XiaoYun,HU YunFa. Mining Algorithms of NMost Frequent Itemsets[J]. , 2007, 20(4): 512-518.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/ OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2007/V20/I4/512

[1] Agrawal R, Imielinski T, Swami A. Mining Association Rules between Sets of Items in Large Databases // Proc of the ACM SIGMOD Conference on Management of Data. Washington, USA, 1993: 207216
[2]Agrawal R, Srikant R. Fast Algorithms for Mining Association Rules // Proc of the International Conference on Very Large Databases. Santiago, USA, 1994: 487499
[3]Han Jiawei, Pei Jian, Yin Yiwen. Mining Frequent Patterns without Candidate Generation: A FrequentPattern Tree Approach. Data Mining and Knowledge Discovery, 2004, 8(1): 5387
[4]Hipp J, Guntzer U, Nakhaeizadeh G. Algorithms for Association Rule Mining-A General Survey and Comparison. SIGKDD Explorations, 2000, 2(2): 5864
[5]Pei Jian, Han Jiawei, AslMortazavi B, et al. PrefixSpan: Mining Sequential Patterns Efficiently by PrefixProjected Pattern Growth // Proc of the 17th International Conference on Data Engineering. Heidelberg, Germany, 2001: 215224
[6]Chen Xiaoyun, Chen Yi, Wang Lei, et al. Text Categorization Based on Classification Rules Tree by Frequent Patterns. Journal of Software, 2006, 17(5): 10171025 (in Chinese)
(陈晓云,陈袆,王雷,等.基于分类规则树的频繁模式文本分类.软件学报. 2006, 17(5): 10171025)
[7]Beil F, Ester M, Xu X. Frequent TermBased Text Clustering // Proc of the 8th International Conference on Knowledge Discovery and Data Mining. New York, USA, 2002: 436442
[8]Fu A W C, Kwong R W W, Tang Jian. Mining NMost Interesting Itemsets // Proc of the International Symposium on Methodologies for Intelligent Systems. Lyon, France, 2000:5967
[9]ElHajj M, Zaiane O R. Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining // Proc of the International Conference on Data Mining and Knowledge Discovery. Washington, USA, 2003: 109118
[10]Richrdo B Y, Berthier R N. Modern Information Retrieval. Milan, Italy: AddisonWesley, 1999
[11]Borgelt C,Kruse R. Induction of Association Rules: Apriori Implementation // Proc of the 15th Conference on Computational Statistics. Berlin, Germany, 2001: 395400