Abstract:A feature extraction method for customer purchase behavior based on genetic algorithm (GA) is proposed. Firstly, Tanimoto similarity is used to measure purchase behavior similarity between customers, and a clustering method based on genetic algorithm is designed to cluster customers who have similar purchase behavior in the same subpopulation. Then, an customer feature extraction method based on multi-population genetic algorithm is presented to find out knowledge from all kinds of subpopulation. To promote coevolution within the population and the quality of rule set, q-nearest neighbor replacement policy and local search are adopted. The proposed algorithm is validated by using real-world retail data and is compared with Apriori algorithm. Experimental results show that the proposed algorithm can efficiently yield condensed rule sets without generating frequent itemsets and is more flexible in rule form as well. Finally, the experimental results are analyzed in detail.
[1] Liu D R, Shih Y Y. Integrating AHP and Data Mining for Product Recommendation Based on Customer Lifetime Value. Information and Management, 2005, 42(3): 387-400 [2] Cheng C H, Chen Y S. Classifying the Segmentation of Customer Value via RFM Model and RS Theory. Expert Systems with Applications: An International Journal, 2009, 36(3): 4176-4184 [3] Hsieh N C. An Integrated Data Mining and Behavioral Scoring Model for Analyzing Bank Customers. Expert Systems with Applications, 2004, 27(4): 623-633 [4] Hughes A M. Strategic Database Marketing. New York, USA: McGraw-Hill, 2004 [5] Kaymak U. Fuzzy Target Selection Using RFM Variables // Proc of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference. Vancouver, Canada, 2001: 1038-1043 [6] Cho Y S, Ryu K H. Implementation of Personalized Recommendation System Using Demograpic Data and RFM Method in e-Commerce // Proc of the 4th IEEE International Conference on Management of Innovation and Technology. Bangkok, Thailand, 2008: 475-479 [7] Schmittlein D C, Morrison D G, Colombo R. Counting Your Customers: Who Are They and What will They Do Next? Management Science, 1987, 33(1): 1-24 [8] Schmittlein D C, Peterson R A. Customer Base Analysis-An Industrial Purchase Process Application. Marketing Science, 1994, 13(1): 41-67 [9] Han Peng, Xie Bo, Yang Fan. A Scalable P2P Recommender System Based on Distributed Collaborative Filtering. Expert Systems with Applications, 2004, 27(2): 203-210 [10] Mild A, Reutterer T. An Improved Collaborative Filtering Approach for Predicting Cross-Category Purchases Based on Binary Market Basket Data. Journal of Retailing and Consumer Services, 2003, 10(3): 123-133 [11] Natter M. Conditional Market Segmentation by Neural Networks: A Monte-Carlo Study. Journal of Retailing and Consumer Services, 1999, 6(4): 237-248 [12] Kuo R J, Ho L M, Hu C M. Integration of Self-Organizing Feature Map and K-means Algorithm for Market Segmentation. Computers and Operations Research,2002, 29(11): 1475-1493 [13] Li Minqiang, Kou Jizong, Lin Pan, et al. The Basic Theories and Applications in GA. Beijing, China: Science Press, 2002 (in Chinese) (李敏强,寇纪淞,林 丹,等.遗传算法的基本理论与应用.北京:科学出版社, 2002) [14] Maulik U, Bandyopadhyay S. Genetic Algorithm-Based Clustering Technique. Pattern Recognition, 2000, 33(9): 1455-1465 [15] Rakesh A, Tomasz I, Arun S. Mining Association Rules between Sets of Items in Large Databases // Proc of the ACM SIGMOD International Conference on Management of Data. Washington, USA, 1993: 1-10 [16] Rakesh A, Ramakrishnan S. Fast Algorithms for Mining Association Rules in Large Databases // Proc of the 20th International Conference on Very Large Data Bases. Hongkong, China, 1994: 487-499 [17] Smith S F. A Learning System Based on Genetic Adaptive Algorithms. Pittsburgh, USA: University of Pittsburgh Press, 1980: 214 [18] Holland J H, Reitman J S. Cognitive Systems Based on Adaptive Algorithms. ACM SIGART Bulletin, 1977, 63: 49 [19] Li Minqiang, Kou Jisong. A Novel Type of Niching Methods Based on Steady-State Genetic Algorithm // Proc of the the International Conference on Advances in Natural Conputation. Changsha, China, 2005: 37-47 [20] Li Minqiang, Kou Jisong. Crowding with Nearest Neighbors Replacement for Multiple Species Niching and Building Blocks Preservation in Binary Multimodal Functions Optimization. Journal of Heuristics, 2008, 14(3): 243-270 [21] Microsoft. Sample Report-Foodmart Sales [DB/OL].[2009-10-07]. http://msdn.microsoft.com/en-us/library/aa237383(SQL.80).aspx [22] Tsay Y J, Chiang J Y. CBAR: An Efficient Method for Mining Association Rules. Knowledge-Based Systems, 2005, 18(2/3):99-105 [23] Sergey B, Motwani R, Ullman J D, et al. Dynamic Itemset Counting and Implication Rules for Market Basket Data // Proc of the ACM SIGMOD International Conference on Management of Data. Tucson, USA, 1997: 255-264 [24] Ramakrishnan S, Rakesh A. Mining Generalized Association Rules // Proc of the 21st International Conference on Very Large Data Bases. Zurich, Switzerland, 1995: 407-409