Abstract:A foremostpolicy reinforcement learning based ART2 neural network (FPRLART2) and its learning algorithm are proposed in this paper. To fit the requirement of real time learning, the first awarded behavior based on present states is selected in our ForemostPolicy Reinforcement Learning (FPRL) in stead of the optimal behavior in 1step QLearning. The algorithm of FPRL is given and it is integrated with ART2 neural network. The stored weights of classified pattern in ART2 is increased or decreased by reinforcement learning. The FPRLART2 is successfully used in collision avoidance of mobile robot and the simulation experiment indicates that the times of collision between robot and obstacle is effectively decreased. The FPRLART2 makes favorable result of collision avoidance.
[1] Carpenter G A, Grossberg S. ART2: Stable Self-Organization of Category Recognition Codes for Analog Input Patterns. Applied Optics, 1987, 26(23): 4919-4930 [2] Liu X H, Yu Z Z, Duan J, et al. Face Recognition Using Adaptive Resonance Theory. In: Proc of the International Conference on Machine Learning and Cybernetics. Xi’an, China, 2003, Ⅴ: 3167-3171 [3] Fan J, Wu G F, et al. Reinforcement Learning and ART2 Neural Network Based Collision Avoidance System of Mobile Robot. In: Yin F L, Wang J, Guo C G, eds. Lecture Notes in Computer Science. 2004, 3174: 35-40 [4] Li M, Yan C H, Liu G H. ART2 Neural Networks with More Vigorous Vigilance Test Criterion. Journal of Image and Graphics, 2001, 6(1): 81-85 (in Chinese) (黎 明,严超华,刘高航.具有更严格警戒测试准则的ART2神经网络.中国图象图形学报, 2001, 6(1): 81-85) [5] Whitehead S D, Sutton R S, Ballard D H. Advances in Reinforcement Learning and Their Implications for Intelligent Control. In: Proc of the 5th IEEE International Symposium on Intelligent Control. Philadelphia, USA ,1990, Ⅱ: 1289-1297 [6] Suwimonteerabuth D, Chongstitvatana P. Online Robot Learning by Reward and Punishment for a Mobile Robot. In: Proc of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Lausanne, Switzerland, 2002, Ⅰ: 921-926 [7] Fujimori A, Tani S. A Navigation of Mobile Robot with Collision Avoidance for Moving Obstacles. In: Proc of the IEEE International Conference on Industrial Technology. Bangkok, Thailand, 2002, Ⅰ: 1-6 [8] Grossberg S. Adaptive Pattern Classification and Universal Recoding, I: Parallel Development and Coding of Neural Feature Detectors. Biological Cybernetics, 1976, 23(3): 121-134 [9] Grossberg S. Adaptive Pattern Classification and Universal Recoding, II: Feedback, Expectation, Olfaction, Illusions. Biological Cybernetics, 1976, 23(4): 187-202 [10] Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge, USA: MIT Press, 1998 [11] Xiao N F, Nahavandi S. A Reinforcement Learning Approach for Robot Control in an Unknown Environment. In: Proc of the IEEE International Conference on Industrial Technology. Bangkok, Thailand, 2002, Ⅱ: 1096-1099