A Method to Design Reinforcement Function Based on Fuzzy Rules in QLearning
ZHAO XiaoHua1, LI ZhenLong2, CHEN YangZhou2, RONG Jian1
1.Key Laboratory of Transportation Engineering in Beijing, Beijing University of Technology, Beijing 1000222. School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100022
Abstract:Qlearning is a reinforcement learning method to solve Markovian decision problems with incomplete information. The design of reward function is an important factor that affects the learning results of Qlearning. A method to design the reward function of Qlearning based on fuzzy rules is introduced to improve the performance of reinforcement learning, and the method is applied to traffic signal optimal control. According to different traffic condition, the switching time and switching sequence of phase can be adapted. The performance of the system is evaluated by Paramics microcosmic traffic simulation software. And the results show that the learning effect of Qlearning based on fuzzy rules is better than that of conventional Qlearning for traffic signal control.
赵晓华,李振龙,陈阳舟,荣建. Q学习中基于模糊规则的强化函数设计方法[J]. 模式识别与人工智能, 2008, 21(2): 254-259.
ZHAO XiaoHua, LI ZhenLong, CHEN YangZhou, RONG Jian. A Method to Design Reinforcement Function Based on Fuzzy Rules in QLearning. , 2008, 21(2): 254-259.
[1] Watkins C J C H, Dayan P. Technical Note: QLearning. Machine Learning, 1992, 8(3), 279292 [2] Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge, USA: MIT Press, 1998 [3] Wu Q H. Reinforcement Learning Control Using Interconnected Learning Automata. International Journal of Control, 1995, 62(1): 116 [4] Zhang Rubo, Gu Guochang, Liu Zhaode, et al. Reinforcement Learning Theory, Algorithms and Its Application. Control Theory and Applications, 2000, 17(5): 637642 (in Chinese) (张汝波,顾国昌,刘照德,等.强化学习理论、算法及应用.控制理论与应用, 2000, 17(5): 637642) [5] Fan Bo, Pan Quan, Zhang Hongcai. A Method to Design the Reward Function Based on Knowledge in MultiAgent Learning. Computer Engineering and Applications, 2005, 41(3): 7779 (in Chinese) (范 波,潘 泉,张洪才.多智能体学习中基于知识的强化函数设计方法.计算机工程与应用, 2005, 41(3): 7779) [6] Zhang Rubo, Zhou Ning, Gu Guochang, et al. Reinforcement Learning Based Obstacle Avoidance Learning for Intelligent Robot. Robot, 1999, 21(3): 204209 (in Chinese) (张汝波,周 宁,顾国昌,等.基于强化学习智能机器人避碰方法研究.机器人, 1999, 21(3): 204209) [7] Yang Ming, Jia Li, Qiu Yuhui. Research on Automated Negotiation in MultiAgent System Based on Reinforcement Learning. Computer Engineering and Applications, 2004, 40(33): 98100,117 (in Chinese) (杨 明,嘉 莉,邱玉辉.基于增强学习的多Agent自动协商研究.计算机工程与应用, 2004, 40(33): 98100,117) [8] Ma Shoufeng, Li Ying, Liu Bao. AgentBased Learning Control Method for Urban Traffic Signal of Single Intersection. Journal of Systems Engineering, 2002, 17(6): 526530 (in Chinese) (马寿峰,李 英,刘 豹.一种基于Agent的单路口交通信号学习控制方法.系统工程学报, 2002, 17(6): 526530) [9] Jiang Guofei, Wu Cangpu. Learning to Control an Inverted Pendulum Using QLearning and Neural Networks. Acta Automatica Sinica, 1998, 24(5): 662666 (in Chinese) (蒋国飞,吴沧浦.基于Q学习算法和BP神经元网络的倒立摆控制.自动化学报, 1998, 24(5): 662666)