模式识别与人工智能
Friday, Apr. 11, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2017, Vol. 30 Issue (7): 663-672    DOI: 10.16451/j.cnki.issn1003-6059.201707009
Orignal Article Current Issue| Next Issue| Archive| Adv Search |
Residual Value Iteration Algorithm Based on Function Approximation
CHEN Jianping1,2,3, HU Wen1,2,3, FU Qiming1,2,3,4
1.School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009
2.Jiangsu Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou 215009
3.Suzhou Key Laboratory of Mobile Networking and Applied Technologies, Suzhou University of Science and Technology, Suzhou 215009
4.Symbol Computation and Knowledge Engineer of Ministry of Education, Jilin University, Changchun 130012

Download: PDF (623 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Aiming at the problem of unstable and slow convergence of traditional value iteration algorithm, an improved residual value iteration algorithm based on function approximation is proposed. The traditional value iteration algorithm and the value iteration algorithm with Bellman residual are combined. Weight factors are introduced and new rules are constructed to update value function parameter vector. Theoretically, the new parameter vector can guarantee the convergence of the algorithm and solve the unstable convergence problem in the traditional value iteration algorithm. Moreover, the forgotten factor is introduced to speed up the convergence of the algorithm. The experimental results of Grid World problem show that the proposed algorithm has good performance and robustness.
Key wordsReinforcement Learning      Value Iteration      Function Approximation      Gradient Descent      Bellman Residual     
Received: 02 November 2016     
ZTFLH: TP 181  
Fund:Supported by National Natural Science Foundation of China(No.61602334,61672371,61502329), Natural Science Foundation of Jiangsu Province(No.BK20140283), Funding of Suzhou Science and Technology(No.SZS201609)
Corresponding Authors: CHEN Jianping(Corresponding author), born in 1963, Ph.D., professor. His research interests include big data and analytics, building energy efficiency and intelligent information processing.)   
About author:: CHEN Jianping(Corresponding author), born in 1963, Ph.D., professor. His research interests include big data and analytics, building energy efficiency and intelligent information processing.)
(HU Wen, born in 1992, master student. Her research interests include reinforcement learning and building energy efficiency.)
(FU Qiming, born in 1985, Ph.D., lectu-rer. His research interests include reinforcement learning, pattern recognition and buil-ding energy efficiency.)
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
CHEN Jianping
HU Wen
FU Qiming
Cite this article:   
CHEN Jianping,HU Wen,FU Qiming. Residual Value Iteration Algorithm Based on Function Approximation[J]. , 2017, 30(7): 663-672.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201707009      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2017/V30/I7/663
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn