基于函数逼近的冗余值迭代算法<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201707009

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (623 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract Aiming at the problem of unstable and slow convergence of traditional value iteration algorithm, an improved residual value iteration algorithm based on function approximation is proposed. The traditional value iteration algorithm and the value iteration algorithm with Bellman residual are combined. Weight factors are introduced and new rules are constructed to update value function parameter vector. Theoretically, the new parameter vector can guarantee the convergence of the algorithm and solve the unstable convergence problem in the traditional value iteration algorithm. Moreover, the forgotten factor is introduced to speed up the convergence of the algorithm. The experimental results of Grid World problem show that the proposed algorithm has good performance and robustness.

Key words： Reinforcement Learning Value Iteration Function Approximation Gradient Descent Bellman Residual

Received: 02 November 2016

ZTFLH:

TP 181

Fund:Supported by National Natural Science Foundation of China(No.61602334,61672371,61502329), Natural Science Foundation of Jiangsu Province(No.BK20140283), Funding of Suzhou Science and Technology(No.SZS201609)

Corresponding Authors: CHEN Jianping(Corresponding author), born in 1963, Ph.D., professor. His research interests include big data and analytics, building energy efficiency and intelligent information processing.)

About author:: CHEN Jianping(Corresponding author), born in 1963, Ph.D., professor. His research interests include big data and analytics, building energy efficiency and intelligent information processing.)
(HU Wen, born in 1992, master student. Her research interests include reinforcement learning and building energy efficiency.)
(FU Qiming, born in 1985, Ph.D., lectu-rer. His research interests include reinforcement learning, pattern recognition and buil-ding energy efficiency.)

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	CHEN Jianping
	HU Wen
	FU Qiming

Cite this article:

CHEN Jianping,HU Wen,FU Qiming. Residual Value Iteration Algorithm Based on Function Approximation[J]. , 2017, 30(7): 663-672.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201707009 OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2017/V30/I7/663