基于核方法的连续动作Actor-Critic学习<sup>*</sup>

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (488 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract In reinforcement learning, the learning algorithms frequently have to deal with both continuous state and continuous action spaces to control accurately. In this paper, the great capacity of kernel method for handling continuous state space problems and the advantage of actor-critic method in dealing with continuous action space problems are combined. Kernel-based continuous-action actor-critic learning(KCACL) is proposed grounded on the combination. In KCACL, the actor updates each action probability based on reward-inaction, and the critic updates the state value function according to online selective kernel-based temporal difference(OSKTD) learning. The experimental results demonstrate the effectiveness of the proposed algorithm.

Received: 13 May 2013

ZTFLH:

TP 181

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/ OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2014/V27/I2/103