基于路径积分强化学习方法的蛇形机器人目标导向运动

doi:10.16451/j.cnki.issn1003-6059.201901001

摘要
图/表
参考文献
相关文章 (10)

全文: PDF (1653 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要路径积分方法源于随机最优控制,是一种数值迭代方法,可求解连续非线性系统的最优控制问题,不依赖于系统模型,快速收敛.文中将基于路径积分强化学习的策略改善方法用于蛇形机器人的目标导向运动.使用路径积分强化学习方法学习蛇形机器人步态方程的参数,不仅可以在仿真环境下使蛇形机器人规避障碍到达目标点,利用仿真环境的先验知识也能在实际环境下快速完成相同的任务.实验结果验证方法的正确性.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	方勇纯
	朱威
	郭宪

关键词 ：路径积分, 强化学习, 随机最优控制, 蛇形机器人, 目标导向

Abstract：Path integral is derived from stochastic optimal control. It is a numerical iteration method and solves the problem of the optimal control about continuous nonlinear systems at a high convergence speed without system model. A policy improvement algorithm based on path integral reinforcement learning is proposed for the target-directed locomotion of a snake-like robot in this paper. The path integral reinforcement learning approach is employed to learn the parameters of the snake-like robot serpentine equation, and the robot is controlled to arrive at the target position fast without contacting obstacles in simulation environment. Moreover, the robot with the priori knowledge from the simulation in real environment can complete the task well. Experimental result verifies the validity of the propose algorithm.

收稿日期: 2018-09-10

ZTFLH:

TP 242.6

基金资助:国家自然科学基金项目(No.61603200,U1613210)资助

通讯作者: 方勇纯,博士,教授,主要研究方向为机器人视觉控制、无人机、欠驱动吊车系统、微纳米操作.E-mail:fangyc@nankai.edu.cn.

作者简介: 朱威,硕士研究生,主要研究方向为蛇形机器人、深度强化学习.E-mail:zhuwei@mail.nankai.edu.cn.郭宪,博士,讲师,主要研究方向为蛇形机器人、深度强化学习.E-mail:guoxian@nankai.edu.cn.

引用本文:

方勇纯, 朱威, 郭宪,. 基于路径积分强化学习方法的蛇形机器人目标导向运动[J]. 模式识别与人工智能, 2019, 32(1): 1-9. FANG Yongchun, ZHU Wei, GUO Xian. Target-Directed Locomotion of a Snake-Like Robot Based on Path Integral Reinforcement Learning. , 2019, 32(1): 1-9.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201901001 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2019/V32/I1/1