稀疏奖励下基于情感的异构多智能体强化学习

doi:10.16451/j.cnki.issn1003-6059.202103004

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1685 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在强化学习中,当处于奖励分布稀疏的环境时,由于无法获得有效经验,智能体收敛速度和效率都会大幅下降.针对此类稀疏奖励,文中提出基于情感的异构多智能体强化学习方法.首先,建立基于个性的智能体情感模型,为异构多智能体提供激励机制,作为外部奖励的有效补充.然后,基于上述激励机制,融合深度确定性策略,提出稀疏奖励下基于内在情感激励机制的深度确定性策略梯度强化学习算法,加快智能体的收敛速度.最后,在多机器人追捕仿真实验平台上,构建不同难度等级的稀疏奖励情景,验证文中方法在追捕成功率和收敛速度上的有效性和优越性.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	方宝富
	马云婷
	王在俊
	王浩

关键词 ：强化学习, 稀疏奖励, 奖励机制, 情感模型

Abstract：In reinforcement learning, the convergence speed and efficiency of the agent are greatly reduced due to its inability to acquire effective experience in an sparse reward distribution environment. Aiming at this kind of sparse reward problem, a method of emotion-based heterogeneous multi-agent reinforcement learning with sparse reward is proposed in this paper. Firstly, the emotion model based on personality is established to provide incentive mechanism for multiple heterogeneous agents as an effective supplement to external rewards. Then, based on this mechanism, a deep deterministic strategy gradient reinforcement learning algorithm based on intrinsic emotional incentive mechanism under sparse rewards is proposed to accelerate the convergence speed of agents. Finally, multi-robot pursuit is used as a simulation experiment platform to construct sparse reward scenarios with different difficulty levels, and the effectiveness and superiority of the proposed method in pursuit success rate and convergence speed are verified.

Key words： Reinforcement Learning Sparse Reward Reward Mechanism Emotion Model

收稿日期: 2020-11-27

ZTFLH:

TP 391

基金资助:国家自然科学基金项目(No.61872327)、中央高校基本科研业务费专项资金项目(No.ACAIM190102)、民航飞行技术与飞行安全重点实验室开放基金项目(No.FZ2020KF07)资助

通讯作者: 方宝富,博士,副教授,主要研究方向为多机器人/智能体系统、情感智能体、强化学习.E-mail:fangbf@ hfut.edu.cn.

作者简介: 马云婷,硕士研究生,主要研究方向为计算机应用技术、强化学习、情感智能体.E-mail:2502662935@qq.com.王在俊,硕士,副教授,主要研究方向为多机器人任务分配、人工智能.E-mail:tiantian20030315@126.com.王浩,博士,教授,主要研究方向为人工智能、机器人.E-mail:jsjxwangh@hfut.ecu.cn.

引用本文:

方宝富, 马云婷, 王在俊, 王浩. 稀疏奖励下基于情感的异构多智能体强化学习[J]. 模式识别与人工智能, 2021, 34(3): 223-231. FANG Baofu, MA Yunting, WANG Zaijun, WANG Hao. Emotion-Based Heterogeneous Multi-agent Reinforcement Learning with Sparse Reward. , 2021, 34(3): 223-231.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202103004 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2021/V34/I3/223