模式识别与人工智能
2025年4月3日 星期四   首 页     期刊简介     编委会     投稿指南     伦理声明     联系我们                                                                English
模式识别与人工智能  2023, Vol. 36 Issue (2): 108-119    DOI: 10.16451/j.cnki.issn1003-6059.202302002
论文与报告 最新目录| 下期目录| 过刊浏览| 高级检索 |
基于知识引导的自适应序列强化学习模型
李迎港1, 童向荣1
1.烟台大学 计算机与控制工程学院 烟台 264005
Knowledge-Guided Adaptive Sequence Reinforcement Learning Model
LI Yinggang1, TONG Xiangrong1
1. School of Computer and Control Engineering, Yantai University, Yantai 264005

全文: PDF (1074 KB)   HTML (1 KB) 
输出: BibTeX | EndNote (RIS)      
摘要 序列推荐可形式化为马尔科夫决策过程,进而转化为深度强化学习问题,其关键是从用户序列中挖掘关键信息,如偏好漂移、序列之间的依赖关系等,但当前大多数基于深度强化学习的推荐系统都是以固定序列长度作为模型输入.受知识图谱的启发,文中设计基于知识引导的自适应序列强化学习模型.首先,利用知识图谱的实体关系,从完整的用户反馈序列中截取部分序列作为漂移序列,其中漂移序列中的项目集合表示用户的当前偏好,序列长度表示用户的偏好变化速度.然后,通过门控循环单元提取漂移序列中用户的偏好变化和项目之间的依赖关系,同时利用自注意力机制对关键的项目信息进行选择性关注.最后,设计复合奖励函数,包括折扣序列奖励和知识图谱奖励,用于缓解奖励稀疏的问题.在4个真实世界数据集上的实验表明,文中模型的推荐准确率较优.
服务
把本文推荐给朋友
加入我的书架
加入引用管理器
E-mail Alert
RSS
作者相关文章
李迎港
童向荣
关键词 自适应序列深度强化学习知识图谱自注意力机制循环神经网络    
Abstract:The sequence recommendation can be formalized as a Markov decision process and then transformed into a deep reinforcement learning problem. Mining critical information from user sequences is a key step, such as preference drift and dependencies between sequences. In most current deep reinforcement learning recommendation systems, a fixed sequence length is taken as the input. Inspired by knowledge graphs, a knowledge-guided adaptive sequence reinforcement learning model is proposed. Firstly, using the entity relationship of the knowledge graph, a partial sequence is intercepted from the complete user feedback sequence as a drift sequence. The item set in the drift sequence represents the user's current preference, and the sequence length represents the user's preference change speed. Then, a gated recurrent unit is utilized to extract the user's preference changes and dependencies between items, while the self-attention mechanism selectively focuses on key item information. Finally, a compound reward function is designed, including discount sequence rewards and knowledge graph rewards, to alleviate the problem of sparse reward.Experiments on four real-world datasets demonstrate that the proposed model achieves superior recommendation accuracy.
Key wordsAdaptive Sequence    Deep Reinforcement Learning    Knowledge Graph    Self-Attention Mechanism    Recurrent Neural Network   
收稿日期: 2022-09-13     
ZTFLH: TP391  
基金资助:国家自然科学基金项目(No.62072392,61972360)、山东省重大科技创新工程项目(No.2019522Y020131)资助
通讯作者: 童向荣,博士,教授,主要研究方向为计算机科学、智能信息处理、社交网络.E-mail:txr@ytu.edu.cn.   
作者简介: 李迎港,硕士研究生,主要研究方向为深度强化学习、推荐系统.E-mail:lyg565795678@163.com.
引用本文:   
李迎港, 童向荣. 基于知识引导的自适应序列强化学习模型[J]. 模式识别与人工智能, 2023, 36(2): 108-119. LI Yinggang, TONG Xiangrong. Knowledge-Guided Adaptive Sequence Reinforcement Learning Model. Pattern Recognition and Artificial Intelligence, 2023, 36(2): 108-119.
链接本文:  
http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202302002      或     http://manu46.magtech.com.cn/Jweb_prai/CN/Y2023/V36/I2/108
版权所有 © 《模式识别与人工智能》编辑部
地址:安微省合肥市蜀山湖路350号 电话:0551-65591176 传真:0551-65591176 Email:bjb@iim.ac.cn
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn