模式识别与人工智能
2025年3月29日 星期六   首 页     期刊简介     编委会     投稿指南     伦理声明     联系我们                                                                English
模式识别与人工智能  2025, Vol. 38 Issue (1): 22-35    DOI: 10.16451/j.cnki.issn1003-6059.202501002
论文与报告 最新目录| 下期目录| 过刊浏览| 高级检索 |
基于对抗强化学习的多跳知识推理
成凌云1, 郭银章1, 刘青芳1
1.太原科技大学 计算机科学与技术学院 太原 030024
Multi-hop Knowledge Reasoning Based on Adversarial Reinforcement Learning
CHENG Lingyun1, GUO Yinzhang1, LIU Qingfang1
1. College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024

全文: PDF (847 KB)   HTML (1 KB) 
输出: BibTeX | EndNote (RIS)      
摘要 为了解决现有知识图谱问答中多跳推理模型在复杂关系中表示不足、数据稀疏性及强化学习推理中存在虚假路径等问题,文中提出基于对抗强化学习的多跳知识推理模型.首先,通过高阶分解关系向量,实现实体与关系特征参数化组合,并在聚合邻居节点时引入注意力机制,赋予不同权重,增强复杂关系的表示能力.还设计知识图谱嵌入框架,用于衡量嵌入空间中<主题实体,问题,答案实体>的可信度.然后,将多维信息融入强化学习框架的状态表示中,避免因数据稀疏而导致的智能体无法得到可靠的决策依据.生成器根据状态信息计算候选实体的概率并生成答案,鉴别器评估答案和推理路径的合理性,通过软奖励和路径奖励优化反馈,缓解虚假路径问题,并使用对抗训练交替优化生成器和鉴别器.最后,将模型应用于云制造产品设计知识多跳问答系统中,验证模型的有效性.在多个公开数据集上的对比实验、消融实验及案例研究表明,文中模型性能较优.
服务
把本文推荐给朋友
加入我的书架
加入引用管理器
E-mail Alert
RSS
作者相关文章
成凌云
郭银章
刘青芳
关键词 复杂关系表示多跳推理对抗强化学习虚假路径    
Abstract:To address the issues of insufficient representation of complex relationships, data sparsity, and false paths in multi-hop reasoning models within existing knowledge graph question-answering systems, a multi-hop knowledge reasoning model based on adversarial reinforcement learning is proposed. First, high-order relation vectors are decomposed to parameterize and combine entity and relation features. An attention mechanism is introduced when neighboring nodes are aggregated to assign different weights, thereby enhancing the representation ability of complex relationships. Additionally, a knowledge graph embedding framework is designed to measure the credibility of <subject entity, question, answer entity> in the embedding space. Second, multi-dimensional information is integrated into the state representation of the reinforcement learning framework to enable the Agent to make reliable decisions despite data sparsity. The generator calculates the probability of candidate entities based on state information and generates answers, while the discriminator evaluates the reasonableness of the answers and the reasoning paths. The problem of false paths is alleviated by optimizing the feedback through soft rewards and path rewards, and adversarial training is utilized to alternately optimize the generator and the discriminator. Finally, the model is applied to a multi-hop question-answering system for cloud manufacturing product design knowledge to verify its effectiveness. Comparative experiments, ablation experiments and case studies verify the effectiveness of the proposed model.
Key wordsComplex Relation Representation    Multi-hop Reasoning    Adversarial Reinforcement Learning    False Path   
收稿日期: 2024-11-04     
ZTFLH: TP391.1  
基金资助:中央引导地方科技发展资金项目(No.YDZJSX1A044)、智能信息处理山西省重点实验室开放课题基金项目(No.CICIP2023001)、山西省研究生实践创新项目(No.2024SJ320)资助
通讯作者: 郭银章,博士,教授,主要研究方向为群智计算、云计算、深度学习.E-mail:guoyinzhang@tyust.edu.cn.   
作者简介: 成凌云,硕士研究生,主要研究方向为云计算与云安全、知识图谱.E-mail:s202220210949@stu.tyust.edu.cn. 刘青芳,硕士研究生,主要研究方向为群智计算、云计算、深度学习.E-mail:s202220210951@stu.tyust.edu.cn.
引用本文:   
成凌云, 郭银章, 刘青芳. 基于对抗强化学习的多跳知识推理[J]. 模式识别与人工智能, 2025, 38(1): 22-35. CHENG Lingyun, GUO Yinzhang, LIU Qingfang. Multi-hop Knowledge Reasoning Based on Adversarial Reinforcement Learning. Pattern Recognition and Artificial Intelligence, 2025, 38(1): 22-35.
链接本文:  
http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202501002      或     http://manu46.magtech.com.cn/Jweb_prai/CN/Y2025/V38/I1/22
版权所有 © 《模式识别与人工智能》编辑部
地址:安微省合肥市蜀山湖路350号 电话:0551-65591176 传真:0551-65591176 Email:bjb@iim.ac.cn
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn