两方零和马尔科夫博弈下的策略梯度算法
李永强, 周键, 冯宇, 冯远静
Policy Gradient Algorithm in Two-Player Zero-Sum Markov Games
LI Yongqiang, ZHOU Jian, FENG Yu, FENG Yuanjing
模式识别与人工智能 . 2023, (1): 81 -91 .  DOI: 10.16451/j.cnki.issn1003-6059.202301007