模式识别与人工智能
Wednesday, Apr. 23, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2016, Vol. 29 Issue (11): 961-968    DOI: 10.16451/j.cnki.issn1003-6059.201611001
Papers and Reports Current Issue| Next Issue| Archive| Adv Search |
Hybrid Heuristic Value Iteration POMDP Algorithm
LIU Feng
Software Institute, Nanjing University, Nanjing 210093
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093

Download: PDF (423 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Point-based value iteration methods are a kind of algorithms for effectively solving partially observable Markov decision process (POMDP) model. However, the algorithm efficiency is limited by the belief point set explored in most of the algorithms by single heuristic criterion. A hybrid heuristic value iteration algorithm (HHVI) for exploring belief point set is presented in this paper. The upper and lower bounds on the value function are maintained and only the belief points with its value function bounds difference greater than the threshold are selected to expand. Furthermore, the furthest belief point away from the explored point set among the subsequent belief points with the above difference also greater than the threshold is explored. The convergence effect of HHVI is guaranteed by making the explored point set fully distributed in the reachable belief space. Experimental results of four benchmarks show that HHVI can guarantee the convergence efficiency and obtain better global optimal solution.
Key wordsPartially Observable Markov Decision Process(POMDP)      Hybrid Heuristic Value Iteration      Reachable Belief Space      Exploration Value     
Received: 04 May 2016     
ZTFLH: TP 319  
Fund:Supported by General Project of State Key Laboratory for Novel Software Technology (No.ZZKT2016B07)
About author:: LIU Feng, born in 1976, Ph.D., lecturer. His research interests include intelligent planning and reinforcement learning.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
LIU Feng
Cite this article:   
LIU Feng. Hybrid Heuristic Value Iteration POMDP Algorithm[J]. , 2016, 29(11): 961-968.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201611001      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2016/V29/I11/961
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn