|
|
Task Scheduling Algorithm Based on Q-Learning and Programming for Sensor Nodes |
WEI Zhenchun1,2, XU Xiangwei 1 , FENG Lin1,2, DING Bei1 |
1.School of Computer and Information, Hefei University of Technology, Hefei 230009 2.Engineering Research Center of Safety Critical Industrial Measurement and Control Technology of Ministry of Education, Hefei University of Technology, Hefei 230009 |
|
|
Abstract To improve the learning policy and obtain better application performance of sensor nodes, a task scheduling algorithm based on Q-learning and programming (QP) for sensor nodes is proposed with the task model of data collection applications. Specifically, some basic learning elements, such as state space, delayed reward and the exploration-exploitation policy, are defined in QP as well. Moreover, according to the characteristics of wireless sensor network(WSN), the programming process based on the expired mechanism and the priority mechanism is established to improve the learning policy by making full use of empirical knowledge. Experimental results show that QP has the ability to perform task scheduling dynamically according to current WSN environments. Compared with other task scheduling algorithms, QP achieves better application performance with reasonable energy consumption.
|
Received: 12 April 2016
|
Fund:Supported by National Natural Science Foundation of China (No.61502142,61370088), International Science & Technology Cooperation Program of China (No.2014DFB10060) |
About author:: WEI Zhenchun, born in 1978, Ph.D., associate professor. His research interests include wireless sensor network distributed control and embedded system. XU Xiangwei, born in 1990, master student. His research interests include wireless sensor network and reinforcement learning. FENG LinCorresponding author, born in 1979,Ph.D., senior engineer. Her research interests include vehicular ad-Hoc network and wireless network. DING Bei, born in 1991, master student. His research inte-rests include wireless sensor network and reinforcement lear-ning. |
|
|
|
[1] IONEL S M, POPESCU D. Wireless Sensor Network for Monitoring Applications // Proc of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems. Washington, USA: IEEE, 2011: 867-871. [2] 毕 冉.基于无线传感器网络的事件监测算法研究.博士学位论文.哈尔滨:哈尔滨工业大学, 2015. (BI R. Research on Algorithms for Event Monitoring in Wireless Sensor Networks. Ph.D Dissertation. Harbin, China: Harbin Institute of Technology, 2015.) [3] KO J G, KLUES K, RICHTER C, et al. Low Power or High Performance? A Tradeoff Whose Time Has Come (and Nearly Gone) // Proc of the 9th European Conference on Wireless Sensor Networks. Berlin, Germany: Springer, 2012: 98-114. [4] FRANK C, RMER K. Algorithms for Generic Role Assignment in Wireless Sensor Networks // Proc of the 3rd International Conference on Embedded Networked Sensor Systems. New York, USA: ACM Press, 2005: 230-242. [5] SHAH K, KUMAR M. Distributed Independent Reinforcement Learning (DIRL) Approach to Resource Management in Wireless Sensor Networks // Proc of the IEEE International Conference on Mobile Adhoc and Sensor Systems. New York, USA: IEEE, 2007. DOI: 10.1109/MOBHOC.2007.4428658. [6] CIRSTEA C, DAVIDESCU R, GONTEAN A. A Reinforcement Learning Strategy for Task Scheduling of WSNs with Mobile Nodes // Proc of the 36th IEEE International Conference on Telecommunications and Signal Processing. New York, USA: IEEE, 2013: 348-353. [7] KHAN M I, RINNER B. Performance Analysis of Resource-Aware Task Scheduling Methods in Wireless Sensor Networks. International Journal of Distributed Sensor Networks, 2014, 10(9). DOI: 10.1155/2014/7651802. [8] 王雪松,朱美强,程玉虎.强化学习原理及其应用.北京:科学出版社, 2014. (WANG X S, ZHU M Q, CHENG Y H. Reinforcement Learning Principles, Applications. Beijing, China: Science Press, 2014.) [9] 傅启明,刘 全,孙洪坤,等.一种二阶TD Error快速Q(λ)算法.模式识别与人工智能, 2013, 26(3): 282-292. (FU Q M, LIU Q, SUN H K, et al. A Fast Q(λ) Algorithm Based on Second-Order TD Error. Pattern Recognition and Artificial Inte-lligence, 2013, 26(3): 282-292.) [10] YAU K L A, KWONG K H, SHEN C. Reinforcement Learning Models for Scheduling in Wireless Networks. Frontiers of Computer Science, 2013, 7(5): 754-766. [11] 陈圣磊,吴慧中,肖 亮,等.基于Metropolis准则的多步Q学习算法与性能仿真.系统仿真学报, 2007, 19(6): 1284-1287. (CHEN S L, WU H Z, XIAO L, et al. Metropolis Policy-Based Multi-step QLearning Algorithm and Performance Simulation. Journal of System Simulation, 2007, 19(6): 1284-1287.) [12] HWANG K S, JIANG W C, CHEN Y J. Model Learning and Knowledge Sharing for a Multiagent System with Dyna-Q Learning. IEEE Trans on Cybernetics, 2014, 45(5): 964-976. [13] PENG J, WILLIAMS R J. Efficient Learning and Planning within the Dyna Framework. Adaptive Behavior, 1993, 1(4): 437-454. [14] 于 俊,刘 全,傅启明,等.基于优先级扫描Dyna结构的贝叶斯Q学习方法.通信学报, 2013, 34(11): 129-139. (YU J, LIU Q, FU Q M. Bayesian Q Learning Method with Dyna Architecture and Prioritized Sweeping. Journal on Communications, 2013, 34(11): 129-139.)
[15] HEINZELMAN W B, CHANDRAKASAN A P, BALAKRISHNAN H. An Application-Specific Protocol Architecture for Wireless Microsensor Networks. IEEE Trans on Wireless Communications, 2002, 1(4): 660-670. |
|
|
|