基于Q学习和规划的传感器节点任务调度算法<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201611008

摘要
图/表
参考文献
相关文章 (4)

全文: PDF (477 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要为了改善节点的学习策略，提高节点的应用性能，以数据收集为应用建立任务模型，提出基于Q学习和规划的传感器节点任务调度算法，包括定义状态空间、延迟回报、探索和利用策略等基本元素.根据无线传感器网络(WSN)特性，建立基于优先级机制和过期机制的规划过程，使节点可以有效利用经验知识，改善学习策略.实验表明，文中算法具备根据当前WSN环境进行动态任务调度的能力.相比其它任务调度算法，文中算法能量消耗合理且获得较好的应用性能.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	魏振春
	徐祥伟
	冯琳
	丁蓓

关键词 ：无线传感器网络(WSN), 传感器节点, 任务调度, Q学习, 规划过程

Abstract：To improve the learning policy and obtain better application performance of sensor nodes, a task scheduling algorithm based on Q-learning and programming (QP) for sensor nodes is proposed with the task model of data collection applications. Specifically, some basic learning elements, such as state space, delayed reward and the exploration-exploitation policy, are defined in QP as well. Moreover, according to the characteristics of wireless sensor network(WSN), the programming process based on the expired mechanism and the priority mechanism is established to improve the learning policy by making full use of empirical knowledge. Experimental results show that QP has the ability to perform task scheduling dynamically according to current WSN environments. Compared with other task scheduling algorithms, QP achieves better application performance with reasonable energy consumption.

Key words： Wireless Sensor Network(WSN) Sensor Node Task Scheduling Q-learning Programming Process

收稿日期: 2016-04-12

基金资助:国家自然科学基金项目(No.61502142,61370088)、国家国际科技合作专项项目(No.2014DFB10060)资助

作者简介: 魏振春,男,1978年生,博士,副教授,主要研究方向为无线传感器网络、分布式控制与嵌入式系统.E-mail:weizc@hfut.edu.cn.
徐祥伟,男,1990年生,硕士研究生,主要研究方向为无线传感器网络、强化学习.E-mail:1259148495@qq.com.
冯琳(通讯作者),女,1979年生,博士,高级工程师,主要研究方向为车载自组织网络、无线网络.E-mail:fenglin@hfut.edu.cn.
丁蓓,男,1991年生,硕士研究生,主要研究方向为无线传感器网络、强化学习.E-mail:462618683@qq.com.

引用本文:

魏振春，徐祥伟，冯琳，丁蓓. 基于Q学习和规划的传感器节点任务调度算法^*[J]. 模式识别与人工智能, 2016, 29(11): 1028-1036. WEI Zhenchun, XU Xiangwei , FENG Lin, DING Bei. Task Scheduling Algorithm Based on Q-Learning and Programming for Sensor Nodes. , 2016, 29(11): 1028-1036.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201611008 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2016/V29/I11/1028

[1] IONEL S M, POPESCU D. Wireless Sensor Network for Monitoring Applications // Proc of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems. Washington, USA: IEEE, 2011: 867-871.
[2] 毕冉.基于无线传感器网络的事件监测算法研究.博士学位论文.哈尔滨:哈尔滨工业大学, 2015.
(BI R. Research on Algorithms for Event Monitoring in Wireless Sensor Networks. Ph.D Dissertation. Harbin, China: Harbin Institute of Technology, 2015.)
[3] KO J G, KLUES K, RICHTER C, et al. Low Power or High Performance？ A Tradeoff Whose Time Has Come (and Nearly Gone) // Proc of the 9th European Conference on Wireless Sensor Networks. Berlin, Germany: Springer, 2012: 98-114.
[4] FRANK C, RMER K. Algorithms for Generic Role Assignment in Wireless Sensor Networks // Proc of the 3rd International Conference on Embedded Networked Sensor Systems. New York, USA: ACM Press, 2005: 230-242.
[5] SHAH K, KUMAR M. Distributed Independent Reinforcement Learning (DIRL) Approach to Resource Management in Wireless Sensor Networks // Proc of the IEEE International Conference on Mobile Adhoc and Sensor Systems. New York, USA: IEEE, 2007. DOI: 10.1109/MOBHOC.2007.4428658.
[6] CIRSTEA C, DAVIDESCU R, GONTEAN A. A Reinforcement Learning Strategy for Task Scheduling of WSNs with Mobile Nodes // Proc of the 36th IEEE International Conference on Telecommunications and Signal Processing. New York, USA: IEEE, 2013: 348-353.
[7] KHAN M I, RINNER B. Performance Analysis of Resource-Aware Task Scheduling Methods in Wireless Sensor Networks. International Journal of Distributed Sensor Networks, 2014, 10(9). DOI: 10.1155/2014/7651802.
[8] 王雪松,朱美强,程玉虎.强化学习原理及其应用.北京:科学出版社, 2014.
(WANG X S, ZHU M Q, CHENG Y H. Reinforcement Learning Principles, Applications. Beijing, China: Science Press, 2014.)
[9] 傅启明,刘全,孙洪坤,等.一种二阶TD Error快速Q(λ)算法.模式识别与人工智能, 2013, 26(3): 282-292.
(FU Q M, LIU Q, SUN H K, et al. A Fast Q(λ) Algorithm Based on Second-Order TD Error. Pattern Recognition and Artificial Inte-lligence, 2013, 26(3): 282-292.)
[10] YAU K L A, KWONG K H, SHEN C. Reinforcement Learning Models for Scheduling in Wireless Networks. Frontiers of Computer Science, 2013, 7(5): 754-766.
[11] 陈圣磊,吴慧中,肖亮,等.基于Metropolis准则的多步Q学习算法与性能仿真.系统仿真学报, 2007, 19(6): 1284-1287.
(CHEN S L, WU H Z, XIAO L, et al. Metropolis Policy-Based Multi-step QLearning Algorithm and Performance Simulation. Journal of System Simulation, 2007, 19(6): 1284-1287.)
[12] HWANG K S, JIANG W C, CHEN Y J. Model Learning and Knowledge Sharing for a Multiagent System with Dyna-Q Learning. IEEE Trans on Cybernetics, 2014, 45(5): 964-976.
[13] PENG J, WILLIAMS R J. Efficient Learning and Planning within the Dyna Framework. Adaptive Behavior, 1993, 1(4): 437-454.
[14] 于俊,刘全,傅启明,等.基于优先级扫描Dyna结构的贝叶斯Q学习方法.通信学报, 2013, 34(11): 129-139.
(YU J, LIU Q, FU Q M. Bayesian Q Learning Method with Dyna Architecture and Prioritized Sweeping. Journal on Communications, 2013, 34(11): 129-139.)

[15] HEINZELMAN W B, CHANDRAKASAN A P, BALAKRISHNAN H. An Application-Specific Protocol Architecture for Wireless Microsensor Networks. IEEE Trans on Wireless Communications, 2002, 1(4): 660-670.