基于半马尔科夫决策过程的虚拟传感网络资源分配策略

王汝言; 李宏娟; 吴大鹏; 李红霞

doi:10.11999/JEIT190016

基于半马尔科夫决策过程的虚拟传感网络资源分配策略

doi: 10.11999/JEIT190016 cstr: 32379.14.JEIT190016

王汝言^{1, 2, 3},
李宏娟^{1, 2, 3, ,},
吴大鹏^{1, 2, 3},
李红霞⁴

1.
重庆邮电大学通信与信息工程学院重庆 400065
2.
重庆高校市级光通信与网络重点实验室重庆 400065
3.
泛在感知与互联重庆市重点实验室重庆 400065
4.
中国联合网络通信有限公司重庆市分公司重庆 401123

基金项目: 国家自然科学基金(61871062，61771082)，重庆市高校创新团队建设计划资助项目(CXTDX201601020)

详细信息

作者简介:
王汝言：男，1969年生，教授，博士，研究方向为泛在网络、多媒体信息处理等

李宏娟：女，1993年生，硕士生，研究方向为虚拟化、无线传感网络

吴大鹏：男，1979年生，教授，博士，研究方向为泛在无线网络、无线网络服务质量控制等

李红霞：女，1969年生，高级工程师，研究方向为光无线融合网络、无线传感网络

通讯作者:
李宏娟　ilihj@foxmail.com

中图分类号: TP393
计量
- 文章访问数: 2547
- HTML全文浏览量: 1310
- PDF下载量: 94
- 被引次数: 0
出版历程
- 收稿日期: 2019-01-07
- 修回日期: 2019-04-16
- 网络出版日期: 2019-05-22
- 刊出日期: 2019-12-01

Semi-Markov Decision Process-based Resource Allocation Strategy for Virtual Sensor Network

Ruyan WANG^{1, 2, 3},
Hongjuan LI^{1, 2, 3
, ,},
Dapeng WU^{1, 2, 3},
Hongxia LI⁴

1.
School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Optical Communication and Network Key Laboratory of Chongqing, Chongqing 400065, China
3.
Key Laboratory of Ubiquitous Sensing and Networking in Chongqing, Chongqing 400065, China
4.
Chongqing Branch, China Unicom, Chongqing 401123, China

Funds: The National Natural Science Foundation of China (61871062, 61771082), The Chongqing Funded Project of Chongqing University Innovation Team Construction (CXTDX201601020)

摘要

摘要: 针对传统无线传感网络(WSN)中资源部署与特定任务的耦合关系密切，造成较低的资源利用率，进而给资源提供者带来较低的收益问题，根据虚拟传感网络请求(VSNR)的动态变化情况，该文提出虚拟传感网络(VSN)中基于半马尔科夫决策过程(SMDP)的资源分配策略。定义VSN的状态集、行为集、状态转移概率，考虑传感网能量受限以及完成VSNR的时间，给出奖赏函数的表达式，并使用免模型强化学习算法求解特定状态下的行为，从而最大化网络资源提供者的长期收益。数值结果表明，该文的资源分配策略能有效提高传感网资源提供者的收益。
- 虚拟传感网络 /
- 资源分配 /
- 半马尔科夫决策过程
Abstract: The close relationship between resource deployment and specific tasks in traditional Wireless Sensor Network(WSN) leads to low resource utilization and revenue. According to the dynamic changes of Virtual Sensor Network Request(VSNR), the resource allocation strategy based on Semi-Markov Decision Process(SMDP) is proposed in Virtual Sensor Network(VSN). Then, difining the state, action, and transition probability of the VSN, the expected reward is given by considering the energy and time to complete the VSNR, and the model-free reinforcement learning approach is used to maximize the long-term reward of the network resource provider. The numerical results show that the resource allocation strategy of this paper can effectively improve the revenue of the sensor network resource providers.
- Virtual Sensor Network(VSN) /
- Resource allocation /
- Semi-Markov Decision Process(SMDP)

HTML全文

图 1 不同${\lambda _{\rm{p}}}$的收益对比

下载: 全尺寸图片幻灯片

图 2 不同资源总量的收益对比图

下载: 全尺寸图片幻灯片

图 3 不同${\lambda _{\rm{p}}}$的收益对比图

下载: 全尺寸图片幻灯片

图 4 不同${\mu _{\rm{p}}}$的收益对比图

下载: 全尺寸图片幻灯片

图 5 不同${\lambda _{\rm{p}}}$的拒绝率

下载: 全尺寸图片幻灯片

表 1 仿真参数设置表

参数	数值	参数	数值
$K$	20～30	${\lambda _{\rm{p}}}$	1～18
${\omega _{\rm{e}}}$	0.5	${\omega _{\rm{d}}}$	0.5
${\beta _{\rm{e}}}$	2	${\beta _{\rm{d}}}$	2
${E_{\rm{l}}}$	20	${P_{\rm{l}}}$	5
${D_{\rm{l}}}$	20	$\delta $	2
$\gamma $	2	$\alpha $	0.1

下载: 导出CSV

参考文献(16)

YETGIN H, CHEUNG K T K, El-HAJJAR M, et al. A survey of network lifetime maximization techniques in wireless sensor networks[J]. IEEE Communications Surveys & Tutorials, 2017, 19(2): 828–854. doi: 10.1109/COMST.2017.2650979

WU Dapeng, ZHANG Feng, WANG Honggang, et al. Security-oriented opportunistic data forwarding in mobile social networks[J]. Future Generation Computer Systems, 2018, 87: 803–815. doi: 10.1016/j.future.2017.07.028

DELGADO C, CANALES M, ORTÍN J, et al. Joint application admission control and network slicing in virtual sensor networks[J]. IEEE Internet of Things Journal, 2018, 5(1): 28–43. doi: 10.1109/JIOT.2017.2769446

WU Dapeng, ZHANG Zhihao, WU Shaoen, et al. Biologically inspired resource allocation for network slices in 5G-enabled internet of things[J]. IEEE Internet of Things Journal, 2018. doi: 10.1109/JIOT.2018.2888543

GUO Lei, NING Zhaolong, SONG Qingyang, et al. A QoS-oriented high-efficiency resource allocation scheme in wireless multimedia sensor networks[J]. IEEE Sensors Journal, 2017, 17(5): 1538–1548. doi: 10.1109/JSEN.2016.2645709

ZHANG Yueyue, ZHU Yaping, YAN Feng, et al. Energy-efficient radio resource allocation in software-defined wireless sensor networks[J]. IET Communications, 2018, 12(3): 349–358. doi: 10.1049/iet-com.2017.0937

HASSAN M M and ALSANAD A. Resource provisioning for cloud-assisted software defined wireless sensor network[J]. IEEE Sensors Journal, 2016, 16(20): 7401–7408. doi: 10.1109/JSEN.2016.2582339

DELGADO C, GÁLLEGO J R, CANALES M, et al. On optimal resource allocation in virtual sensor networks[J]. Ad Hoc Networks, 2016, 50: 23–40. doi: 10.1016/j.adhoc.2016.04.004

WU Dapeng, LIU Qianru, WANG Honggang, et al. Cache less for more: Exploiting cooperative video caching and delivery in D2D communications[J]. IEEE Transactions on Multimedia, 2018. doi: 10.1109/TMM.2018.2885931

ZHENG Kan, MENG Hanlin, CHATZIMISIOS P, et al. An SMDP-based resource allocation in vehicular cloud computing systems[J]. IEEE Transactions on Industrial Electronics, 2015, 62(12): 7920–7928. doi: 10.1109/TIE.2015.2482119

SCHOLLIG A, CAINES P E, EGERSTEDT M, et al. A hybrid Bellman equation for systems with regional dynamics[C]. The 200746th IEEE Conference on Decision and Control, New Orleans, USA, 2007: 3393–3398. doi: 10.1109/CDC.2007.4434952.

GOSAVI A. Relative value iteration for average reward semi-Markov control via simulation[C]. 2013 Winter Simulations Conference, Washington, USA, 2013: 623–630. doi: 10.1109/WSC.2013.6721456.

WU Dapeng, SHI Hang, WANG Honggang, et al. A feature-based learning system for internet of things applications[J]. IEEE Internet of Things Journal, 2019, 6(2): 1928–1937. doi: 10.1109/JIOT.2018.2884485

CHEN Yueyun and JIA Cuixia. An improved call admission control scheme based on reinforcement learning for multimedia wireless networks[C]. 2009 International Conference on Wireless Networks and Information Systems, Shanghai, China, 2009: 322–325. doi: 10.1109/WNIS.2009.91.

ABUNDO M, DI VALERIO V, CARDELLINI V, et al. QoS-aware bidding strategies for VM spot instances: A reinforcement learning approach applied to periodic long running jobs[C]. 2015 IFIP/IEEE International Symposium on Integrated Network Management, Ottawa, Canada, 2015: 53–61. doi: 10.1109/INM.2015.7140276.

DARKEN C, CHANG J, and MOODY J. Learning rate schedules for faster stochastic gradient search[C]. 1992 IEEE Workshop on Neural Networks for Signal Processing II, Helsingoer, Denmark, 1992: 3–12. doi: 10.1109/NNSP.1992.253713.

施引文献

资源附件(0)

访问统计