变量名 | 变量符号 |
决策周期长度 | $\sigma $ |
决策周期 | $t$ |
边缘服务器数量 | $i$ |
车载用户位置 | ${\nu _t}$ |
边缘服务器位置 | ${\mu _m}$ |
路径损失参数 | $\delta $ |
路旁单元覆盖半径 | $r$ |
任务大小 | ${q_{\rm s}}$ |
任务最大容忍时延 | ${q_{\rm d}}$ |
传输功率 | ${P_{\rm s}}$ |
时延 | $T$ |
虚拟机所在位置 | ${ D}$ |

Citation: Jun PENG, Chenglong WANG, Fu JIANG, Xin GU, Yueyue MU, Weirong LIU. A Fast Deep Q-learning Network Edge Cloud Migration Strategy for Vehicular Service[J]. Journal of Electronics and Information Technology, doi: 10.11999/JEIT190612

一种车载服务的快速深度Q学习网络边云迁移策略
English
A Fast Deep Q-learning Network Edge Cloud Migration Strategy for Vehicular Service
-
-
[1]
ZHU Li, YU F R, WANG Yige, et al. Big data analytics in intelligent transportation systems: A survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(1): 383–398. doi: 10.1109/TITS.2018.2815678
-
[2]
D’OREY P M and FERREIRA M. ITS for sustainable mobility: A survey on applications and impact assessment tools[J]. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(2): 477–493. doi: 10.1109/TITS.2013.2287257
-
[3]
彭军, 马东, 刘凯阳, 等. 基于LTE D2D技术的车联网通信架构与数据分发策略研究[J]. 通信学报, 2016, 37(7): 62–70. doi: 10.11959/j.issn.1000-436x.2016134
PENG Jun, MA Dong, LIU Kaiyang, et al. LTE D2D based vehicle networking communication architecture and data distributing strategy[J]. Journal on Communications, 2016, 37(7): 62–70. doi: 10.11959/j.issn.1000-436x.2016134 -
[4]
GAO Kai, Han Farong, Dong Pingping, et al. Connected vehicle as a mobile sensor for real time queue length at signalized intersections[J]. Sensors, 2019, 19(9): 2059. doi: 10.3390/s19092059
-
[5]
KONG Yue, ZHANG Yikun, WANG Yichuan, et al. Energy saving strategy for task migration based on genetic algorithm[C]. 2018 International Conference on Networking and Network Applications, Xi’an, China, 2018: 330–336.
-
[6]
CHEN Xianfu, ZHANG Honggang, WU C, et al. Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning[J]. IEEE Internet of Things Journal, 2019, 6(3): 4005–4018. doi: 10.1109/JIOT.2018.2876279
-
[7]
SAHA S, HASAN M S. Effective task migration to reduce execution time in mobile cloud computing[C]. The 23rd International Conference on Automation and Computing, Huddersfield, UK, 2017: 1-–5.
-
[8]
GONÇALVES D, VELASQUEZ K, CURADO M, et al. Proactive virtual machine migration in fog environments[C]. 2018 IEEE Symposium on Computers and Communications, Natal, Brazil, 2018: 742–745.
-
[9]
KIKUCHI J, WU C, JI Yusheng, et al. Mobile edge computing based VM migration for QoS improvement[C]. The 6th IEEE Global Conference on Consumer Electronics, Nagoya, Japan, 2017: 1–5.
-
[10]
CHOWDHURY M, STEINBACH E, KELLERER W, et al. Context-Aware task migration for HART-Centric collaboration over FiWi based tactile internet infrastructures[J]. IEEE Transactions on Parallel and Distributed Systems, 2018, 29(6): 1231–1246. doi: 10.1109/TPDS.2018.2791406
-
[11]
LU Wei, MENG Xianyu, and GUO Guanfei. Fast service migration method based on virtual machine technology for MEC[J]. IEEE Internet of Things Journal, 2019, 6(3): 4344–4354. doi: 10.1109/JIOT.2018.2884519
-
[12]
WANG Yanting, SHENG Min, WANG Xijun, et al. Mobile-edge computing: Partial computation offloading using dynamic voltage scaling[J]. IEEE Transactions on Communications, 2016, 64(10): 4268–4282. doi: 10.1109/TCOMM.2016.2599530
-
[13]
SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge: MIT Press, 1998: 25–42.
-
[14]
SNIA trace data[EB/OL]. http://iotta.snia.org/traces, 2018.
-
[1]
-
表 1 变量表
表 2 DQN-TP算法
算法1: DQN-TP算法 (1) Repeat: (2) 车载用户上传车载决策神经网络的经验$({X_t},{a_t},{U_t},{X_{t + 1}})$到经验回放池; (3) While $t \ne $最后一个周期do (4) 从经验回放池中随机抽取$n$个经验作为一个mini-batch; (5) 将${X_t},{a_t}$作为评估神经网络的输入获得${Q_{\pi} }({X_t},{a_t};\theta )$,将${X_{t + 1}}$作为决策神经网络的输入获得${Q_{\pi } }({X_{t + 1} },{a_{t + 1} };{\theta ^-})$; (6) 根据式(13)、式(14)训练神经网络; (7) End While (8) 每训练$c$次将云端的神经网络参数更新给车载神经网络$\theta \to {\theta ^{^\_}}$; (9) 车载用户使用$\varepsilon {\rm{ - }}$贪婪算法选择动作-状态值函数最高的动作作为车载用户动作执行; (10) End 表 3 仿真参数设定
参数名 参数符号 参数值 决策周期 $\sigma $ 10–3 s 边缘服务器数量 $i$ 10 路径损失参数 $\delta $ 1.5 带宽 $W$ 4 MHz 路旁单元覆盖半径 $r$ 500 m 效用函数参数 $k$ 1.3 效用函数参数 $b$ 0.1 记忆回放池最大存储数 $o$ 3000 Mini-batch大小 $n$ 500 参数更新间隔步长 $c$ 80 神经网络层数 无 4 神经元总数 无 100 -