Task Offloading Algorithm for Large-scale Multi-access Edge Computing Scenarios

LU Xianling; LI Dekang

doi:10.11999/JEIT240624

Volume 47 Issue 1

Jan. 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 > 47(1): 116-127

LU Xianling, LI Dekang. Task Offloading Algorithm for Large-scale Multi-access Edge Computing Scenarios[J]. Journal of Electronics & Information Technology, 2025, 47(1): 116-127. doi: 10.11999/JEIT240624

Citation:

LU Xianling, LI Dekang. Task Offloading Algorithm for Large-scale Multi-access Edge Computing Scenarios[J]. Journal of Electronics & Information Technology, 2025, 47(1): 116-127. doi: 10.11999/JEIT240624

Citation:

PDF( 3433 KB)

Task Offloading Algorithm for Large-scale Multi-access Edge Computing Scenarios

doi: 10.11999/JEIT240624

LU Xianling^,,
LI Dekang

School of Internet of Things, Jiangnan University, Wuxi 214122, China

Funds: The National Natural Science Foundation of China (61773181)

Received Date: 2024-07-18
Rev Recd Date: 2024-12-02

Available Online: 2024-12-09

Publish Date: 2025-01-31

Abstract

Abstract

Objective Recently, task offloading techniques based on reinforcement learning in Multi-access Edge Computing (MEC) have attracted considerable attention and are increasingly being utilized in industrial applications. Algorithms for task offloading that rely on single-agent reinforcement learning are typically developed within a decentralized framework, which is preferred due to its relatively low computational complexity. However, in large-scale MEC environments, such task offloading policies are formed solely based on local observations, often resulting in partial observability challenges. Consequently, this can lead to interference among agents and a degradation of the offloading policies. In contrast, traditional multi-agent reinforcement learning algorithms, such as the Multi-Agent Deep Deterministic Policy Gradient (MADDPG), consolidate the observation and action vectors of all agents, thereby effectively addressing the partial observability issue. Optimal joint offloading policies are subsequently derived through online training. Nonetheless, the centralized training and decentralized execution model inherent in MADDPG causes computational complexity to increase linearly with the number of mobile devices (MDs). This scalability issue restricts the ability of MEC systems to accommodate additional devices, ultimately undermining the system’s overall scalability. Methods First, a task offloading queue model for large-scale MEC systems is developed to handle delay-sensitive tasks with deadlines. This model incorporates both the transmission process, where tasks are offloaded via wireless channels to the edge server, and the computation process, where tasks are processed on the edge server. Second, the offloading process is defined as a Partially Observable Markov Decision Process (POMDP) with specified observation space, action space, and reward function for the agents. The Mean-Field Multi-Agent Task Offloading (MF-MATO) algorithm is subsequently proposed. Long Short-Term Memory (LSTM) networks are utilized to predict the current state vector of the MEC system by analyzing historical observation vectors. The predicted state vector is then input into fully connected networks to determine the task offloading policy. The incorporation of LSTM networks addresses the partial observability issue faced by agents during offloading decisions. Moreover, mean field theory is employed to approximate the Q-value function of MADDPG through linear decomposition, resulting in an approximate Q-value function and a mean-field-based action approximation for the MF-MATO algorithm. This mean-field approximation replaces the joint action of agents. Consequently, the MF-MATO algorithm interacts with the MEC environment to gather experience over one episode, which is stored in an experience replay buffer. After each episode, experiences are sampled from the buffer to train both the policy network and the Q-value network. Results and Discussions The simulation results indicate that the average cumulative rewards of the MF-MATO algorithm are comparable to those of the MADDPG algorithm, outperforming the other comparison algorithms during the training phase. (1) The task offloading delay curves for MD using the MF-MATO and MADDPG algorithms show a synchronous decline throughout the training process. Upon reaching training convergence, the delays consistently remain lower than those of the single-agent task offloading algorithm. In contrast, the average delay curve for the single-agent algorithm exhibits significant variation across different MD scenarios. This inconsistency is attributed to the single-agent algorithm’s inability to address mutual interference among agents, resulting in policy degradation for certain agents due to the influence of others. (2) As the number of MD increases, the MF-MATO algorithm’s performance regarding delay and task drop rate increasingly aligns with that of MADDPG, while exceeding all other comparison algorithms. This enhancement is attributed to the improved accuracy of the mean-field approximation as the number of MD rises. (3) A rise in task arrival probability leads to a gradual increase in the average delay and task drop rate curves for both the MF-MATO and MADDPG algorithms. When the task arrival probability reaches its maximum value, a significant rise in both the average delay and task drop rate is observed across all algorithms, due to the high volume of tasks fully utilizing the available computational resources. (4) As the number of edge servers increases, the average delay and task drop rate curves for the MF-MATO and MADDPG algorithms show a gradual decline, whereas the performance of the other comparison algorithms experiences a marked improvement with only a slight increase in computational resources. This suggests that the MF-MATO and MADDPG algorithms effectively optimize computational resource utilization through cooperative decision-making among agents. The simulation results substantiate that, by reducing computational complexity, the MF-MATO algorithm achieves performance in terms of delay and task drop rate that is consistent with that of the MADDPG algorithm. Conclusions The task offloading algorithm proposed in this paper, which is based on LSTM networks and mean field approximation theory, effectively addresses the challenges associated with task offloading in large-scale MEC scenarios. By utilizing LSTM networks, the algorithm alleviates the partially observable issues encountered by single-agent approaches, while also enhancing the efficiency of experience utilization in multi-agent systems and accelerating algorithm convergence. Additionally, mean field approximation theory reduces the dimensionality of the action space for multiple agents, thereby mitigating the computational complexity that traditional MADDPG algorithms face, which increases linearly with the number of mobile devices. As a result, the computational complexity of the MF-MATO algorithm remains independent of the number of mobile devices, significantly improving the scalability of large-scale MEC systems.
- Multi-access Edge Computing (MEC),
- Task offloading,
- Reinforcement learning,
- Multi-agent algorithms,
- Mean field approximation theory

FullText(HTML)

References(21)

References

[1]	KHAN W Z, AHMED E, HAKAK S, et al. Edge computing: A survey[J]. Future Generation Computer Systems, 2019, 97: 219–235. doi: 10.1016/j.future.2019.02.050.
[2]	HUA Haochen, LI Yutong, WANG Tonghe, et al. Edge computing with artificial intelligence: A machine learning perspective[J]. ACM Computing Surveys, 2023, 55(9): 184. doi: 10.1145/3555802.
[3]	LI Tianxu, ZHU Kun, LUONG N C, et al. Applications of multi-agent reinforcement learning in future internet: A comprehensive survey[J]. IEEE Communications Surveys & Tutorials, 2022, 24(2): 1240–1279. doi: 10.1109/COMST.2022.3160697.
[4]	FENG Chuan, HAN Pengchao, ZHANG Xu, et al. Computation offloading in mobile edge computing networks: A survey[J]. Journal of Network and Computer Applications, 2022, 202: 103366. doi: 10.1016/j.jnca.2022.103366.
[5]	LUO Quyuan, HU Shihong, LI Changle, et al. Resource scheduling in edge computing: A survey[J]. IEEE Communications Surveys & Tutorials, 2021, 23(4): 2131–2165. doi: 10.1109/COMST.2021.3106401.
[6]	CHEN Weiwei, WANG Dong, and LI Keqin. Multi-user multi-task computation offloading in green mobile edge cloud computing[J]. IEEE Transactions on Services Computing, 2019, 12(5): 726–738. doi: 10.1109/TSC.2018.2826544.
[7]	PORAMBAGE P, OKWUIBE J, LIYANAGE M, et al. Survey on multi-access edge computing for internet of things realization[J]. IEEE Communications Surveys & Tutorials, 2018, 20(4): 2961–2991. doi: 10.1109/COMST.2018.2849509.
[8]	SAEIK F, AVGERIS M, SPATHARAKIS D, et al. Task offloading in edge and cloud computing: A survey on mathematical, artificial intelligence and control theory solutions[J]. Computer Networks, 2021, 195: 108177. doi: 10.1016/j.comnet.2021.108177.
[9]	LI Shancang, XU Lida, and ZHAO Shanshan. 5G internet of things: A survey[J]. Journal of Industrial Information Integration, 2018, 10: 1–9. doi: 10.1016/j.jii.2018.01.005.
[10]	夏士超, 姚枝秀, 鲜永菊, 等. 移动边缘计算中分布式异构任务卸载算法[J]. 电子与信息学报, 2020, 42(12): 2891–2898. doi: 10.11999/JEIT190728. XIA Shichao, YAO Zhixiu, XIAN Yongju, et al. A distributed heterogeneous task offloading methodology for mobile edge computing[J] Journal of Electronics & Information Technology, 2020, 42(12): 2891–2898. doi: 10.11999/JEIT190728.
[11]	RANAWEERA P, JURCUT A D, and LIYANAGE M. Survey on multi-access edge computing security and privacy[J]. IEEE Communications Surveys & Tutorials, 2021, 23(2): 1078–1124. doi: 10.1109/COMST.2021.3062546.
[12]	TRAN T X and POMPILI D. Joint task offloading and resource allocation for multi-server mobile-edge computing networks[J]. IEEE Transactions on Vehicular Technology, 2019, 68(1): 856–868. doi: 10.1109/TVT.2018.2881191.
[13]	BI Suzhi, HUANG Liang, WANG Hui, et al. Lyapunov-guided deep reinforcement learning for stable online computation offloading in mobile-edge computing networks[J]. IEEE Transactions on Wireless Communications, 2021, 20(11): 7519–7537. doi: 10.1109/TWC.2021.3085319.
[14]	CHEN Xianfu, ZHANG Honggang, WU Celimuge, et al. Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning[J]. IEEE Internet of Things Journal, 2019, 6(3): 4005–4018. doi: 10.1109/JIOT.2018.2876279.
[15]	HUANG Liang, BI Suzhi, and ZHANG Y J A. Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks[J]. IEEE Transactions on Mobile Computing, 2020, 19(11): 2581–2593. doi: 10.1109/TMC.2019.2928811.
[16]	CAO Zilong, ZHOU Pan, LI Ruixuan, et al. Multiagent deep reinforcement learning for joint multichannel access and task offloading of mobile-edge computing in industry 4.0[J]. IEEE Internet of Things Journal, 2020, 7(7): 6201–6213. doi: 10.1109/JIOT.2020.2968951.
[17]	ZHU Xiaoyu, LUO Yueyi, LIU Anfeng, et al. Multiagent deep reinforcement learning for vehicular computation offloading in IoT[J]. IEEE Internet of Things Journal, 2021, 8(12): 9763–9773. doi: 10.1109/JIOT.2020.3040768.
[18]	HEYDARI J, GANAPATHY V, and SHAH M. Dynamic task offloading in multi-agent mobile edge computing networks[C]. 2019 IEEE Global Communications Conference, Waikoloa, USA, 2019: 1–6. doi: 10.1109/GLOBECOM38437.2019.9013115.
[19]	GAO Zhen, YANG Lei, and DAI Yu. Large-scale computation offloading using a multi-agent reinforcement learning in heterogeneous multi-access edge computing[J]. IEEE Transactions on Mobile Computing, 2023, 22(6): 3425–3443. doi: 10.1109/TMC.2022.3141080.
[20]	YANG Yaodong, LUO Rui, LI Minne, et al. Mean field multi-agent reinforcement learning[C]. The 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 5571–5580.
[21]	TANG Ming and WONG V W S. Deep reinforcement learning for task offloading in mobile edge computing systems[J]. IEEE Transactions on Mobile Computing, 2022, 21(6): 1985–1997. doi: 10.1109/TMC.2020.3036871.