LIN Peng, WANG Jun, LIU Yan, ZHANG Zhizhong. Multi-dimensional Performance Adaptive Content Caching in Mobile Networks Based on Meta Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2598-2607. doi: 10.11999/JEIT250100
Citation:
LIN Peng, WANG Jun, LIU Yan, ZHANG Zhizhong. Multi-dimensional Performance Adaptive Content Caching in Mobile Networks Based on Meta Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2598-2607. doi: 10.11999/JEIT250100
LIN Peng, WANG Jun, LIU Yan, ZHANG Zhizhong. Multi-dimensional Performance Adaptive Content Caching in Mobile Networks Based on Meta Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2598-2607. doi: 10.11999/JEIT250100
Citation:
LIN Peng, WANG Jun, LIU Yan, ZHANG Zhizhong. Multi-dimensional Performance Adaptive Content Caching in Mobile Networks Based on Meta Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2598-2607. doi: 10.11999/JEIT250100
Key Laboratory of Intelligent Support Technology for Complex Environments, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing 210044, China
2.
School of Future Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China
3.
School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
Funds:
The National Natural Science Foundation of China (62201271, 62303232), China Postdoctoral Science Foundation (2023M741781), The Natural Science Foundation of Jiangsu Province of China (BK20220441), The Foundation of the Key Laboratory of Intelligent Support Technology for Complex Environments, Ministry of Education, Nanjing University of Information Science and Technology (NUIST-IST-JJ-2024-004)
Objective Content caching enhances the efficiency of video services in mobile networks. However, most existing studies optimize caching strategies for a single performance objective, overlooking their combined effect on key metrics such as content delivery latency, cache hit rate, and redundancy rate. An effective caching strategy must simultaneously satisfy multiple performance requirements and adapt to their dynamic changes over time. This study addresses these limitations by investigating the joint optimization of content delivery latency, cache hit rate, and redundancy rate. To capture the interdependencies and temporal variations among these metrics, a meta-reinforcement learning-based caching decision algorithm is proposed. Built on conventional reinforcement learning frameworks, the proposed method enables adaptive optimization across multiple performance dimensions, supporting a dynamic and balanced content caching strategy.Methods To address the multi-dimensional objectives of content caching, namely, content delivery latency, cache hit rate, and redundancy rate, this study proposes a performance-aware adaptive caching strategy. Given the uncertainty and temporal variability of interrelationships among performance metrics in real-world environments, dynamic correlation parameters are introduced to simulate the evolving behavior of these metrics. The caching problem is formulated as a dynamic joint optimization task involving delivery latency efficiency, cache hit rate, and a cache redundancy index. This problem is further modeled as a Markov Decision Process (MDP), where the state comprises the content popularity distribution and the caching state from the previous time slot; the action represents the caching decision at the current time slot. The reward function is defined as a cumulative metric that integrates dynamic correlation parameters across latency, hit rate, and redundancy. To solve the MDP, a Model-Agnostic Meta-Reinforcement Learning Algorithm (MAML-DDPG) is proposed. This algorithm reformulates the joint optimization task as a multi-task reinforcement learning problem, enabling adaptation to dynamically changing optimization targets and improving decision-making efficiency.Results and Discussions This study compares the performance of MAML-DDPG with baseline algorithms under a gradually changing Zipf parameter (0.5 to 1.5). Results show that MAML-DDPG maintains more stable system performance throughout the change, indicating superior adaptability. The algorithm’s response to abrupt shifts in optimization objectives is further evaluated by modifying weight parameters during training. Specifically, the experiments include comparisons among DDPG, $ {\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{100} $, $ \mathrm{M}\mathrm{A}\mathrm{M}\mathrm{L}{-\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{100} $, and $ \mathrm{M}\mathrm{A}\mathrm{M}\mathrm{L}{-\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{150} $, where $ {\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{100} $ denotes a change in weight parameters at the 100th training cycle to simulate task mutation. Results show that the DDPG model exhibits a sharp drop in convergence value following the change and stabilizes at a lower performance level. In contrast, MAML-DDPG, although initially affected by the shift, recovers rapidly due to its meta-learning capability and ultimately converges to a higher-performing caching strategy.Conclusions This study addresses the content caching problem in mobile edge networks by formulating it as a joint optimization task involving cache hit rate, cache redundancy index, and delivery latency efficiency. To handle the dynamic uncertainty associated with these performance metrics, a MAML-DDPG is proposed. The algorithm enables rapid adaptation to changing optimization targets, improving decision-making efficiency. Simulation results confirm that MAML-DDPG effectively adapts to dynamic performance objectives and outperforms existing methods across multiple caching metrics. The findings demonstrate the algorithm’s capability to meet evolving performance requirements while maintaining strong overall performance.
FU Yaru, LIU Jianqing, KE Junming, et al. Optimal and suboptimal dynamic cache update algorithms for wireless cellular networks[J]. IEEE Wireless Communications Letters, 2022, 11(12): 2610–2614. doi: 10.1109/LWC.2022.3211962.
[2]
LI Dongyang, ZHANG Haixia, DING Hui, et al. User-preference-learning-based proactive edge caching for D2D-assisted wireless networks[J]. IEEE Internet of Things Journal, 2023, 10(13): 11922–11937. doi: 10.1109/JIOT.2023.3244621.
[3]
LIN Peng, NING Zhaolong, ZHANG Zhizhong, et al. Joint optimization of preference-aware caching and content migration in cost-efficient mobile edge networks[J]. IEEE Transactions on Wireless Communications, 2024, 23(5): 4918–4931. doi: 10.1109/TWC.2023.3323464.
[4]
ZHANG Wanlu, LUO Jingjing, ZHENG Fuchun, et al. Decentralized collaborative caching in ultra-dense networks[J]. IEEE Wireless Communications Letters, 2024, 13(5): 1215–1219. doi: 10.1109/LWC.2024.3362595.
[5]
BATABYAL S. On the effect of redundant caching policy on multimedia streaming in D2D underlay network[C]. GLOBECOM 2023 - 2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 2023: 6747–6752. doi: 10.1109/GLOBECOM54140.2023.10437940.
[6]
LUONG N C, HOANG D T, GONG Shimin, et al. Applications of deep reinforcement learning in communications and networking: A survey[J]. IEEE Communications Surveys & Tutorials, 2019, 21(4): 3133–3174. doi: 10.1109/COMST.2019.2916583.
[7]
ZHOU Huan, WANG Zhenning, CHENG Nan, et al. Stackelberg-game-based computation offloading method in cloud–edge computing networks[J]. IEEE Internet of Things Journal, 2022, 9(17): 16510–16520. doi: 10.1109/JIOT.2022.3153089.
CHEN Yipeng, YANG Zhe, GU Fei, et al. Resource allocation strategy based on game theory in mobile edge computing[J]. Computer Science, 2023, 50(2): 32–41. doi: 10.11896/jsjkx.220300198.
[9]
YAN Jia, BI Suzhi, DUAN Lingjie, et al. Pricing-driven service caching and task offloading in mobile edge computing[J]. IEEE Transactions on Wireless Communications, 2021, 20(7): 4495–4512. doi: 10.1109/TWC.2021.3059692.
[10]
SUN Mengying, XU Xiaodong, HUANG Yuzhen, et al. Resource management for computation offloading in D2D-aided wireless powered mobile-edge computing networks[J]. IEEE Internet of Things Journal, 2021, 8(10): 8005–8020. doi: 10.1109/JIOT.2020.3041673.
[11]
TENG Ziyi, FANG Juan, and LIU Yaqi. Combining Lyapunov optimization and deep reinforcement learning for D2D assisted heterogeneous collaborative edge caching[J]. IEEE Transactions on Network and Service Management, 2024, 21(3): 3236–3248. doi: 10.1109/TNSM.2024.3361796.
[12]
WU Pingyang, LI Jun, SHI Long, et al. Dynamic content update for wireless edge caching via deep reinforcement learning[J]. IEEE Communications Letters, 2019, 23(10): 1773–1777. doi: 10.1109/LCOMM.2019.2931688.
[13]
LIM J, KIM D, and YOO Y. Joint cache allocation and replacement for content-centric network-based private 5G networks: Deep reinforcement learning approach[J]. IEEE Access, 2024, 12: 56214–56225. doi: 10.1109/ACCESS.2024.3390429.
[14]
LIN Peng, LIU Yan, ZHANG Zhizhong, et al. Cost-aware task offloading and migration for wireless virtual reality using interactive A3C approach[J]. IEEE Transactions on Vehicular Technology, 2024, 73(7): 10850–10855. doi: 10.1109/TVT.2024.3374303.
[15]
TANG Ming and WONG V W S. Deep reinforcement learning for task offloading in mobile edge computing systems[J]. IEEE Transactions on Mobile Computing, 2022, 21(6): 1985–1997. doi: 10.1109/TMC.2020.3036871.