Research on Federated Unlearning Approach Based on Adaptive Model Pruning
-
摘要: 随着物联网(IoT)设备数量的指数级增长和《个人信息保护法》等法规的相继实施,联邦遗忘学习在边缘计算(EC)中已成为保障数据“被遗忘权”的关键技术。然而,边缘节点普遍存在的资源异构性—计算能力、存储容量和网络带宽等差异,导致基于全局统一剪枝策略的类级遗忘方法(遗忘某类训练数据)面临训练效率下降的困境。为了应对上述挑战,该文针对类级遗忘场景,提出了一种基于自适应模型剪枝的联邦遗忘学习框架(FunAMP),通过降低节点之间的等待时间提高模型训练效率。首先,建立了模型训练时间、节点资源与模型剪枝比之间的定量关系,并据此给出自适应模型剪枝问题的形式化定义。随后,设计了一种基于贪心策略的剪枝比决策算法,根据每个节点的计算和通信资源为其分配合适的剪枝比,并分析该算法的近似比,为算法性能提供理论保证。接着,确立了一种基于词频-逆文频的相关性指标来衡量模型参数与目标类数据之间的关系,根据该指标和分配的剪枝比将与目标类数据相关的模型参数去除,从而在实现目标类数据遗忘的同时最大程度地降低模型训练时间。实验结果表明,FunAMP在达到相同准确率的情况下,相比现有方法最高可实现11.8倍的加速比。Abstract:
Objective The rapid proliferation of Internet of Things (IoT) devices and the enforcement of data privacy regulations, including the General Data Protection Regulation (GDPR) and the Personal Information Protection Act, have positioned Federated Unlearning (FU) as a critical mechanism to safeguard the “right to be forgotten” in Edge Computing (EC). Existing class-level unlearning approaches often adopt uniform model pruning strategies. However, because edge nodes vary substantially in computational capacity, storage, and network bandwidth, these methods suffer from efficiency degradation, leading to imbalanced training delays and decreased resource utilization. This study proposes FU with Adaptive Model Pruning (FunAMP), a framework that minimizes training time while reliably eliminating the influence of target-class data. FunAMP dynamically assigns pruning ratios according to node resources and incorporates a parameter correlation metric to guide pruning decisions. In doing so, it addresses the challenge of resource heterogeneity while preserving compliance with privacy regulations. Methods The proposed framework establishes a quantitative relationship among model training time, node resources, and pruning ratios, on the basis of which an optimization problem is formulated to minimize overall training time. To address this problem, a greedy algorithm (Algorithm 2) is designed to adaptively assign appropriate pruning ratios to each node. The algorithm discretizes the pruning ratio space and applies a binary search strategy to balance computation and communication delays across nodes. Additionally, a Term Frequency–Inverse Document Frequency (TF–IDF)-based metric is introduced to evaluate the correlation between model parameters and the target-class data. For each parameter, the TF score reflects its activation contribution to the target class, whereas the IDF score measures its specificity across all classes. Parameters with high TF–IDF scores are iteratively pruned until the assigned pruning ratio is satisfied, thereby ensuring the effective removal of target-class data. Results and Discussions Simulation results confirm the effectiveness of FunAMP in balancing training efficiency and unlearning performance under resource heterogeneity. The effect of pruning granularity on model accuracy ( Fig. 2 ): fine granularity (e.g., 0.01) preserves model integrity, whereas coarse settings degrade accuracy due to excessive parameter removal. Under fixed training time, FunAMP consistently achieves higher accuracy than FunUMP and Retrain (Fig. 3 ), as adaptive pruning ratios reduce inter-node waiting delays. For instance, FunAMP attains 76.48% accuracy on LeNet and 83.60% on AlexNet with FMNIST, outperforming baseline methods by 5.91% and 4.44%, respectively. The TF–IDF-driven pruning mechanism fully removes contributions of target-class data, achieving 0.00% accuracy on the target data while maintaining competitive performance on the remaining data (Table 2 ). Robustness under varying heterogeneity levels is further verified (Fig. 4 ). Compared with baselines, FunAMP markedly reduces the training time required to reach predefined accuracy and delivers up to 11.8× speedup across four models. These results demonstrate FunAMP’s capability to harmonize resource utilization, preserve model performance, and ensure unlearning efficacy in heterogeneous edge environments.Conclusions To mitigate training inefficiency caused by resource heterogeneity in FU, this study proposes FunAMP, a framework that integrates adaptive pruning with parameter relevance analysis. A system model is constructed to formalize the relationship among node resources, pruning ratios, and training time. A greedy algorithm dynamically assigns pruning ratios to edge nodes, thereby minimizing global training time while balancing computational and communication delays. Furthermore, a TF–IDF-driven metric quantifies the correlation between model parameters and target-class data, enabling the selective removal of critical parameters to erase target-class contributions. Theoretical analysis verifies the stability and reliability of the framework, while empirical results demonstrate that FunAMP achieves complete removal of target-class data and sustains competitive accuracy on the remaining classes. This work is limited to single-class unlearning, and extending the approach to scenarios requiring the simultaneous removal of multiple classes remains an important direction for future research. -
2 基于贪心策略的剪枝比决策算法
输入:剪枝比调整粒度$ \theta \in (0,1) $,节点计算能力$ {C_i} $,通信带宽
$ {B_i} $输出:各节点每轮剪枝比$ \alpha _i^e \in (0,1) $ (1) 根据$ \theta $将剪枝区间离散化为$ {\varPhi _\theta } = \{ {\alpha _z} = z \cdot \theta |z = 1,2, \cdots ,k\} $,
其中$ k = \left\lceil {(1 - \theta )/\theta } \right\rceil $(2) 对于每个节点$ i $,根据公式(6)~(8)计算其在剪枝比$ {\alpha _z} $下的时
间$ T_{i,z}^e $(3) 初始化全局时间下界$ T_{{\text{low}}}^e $和上界$ T_{{\text{high}}}^e $分别为
$ T_{{\text{low}}}^e = {\min _i}T_{i,1}^e $,$ T_{{\text{high}}}^e = {\text{ma}}{{\text{x}}_i}T_{i,z}^e $(4) While $ T_{{\text{high}}}^e \gt T_{{\text{low}}}^e $ do: (5) 令$ T_{{\text{mid}}}^e = (T_{{\text{high}}}^e + T_{{\text{low}}}^e)/2 $ (6) 对于每个节点$ i $,选择最大的$ {z_i} $使得$ T_{i,{z_i}}^e \ge T_{{\text{mid}}}^e $ (7) 若存在可行解(所有节点均能找到对应的$ {z_i} $),则更新
$ T_{{\text{high}}}^e = T_{{\text{mid}}}^e $,否则更新$ T_{{\text{low}}}^e = T_{{\text{mid}}}^e $(8) End While (9) 将得到的可行$ T_{{\text{high}}}^e $记为$ {T^*} $ (10) 对于每个节点$ i $,选择$ {z_i} = \arg {\max _z}\{ T_{i,z}^e \ge {T^*}\} $ (11) 返回所有节点的剪枝比$ \{ {\alpha _{{z_1}}},{\alpha _{{z_2}}}, \cdots ,{\alpha _{{z_N}}}\} $ 3 基于词频-逆文频的自适应模型剪枝算法
输入:剪枝比$ \alpha _i^e $,待剪枝模型参数$ {{\boldsymbol{w}}^e} $,目标类$ l $ 输出:剪枝后的模型$ {{\boldsymbol{\tilde w}}^e} $ (1) For $ m = 1 $ to $ M $ (2) 根据公式(10)~ (12)计算第$ m $层模型参数与目标类$ l $的
TF-IDF分数(3) End For (4) 初始化累计剪枝比例$ r = 0 $ (5) While $ r \lt \alpha _i^e $ do: (6) 逐步去除剩余参数中与目标类$ l $最相关的模型参数 (7) 更新累计剪枝比例$ r $ (8) End While (9) 返回剪枝后的模型$ {{\boldsymbol{\tilde w}}^e} $ 表 1 给定训练时间内不同方法的准确率和遗忘效果对比
FMNIST CIFAR-100 LeNet AlexNet VGG11 VGG16 准确率 遗忘效果 准确率 遗忘效果 准确率 遗忘效果 准确率 遗忘效果 FunAMP 76.48% 00.00% 83.60% 00.00% 46.93% 00.00% 55.69% 00.00% FunUMP 70.55% 00.00% 79.16% 00.00% 43.50% 00.00% 54.02% 00.00% ReTrain 60.48% 00.00% 72.68% 00.00% 39.82% 00.00% 44.95% 00.00% 注:粗体表示所有方案中准确率的最高值 -
[1] SHI Weisong, CAO Jie, ZHANG Quan, et al. Edge computing: Vision and challenges[J]. IEEE Internet of Things Journal, 2016, 3(5): 637–646. doi: 10.1109/JIOT.2016.2579198. [2] MCMAHAN B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]. The 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, 2017: 1273–1282. [3] 管桂林, 陶政坪, 支婷, 等. 面向医疗场景的去中心化联邦学习隐私保护方法[J]. 计算机应用, 2024, 44(S2): 112–117. doi: 10.11772/j.issn.1001-9081.2024030347.GUAN Guilin, TAO Zhengping, ZHI Ting, et al. Decentralized federated learning privacy protection method for medical scenarios[J]. Journal of Computer Applications, 2024, 44(S2): 112–117. doi: 10.11772/j.issn.1001-9081.2024030347. [4] 智慧, 段苗苗, 杨利霞, 等. 一种基于区块链和联邦学习融合的交通流预测方法[J]. 电子与信息学报, 2024, 46(9): 3777–3787. doi: 10.11999/JEIT240030.ZHI Hui, DUAN Miaomiao, YANG Lixia, et al. A traffic flow prediction method based on the fusion of blockchain and federated learning[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3777–3787. doi: 10.11999/JEIT240030. [5] 郑润达, 张维, 张永杰, 等. 金融领域中的联邦学习研究[J]. 数智技术研究与应用, 2025, 1(1): 1–17. doi: 10.26917/j.cnki.issn.2097-597X.2025.01.001.ZHENG Runda, ZHANG Wei, ZHANG Yongjie, et al. Research on federated learning in the field of finance[J]. SmartTech Innovations, 2025, 1(1): 1–17. doi: 10.26917/j.cnki.issn.2097-597X.2025.01.001. [6] 肖雄, 唐卓, 肖斌, 等. 联邦学习的隐私保护与安全防御研究综述[J]. 计算机学报, 2023, 46(5): 1019–1044. doi: 10.11897/SP.J.1016.2023.01019.XIAO Xiong, TANG Zhuo, XIAO Bin, et al. A survey on privacy and security issues in federated learning[J]. Chinese Journal of Computers, 2023, 46(5): 1019–1044. doi: 10.11897/SP.J.1016.2023.01019. [7] LIU Ziyao, JIANG Yu, SHEN Jiyuan, et al. A survey on federated unlearning: Challenges, methods, and future directions[J]. ACM Computing Surveys, 2025, 57(1): 2. doi: 10.1145/3679014. [8] WU Leijie, GUO Song, WANG Junxiao, et al. Federated unlearning: Guarantee the right of clients to forget[J]. IEEE Network, 2022, 36(5): 129–135. doi: 10.1109/MNET.001.2200198. [9] WANG Zichen, GAO Xiangshan, WANG Cong, et al. Efficient vertical federated unlearning via fast retraining[J]. ACM Transactions on Internet Technology, 2024, 24(2): 11. doi: 10.1145/3657290. [10] SYROS G, YAR G, BOBOILA S, et al. Backdoor attacks in peer-to-peer federated learning[J]. ACM Transactions on Privacy and Security, 2025, 28(1): 8. doi: 10.1145/3691633. [11] WANG Junxiao, GUO Song, XIE Xin, et al. Federated unlearning via class-discriminative pruning[C]. The ACM Web Conference, Lyon, France, 2022: 622–632. doi: 10.1145/3485447.3512222. [12] WANG Lun, XU Yang, XU Hongli, et al. BOSE: Block-wise federated learning in heterogeneous edge computing[J]. IEEE/ACM Transactions on Networking, 2024, 32(2): 1362–1377. doi: 10.1109/TNET.2023.3316421. [13] JIANG Zhida, XU Yang, XU hongli, et al. Computation and communication efficient federated learning with adaptive model pruning[J]. IEEE Transactions on Mobile Computing, 2024, 23(3): 2003–2021. doi: 10.1109/TMC.2023.3247798. [14] MA Zhenguo, XU Yang, XU Hongli, et al. Adaptive batch size for federated learning in resource-constrained edge computing[J]. IEEE Transactions on Mobile Computing, 2023, 22(1): 37–53. doi: 10.1109/TMC.2021.3075291. [15] AIZAWA A. An information-theoretic perspective of tf–idf measures[J]. Information Processing & Management, 2003, 39(1): 45–65. doi: 10.1016/S0306-4573(02)00021-3. [16] XIAO Han, RASUL K, and VOLLGRAF R. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms[J]. arXiv preprint arXiv: 1708.07747, 2017. doi: 10.48550/arXiv.1708.07747. (查阅网上资料,请核对文献类型及格式是否正确). [17] KRIZHEVSKY A, NAIR V, and HINTON G. The CIFAR-100 dataset[EB/OL]. http://www.cs.toronto.edu/~kriz/cifar.html, 2024. (查阅网上资料,未找到本条文献出版或更新年份信息,请确认). [18] PAN Zibin, WANG Zhichao, LI Chi, et al. Federated unlearning with gradient descent and conflict mitigation[C]. The 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 19804–19812. doi: 10.1609/aaai.v39i19.34181. -