Advanced Search
Turn off MathJax
Article Contents
WANG Yumeng, LIU Zhenbing, LIU Zaiyi. Privacy-Preserving Federated Weakly-Supervised Learning for Cancer Subtyping on Histopathology Images[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250842
Citation: WANG Yumeng, LIU Zhenbing, LIU Zaiyi. Privacy-Preserving Federated Weakly-Supervised Learning for Cancer Subtyping on Histopathology Images[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250842

Privacy-Preserving Federated Weakly-Supervised Learning for Cancer Subtyping on Histopathology Images

doi: 10.11999/JEIT250842 cstr: 32379.14.JEIT250842
Funds:  The National Natural Science Foundation of China (82272075, U22A20345), Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application (2022B1212010011)
  • Accepted Date: 2025-11-17
  • Rev Recd Date: 2025-11-17
  • Available Online: 2025-11-25
  •   Objective  Data-driven deep learning methods have demonstrated superior performance. The development of robust and accurate models often relies on a large amount of training data with fine-grained annotations, which incurs high annotation costs for gigapixel whole slide images (WSI) in histopathology. Typically, healthcare data exists in “data silos”, and the complex data sharing process may raise privacy concerns. Federated Learning (FL) is a promising approach that enables training a global model from data spread across numerous medical centers without exchanging data. However, in traditional FL algorithms, the inherent data heterogeneity across medical centers significantly impacts the performance of the global model.  Methods  In response to these challenges, this work proposes a privacy-preserving FL method for gigapixel WSIs in computational pathology. The method integrates weakly supervised attention-based multiple instance learning (MIL) with differential privacy techniques. In the context of each client, a multi-scale attention-based MIL method is employed for local training on histopathology WSIs, with only slide-level labels available. This effectively mitigates the high costs of pixel-level annotation for histopathology WSIs via a weakly supervised setting. In the federated model update phase, local differential privacy is used to further mitigate the risk of sensitive data leakage. Specifically, random noise that follows a Gaussian or Laplace distribution is added to the model parameters after local training on each client. Furthermore, a novel federated adaptive reweighting strategy is adopted to overcome challenges posed by the heterogeneity of pathological images across clients. This strategy dynamically balances the contribution of the quantity and quality of local data to each client's weight.  Results and Discussions  The proposed FL framework is evaluated on two clinical diagnostic tasks: Non-small Cell Lung Cancer (NSCLC) histologic subtyping and Breast Invasive Carcinoma (BRCA) histologic subtyping. As shown in (Table 1, Table 2, and Fig. 4), the proposed FL method (Ours with DP and Ours w/o DP) exhibits superior accuracy and generalization when compared with both localized models and other FL methods. Notably, even when compared to the centralized model, its classification performance remains competitive (Fig. 3). These results demonstrate that privacy-preserving FL not only serves as a feasible and effective method for multicenter histopathology images, but also may mitigate the performance degradation typically caused by data heterogeneity across centers. By controlling the intensity of added noise within a limited range, the model can also achieve stable classification (Table 3). The two key components (i.e., multi-scale representation attention network and federated adaptive reweighting strategy) are proven valuable for consistent performance improvement (Table 4). In addition, the proposed FL method maintains stable classification performance across different hyperparameter settings (Table 5, Table 6). These results further demonstrate that the proposed FL method is robust.  Conclusions  In conclusion, the proposed FL method tackles two critical issues in multicenter computational pathology: data silos and privacy concerns. Moreover, it can effectively alleviates the performance degradation induced by inter-center data heterogeneity. Given the challenges in balancing model accuracy and privacy protection, future work will explore new methods that preserve privacy while maintaining model performance.
  • loading
  • [1]
    BRAY F, LAVERSANNE M, SUNG H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J]. CA: A Cancer Journal for Clinicians, 2024, 74(3): 229–263. doi: 10.3322/caac.21834.
    [2]
    HAN Bingfeng, ZHENG Rongshou, ZENG Hongmei, et al. Cancer incidence and mortality in China, 2022[J]. Journal of the National Cancer Center, 2024, 4(1): 47–53. doi: 10.1016/j.jncc.2024.01.006.
    [3]
    DENTRO S C, LESHCHINER I, HAASE K, et al. Characterizing genetic intra-tumor heterogeneity across 2, 658 human cancer genomes[J]. Cell, 2021, 184(8): 2239–2254. e39. doi: 10.1016/j.cell.2021.03.009.
    [4]
    WANG Yibei, SAFI M, HIRSCH F R, et al. Immunotherapy for advanced-stage squamous cell lung cancer: The state of the art and outstanding questions[J]. Nature Reviews Clinical Oncology, 2025, 22(3): 200–214. doi: 10.1038/s41571-024-00979-8.
    [5]
    GONG Tingting, GUO Shuang, LIU Fanghua, et al. Proteomic characterization of epithelial ovarian cancer delineates molecular signatures and therapeutic targets in distinct histological subtypes[J]. Nature Communications, 2023, 14(1): 7802. doi: 10.1038/s41467-023-43282-3.
    [6]
    NASRAZADANI A, LI Yujia, FANG Yusi, et al. Mixed invasive ductal lobular carcinoma is clinically and pathologically more similar to invasive lobular than ductal carcinoma[J]. British Journal of Cancer, 2023, 128(6): 1030–1039. doi: 10.1038/s41416-022-02131-8.
    [7]
    ELMORE J. Abstract SY01–03: The gold standard cancer diagnosis: Studies of physician variability, interpretive behavior, and the impact of AI[J]. Cancer Research, 2021, 81(S13): SY01–03. doi: 10.1158/1538-7445.AM2021-SY01-03.
    [8]
    MADABHUSHI A and LEE G. Image analysis and machine learning in digital pathology: Challenges and opportunities[J]. Medical Image Analysis, 2016, 33: 170–175. doi: 10.1016/j.media.2016.06.037.
    [9]
    LI Bin, KEIKHOSRAVI A, LOEFFLER A G, et al. Single image super-resolution for whole slide image using convolutional neural networks and self-supervised color normalization[J]. Medical Image Analysis, 2021, 68: 101938. doi: 10.1016/j.media.2020.101938.
    [10]
    BULTEN W, PINCKAERS H, VAN BOVEN H, et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: A diagnostic study[J]. The Lancet Oncology, 2020, 21(2): 233–241. doi: 10.1016/S1470-2045(19)30739-9.
    [11]
    SRINIDHI C L, CIGA O, and MARTEL A L. Deep neural network models for computational histopathology: A survey[J]. Medical Image Analysis, 2021, 67: 101813. doi: 10.1016/j.media.2020.101813.
    [12]
    DIETTERICH T G, LATHROP R H, and LOZANO-PÉREZ T. Solving the multiple instance problem with axis-parallel rectangles[J]. Artificial Intelligence, 1997, 89(1/2): 31–71. doi: 10.1016/S0004-3702(96)00034-3.
    [13]
    CARBONNEAU M A, CHEPLYGINA V, GRANGER E, et al. Multiple instance learning: A survey of problem characteristics and applications[J]. Pattern Recognition, 2018, 77: 329–353. doi: 10.1016/j.patcog.2017.10.009.
    [14]
    LU M Y, WILLIAMSON D F K, CHEN T Y, et al. Data-efficient and weakly supervised computational pathology on whole-slide images[J]. Nature Biomedical Engineering, 2021, 5(6): 555–570. doi: 10.1038/s41551-020-00682-w.
    [15]
    BONTEMPO G, BOLELLI F, PORRELLO A, et al. A graph-based multi-scale approach with knowledge distillation for WSI classification[J]. IEEE Transactions on Medical Imaging, 2024, 43(4): 1412–1421. doi: 10.1109/TMI.2023.3337549.
    [16]
    DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 248–255. doi: 10.1109/CVPR.2009.5206848.
    [17]
    MARELLI L and TESTA G. Scrutinizing the EU general data protection regulation[J]. Science, 2018, 360(6388): 496–498. doi: 10.1126/science.aar5419.
    [18]
    MARKS M and HAUPT C E. AI chatbots, health privacy, and challenges to HIPAA compliance[J]. JAMA, 2023, 330(4): 309–310. doi: 10.1001/jama.2023.9458.
    [19]
    MCMAHAN B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]. The 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, 2017: 1273–1282.
    [20]
    KARARGYRIS A, UMETON R, SHELLER M J, et al. Federated benchmarking of medical artificial intelligence with MedPerf[J]. Nature Machine Intelligence, 2023, 5(7): 799–810. doi: 10.1038/s42256-023-00652-2.
    [21]
    DU TERRAIL J O, LEOPOLD A, JOLY C, et al. Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer[J]. Nature Medicine, 2023, 29(1): 135–146. doi: 10.1038/s41591-022-02155-w.
    [22]
    ZHANG Yuanming, LI Zheng, HAN Xiangmin, et al. Pseudo-data based self-supervised federated learning for classification of histopathological images[J]. IEEE Transactions on Medical Imaging, 2024, 43(3): 902–915. doi: 10.1109/TMI.2023.3323540.
    [23]
    RODRÍGUEZ-BARROSO N, JIMÉNEZ-LÓPEZ D, LUZÓN M V, et al. Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges[J]. Information Fusion, 2023, 90: 148–173. doi: 10.1016/j.inffus.2022.09.011.
    [24]
    ZHANG Yuheng, JIA Ruoxi, PEI Hengzhi, et al. The secret revealer: Generative model-inversion attacks against deep neural networks[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 250–258. doi: 10.1109/CVPR42600.2020.00033.
    [25]
    GEIPING J, BAUERMEISTER H, DRÖGE H, et al. Inverting gradients-how easy is it to break privacy in federated learning?[C]. The 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2020: 1421.
    [26]
    WANG Zhibo, SONG Mengkai, ZHANG Zhifei, et al. Beyond inferring class representatives: User-level privacy leakage from federated learning[C]. IEEE INFOCOM 2019-IEEE Conference on Computer Communications, Paris, France, 2019: 2512–2520. doi: 10.1109/INFOCOM.2019.8737416.
    [27]
    DONG Jinshuo, ROTH A, and SU Weijie. Gaussian differential privacy[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2022, 84(1): 3–37. doi: 10.1111/rssb.12454.
    [28]
    KAISSIS G A, MAKOWSKI M R, RÜCKERT D, et al. Secure, privacy-preserving and federated machine learning in medical imaging[J]. Nature Machine Intelligence, 2020, 2(6): 305–311. doi: 10.1038/s42256-020-0186-1.
    [29]
    WANG Xiaoding, HU Jia, LIN Hui, et al. Federated learning-empowered disease diagnosis mechanism in the internet of medical things: From the privacy-preservation perspective[J]. IEEE Transactions on Industrial Informatics, 2023, 19(7): 7905–7913. doi: 10.1109/TII.2022.3210597.
    [30]
    XIANG Hangchen, SHEN Junyi, YAN Qingguo, et al. Multi-scale representation attention based deep multiple instance learning for gigapixel whole slide image analysis[J]. Medical Image Analysis, 2023, 89: 102890. doi: 10.1016/j.media.2023.102890.
    [31]
    CHIDAMBARANATHAN M, SHARMA U, NAIDU C M, et al. A new approach for recognition of implant in knee by template matching[J]. Indian Journal of Science and Technology, 2016, 9(37): 1–5. doi: 10.17485/ijst/2016/v9i37/102081.
    [32]
    SHI Xiaoshuang, XING Fuyong, XU Kaidi, et al. Loss-based attention for interpreting image-level prediction of convolutional neural networks[J]. IEEE Transactions on Image Processing, 2021, 30: 1662–1675. doi: 10.1109/TIP.2020.3046875.
    [33]
    GUO Shengnan, WANG Xibin, LONG Shigong, et al. A federated learning scheme meets dynamic differential privacy[J]. CAAI Transactions on Intelligence Technology, 2023, 8(3): 1087–1100. doi: 10.1049/cit2.12187.
    [34]
    ZHENG Yifeng, LAI Shangqi, LIU Yi, et al. Aggregation service for federated learning: An efficient, secure, and more resilient realization[J]. IEEE Transactions on Dependable and Secure Computing, 2023, 20(2): 988–1001. doi: 10.1109/TDSC.2022.3146448.
    [35]
    WANG Bo, LI Hongtao, GUO Yina, et al. PPFLHE: A privacy-preserving federated learning scheme with homomorphic encryption for healthcare data[J]. Applied Soft Computing, 2023, 146: 110677. doi: 10.1016/j.asoc.2023.110677.
    [36]
    LI Xiaoxiao, GU Yufeng, DVORNEK N, et al. Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: ABIDE results[J]. Medical Image Analysis, 2020, 65: 101765. doi: 10.1016/j.media.2020.101765.
    [37]
    LU M Y, CHEN R J, KONG Dehan, et al. Federated learning for computational pathology on gigapixel whole slide images[J]. Medical Image Analysis, 2022, 76: 102298. doi: 10.1016/j.media.2021.102298.
    [38]
    MACENKO M, NIETHAMMER M, MARRON J S, et al. A method for normalizing histology slides for quantitative analysis[C]. 2019 IEEE International Symposium on Biomedical Imaging, Boston, USA, 2009: 1107–1110. doi: 10.1109/ISBI.2009.5193250.
    [39]
    MA Benteng, FENG Yu, CHE Geng, et al. Federated adaptive reweighting for medical image classification[J]. Pattern Recognition, 2023, 144: 109880. doi: 10.1016/j.patcog.2023.109880.
    [40]
    ILSE M, TOMCZAK J M, and WELLING M. Attention-based deep multiple instance learning[C]. The 35th International Conference on Machine Learning, Stockholmsmässan, Sweden, 2018: 2132–2141.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(4)  / Tables(7)

    Article Metrics

    Article views (26) PDF downloads(4) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return