Advanced Search
Turn off MathJax
Article Contents
DONG Qingwei, FU Xueting, ZHANG Benkui. MCL-PhishNet: A Multi-Modal Contrastive Learning Network for Phishing URL Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250758
Citation: DONG Qingwei, FU Xueting, ZHANG Benkui. MCL-PhishNet: A Multi-Modal Contrastive Learning Network for Phishing URL Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250758

MCL-PhishNet: A Multi-Modal Contrastive Learning Network for Phishing URL Detection

doi: 10.11999/JEIT250758 cstr: 32379.14.JEIT250758
  • Accepted Date: 2025-12-03
  • Rev Recd Date: 2025-12-03
  • Available Online: 2025-12-09
  •   Objective  With the increasing complexity and dynamism of phishing attacks, traditional detection methods face challenges such as feature redundancy, multi-modal mis-match, and insufficient robustness to adversarial samples when confronting emerging attacks.  Methods  This paper proposes MCL-PhishNet, a multi-modal contrastive learning framework, to achieve precise phishing URL detection through a hierarchical syntactic encoder, bidirectional cross-modal attention mechanisms, and curriculum contrastive learning strategies. Specifically, multi-scale residual convolutions and Transformers collaboratively model local grammatical patterns and global de-pendency relationships of URLs, while 17-dimensional statistical features enhance robustness to adversarial samples. The dynamic contrastive learning mechanism optimizes feature space distribution via online spectral clustering-based semantic subspace partitioning and boundary margin constraints.  Results and Discussions  Experimental results demonstrate that MCL-PhishNet achieves an accuracy of 99.41% and an F1-score of 99.65% on datasets including EBUU17 and PhishStorm(Fig. 4 and Fig. 5), significantly outperforming traditional machine learning and deep learning approaches.  Conclusions  This framework provides an end-to-end technical paradigm for detecting dynamically evolving adversarial attacks.
  • loading
  • [1]
    LIU Ruitong, WANG Yanbin, XU Haitao, et al. PMANet: Malicious URL detection via post-trained language model guided multi-level feature attention network[J]. Information Fusion, 2025, 113: 102638. doi: 10.1016/j.inffus.2024.102638.
    [2]
    钟文康, 王添, 张功萱. 基于组件分割的钓鱼URL检测方法[J]. 信息安全学报, 2025, 10(1): 130–142. doi: 10.19363/J.cnki.cn10-1380/tn.2025.01.10.

    ZHONG Wenkang, WANG Tian, and ZHANG Gongxuan. Phishing URL detection method based on component segmentation[J]. Journal of Cyber Security, 2025, 10(1): 130–142. doi: 10.19363/J.cnki.cn10-1380/tn.2025.01.10.
    [3]
    JAIN A K and GUPTA B B. A survey of phishing attack techniques, defence mechanisms and open research challenges[J]. Enterprise Information Systems, 2022, 16(4): 527–565. doi: 10.1080/17517575.2021.1896786.
    [4]
    OMOLARA A E and ALAWIDA M. DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online security[J]. Computers & Security, 2025, 148: 104170. doi: 10.1016/j.cose.2024.104170.
    [5]
    PANDEY P and MISHRA N. Phish-sight: A new approach for phishing detection using dominant colors on web pages and machine learning[J]. International Journal of Information Security, 2023, 22(4): 881–891. doi: 10.1007/s10207-023-00672-4.
    [6]
    CHEN Qisheng and OMOTE K. An intrinsic evaluator for embedding methods in malicious URL detection[J]. International Journal of Information Security, 2025, 24(1): 36. doi: 10.1007/s10207-024-00950-9.
    [7]
    文伟平, 朱一帆, 吕子晗, 等. 针对品牌的网络钓鱼扩线与检测方案[J]. 信息网络安全, 2023, 23(12): 1–9. doi: 10.3969/j.issn.1671-1122.2023.12.001.

    WEN Weiping, ZHU Yifan, LYU Zihan, et al. Brand-specific phishing expansion and detection solutions[J]. Netinfo Security, 2023, 23(12): 1–9. doi: 10.3969/j.issn.1671-1122.2023.12.001.
    [8]
    胡忠义, 张硕果, 吴江. 基于URL多粒度特征融合的钓鱼网站识别[J]. 数据分析与知识发现, 2022, 6(11): 103–110. doi: 10.11925/infotech.2096-3467.2022.0141.

    HU Zhongyi, ZHANG Shuoguo, and WU Jiang. Identifying phishing websites based on URL multi-granularity feature fusion[J]. Data Analysis and Knowledge Discovery, 2022, 6(11): 103–110. doi: 10.11925/infotech.2096-3467.2022.0141.
    [9]
    SABIR B, BABAR M A, GAIRE R, et al. Reliability and robustness analysis of machine learning based phishing URL detectors[J]. IEEE Transactions on Dependable and Secure Computing, 2022. doi: 10.1109/TDSC.2022.3218043. (查阅网上资料,未找到卷期页码信息,请确认补充).
    [10]
    DO N Q, SELAMAT A, FUJITA H, et al. An integrated model based on deep learning classifiers and pre-trained transformer for phishing URL detection[J]. Future Generation Computer Systems, 2024, 161: 269–285. doi: 10.1016/j.future.2024.06.031.
    [11]
    ASIRI S, XIAO Yang, ALZAHRANI S, et al. PhishingRTDS: A real-time detection system for phishing attacks using a deep learning model[J]. Computers & Security, 2024, 141: 103843. doi: 10.1016/j.cose.2024.103843.
    [12]
    OPARA C, CHEN Yingke, and WEI Bo. Look before you leap: Detecting phishing web pages by exploiting raw URL and HTML characteristics[J]. Expert Systems with Applications, 2024, 236: 121183. doi: 10.1016/j.eswa.2023.121183.
    [13]
    谢丽霞, 张浩, 杨宏宇, 等. 网络钓鱼检测研究综述[J]. 电子科技大学学报, 2024, 53(6): 883–899. doi: 10.12178/1001-0548.2023273.

    XIE Lixia, ZHANG Hao, YANG Hongyu, et al. A review of phishing detection research[J]. Journal of University of Electronic Science and Technology of China, 2024, 53(6): 883–899. doi: 10.12178/1001-0548.2023273.
    [14]
    DU Yuefeng, DUAN Huayi, XU Lei, et al. PEBA: Enhancing user privacy and coverage of safe browsing services[J]. IEEE Transactions on Dependable and Secure Computing, 2023, 20(5): 4343–4358. doi: 10.1109/TDSC.2022.3204767.
    [15]
    胡强, 刘倩, 周杭霞. 基于改进Stacking策略的钓鱼网站检测研究[J]. 广西师范大学学报: 自然科学版, 2022, 40(3): 132–140. doi: 10.16088/j.issn.1001-6600.2021071201.

    HU Qiang, LIU Qian, and ZHOU Hangxia. Study on phishing website detection based on improved Stacking strategy[J]. Journal of Guangxi Normal University: Natural Science Edition, 2022, 40(3): 132–140. doi: 10.16088/j.issn.1001-6600.2021071201.
    [16]
    杨鹏, 曾朋, 赵广振, 等. 基于Logistic回归和XGBoost的钓鱼网站检测方法[J]. 东南大学学报: 自然科学版, 2019, 49(2): 207–212. doi: 10.3969/j.issn.1001-0505.2019.02.001.

    YANG Peng, ZENG Peng, ZHAO Guangzhen, et al. Phishing website detection method based on Logistic regression and XGBoost[J]. Journal of Southeast University: Natural Science Edition, 2019, 49(2): 207–212. doi: 10.3969/j.issn.1001-0505.2019.02.001.
    [17]
    SAHINGOZ O K, BUBER E, DEMIR O, et al. Machine learning based phishing detection from URLs[J]. Expert Systems with Applications, 2019, 117: 345–357. doi: 10.1016/j.eswa.2018.09.029.
    [18]
    卜佑军, 张桥, 陈博, 等. 基于CNN和BiLSTM的钓鱼URL检测技术研究[J]. 郑州大学学报: 工学版, 2021, 42(6): 14–20. doi: 10.13705/j.issn.1671-6833.2021.04.022.

    BU Youjun, ZHANG Qiao, CHEN Bo, et al. Research on phishing URL detection technology based on CNN-BiLSTM[J]. Journal of Zhengzhou University: Engineering Science, 2021, 42(6): 14–20. doi: 10.13705/j.issn.1671-6833.2021.04.022.
    [19]
    张鹏, 孙博文, 李唯实, 等. 基于LSTM的钓鱼邮件检测系统[J]. 北京理工大学学报, 2020, 40(12): 1289–1294. doi: 10.15918/j.tbit1001-0645.2019.262.

    ZHANG Peng, SUN Bowen, LI Weishi, et al. Phishing mail detection system based on LSTM neural network[J]. Transactions of Beijing Institute of Technology, 2020, 40(12): 1289–1294. doi: 10.15918/j.tbit1001-0645.2019.262.
    [20]
    AKÇAM Ö Ş, TEKEREK A, and TEKEREK M. Development of BiLSTM deep learning model to detect URL-based phishing attacks[J]. Computers and Electrical Engineering, 2025, 123: 110212. doi: 10.1016/j.compeleceng.2025.110212.
    [21]
    PRASAD Y B and DONDETI V. PDSMV3-DCRNN: A novel ensemble deep learning framework for enhancing phishing detection and URL extraction[J]. Computers & Security, 2025, 148: 104123. doi: 10.1016/j.cose.2024.104123.
    [22]
    张重生, 陈杰, 李岐龙, 等. 深度对比学习综述[J]. 自动化学报, 2023, 49(1): 15–39. doi: 10.16383/j.aas.c220421.

    ZHANG Chongsheng, CHEN Jie, LI Qilong, et al. Deep contrastive learning: A survey[J]. Acta Automatica Sinica, 2023, 49(1): 15–39. doi: 10.16383/j.aas.c220421.
    [23]
    侯明泽, 饶蕾, 范光宇, 等. 基于课程学习的跨度级方面情感三元组提取[J]. 浙江大学学报: 工学版, 2025, 59(1): 79–88. doi: 10.3785/j.issn.1008-973X.2025.01.008.

    HOU Mingze, RAO Lei, FAN Guangyu, et al. Span-level aspect sentiment triplet extraction based on curriculum learning[J]. Journal of Zhejiang University: Engineering Science, 2025, 59(1): 79–88. doi: 10.3785/j.issn.1008-973X.2025.01.008.
    [24]
    JAMES J, SANDHYA L, and THOMAS C. Detection of phishing URLs using machine learning techniques[C]. 2013 International Conference on Control Communication and Computing, Thiruvananthapuram, India, 2013: 304–309. doi: 10.1109/ICCC.2013.6731669. (查阅网上资料,标黄信息不确定,请确认).
    [25]
    TYAGI I, SHAD J, SHARMA S, et al. A novel machine learning approach to detect phishing websites[C]. 5th International Conference on Signal Processing and Integrated Networks, Noida, India, 2018: 425–430. doi: 10.1109/SPIN.2018.8474040.
    [26]
    PATIL V, THAKKAR P, SHAH C, et al. Detection and prevention of phishing websites using machine learning approach[C]. 4th International Conference on Computing Communication Control and Automation, Pune, India, 2018: 1–5. doi: 10.1109/ICCUBEA.2018.8697412.
    [27]
    LI Yukun, YANG Zhenguo, CHEN Xu, et al. A stacking model using URL and HTML features for phishing webpage detection[J]. Future Generation Computer Systems, 2019, 94: 27–39. doi: 10.1016/j.future.2018.11.004.
    [28]
    ABDELHAMID N, THABTAH F, and ABDEL-JABER H. Phishing detection: A recent intelligent machine learning comparison based on models content and features[C]. 2017 International Conference on Intelligence and Security Informatics, Beijing, China, 2017: 72–77. doi: 10.1109/ISI.2017.8004877.
    [29]
    JAGADEESAN S, CHATURVEDI A, and KUMAR S. URL phishing analysis using random forest[J]. International Journal of Pure and Applied Mathematics, 2018, 118(20): 4159–4163.
    [30]
    CHIEW K L, TAN C L, WONG K S, et al. A new hybrid ensemble feature selection framework for machine learning-based phishing detection system[J]. Information Sciences, 2019, 484: 153–166. doi: 10.1016/j.ins.2019.01.064.
    [31]
    BOZKIR A S, DALGIC F C, and AYDOS M. GramBeddings: A new neural network for URL based identification of phishing web pages through N-gram embeddings[J]. Computers & Security, 2023, 124: 102964. doi: 10.1016/j.cose.2022.102964.
    [32]
    PRABAKARAN M K, SUNDARAM P M, and CHANDRASEKAR A D. An enhanced deep learning-based phishing detection mechanism to effectively identify malicious URLs using variational autoencoders[J]. IET Information Security, 2023, 17(3): 423–440. doi: 10.1049/ise2.12106.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(2)

    Article Metrics

    Article views (26) PDF downloads(1) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return