高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

拒绝服务攻击下信息物理系统的数据驱动安全控制:一种在线模态依赖的切换-Q-学习策略

张瑞丰 杨荣妮

张瑞丰, 杨荣妮. 拒绝服务攻击下信息物理系统的数据驱动安全控制:一种在线模态依赖的切换-Q-学习策略[J]. 电子与信息学报. doi: 10.11999/JEIT250746
引用本文: 张瑞丰, 杨荣妮. 拒绝服务攻击下信息物理系统的数据驱动安全控制:一种在线模态依赖的切换-Q-学习策略[J]. 电子与信息学报. doi: 10.11999/JEIT250746
ZHANG Ruifeng, YANG Rongni. Data-Driven Secure Control for Cyber-Physical Systems under Denial-of-Service Attacks: An Online Mode-Dependent Switching-Q-Learning Strategy[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250746
Citation: ZHANG Ruifeng, YANG Rongni. Data-Driven Secure Control for Cyber-Physical Systems under Denial-of-Service Attacks: An Online Mode-Dependent Switching-Q-Learning Strategy[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250746

拒绝服务攻击下信息物理系统的数据驱动安全控制:一种在线模态依赖的切换-Q-学习策略

doi: 10.11999/JEIT250746 cstr: 32379.14.JEIT250746
基金项目: 国家自然科学基金(62273208)
详细信息
    作者简介:

    张瑞丰:男,博士生,研究方向为切换系统理论、信息物理系统的安全控制与数据驱动控制

    杨荣妮:女,教授,研究方向为复杂动态系统网络化控制、数据驱动控制及其在智能电网和无人机电系统应用

    通讯作者:

    杨荣妮 rnyang@sdu.edu.cn

  • 中图分类号: TP273

Data-Driven Secure Control for Cyber-Physical Systems under Denial-of-Service Attacks: An Online Mode-Dependent Switching-Q-Learning Strategy

Funds: The National Natural Science Foundation of China (62273208)
  • 摘要: 基于学习策略和切换系统理论,该文研究了拒绝服务(DoS)攻击下未知信息物理系统(CPSs)的安全分析与控制问题。考虑到攻击能量有限性,采用攻击频率和持续时间来描述DoS攻击。特别地,不同于现有的安全学习方法,该文利用切换系统理论提出了一种在线模态依赖的切换-Q-学习控制新算法及相应的数据驱动安全评估新准则。首先,将休眠和活跃DoS攻击下的未知CPSs分别转化为一类含有稳定和不稳定子系统的未知切换系统。随后设计了一种新颖的在线模态依赖的切换-Q-学习算法,进而获得数据驱动的最优安全控制增益。同时通过约束子系统阶段和切换阶段的能量函数,提出了一种具有攻击频率和持续时间约束的数据驱动安全评估准则。最后通过网络化轮式机器人系统的对比实验验证了该方法的高效性和优越性。
  • 图  1  网络化轮式机器人平台

    图  2  基于模型的$ {{\boldsymbol{P}}_i} $和${{\boldsymbol{L}}_a}$的迭代过程

    图  3  数据驱动的$ {{\boldsymbol{P}}_i} $和${{\boldsymbol{L}}_a}$的迭代过程

    图  4  切换-Q-学习过程

    图  5  基于切换-Q-学习控制器的误差系统状态响应

    图  6  本文切换-Q-学习控制器下的机器人轨迹

    图  7  文献[25,26]中Q-学习控制器下的机器人轨迹

    1  在线模态依赖的切换-Q-学习控制算法

     1:设定初始值$ {\boldsymbol{L}}_a^0 $、$ {\boldsymbol{\mathcal{P}}}_i^0 $、$ {\boldsymbol{P}}_i^0 $和$j = 0$,其中$i \in \{ a,b\} $.
     2:设定学习误差阈值$ \varepsilon _i^{} $
     3:执行策略更新(32)
     4:执行策略评估(31)
     5:如果 $ \Vert {{\boldsymbol{P}}}_{i}^{j+1}-{{\boldsymbol{P}}}_{i}^{j}\Vert \lt {\epsilon}_{i} $,那么
     6: 输出参数矩阵$ {\boldsymbol{P}}_i^{j + 1} $和最优控制增益$ {\boldsymbol{L}}_a^{j + 1} $
     7:否则
     8: $j = j + 1$
     9: 返回步骤3
     10:结束判断
    下载: 导出CSV

    表  1  本文与文献[24]的数据驱动安全评估准则比较

    本文定理2 (基于本文算法1) 文献[24]推论1(基于[24]中算法1)
    切换参数: ${\mu _a} = 1.04$、 ${\mu _b} = 1.74$ 无
    子系统参数: ${\eta _a} = 0.8$、 $ {\eta _b} = 1.55 $ $ {\mu _1}{\text{ = 0}}{\text{.11}} $、$ {\mu _2}{\text{ = 18}}{\text{.5}} $
    攻击约束: ${\tau _{\text{F}}} = {\text{10}}{\text{.26}}$、${\tau _{\text{D}}} = 4$ $\tau \ge 6$(${\text{T}} = 6$下)
    下载: 导出CSV
  • [1] 杨挺, 刘亚闯, 刘宇哲, 等. 信息物理系统技术现状分析与趋势综述[J]. 电子与信息学报, 2021, 43(12): 3393–3406. doi: 10.11999/JEIT211135.

    YANG Ting, LIU Yachuang, LIU Yuzhe, et al. Review on cyber-physical system: Technology analysis and trends[J]. Journal of Electronics & Information Technology, 2021, 43(12): 3393–3406. doi: 10.11999/JEIT211135.
    [2] 杨光红, 芦安洋, 安立伟. 网络攻击下的信息物理系统安全状态估计研究综述[J]. 控制与决策, 2023, 38(8): 2093–2105. doi: 10.13195/j.kzyjc.2023.0885.

    YANG Guanghong, LU Anyang, and AN Liwei. A survey on secure state estimation of cyber-physical systems under cyber attacks[J]. Control and Decision, 2023, 38(8): 2093–2105. doi: 10.13195/j.kzyjc.2023.0885.
    [3] LU Kangdi and WU Zhengguang. Resilient event-triggered load frequency control for cyber-physical power systems under DoS attacks[J]. IEEE Transactions on Power Systems, 2023, 38(6): 5302–5313. doi: 10.1109/TPWRS.2022.3229667.
    [4] OBAYYA M, AL-WESABI F N, ALABDAN R, et al. Artificial intelligence for traffic prediction and estimation in intelligent cyber-physical transportation systems[J]. IEEE Transactions on Consumer Electronics, 2024, 70(1): 1706–1715. doi: 10.1109/TCE.2023.3320513.
    [5] 李云鹏, 张立宪, 韩岳江, 等. 基于模型预测控制的子母式无人机编队飞行控制方法[J]. 自动化学报, 2025, 51(2): 312–326. doi: 10.16383/j.aas.c240405.

    LI Yunpeng, ZHANG Lixian, HAN Yuejiang, et al. Model predictive control-based formation flight control method for composite UAVs[J]. Acta Automatica Sinica, 2025, 51(2): 312–326. doi: 10.16383/j.aas.c240405.
    [6] LANGNER R. Stuxnet: Dissecting a cyberwarfare weapon[J]. IEEE Security & Privacy, 2011, 9(3): 49–51. doi: 10.1109/MSP.2011.67.
    [7] KHAN S. Distributed sensors, computation and AI for automation, protection and maintenance of power grid[C]. Proceedings of the 2022 18th International Computer Engineering Conference, Cairo, Egypt, 2022: 130–135. doi: 10.1109/ICENCO55801.2022.10032522.
    [8] XU Hang, BARBOT S, and WANG Teng. Remote sensing through the fog of war: Infrastructure damage and environmental change during the Russian-Ukrainian conflict revealed by open-access data[J]. Natural Hazards Research, 2024, 4(1): 1–7. doi: 10.1016/j.nhres.2024.01.006.
    [9] WANG Zhe, ZHANG Heng, YANG Chaoqun, et al. Improved zero-dynamics attack scheduling with state estimation[J]. IEEE/CAA Journal of Automatica Sinica, 2025, 12(2): 472–474. doi: 10.1109/JAS.2024.124737.
    [10] ZHAO Rui, ZUO Zhiqiang, SHI Yang, et al. DoS and stealthy deception attacks for switched systems: A cooperative approach[J]. IEEE Transactions on Automatic Control, 2024, 69(7): 4396–4410. doi: 10.1109/TAC.2023.3321248.
    [11] DE PERSIS C and TESI P. Input-to-state stabilizing control under denial-of-service[J]. IEEE Transactions on Automatic Control, 2015, 60(11): 2930–2944. doi: 10.1109/TAC.2015.2416924.
    [12] SU Lei and YE Dan. Observer-based output feedback control for cyber-physical systems under randomly occurring packet dropout and periodic DoS attacks[J]. ISA Transactions, 2019, 95: 58–67. doi: 10.1016/j.isatra.2019.05.008.
    [13] TAN Wen, HOU Zhongsheng, and LI Yuanxin. Data-driven containment control for unknown MIMO nonlinear MASs under aperiodic DoS attacks[J]. IEEE Transactions on Automation Science and Engineering, 2025, 22: 7762–7772. doi: 10.1109/TASE.2024.3469153.
    [14] DEBRUHL B and TAGUE P. Digital filter design for jamming mitigation in 802.15. 4 communication[C]. Proceedings of 20th International Conference on Computer Communications and Networks, Lahaina, USA, 2011: 1–6. doi: 10.1109/ICCCN.2011.6006020.
    [15] SHI Ting, SHI Peng, and CHAMBERS J. Dynamic event-triggered model predictive control under channel fading and denial-of-service attacks[J]. IEEE Transactions on Automation Science and Engineering, 2024, 21(4): 6448–6459. doi: 10.1109/TASE.2023.3325534.
    [16] YUAN Yuan, YUAN Huanhuan, GUO Lei, et al. Resilient control of networked control system under DoS attacks: A unified game approach[J]. IEEE Transactions on Industrial Informatics, 2016, 12(5): 1786–1794. doi: 10.1109/TII.2016.2542208.
    [17] SAEEDI M, ZAREI J, RAZAVI-FAR R, et al. Event-triggered adaptive optimal fast terminal sliding mode control under denial-of-service attacks[J]. IEEE Systems Journal, 2022, 16(2): 2684–2692. doi: 10.1109/JSYST.2021.3073816.
    [18] ZHU Yanzheng and ZHENG Weixing. Observer-based control for cyber-physical systems with periodic DoS attacks via a cyclic switching strategy[J]. IEEE Transactions on Automatic Control, 2020, 65(8): 3714–3721. doi: 10.1109/TAC.2019.2953210.
    [19] WU Chengwei, WU Ligang, LIU Jianxing, et al. Active defense-based resilient sliding mode control under denial-of-service attacks[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 237–249. doi: 10.1109/TIFS.2019.2917373.
    [20] SHEN Hao, LIU Xinmiao, MA Qian, et al. Observer-based control for interval type-2 fuzzy systems under PDT-based DoS attacks[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2025, 55(6): 3780–3790. doi: 10.1109/TSMC.2025.3547320.
    [21] WANG Fuxing, LONG Yue, and LI Tieshan. Thruster fault detection for unmanned marine vehicles under DoS attacks: An asynchronous switched method[C]. Proceedings of the 14th International Conference on Information Science and Technology, Chengdu, China, 2024: 554–559. doi: 10.1109/ICIST63249.2024.10805417.
    [22] 华和安, 方勇纯, 钱辰, 等. 基于线性滤波器的四旋翼无人机强化学习控制策略[J]. 电子与信息学报, 2021, 43(12): 3407–3417. doi: 10.11999/JEIT210251.

    HUA He’an, FANG Yongchun, QIAN Chen, et al. Reinforcement learning control strategy of quadrotor unmanned aerial vehicles based on linear filter[J]. Journal of Electronics & Information Technology, 2021, 43(12): 3407–3417. doi: 10.11999/JEIT210251.
    [23] REN Yan, ZHANG Heng, YANG Wei, et al. Transferable adversarial attack against deep reinforcement learning-based smart grid dynamic pricing system[J]. IEEE Transactions on Industrial Informatics, 2024, 20(6): 9015–9025. doi: 10.1109/TII.2024.3379645.
    [24] YIN Liyuan, XU Lezhong, HOU Fusheng, et al. Security analysis and control under periodic DoS attacks[J]. IEEE Internet of Things Journal, 2024, 11(5): 8473–8484. doi: 10.1109/JIOT.2023.3319703.
    [25] LIU Jinliang, DONG Yanhui, ZHA Lijuan, et al. Reinforcement learning-based tracking control for networked control systems with DoS attacks[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 4188–4197. doi: 10.1109/TIFS.2024.3376250.
    [26] GAO Weinan, DENG Chao, JIANG Yi, et al. Resilient reinforcement learning and robust output regulation under denial-of-service attacks[J]. Automatica, 2022, 142: 110366. doi: 10.1016/j.automatica.2022.110366.
    [27] LI Hao, CHEN Hua, and ZHANG Wei. On model-free reinforcement learning for switched linear systems: A subspace clustering approach[C]. Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Monticello, USA, 2018: 123–130, doi: 10.1109/ALLERTON.2018.8635985.
    [28] CHEN Hua, ZHANG Linfang, and ZHANG Wei. Optimal control inspired Q-learning for switched linear systems[C]. Proceedings of the 2020 American Control Conference (ACC), Denver, USA, 2020: 4003–4010. doi: 10.23919/ACC45564.2020.9147818.
    [29] ZHANG Xuewen, WANG Yun, XIA Jianwei, et al. Optimal tracking control for discrete-time modal persistent dwell time switched systems based on Q-learning[J]. Optimal Control Applications and Methods, 2023, 44(6): 3327–3341. doi: 10.1002/oca.3040.
    [30] SUN Jiayue, ZHANG Huaguang, WANG Yingchun, et al. Optimal tracking control of switched systems applied in grid-connected hybrid generation using reinforcement learning[J]. Neural Computing and Applications, 2021, 33(15): 9363–9374. doi: 10.1007/s00521-021-05696-2.
    [31] WU Jiacheng, LIAN Bosen, SU Hongye, et al. Data-driven weighted $\tiny{H_\infty }$ control of persistent dwell time switched systems with optimal disturbance attenuation guaranteed[J]. IEEE Transactions on Automation Science and Engineering, 2025, 22: 8162–8173. doi: 10.1109/TASE.2024.3480449.
    [32] ZHAI Lijing and VAMVOUDAKIS K G. Data-based and secure switched cyber-physical systems[J]. Systems & Control Letters, 2021, 148: 104826. doi: 10.1016/j.sysconle.2020.104826.
    [33] ZHANG Wei, ABATE A, HU Jianghai, et al. Exponential stabilization of discrete-time switched linear systems[J]. Automatica, 2009, 45(11): 2526–2536. doi: 10.1016/j.automatica.2009.07.018.
    [34] AL-TAMIMI A, LEWIS F L, and ABU-KHALAF M. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control[J]. Automatica, 2007, 43(3): 473–481. doi: 10.1016/j.automatica.2006.09.019.
    [35] FEI Zhongyang, SHI Shuang, and SHI Peng. Analysis and Synthesis for Discrete-Time Switched Systems: A Quasi-Time-Dependent Method[M]. Cham: Springer, 2020: 23–25. doi: 10.1007/978-3-030-25812-2.
  • 加载中
图(7) / 表(2)
计量
  • 文章访问数:  61
  • HTML全文浏览量:  27
  • PDF下载量:  26
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-08-12
  • 修回日期:  2025-11-05
  • 录用日期:  2025-11-05
  • 网络出版日期:  2025-11-13

目录

    /

    返回文章
    返回