Domain-Independent Intelligent Planning Technology and Its Application to Automated Penetration Testing Oriented Attack Path Discovery
-
摘要: 攻击路径发现是自动化渗透测试领域的重要研究方向。该文综合论述了领域独立智能规划技术在面向自动化渗透测试的攻击路径发现上的研究进展及应用前景。首先介绍了攻击路径发现的基本概念并按照技术原理将其划分为基于领域相关和领域独立规划技术的攻击路径发现方法。然后介绍了领域独立智能规划算法,包括确定性规划算法、非确定性规划算法和博弈规划的技术原理和发展状况并就各类方法在攻击路径发现中的应用进行了综述。接着分析总结了渗透测试过程的特点,对比了领域独立智能规划算法应用在面向自动化渗透测试的攻击路径发现时的优缺点。最后对攻击路径发现将来的发展方向进行了总结和展望,希望对未来进一步的研究工作有一定的参考价值。
-
关键词:
- 领域独立智能规划技术 /
- 自动化渗透测试 /
- 攻击路径发现
Abstract: Attack path discovery is an important research direction in automated penetration testing area. This paper introduces the research progress of domain independent intelligent planning technology and its application to the field of automated penetration testing oriented attack paths discovery. Firstly, the basic concept of attack path discovery is introduced and the related algorithms are divided into domain-specific and domain-independent intelligent planning based attack path discovery algorithms separately. Secondly, the domain-independent planning algorithms are classified into deterministic planning, uncertain planning and game planning, where each of which is described from principle, development and application aspect in detail. Thirdly, this paper summarizes the characteristics of automated penetration testing and compares the advantages and disadvantages of domain independent intelligent planning algorithms adopted in automated penetration testing. Lastly, the development of automated penetration testing oriented attack path discovery is prospected. It is hoped that this paper could contribute future research works on attack path discovery. -
表 1 领域独立智能规划算法进行攻击路径发现时的适用性总结
类型 文献 O U D R M 优点 缺点 确定性攻击路径发现 规划图 [18] √ × × × √ 能够显示描述所有可能攻击路径,可解释性强 时间复杂度高,为O(mnk),m为状态空间大小,n为动作空间大小,k为层数 [20] √ × × × √ 基于规划图构建启发函数,提高攻击路径发现效率 时间复杂度高,为O(mn),不适用于大规模场景m为状态空间大小,n为动作空间大小 偏序规划 [22] √ × × × × 能够发现所有动作对之间的约束关系 需要遍历动作空间,构建约束集合,造成额外时间开销 [24] √ × × × √ 构造启发函数选择动作,并利用约束关系缩减规模,提高路径搜索效率 分层任务网络 [30] √ × × × × 可解释性更强 需要专家制定分解方法 [31] √ × × × √ 利用标准优化算法提高路径发现效率 非确定性攻击路径发现 Determinizing [36] √ √ × × × 可扩展性好,适用多种非确定性场景 无法进行重规划 概率优化 [41] √ √ × × × 能够根据实际执行结果进行重规划 需要删除非确定性信息进行规划,无法利用规划反馈信息 [44] √ × × × √ 构造规划图启发函数,求解效率高 构建多个规划图,造成大量冗余 马尔可夫
决策过程[52] √ √ × × √ 能存储大规模网络空间状态策略,策略求解效率更高 容易陷入局部极小值 [53] √ √ × × √ 基于数据确定模型的参数个数和函数形式,无需人工设定,灵活方便 在较大数据集的情况下训练时间较长 部分观测的马尔可夫决策过程 [55] √ √ × × √ 精确求解算法,是后续近似求解算法的基础 求解复杂度极高,当状态空间较大时无法进行规划求解 [57] √ √ × × √ 首个基于点迭代的近似求解方法,求解效率相对于精确求解效率高 仅能对单主机进行规划,时间复杂度O(|N||A|(|S||B|+|O|)),其中S为状态集合,A为动作集合,O为观测状态集合,B为信念状态点集合,N为上限点集合 [58] √ √ × × √ 采用前向搜索策略,采样效率更高,适合短序列场景 仅能对单主机进行规划,时间复杂度O(|N|(|S|2+|A|+|O|))其中S为状态集合,A为动作集合,O为观测状态集合,N为上限点集合 [59] √ √ × × √ 采样效率高 仅能对单主机进行规划,无法扩展到网络层面,时间复杂度为O(|S|3|A||O||B||N|)其中S为状态集合,A为动作集合,O为观测状态集合,B为信念状态点集合,N为上限点集合 [60] √ √ × × √ 能够实现网络层面攻击路径发现 假定网络拓扑结构及策略稳定不变 博弈攻击路径发现 静态博弈模型 [62] √ × × × √ 首次将博弈模型引入到攻防对抗 要求完全信息且攻防双方为完全理性,并且要求攻防对抗策略保持不变 [63] √ × × × √ 求解效率高 动态博弈模型 [67] √ × √ × √ 多轮次博弈条件下的攻击路径发现 要求完全信息且攻防双方为完全理性 [68] √ × √ × √ 摒弃了完全理性和完全信息假设 复杂度较高,为O((m+n)2),m和n分别为攻防策略集合大小 [71] √ × √ × √ 摒弃了攻防双方对等信息的假设 模型复杂,求解难,现实应用场景受限 注:O:状态空间完备性;U:行为不确定性;D:过程动态性;R:资源约束性;M:路径最优性。 -
KRUTZ R L and VINES R D. The CISSP and CAP Prep guide: Platinum Edition[M]. New Jersey: Wiley, 2007. STEFINKO Y, PISKOZUB A, and BANAKH R. Manual and automated penetration testing. Benefits and drawbacks. Modern tendency[C]. The 13th International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science, Lviv, Ukraine, 2016: 488–491. doi: 10.1109/tcset.2016.7452095. ABU-DABASEH F and ALSHAMMARI E. Automated penetration testing: An overview[C]. The 4th International Conference on Natural Language Computing, Copenhagen, Denmark, 2018: 121–129. MCDERMOTT J P. Attack net penetration testing[C]. 2000 Workshop on New Security Paradigms, Ballycotton, Ireland, 2001: 15–21. doi: 10.1145/366173.366183. 诸葛建伟, 陈力波, 孙松柏, 等. Metasploit渗透测试魔鬼训练营[M]. 北京: 机械工业出版社, 2013: 3–4.ZHUGE Jianwei, CHEN Libo, SUN Songbai, et al. Penetration Testing Devil Training Camp Based on Metasploit[M]. Beijing: China Machine Press, 2013: 3–4. POLATIDIS N, PAVLIDIS M, and MOURATIDIS H. Cyber-attack path discovery in a dynamic supply chain maritime risk management system[J]. Computer Standards & Interfaces, 2018, 56: 74–82. doi: 10.1016/j.csi.2017.09.006 李庆华, 尤越, 沐雅琪, 等. 一种针对大型凹型障碍物的组合导航算法[J]. 电子与信息学报, 2020, 42(4): 917–923. doi: 10.11999/JEIT190179LI Qinghua, YOU Yue, MU Yaqi, et al. Integrated navigation algorithm for large concave obstacles[J]. Journal of Electronics &Information Technology, 2020, 42(4): 917–923. doi: 10.11999/JEIT190179 BIALEK Ł, DUNIN-KĘPLICZ B, and SZAŁAS A. A paraconsistent approach to actions in informationally complex environments[J]. Annals of Mathematics and Artificial Intelligence, 2019, 86(4): 231–255. doi: 10.1007/s10472-019-09627-9 AMMANN P, WIJESEKERA D, and KAUSHIK S. Scalable, Graph-based network vulnerability analysis[C]. The 9th ACM Conference on Computer and Communications Security, Washington, USA, 2002: 217–224. doi: 10.1145/586110.586140. CHEN Feng, LIU Dehui, ZHANG Yi, et al. A scalable approach to analyzing network security using compact attack graphs[J]. Journal of Networks, 2010, 5(5): 543–550. doi: 10.4304/jnw.5.5.543-550 OU Xinming, GOVINDAVAJHALA S, and APPEL A W. MulVAL: A logic-based network security analyzer[C]. The 14th Conference on USENIX Security Symposium, Baltimore, USA, 2005: 113–128. WANG Lingyu, YAO Chao, SINGHAL A, et al. Interactive analysis of attack graphs using relational queries[C]. The 20th Annual Conference on Data and Applications Security and Privacy, Sophia Antipolis, France, 2006: 119–132. doi: 10.1007/11805588_9. LI Wei, VAUGHN R B, and DANDASS Y S. An approach to model network exploitations using exploitation graphs[J] Simulation, 2006, 82(8): 523–541. doi: 10.1177/0037549706072046. MAHDAVI A and CARVALHO M. Optimal trajectory and schedule planning for autonomous guided vehicles in flexible manufacturing system[C]. The 2nd IEEE International Conference on Robotic Computing, Laguna Hills, USA, 2018: 167–172. doi: 10.1109/irc.2018.00034. MA Xiaobai, JIAO Ziyuan, WANG Zhenkai, et al. 3-D decentralized prioritized motion planning and coordination for high-density operations of micro aerial vehicles[J]. IEEE Transactions on Control Systems Technology, 2018, 26(3): 939–953. doi: 10.1109/tcst.2017.2699165 ZANG Yichao, ZHOU Tianyang, GE Xiaoyue, et al. An improved attack path discovery algorithm through compact graph planning[J]. IEEE Access, 2019, 7: 59346–59356. doi: 10.1109/access.2019.2915091 BODDY M S, GOHDE J, HAIGH T, et al. Course of action generation for cyber security using classical planning[C]. The 15th International Conference on Automated Planning and Scheduling, Monterey, USA, 2005: 12–21. GARRETT C R, LOZANO-PÉREZ T, and KAELBLING L P. FFRob: Leveraging symbolic planning for efficient task and motion planning[J]. The International Journal of Robotics Research, 2018, 37(1): 104–136. doi: 10.1177/0278364917739114 KAUTZ H A and SELMAN B. Unifying SAT-based and graph-based planning[C]. The 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 1999: 318–325. DO M B and KAMBHAMPATI S. Planning as constraint satisfaction: Solving the planning graph by compiling it into CSP[J]. Artificial Intelligence, 2001, 132(2): 151–182. doi: 10.1016/s0004-3702(01)00128-x BAIOLETTI M, MARCUGINI S, and MILANI A. DPPlan: An algorithm for fast solutions extraction from a planning graph[C]. The 5th International Conference on Artificial Intelligence Planning Systems, Breckenridge, USA, 2000: 13–21. BARRETT A and WELD D S. Partial-order planning: Evaluating possible efficiency gains[J]. Artificial Intelligence, 1994, 67(1): 71–112. doi: 10.1016/0004-3702(94)90012-4 NGUYEN X L and KAMBHAMPATI S. Reviving partial order planning[C]. The 17th International Joint Conference on Artificial Intelligence, Seattle, USA. 2001: 459–466. YOUNES H L S and SIMMONS R G. VHPOP: Versatile heuristic partial order planner[J]. Journal of Artificial Intelligence Research, 2003, 20: 405–430. doi: 10.1613/jair.1136 COLES A J, COLES A, FOX M, et al. Forward-chaining partial-order planning[C]. The 20th International Conference on Automated Planning and Scheduling, Toronto, Canada, 2010: 42–49. BOUTILIER C and BRAFMAN R I. Partial-order planning with concurrent interacting actions[J]. Journal of Artificial Intelligence Research, 2001, 14: 105–136. doi: 10.1613/jair.740 MOHR F, WEVER M, and HÜLLERMEIER E. ML-Plan: Automated machine learning via hierarchical planning[J]. Machine Learning, 2018, 107(8–10): 1495–1515. doi: 10.1007/s10994-018-5735-z DE SILVA L, PADGHAM L, and SARDINA S. HTN-like solutions for classical planning problems: An application to BDI agent systems[J]. Theoretical Computer Science, 2019, 763: 12–37. doi: 10.1016/j.tcs.2019.01.034 SOHN S, OH J, and LEE H. Hierarchical reinforcement learning for zero-shot generalization with subtask dependencies[C]. The 32nd Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 7156–7166. MU Chengpo and LI Yingjiu. An intrusion response decision-making model based on hierarchical task network planning[J]. Expert Systems with Applications, 2010, 37(3): 2465–2472. doi: 10.1016/j.eswa.2009.07.079 ONTAÑÓN S and BURO M. Adversarial hierarchical-task network planning for complex real-time games[C]. The 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 2015: 1652–1658. FU JICHENG, NG V, BASTANI F B, et al. Simple and fast strong cyclic planning for fully-observable nondeterministic planning problems[C]. The 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, 2011: 1949–1954. doi: 10.1007/s10472-016-9517-7. KOLOBOV A, MAUSAM M, and Weld D S. LRTDP versus UCT for online probabilistic planning[C]. The 26th AAAI Conference on Artificial Intelligence, Toronto, Canada, 2012: 1786–1792. YOON S, FERN A, GIVAN R, et al. Probabilistic planning via determinization in hindsight[C]. The 23rd AAAI Conference on Artificial Intelligence, Chicago, USA, 2008: 1010–1016. CIMATTI A, PISTORE M, ROVERI M, et al. Weak, strong, and strong cyclic planning via symbolic model checking[J]. Artificial Intelligence, 2003, 147(1/2): 35–84. doi: 10.1016/s0004-3702(02)00374-0 MUISE C J, MCILRAITH S A, and BECK J C. Improved non-deterministic planning by exploiting state relevance[C]. The 22nd International Conference on Automated Planning and Scheduling, Atibaia, Brazil, 2012: 172–180. MUISE C J, MCILRAITH S A, and BELLE V. Non-deterministic planning with conditional effects[C]. The 24th International Conference on Automated Planning and Scheduling, Portsmouth, USA, 2014: 370–374. 李洋, 文中华, 伍小辉, 等. 求最小期望权值强循环规划解[J]. 计算机科学, 2015, 42(4): 217–220, 257. doi: 10.11896/j.issn.1002-137X.2015.04.044LI Yang, WEN Zhonghua, WU Xiaohui, et al. Solving strong cyclic planning with minimal expectation weight[J]. Computer Science, 2015, 42(4): 217–220, 257. doi: 10.11896/j.issn.1002-137X.2015.04.044 唐杰, 文中华, 汪泉, 等. 不确定可逆规划的强循环规划解[J]. 计算机研究与发展, 2013, 50(9): 1970–1980. doi: 10.7544/issn1000-1239.2013.20130404TANG Jie, WEN Zhonghua, WANG Quan, et al. Solving strong cyclic planning in nondeterministic reversible planning domain[J]. Journal of Computer Research and Development, 2013, 50(9): 1970–1980. doi: 10.7544/issn1000-1239.2013.20130404 KUSHMERICK N, HANKS S, and WELD D S. An algorithm for probabilistic planning[J]. Artificial Intelligence, 1995, 76(1/2): 239–286. doi: 10.1016/0004-3702(94)00087-H YOON S W, FERN A, and GIVAN R. FF-Replan: A baseline for probabilistic planning[C]. The 17th International Conference on Automated Planning and Scheduling, Providence, USA, 2007: 352–359. YOON S, RUML W, BENTON J, et al. Improving determinization in hindsight for on-line probabilistic planning[C]. The 20th International Conference on Automated Planning and Scheduling, Toronto, Canada, 2010: 209–216. ISSAKKIMUTHU M, FERN A, KHARDON R, et al. Hindsight optimization for probabilistic planning with factored actions[C]. The 25th International Conference on Automated Planning and Scheduling, Jerusalem, Israel, 2015: 120–128. BRYCE D, KAMBHAMPATI S, and SMITH D E. Sequential Monte Carlo in probabilistic planning reachability heuristics[C]. The 16th International Conference on Automated Planning and Scheduling, Cumbria, UK, 2006: 233–242. TREVIZAN F W, THIÉBAUX S, and HASLUM P. Occupation measure heuristics for probabilistic planning[C]. The 27th International Conference on Automated Planning and Scheduling, Pittsburgh, USA, 2017: 306–315. DURKOTA K and LISÝ V. Computing optimal policies for attack graphs with action failures and costs[C]. The 7th European Starting AI Researcher Symposium, Prague, Czech Republic, 2014: 101–110. SUN Wen, GORDON G J, BOOTS B, et al. Dual policy iteration[C]. The 32nd Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 7059–7069. HOUTHOOFT R, CHEN R Y, ISOLA P, et al. Evolved policy gradients[C]. The 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 5405–5414. LIU Huaping, WU Yupei, and SUN Fuchun. Extreme trust region policy optimization for active object recognition[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(6): 2253–2258. doi: 10.1109/TNNLS.2017.2785233 SRINIVASAN S, LANCTOT M, ZAMBALDI V, et al. Actor-critic policy optimization in partially observable multiagent environments[C]. The 32nd Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 3422–3435. KANG Qinma, ZHOU Huizhuo, and KANG Yunfan. An asynchronous advantage actor-critic reinforcement learning method for stock selection and portfolio management[C]. The 2nd International Conference on Big Data Research, Weihai, China, 2018: 141–145. doi: 10.1145/3291801.3291831. TAN Fuxiao and GUAN Xinping. Kernel-based adaptive critic designs for optimal control of nonlinear discrete-time system[C]. The 37th Chinese Control Conference, Wuhan, China, 2018: 2167–2172. doi: 10.23919/chicc.2018.8482778. TAYLOR G and PARR R. Kernelized value function approximation for reinforcement learning[C]. The 26th Annual International Conference on Machine Learning, Montreal, Canada, 2009: 1017–1024. doi: 10.1145/1553374.1553504. SARRAUTE C, BUFFET O, and HOFFMANN J. Penetration testing== POMDP solving?[C]. 2011 IJCAI Workshop on Intelligent Security, Barcelona, Spain, 2011: 66–73. SMALLWOOD R D and SONDIK E J. The optimal control of partially observable Markov processes over a finite horizon[J]. Operations Research, 1973, 21(5): 1071–1088. doi: 10.1287/opre.21.5.1071 CHENG H T. Algorithms for partially observable Markov decision processes[D]. [Ph. D. dissertation], The University of British Columbia, 1988. doi: 10.14288/1.0098252. PINEAU J, GORDON G, and THRUN S. Point-based value iteration: An anytime algorithm for POMDPs[C]. The 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 2003: 1025–1032. LIU Bingbing, KANG Yu, JIANG Xiaofeng, et al. A fast approximation method for partially observable Markov decision processes[J]. Journal of Systems Science and Complexity, 2018, 31(6): 1423–1436. doi: 10.1007/s11424-018-7038-7 KURNIAWATI H, HSU D, LEE W S. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces[C]. The Robotics: Science and Systems IV, Zurich, Switzerland, 2008: 65–72. doi: 10.15607/RSS.2008.IV.009. SARRAUTE C, BUFFET O, and HOFFMANN J. POMDPs make better hackers: Accounting for uncertainty in penetration testing[C]. The 26th AAAI Conference on Artificial Intelligence, Toronto, Canada, 2012: 1816–1824. 王刚, 胡鑫, 马润年, 等. 集体防御机制下的网络行动同步建模和稳定性[J]. 电子与信息学报, 2018, 40(6): 1515–1519. doi: 10.11999/JEIT170619WANG Gang, HU Xin, MA Runnian, et al. Synchronization modeling and stability of cyberspace operation based on collective defensive mechanism[J]. Journal of Electronics &Information Technology, 2018, 40(6): 1515–1519. doi: 10.11999/JEIT170619 LYE K W and WING J M. Game strategies in network security[J]. International Journal of Information Security, 2005, 4(1/2): 71–86. doi: 10.1007/s10207-004-0060-x 姜伟, 方滨兴, 田志宏, 等. 基于攻防博弈模型的网络安全测评和最优主动防御[J]. 计算机学报, 2009, 32(4): 817–827. doi: 10.3724/SP.J.1016.2009.00817JIANG Wei, FANG Binxing, TIAN Zhihong, et al. Evaluating network security and optimal active defense based on attack-defense game model[J]. Chinese Journal of Computers, 2009, 32(4): 817–827. doi: 10.3724/SP.J.1016.2009.00817 王晋东, 余定坤, 张恒巍, 等. 静态贝叶斯博弈主动防御策略选取方法[J]. 西安电子科技大学学报: 自然科学版, 2016, 43(1): 144–150. doi: 10.3969/j.issn.1001-2400.2016.01.026WANG Jindong, YU Dingkun, ZHANG Hengwei, et al. Active defense strategy selection based on the static Bayesian game[J]. Journal of Xidian University, 2016, 43(1): 144–150. doi: 10.3969/j.issn.1001-2400.2016.01.026 王元卓, 林闯, 程学旗, 等. 基于随机博弈模型的网络攻防量化分析方法[J]. 计算机学报, 2010, 33(9): 1748–1762. doi: 10.3724/SP.J.1016.2010.01748WANG Yuanzhuo, LIN Chuang, CHENG Xueqi, et al. Analysis for network attack-defense based on stochastic game model[J]. Chinese Journal of Computers, 2010, 33(9): 1748–1762. doi: 10.3724/SP.J.1016.2010.01748 CUI Xiaolin, TAN Xiaobin, ZHANG Yong, et al. A Markov game theory-based risk assessment model for network information system[C]. 2008 International Conference on Computer Science and Software Engineering, Hubei, China, 2008: 1057–1061. doi: 10.1109/csse.2008.949. LI Tao, WANG Jindong, CHEN Yu, et al. A multi-stage game approach applied to network security risk controlling[C]. The 2nd IEEE Advanced Information Technology, Electronic and Automation Control Conference, Chongqing, China, 2017: 2518–2522. doi: 10.1109/iaeac.2017.8054477. 黄健明, 张恒巍, 王晋东, 等. 基于攻防演化博弈模型的防御策略选取方法[J]. 通信学报, 2017, 38(1): 168–176. doi: 10.11959/j.issn.1000-436x.2017019HUANG Jianming, ZHANG Hengwei, Wang Jindong, et al. Defense strategies selection based on attack-defense evolutionary game model[J]. Journal on Communications, 2017, 38(1): 168–176. doi: 10.11959/j.issn.1000-436x.2017019 张恒巍, 李涛. 基于多阶段攻防信号博弈的最优主动防御[J]. 电子学报, 2017, 45(2): 431–439. doi: 10.3969/j.issn.0372-2112.2017.02.023ZHANG Hengwei and LI Tao. Optimal active defense based on multi-stage attack-defense signaling game[J]. Acta Electronica Sinica, 2017, 45(2): 431–439. doi: 10.3969/j.issn.0372-2112.2017.02.023 朱建明, 宋彪, 黄启发. 基于系统动力学的网络安全攻防演化博弈模型[J]. 通信学报, 2014, 35(1): 54–61. doi: 10.3969/j.issn.1000-436x.2014.01.007ZHU Jianming, SONG Biao, and HUANG Qifa. Evolution game model of offense-defense for network security based on system dynamics[J]. Journal on Communications, 2014, 35(1): 54–61. doi: 10.3969/j.issn.1000-436x.2014.01.007 VADLAMUDI S G, SENGUPTA S, TAGUINOD M, et al. Moving target defense for web applications using Bayesian stackelberg games: (Extended Abstract)[C]. The 2016 International Conference on Autonomous Agents & Multiagent Systems, Singapore, 2016: 1377–1378. YUAN Yuan, SUN Fuchun, and LIU Huaping. Resilient control of cyber-physical systems against intelligent attacker: A hierarchal stackelberg game approach[J]. International Journal of Systems Science, 2016, 47(9): 2067–2077. doi: 10.1080/00207721.2014.973467
表(1)
计量
- 文章访问数: 2587
- HTML全文浏览量: 1165
- PDF下载量: 282
- 被引次数: 0