Current Articles

2026, Volume 48,  Issue 5

2026, 48(5): 1-1.
Abstract:
2026, 48(5): 1-4.
Abstract:
Excellence Leadership Column
A Fast and Accurate Programming Strategy for Analog In-Memory Computing Validated With a Transposable RRAM Macro and 0.64% Fully-Parallel RMS Error
XIE Lifan, WEI Songtao, YAO Peng, WU Dong, TANG Jianshi, QIAN He, GAO Bin, WU Huaqiang
2026, 48(5): 1875-1883. doi: 10.11999/JEIT251174
Abstract:
  Objective  Non-Volatile Memory (NVM)-based Compute-in-Memory (CIM) is considered a promising candidate for next-generation artificial intelligence accelerators because of its high energy efficiency and instant wake-up capability. However, the conventional Write-and-Verify (W&V) scheme cannot satisfy the speed and precision requirements of highly parallel CIM macros. The main limitation arises from the inefficient verification stage. Cell-by-cell reading must be repeated for the entire array, which significantly increases programming time. In addition, switching from the verify state, where only one row is active, to the compute state, where all rows are active, introduces systematic errors such as reference drift and IR-drop-induced weight inaccuracy. Analog CIM macros with on-chip programming must also tolerate large and non-uniform offsets under massive parallel operation. This work proposes three techniques: (1) a Back-Propagation-Assisted Programming (BPAP) scheme that rapidly and accurately locates failing cells without full-array verification; (2) an Analog-domain Offset-Canceling Structure (AOSC) that compensates channel-wise offsets in situ; and (3) a transposable Resistive Random-Access Memory (RRAM) macro equipped with parallel Two-Channel current-domain Analog-to-Digital Converters (TC-ADC), which doubles the effective sampling rate with only 15% additional ADC area.  Methods  As shown in Fig. 2, the transposable RRAM macro contains two processing elements (PEs) and a shared backward-processing ADC (BP-ADC). Each PE includes an input loader (IL), a Digital-to-Analog Converter (DAC) array, a Bit-Line (BL) buffer and switch array, and 32 TC-ADCs. This configuration supports fully parallel forward computation. An Error Loader (EL) and a Source-Line (SL) buffer are also included to provide an error input vector for transposed matrix-vector multiplication (MVM). Fig. 3 illustrates the programming flow of the BPAP scheme. After AOSC calibration, a forward calculation is first executed. The differences between the expected outputs (yexp) and the measured outputs (yreal) are then computed on chip and used as inputs for the following back-propagation phase. The derivatives of the RRAM weights are calculated using several validation patterns. This training-like process adapts to the actual RRAM states and detects programming failures under the highly parallel computing condition. Weights with derivatives exceeding a predefined error threshold are selected for remapping. This approach enables accurate programming without performing cell-by-cell verification across the entire array. In the forward phase (Fig. 4a), each 2T2R cell is configured as a signed weight, and the SLs are clamped at VCM by the TC-ADCs. For each PE, a fully parallel 4b-IN/4b-W MVM operation is completed with 320 active rows of 2T2R cells, and 32 ADCs perform simultaneous conversions. In the backward phase (Fig. 4b), only the upper half of the reference voltages drives the SL buffers, and the weight is configured in 1T1R mode. Differential computation between the positive and negative 1T1R cells is performed by an external processor. Fig. 5 shows the operation of the AOSC scheme. Redundant rows in the RRAM array are programmed to compensate the analog computing offsets in situ. Offset currents are first measured by applying an all-zero input pattern to the regular weights. The redundant RRAM weights are then programmed to minimize the offset currents under a constant input voltage. During normal computation, these programmed redundancy rows receive the same input voltage to cancel the offsets. The macro supports this AOSC operation with only about 1% additional array area. Fig. 6 shows the TC-ADC architecture. A class-AB output stage, together with associated switches and capacitors, enables two-channel conversion and reduces the computation latency by half. This design increases the ADC area by only about 15% while achieving a 2× sampling rate.  Conclusions  Replacing the conventional W&V procedure with BPAP, together with AOSC calibration and TC-ADC acceleration, enables reliable and high-precision programming of analog RRAM-CIM macros under massive parallel operation. The measured results show 96.5% classification accuracy on MNIST and a 4.8% improvement on ImageNet. The proposed techniques are compatible with standard 2T2R and 1T1R RRAM bit cells and can be extended to larger arrays and deeper neural networks.
Circuit and System Design
A Miniaturized Steady-State Visual Evoked Potential Brain-Computer Interface System
CAI Yu, WANG Junyang, JIANG Chuanli, LUO Ruixin, LÜ Zhengchao, YU Haiqing, HUANG Yongzhi, ZHONG Ziping, XU Minpeng
2026, 48(5): 1884-1893. doi: 10.11999/JEIT251223
Abstract:
  Objective  The practical use of Brain-Computer Interface (BCI) systems in daily settings is limited by bulky acquisition hardware and the cables required for stable performance. Although portable systems exist, achieving compact hardware, full mobility, and high decoding performance at the same time remains difficult. This study aims to design, implement, and validate a wearable Steady-State Visual Evoked Potential (SSVEP) BCI system. The goal is to create an integrated system with ultra-miniaturized and concealable acquisition hardware and a stable cable-free architecture, and to show that this approach provides online performance comparable with laboratory systems.  Methods  A system-level solution was developed based on a distributed architecture to support wearability and hardware simplification. The core component is an ultra-miniaturized acquisition node. Each node functions as an independent EEG acquisition unit and integrates a Bluetooth Low Energy (BLE) system-on-chip (CC2640R2F), a high-precision analog-to-digital converter (ADS1291), a battery, and an electrode in one encapsulated module. Through an optimized 6-layer PCB design and stacked assembly, the module size was reduced to 15.12 mm × 14.08 mm × 14.31 mm (3.05 cm3) with a weight of 3.7 g. Each node uses one active electrode, and all nodes share a common reference electrode connected by a thin short wire. This structure reduces scalp connections and allows concealed placement in hair using a hair-clip form factor. Multiple nodes form a star network coordinated by a master device that manages communication with a stimulus computer. A cable-free synchronization strategy was implemented to handle timing uncertainties in distributed wireless operation. Hardware-event detection and software-based clock management were combined to align stimulus markers with multi-channel EEG data without dedicated synchronization cables. The master device coordinates this process and streams synchronized data to the computer for real-time processing. System evaluation was conducted in two phases. Foundational performance metrics included physical characteristics, electrical parameters (input-referred noise: 3.91 mVpp; common-mode rejection ratio: 132.99 dB), and synchronization accuracy under different network scales. Application-level performance was assessed using a 40-command online SSVEP spelling task with six subjects in an unshielded room with common RF interference. Four nodes were placed at Pz, PO3, PO4, and Oz. EEG epochs (0.14\begin{document}$ \sim $\end{document}3.14 s post-stimulus) were analyzed using Canonical Correlation Analysis (CCA) and ensemble Task-Related Component Analysis (e-TRCA) to compute recognition accuracy and Information Transfer Rate (ITR).  Results and Discussions  The system met its design objectives. Each acquisition node achieved an ultra-compact form factor (3.05 cm3, 3.7 g) suitable for concealed wear and provided more than 5 hours of battery life at a 1 000 Hz sampling rate. Electrical performance supported high-quality SSVEP acquisition. The cable-free synchronization strategy ensured stable operation. More than 95% of event markers aligned with the EEG stream with less than 1 ms error (Fig. 4), meeting SSVEP-BCI requirements. This stability supported the quality of recorded neural signals. Grand-averaged SSVEP responses showed clear and stable waveforms with precise phase alignment (Fig. 5). The signal-to-noise ratio at the fundamental stimulation frequency exceeded 10 dB for all 40 commands (Fig. 6). In the online spelling experiment, the system showed strong decoding performance. With the e-TRCA algorithm and a 3-s window, the average accuracy was (95.00 ± 2.04)%. The system reached a peak ITR of (147.24 ± 30.52) bit/min with a 0.4-s data length (Fig. 7). Comparison with existing SSVEP-BCI systems (Table 1) indicates that, despite constraints of miniaturization, cable-free use, and four channels, the system achieved accuracy comparable with several cable-dependent laboratory systems while offering improved wearability.  Conclusions  This work presents a wearable SSVEP-BCI system that integrates ultra-miniaturized hardware with a distributed cable-free architecture. The results show that coordinated hardware and system design can overcome tradeoffs between device size, user mobility, and decoding capability. The acquisition node (3.7 g, 3.05 cm3) supports concealable wearability, and the synchronization strategy provides reliable cable-free operation. In a realistic environment, the system produced online performance comparable with many cable-dependent setups, achieving 95.00% accuracy and a peak ITR of 147.24 bit/min in a 40-target task. Therefore, this study provides a practical system-level solution that supports progress toward wearable high-performance BCIs.
Design of an Aerospace-grade Radiation-hardened SRAM Cell for High-speed Read/Write Applications
CAI Shuo, SHUAI Wei, HU Xing, LIANG Xinjie, HUANG Zhu, YU Fei
2026, 48(5): 1894-1904. doi: 10.11999/JEIT251287
Abstract:
  Objective  With the continued scaling of Complementary Metal-Oxide-Semiconductor (CMOS) technology nodes and the reduction in supply voltage, Static Random Access Memory (SRAM) in aerospace environments becomes increasingly sensitive to high-energy particle radiation and is prone to Single-Node Upset (SNU) and Double-Node Upset (DNU). This sensitivity poses a serious challenge to the reliability of spaceborne Systems-on-Chip (SoC). Existing Radiation-Hardened-By-Design (RHBD) structures, however, usually cannot balance strong radiation tolerance with high-speed access performance. This work therefore aims to design an aerospace-grade radiation-hardened SRAM cell for high-speed read/write applications that provides both strong radiation resistance and fast access performance.  Methods  The proposed Read Fast and Write Fast 16-Transistor (RFWF16T) SRAM is built on a dual-source isolation architecture composed of 16 transistors (8 PMOS and 8 NMOS) (Fig. 1, Fig. 2). By using a symmetric recovery mechanism, the RFWF16T reduces the number of key sensitive nodes to only two. Redundant transistors (P2 and P6) are used to establish a stable high-level isolation state, which isolates the storage nodes from potential disturbances during the non-access phase. To achieve high-speed operation, the RFWF16T combines a short feedback path with a low-impedance voltage discharge loop. Unlike conventional hardened cells that rely on stacked transistors, which increase resistance and delay, the RFWF16T adopts a parallel access topology connected to word lines and bit lines. This configuration forms a low-impedance path during write operations and significantly accelerates node voltage switching (Fig. 3). Performance verification confirms the self-recovery capability of the four data nodes. A comprehensive variation analysis is conducted, including Process-Voltage-Temperature (PVT) variations and 2,000-point Monte Carlo simulations. Additionally, an improved Electrical Quality Metric (EQM) is proposed to evaluate multidimensional performance quantitatively.  Results and Discussions  The RFWF16T exhibits strong overall performance, particularly in overcoming the speed bottleneck of hardened SRAM cells. In terms of access speed, the RFWF16T performs substantially better than typical models such as S8P8N, SAW16T, and RH20T. Under standard conditions (28 nm CMOS process, 1.0 V, 25 °C, TT corner), the RFWF16T achieves a Read Access Time (RAT) of 20.97 ps and a Write Access Time (WAT) of 2.72 ps. These values correspond to average speed improvements of 46.65% and 14.77%, respectively, over eight comparable hardened structures (Table 2). PVT analysis confirms that the RFWF16T maintains the lowest latency across voltages from 0.7 V to 1.1 V and temperatures from –25 °C to 125 °C (Fig. 6). This write-speed advantage is attributed to the removal of write contention through optimized discharge paths. In terms of noise margin and stability, the RFWF16T demonstrates strong robustness and achieves the highest Write Word-line Toggle Voltage (WWTV) among nine comparative structures. Its Hold Static Noise Margin (HSNM) and Read Static Noise Margin (RSNM) also rank among the best, which ensures stability under disturbances (Fig. 7). In radiation hardening, the RFWF16T achieves a 100% self-recovery rate for SNUs and an 83.3% recovery rate for DNUs, reaching the state-of-the-art level among DNU-recoverable units (Table 1). Monte Carlo simulations confirm that the average recovery times of the internal nodes range from 1.09 ns to 1.19 ns (Fig. 4, Fig. 5). In terms of overhead, the RFWF16T maintains a normalized area of 1.00× (4.3 μm × 1.9 μm) (Table 3, Fig. 2) and an average power consumption of 23.45 nW (Table 4). Although the power consumption is slightly higher, this increase is a reasonable trade-off for the substantial speed advantage. In the EQM evaluation, the RFWF16T obtains the highest score, which confirms its overall advantage in balancing reliability, speed, and stability (Fig. 9).  Conclusions  A radiation-hardened SRAM cell, RFWF16T, is proposed for aerospace-grade high-speed read/write applications. The cell contains only two sensitive nodes and achieves 100% self-recovery for SNUs and an 83.3% recovery rate for DNUs, which demonstrates strong radiation tolerance. Compared with eight other SRAM cells, the RFWF16T significantly reduces read and write delay with only a slight increase in area and power consumption, while maintaining good noise immunity and the best electrical quality metric. PVT and Monte Carlo simulations further confirm the stability and robustness of the proposed cell under different operating conditions. Future work will focus on array-level integration and tape-out verification, and on its application in satellite-borne high-speed data processing.
Design of a Narrowband Energy-Selective Protective Antenna Integrating Electromagnetic Protection and Out-of-Band Interference Suppression
GAI Longjie, XU Yanlin, WANG Sijun, LIU Peiguo, HU Ning, HE Zhengwei
2026, 48(5): 1905-1915. doi: 10.11999/JEIT251363
Abstract:
  Objective  With the rapid development of wireless communication technologies, the ElectroMagnetic (EM) environment is becoming increasingly complex. Electronic information equipment is facing growing challenges from High-Intensity Radiation Fields (HIRFs) and out-of-band interference. This trend makes the co-design of EM protection and out-of-band interference suppression in electronic information systems an urgent issue. As the front end of the radio-frequency channel, antennas provide the main path by which EM waves in free space are converted into guided waves in microwave circuits. High-power EM waves can couple into a system through an antenna and cause EM damage. In single-frequency applications, if an antenna does not exhibit narrowband characteristics, out-of-band interference signals may also enter the system through the antenna and disrupt normal operation. A narrowband energy-selective protective antenna should therefore be developed to provide both out-of-band interference suppression and in-band EM protection against strong EM threats, thereby improving the operational stability and environmental adaptability of electronic information equipment in complex EM environments.  Methods  A coaxial-fed microstrip patch antenna is designed, and its structure is optimized through simulation for operation at 915 MHz. The antenna structure is designed to provide both narrowband behavior and EM protection, thereby achieving integrated EM protection and out-of-band interference suppression. A high dielectric constant is used to support both antenna miniaturization and narrowband operation. Accordingly, a TP-2 substrate with a dielectric constant of 20 is selected to obtain the required narrowband response. In a conventional coaxial-fed microstrip patch antenna, the probe passes directly through the dielectric substrate and connects to the radiating patch, which leaves insufficient space for the integration of a protective structure. To solve this problem, a layered-substrate design with a central hollow cavity is adopted. This configuration forms a layered cavity protective structure and enables the antenna itself to exhibit energy-selective protection characteristics.  Results and Discussions  To verify the performance of the proposed antenna, physical fabrication and experimental measurements are carried out (Fig. 16). The measured center frequency is 928.5 MHz, and the operating bandwidth is 927.0–930.0 MHz. Although the measured center frequency is shifted by 12.8 MHz from the simulated design value, the antenna still exhibits favorable narrowband characteristics (Fig. 17). The measured radiation pattern agrees well with the simulated result. In the \begin{document}$\varphi $\end{document} = 0 deg plane, a stable omnidirectional radiation pattern is observed, and the measured maximum gain reaches 2.5 dBi (Fig. 12 and Fig. 18). The Shielding Effectiveness (SE) is measured by a high-power injection method. As the injected power increases, the radiated power increases linearly. When the injected power reaches 22 dBm, the increase in radiated power begins to saturate, which indicates that the diodes in the protective structure start to conduct and that the energy-selective mechanism is activated. As the injected power increases further, the SE rises gradually. When the injected power reaches 48 dBm, the radiated power rises sharply to the level of the original linear radiation curve, and the SE drops abruptly, which indicates diode breakdown and failure of the protective structure. In summary, the activation threshold of the protection function is 26 dBm, and the device failure threshold is 48 dBm. Within this range, the maximum SE reaches 26 dB (Fig. 20).  Conclusions  Based on a coaxial-fed microstrip patch antenna, a narrowband energy-selective protective antenna with integrated EM protection and out-of-band interference suppression is designed and demonstrated. The complete process is covered, including theoretical analysis, structural simulation and optimization, prototype fabrication, and experimental verification. First, Characteristic Mode Analysis (CMA) is used to examine the potential operating modes of the microstrip patch antenna. By analyzing the electric- and magnetic-field modal distributions, the impedance-matching characteristics are clarified, and the optimal coaxial feed position is determined. Next, the use of a high-permittivity substrate enables both antenna miniaturization and narrowband performance, and an Interference Suppression Capability (ISC) better than 22.1 dB is achieved. A layered-substrate structure with a central hollow cavity is then proposed, and a cavity-based protective structure integrated into the feed-probe region is established. An equivalent-circuit model is also developed to explain the operating mechanisms of the antenna in both the normal and protective states. Finally, the antenna prototype is fabricated and tested. The measured results show favorable narrowband characteristics, good agreement between the measured and simulated radiation patterns, and a measured maximum gain of 2.5 dBi. In addition, by applying the reciprocity principle and using a high-power injection method for SE testing, a maximum SE of 26 dB is obtained, which confirms the excellent EM protection capability of the antenna. Compared with existing protective antennas, the proposed structure achieves both out-of-band interference suppression and EM protection within the antenna itself. This design advances the integration of frequency-domain interference suppression and energy-domain protection. It should also be noted that the deviation between the measured and simulated center frequencies is caused in part by nonuniform substrate permittivity and fabrication tolerances, which reflects the sensitivity of narrowband antennas to structural parameters. In future work, a tunable mechanism may be adopted to develop a frequency-reconfigurable narrowband energy-selective protective antenna, so that frequency deviations can be compensated dynamically and the design robustness and environmental adaptability can be improved.
Crosstalk-Free Frequency-Spin Multiplexed Multifunctional Device Realized by Nested Meta-Atoms
ZHANG Ming, DONG Peng, TAO En, YANG Lin, HAN Qi, HE Yuhang, HOU Weimin, LI Kang
2026, 48(5): 1916-1926. doi: 10.11999/JEIT251202
Abstract:
  Objective  To address high fabrication costs and signal crosstalk in existing multidimensional multiplexed metasurfaces, a crosstalk-free, frequency-spin multiplexed single-layer metasurface based on nested bi-spectral meta-atoms is proposed. Two C-shaped split-ring resonators are physically superimposed to target the Ku band (12.5 GHz) and the K band (22 GHz). This configuration enables four fully independent information channels, defined by two frequencies and two spin states, without spatial division or multilayer stacking. The objective is to demonstrate independent, high-performance vortex beam generation and holographic imaging, providing a simplified and cost-effective solution for advanced 6G communication and sensing systems.  Methods  A reflective metal-dielectric-metal metasurface architecture is adopted, in which each unit cell integrates an Outer C-Shaped Split-Ring Resonator (OCSRR) and an Inner C-Shaped Split-Ring Resonator (ICSRR). Parameter sweeps performed using CST Microwave Studio are used to select structures that provide high cross-polarization conversion at the target frequencies while maintaining negligible responses in non-target bands. Independent spin multiplexing is achieved through the combined use of transmission phase and geometric phase, controlled by resonator rotation. Two prototypes are fabricated using printed circuit board technology. MS1 is designed for focused vortex beam generation with topological charges l = +1, +2, +3, and +4, whereas MS2 is designed for holographic imaging of the letters “H”, “B”, “K”, and “D”. Device performance is validated by near-field scanning measurements under oblique incidence using a vector network analyzer.  Results and Discussions  Simulation and experimental results confirm strong frequency selectivity and effective spin decoupling enabled by the nested meta-atom design. The OCSRR and ICSRR dominate the electromagnetic responses at 12.5 GHz and 22 GHz, respectively, and exhibit linear superposition behavior with minimal crosstalk. MS1 generates four focused vortex beams with clearly separated topological charges, achieving an average mode purity of 88.25%. MS2 reconstructs four independent and well-defined holographic images with high channel isolation. The close agreement between measured and simulated results demonstrates the robustness of the device and validates the effectiveness of the crosstalk-free design strategy under practical illumination conditions.  Conclusions  A reliable approach for realizing crosstalk-free frequency-spin multiplexed metasurfaces using nested meta-atoms is demonstrated. Simultaneous and independent manipulation of electromagnetic waves across four channels is achieved on a single metasurface layer, substantially reducing design complexity and fabrication cost. The successful demonstration of multi-channel vortex beam generation and holographic imaging indicates strong potential for integrated multifunctional applications in next-generation wireless communication and optical systems.
Image and Intelligent Information Processing
Dynamic Scale Perception-Driven Multi-UAV Collaborative 3D Object Detection Method
DUAN Shujing, WANG Zhirui, CHENG Peirui, FU Kun
2026, 48(5): 1927-1935. doi: 10.11999/JEIT251378
Abstract:
  Objective  Multi-UAV collaborative 3D object detection is a core technology for low-altitude intelligent perception, and the Bird’s-Eye View (BEV) feature representation paradigm provides support for global spatial consistency. However, in practical UAV remote-sensing scenarios, targets are extremely small, sparsely distributed, and embedded in a large proportion of background regions. Existing Transformer-based BEV perception methods adopt a homogeneous full-image feature-processing strategy. This strategy not only wastes computing resources because of excessive computation in large background areas, but also tends to dilute small-target features with background noise, making it difficult to balance computational efficiency and detection accuracy. Meanwhile, multi-UAV collaboration requires cross-device information interaction to achieve view complementarity and information gain, but this process is prone to redundant information and even feature conflicts. Traditional fixed-weight aggregation methods cannot accurately identify effective information or suppress redundancy, resulting in poor consistency of global BEV features and reduced collaborative detection accuracy. Therefore, the development of a detection network that is adaptive to multi-UAV aerial scenarios is of clear practical value.  Methods  A dynamic scale-aware detection network is proposed for efficient and accurate 3D object detection through two core modules: the Dynamic Scale-aware BEV Generation (DSBG) module and the Adaptive Collaborative BEV-Feature Aggregation (ACFA) module. The network establishes an end-to-end pipeline of “multi-view image input-dynamic scale adaptive feature encoding-BEV space 3D detection” (Fig. 1). First, the observed images collected by each UAV are processed independently by a parameter-sharing ResNet-50 backbone network to generate feature maps with a consistent structure. The DSBG module then takes these feature maps as input, calculates the amplitude of feature responses in each spatial region through the Local Scale-Aware Unit, and estimates the target distribution. On this basis, differentiated BEV grid encoding is dynamically allocated: high-resolution dense grids are assigned to high-response target regions to preserve fine-grained features, whereas low-resolution sparse grids are assigned to low-response background regions to reduce invalid computation. At the same time, target query vectors with spatial position priors are generated. The ACFA module receives the multi-resolution BEV features generated by the DSBG module, concatenates the dual-resolution features from different UAVs in the channel dimension, upsamples the low-resolution features to align them with the high-resolution features, models the local correlations of two-scale features through 3*3 convolution, and obtains a globally consistent BEV feature map through element-wise weighted summation. Finally, the global BEV features are fed into the DETR decoder for 3D target prediction, with Focal Loss used for classification and Smooth L1 Loss used for regression (Eqs. 5\begin{document}$ \sim $\end{document}6).  Results and Discussions  Extensive experiments are conducted on two public multi-UAV collaborative simulation datasets, AeroCollab3D and Air-Co-Pred. The results show that the proposed method achieves strong performance on both datasets. Compared with current state-of-the-art methods and baseline models, it not only improves mean Average Precision (mAP) by up to 7.2 percentage points, but also substantially reduces key evaluation metrics, including mean size error by more than 48%, mean localization error, and mean orientation error. In particular, clear advantages are observed in small-target detection and fine-grained category recognition, with pedestrian detection accuracy improved by nearly 10 percentage points. Ablation experiments verify the effectiveness of both the DSBG and ACFA modules. The proposed method steadily improves detection accuracy while significantly reducing computational cost by up to 41.6%, thereby achieving coordinated optimization of accuracy and efficiency. Visualization results (Fig. 3) show that the predicted bounding boxes have higher spatial alignment with the ground truth, effectively alleviating the common problems of target overlap and missed detection in traditional methods. Fig. 4 further illustrates the technical advantages of multi-UAV collaborative detection. Even for targets occluded by obstacles, the proposed method achieves efficient detection, thereby enhancing the comprehensive perception capability of the global region.  Conclusions  A dynamic scale-aware detection network is proposed for multi-UAV collaborative 3D object detection to address the core challenges of the efficiency-accuracy tradeoff and poor feature consistency in traditional methods. The DSBG module achieves dynamic matching between the BEV encoding scale and target distribution, thereby reducing redundant computation, whereas the ACFA module improves multi-scale and multi-view feature aggregation to ensure global feature consistency and accuracy. Experimental results on two datasets confirm that the proposed method outperforms existing advanced methods in detection accuracy, computational efficiency, and robustness. Future work will focus on optimizing dynamic scale-adjustment strategies with temporal information and exploring multi-sensor fusion with lightweight LiDAR data to improve detection stability in complex scenarios.
Multi-agent Reinforcement Learning Method for Trajectory Optimization in Dual-UAV Cooperative Railway Inspection
HUANG Gaoyong, SONG Jun, FANG Xuming, YAN Li, HE Rong
2026, 48(5): 1936-1947. doi: 10.11999/JEIT251321
Abstract:
  Objective  Conventional railway inspection methods, including manual inspection and dedicated inspection vehicles, suffer from low efficiency, limited coverage, and safety risks, especially in hazardous or inaccessible areas. Unmanned Aerial Vehicles (UAVs) offer a promising alternative. However, deployment in strictly regulated railway protection zones remains challenging. In particular, single-UAV inspection is limited by restricted viewpoints, coverage blind spots, and poor data synchronization. To address these issues, this paper proposes a dual-UAV cooperative railway inspection framework. The objective is to jointly optimize the flight trajectories and inspection task sequence of two UAVs to maximize inspection task quality under coupled constraints, including energy consumption, obstacle avoidance, communication-rate constraints, and cooperative synchronization.  Methods  To solve this high-dimensional, non-convex, NP-hard problem, a two-stage hierarchical framework is proposed. In the first stage, the optimal cooperative observation positions for each inspection task are determined. Particle Swarm Optimization (PSO) is used to obtain the optimal three-dimensional coordinates of the two UAVs, thereby improving coverage and inspection quality. In the second stage, continuous trajectory optimization is formulated as a Multi-Agent Deep Reinforcement Learning (MADRL) problem. To improve convergence stability under strong safety constraints, a Risk-Adaptive Exploration Noise Mechanism (RAENM) is incorporated into the training process. The problem is then solved by an improved Multi-Agent Twin Delayed Deep Deterministic policy gradient (MATD3) algorithm under the Centralized Training with Decentralized Execution (CTDE) paradigm. Each UAV is modeled as an independent agent. Its state includes kinematic information, target position, remaining energy, and obstacle distance. Its action space defines the flight control variables. A composite reward function is designed to balance multiple objectives, including target approaching, energy saving, obstacle avoidance, railway-protection-zone compliance, and synchronized cooperative arrival.  Results and Discussions  The proposed framework is evaluated through simulations against several baseline algorithms. The results show that the improved MATD3 method achieves faster and more stable convergence, especially as the number of inspection tasks increases. In path planning, it generates more compact trajectories and the shortest total path length. For example, in the two-task scenario, the total path length is reduced to 13,025 m, about 4.5% shorter than that of the next best method. In addition, the proposed method achieves the lowest cumulative energy consumption in all tested scenarios. It also yields the smallest navigation error and the shortest arrival-time difference between the two UAVs at shared inspection points, indicating higher control accuracy and better spatiotemporal coordination. By reducing position deviation and improving synchronization, the proposed method achieves the highest inspection task quality in all evaluation settings.  Conclusions  This paper proposes a two-stage hierarchical framework for dual-UAV cooperative trajectory optimization in railway inspection. The framework combines PSO-based cooperative observation position optimization with improved MATD3-based trajectory learning. Simulation results show that the proposed method outperforms baseline methods in path efficiency, energy saving, cooperative synchronization, and inspection task quality. This study provides support for the deployment of intelligent multi-UAV systems in railway infrastructure inspection. Future work will consider more realistic factors, including communication uncertainty and dynamic environments.
Context-Aware Fine-Grained Multimodal Emotion Recognition Based on Mamba
SUN Linhui, CHENG Leyang, YANG Xinyue, CHEN Shuaitong, LI Pingan, SHAO Xi
2026, 48(5): 1948-1959. doi: 10.11999/JEIT251307
Abstract:
  Objective  Multimodal Emotion Recognition(MER) aims to infer human emotional states by integrating speech and text signals. Existing MER methods often fail to use temporal and speaker context effectively and lack fine-grained intra- and inter-modal interaction modeling. These limitations reduce the ability to distinguish similar emotions. This study proposes a Context-Aware Fine-Grained Multimodal Emotion Recognition model based on the Mamba State Space Model(SSM), termed CA-FGMER-Mamba, to improve recognition accuracy in complex scenarios.  Methods  The CA-FGMER-Mamba model consists of five modules. First, text features are encoded using RoBERTa with explicit speaker identity injection and a three-segment contextual input. Audio features are extracted using OpenSMILE and reduced to 512 dimensions. Second, a Bidirectional Gated Recurrent Unit(Bi-GRU) integrates historical and future contextual dependencies. Third, intra-modal fine-grained filtering applies multi-head self-attention to emphasize key emotional cues and suppress redundancy. Fourth, inter-modal fine-grained fusion uses a Mamba SSM module to recalibrate features across time steps. This stage includes higher-order outer-product fusion, mean pooling, and a cross-modal interaction modulation module to adaptively adjust modality contributions. Finally, fused features are processed by a Bi-LSTM, followed by a self-attention layer and a fully connected network for classification. The model is optimized using a joint triplet loss and cross-entropy loss.  Results and Discussions  Experiments are conducted on the IEMOCAP and MELD datasets. On the IEMOCAP four-class task, CA-FGMER-Mamba achieves a Weighted Accuracy(WA) of 0.781 and an Unweighted Accuracy(UA) of 0.790, outperforming seven representative methods. On the six-class task, the model achieves a Weighted F1-score of 0.703 and shows strong performance in distinguishing similar emotions such as “happy” (0.646) and “excited” (0.803). On the MELD dataset, the model achieves a Weighted F1-score of 0.665, indicating strong generalization. Ablation experiments confirm that combining intra-modal and inter-modal fusion improves performance.  Conclusions  The CA-FGMER-Mamba model addresses key limitations in existing MER methods by integrating context-aware modeling with fine-grained intra- and inter-modal fusion based on the Mamba SSM. The Bi-GRU with speaker identity enhances modeling of temporal and role-related context and alleviates recency bias. Intra-modal self-attention and Mamba-based inter-modal recalibration improve feature extraction and cross-modal interaction modeling, enabling accurate discrimination of similar emotions. The cross-modal interaction modulation module adaptively adjusts modality contributions and enhances robustness. Experimental results demonstrate strong performance in WA, UA, and Weighted F1-score, with good generalization. Future work will explore multi-scale interaction mechanisms, multi-task learning strategies, and noise-aware modeling to further improve fusion accuracy and robustness.
Facial Expression Recognition Model Based on an Improved YOLO12n
HAN Chuang, HUANG Jingyao, LAN Chaofeng
2026, 48(5): 1960-1973. doi: 10.11999/JEIT250936
Abstract:
  Objective  Facial Expression Recognition (FER) is a key technology in affective computing and intelligent human-computer interaction. In practical scenarios, recognition performance is often degraded by low resolution, complex illumination, partial occlusion, and class imbalance. Although deep learning-based methods have made substantial progress, lightweight models such as You Only Look Once version 12 nano (YOLO12n) still have limited feature extraction ability and reduced robustness under degraded imaging conditions. To address these limitations, this paper proposes an improved FER model, termed YOLO-FER. The model is designed to enhance feature representation, improve the discrimination of similar expressions, and maintain real-time detection performance in low-quality environments.  Methods  Based on the YOLO12n model, YOLO-FER introduces several targeted improvements. First, a C3k2_star module is constructed by embedding NewStarBlock into the original bottleneck structure. This design enhances high-dimensional nonlinear feature representation and alleviates feature loss during fusion, as shown in Fig. 2 and Fig. 3. Second, Multidimensional Collaborative Attention (MCA) is integrated with the A2C2f module to form A2C2f_MCA. This module performs joint modeling across the channel, height, and width dimensions to capture fine-grained facial features (Fig. 4). Third, a Low Resolution Feature Extractor (LRFE) module is placed at the end of the backbone. It enhances pixel-level feature representation under low-resolution and low-light conditions through dilated convolution and pixel attention (Fig. 5). Finally, Adaptive Threshold Focal Loss (ATFL) is used to dynamically adjust the contributions of easy and hard samples. This function mitigates class imbalance and improves the discrimination of similar expressions. The overall model structure is shown in Fig. 1. Experiments are conducted on the RAF-DB and Low Light Dataset (LLD) datasets. Precision (P), recall (R), F1 score, and mAP@0.5 are used as evaluation metrics.  Results and Discussions  Extensive experiments show that YOLO-FER outperforms the baseline YOLO12n and other YOLO-series models. As shown in Table 2, on the RAF-DB dataset, YOLO-FER achieves P=81.8%, R=81.9%, and mAP@0.5=87.6%, with a 3.8% improvement in mAP@0.5 over the baseline. On the LLD dataset (Table 3), YOLO-FER achieves an mAP@0.5 of 95.9%, representing a 5.0% improvement. These results indicate strong robustness under low-light conditions. The ablation studies in Table 2 and Table 3 confirm that each proposed module contributes to performance improvement. C3k2_star, A2C2f_MCA, LRFE, and ATFL all lead to consistent gains in detection accuracy. Their combination achieves the best performance with only a slight increase in parameters. The comparison with other YOLO variants in Table 5 further shows that YOLO-FER achieves a favorable balance between accuracy and model complexity. The mAP@0.5 curves in Fig. 8 show that the proposed model maintains consistent performance gains during training. The confusion matrix analysis in Fig. 9 and Table 4 demonstrates that the MCA module improves the discrimination of similar expressions, such as Angry and Disgust, and reduces misclassification. Grad-CAM visualization results (Fig. 13) indicate that YOLO-FER focuses more accurately on key facial regions, including the eyes, eyebrows, and mouth, than the baseline model. Experiments under degraded conditions (Fig. 14 and Table 13) further show that YOLO-FER maintains higher detection performance than YOLO12n and has a smaller overall performance drop. These findings confirm its robustness in low-quality scenarios. Although the number of parameters increases slightly from 2.5 M to 3.0 M, the inference speed remains competitive (Table 7), indicating that the proposed method retains real-time capability.  Conclusions  This paper proposes YOLO-FER, an improved FER model based on YOLO12n. The model improves feature extraction and robustness in low-quality image scenarios. By integrating C3k2_star, MCA, LRFE, and ATFL, YOLO-FER improves recognition performance and generalization ability. Experimental results on the RAF-DB and LLD datasets confirm that the model achieves high detection performance while maintaining efficient inference speed. The proposed method provides a practical solution for real-time FER applications in complex environments. Future work will focus on improving performance under extremely low-resolution conditions and exploring cross-domain generalization and micro-expression recognition.
Mamba-YOWO: An Efficient Spatio-Temporal Representation Framework for Action Detection
MA Li, XIN Jiangbo, WANG Lu, DAI Xinguan, SONG Shuang
2026, 48(5): 1974-1984. doi: 10.11999/JEIT251124
Abstract:
  Objective  Spatio-temporal action detection aims to localize and recognize action instances in untrimmed videos. This task is essential for applications such as intelligent surveillance and human-computer interaction. Existing methods, particularly those based on 3D convolutional neural networks (3D CNNs) or Transformers, often face difficulty balancing computational cost and the ability to model long-range temporal dependencies. The YOWO series provides efficient detection but relies on 3D convolutions with limited receptive fields. The Mamba architecture, based on a Selective State Space Model (SSM) with linear computational complexity, has shown strong capability for long-sequence modeling. This study integrates Mamba into the YOWO framework to improve temporal modeling efficiency and representation ability while reducing computational cost, addressing the limited application of Mamba in spatio-temporal action detection.  Methods  The proposed Mamba-YOWO framework is built on the lightweight YOWOv3 architecture. It adopts a dual-branch heterogeneous design for feature extraction. The 2D branch, derived from YOLOv8 with CSPDarknet and PANet structures, processes keyframes to extract multi-scale spatial features. The temporal branch replaces conventional 3D convolutions with a hierarchical architecture composed of a Stem layer and three stages (Stage1–Stage3). Stage1 and Stage2 apply Patch Merging for spatial downsampling and stack Decomposed Bidirectionally Fractal Mamba (DBFM) blocks. The DBFM block employs a bidirectional Mamba structure to capture temporal dependencies in both past-to-future and future-to-past directions. A Spatio-Temporal Interleaved Scan (STIS) strategy is introduced within the DBFM block. This strategy combines bidirectional temporal scanning with spatial Hilbert quad-directional scanning, enabling serialized video representation while maintaining spatial locality and temporal consistency. Stage3 applies 3D average pooling to compress temporal features. An Efficient Multi-scale Spatio-Temporal Fusion (EMSTF) module is designed to integrate features from the 2D and temporal branches. This module applies group convolution–guided hierarchical interaction for preliminary fusion and a parallel dual-branch structure for refined fusion, generating an adaptive spatio-temporal attention map. A lightweight detection head with decoupled classification and regression subnetworks produces the final action tubes.  Results and Discussions  Extensive experiments were conducted on the UCF101-24 and JHMDB datasets. Compared with the YOWOv3/L baseline on UCF101-24, Mamba-YOWO achieved a Frame-mAP of 90.24% and a Video-mAP@0.5 of 87.90%, which correspond to improvements of 2.1% and 6.0%, respectively (Table 1). These improvements were obtained while reducing model parameters by 7.3% and computational cost (GFLOPs) by 5.4%. On JHMDB, Mamba-YOWO achieved a Frame-mAP of 83.2% and a Video-mAP@0.5 of 86.7% (Table 2). Ablation experiments verified the contribution of key components. The optimal number of DBFM blocks in Stage2 was four, whereas additional blocks reduced performance, likely due to overfitting (Table 3).  Conclusions  This study proposes Mamba-YOWO, an efficient spatio-temporal action detection framework that integrates the Mamba architecture into YOWOv3. The model replaces conventional 3D convolutions with a DBFM-based temporal branch that incorporates the STIS scanning strategy, which improves long-range temporal modeling with linear computational complexity. The EMSTF module further improves feature representation through group convolution and dynamic gating mechanisms. Experimental results on UCF101-24 and JHMDB show that Mamba-YOWO achieves higher detection accuracy, such as 90.24% Frame-mAP on UCF101-24, whereas model parameters and computational cost are reduced. Future work will examine the theoretical mechanism of Mamba for temporal modeling, extend its capability to longer video sequences, and support lightweight deployment on edge devices.
Spherical Geometry-guided and Frequency-Enhanced Segment Anything Model for 360° Salient Object Detection
CHEN Xiaolei, SHEN Yujie, ZHONG Zhihua
2026, 48(5): 1985-1996. doi: 10.11999/JEIT251254
Abstract:
  Objective  With the rapid development of Virtual Reality (VR) and Augmented Reality (AR) technologies and the increasing demand for omnidirectional visual applications, accurate salient object detection in complex 360° scenes has become critical for system stability and intelligent decision-making. The Segment Anything Model (SAM) demonstrates strong transferability across two-dimensional vision tasks. However, it is primarily designed for planar images and lacks explicit modeling of spherical geometry, which limits its direct application to 360° Salient Object Detection (360° SOD). To address this limitation, this study integrates the generalization capability of SAM with spherical-aware multi-scale geometric modeling to improve 360° SOD. Specifically, a Multi-Cognitive Adapter (MCA), Spherical Geometry Guided Attention (SGGA), and Spatial-Frequency Joint Perception Module (SFJPM) are proposed to enhance multi-scale structural representation, mitigate projection-induced geometric distortions and boundary discontinuities, and strengthen joint global and local feature modeling.  Methods  The proposed 360° SOD framework is built on SAM and consists of an image encoder and a mask decoder. During encoding, spherical geometry modeling is incorporated into patch embedding by mapping image patches onto a unit sphere and explicitly modeling spatial relationships between patch centers. This strategy injects geometric priors into the attention mechanism, which improves sensitivity to non-uniform geometric characteristics and reduces information loss caused by omnidirectional projection distortion. The encoder adopts a partial freezing strategy and is organized into four stages, each containing three encoder blocks. Each block integrates the MCA for multi-scale contextual fusion and the SGGA to model long-range dependencies in spherical space. Multi-level features are concatenated along the channel dimension to form a unified representation. The representation is then refined by the SFJPM, which jointly captures spatial structures and frequency-domain global information. The fused features are subsequently fed into the SAM mask decoder. Saliency maps are optimized under ground-truth supervision to achieve accurate object localization and boundary refinement.  Results and Discussions  Experiments are conducted using the PyTorch framework on an RTX 3090 GPU with an input resolution of 512 × 512. Evaluations are performed on two public datasets, 360-SOD and 360-SSOD, and compared with 14 state-of-the-art methods. The proposed approach consistently achieves superior performance across six evaluation metrics. On the 360-SOD dataset, the model achieves a Mean Absolute Error (MAE) of 0.015 2 and a maximum F-measure of 0.849 2, outperforming representative methods such as MDSAM and DPNet. Qualitative results show that the proposed method produces saliency maps that are highly consistent with ground-truth annotations. The model handles challenging scenarios effectively, including projection distortion, boundary discontinuity, multi-object scenes, and complex backgrounds. Ablation studies further show that MCA, SGGA, and SFJPM each contribute to performance improvement and operate complementarily.  Conclusions  This study proposes an SAM-based framework for 360° salient object detection that jointly addresses multi-scale representation, spherical distortion awareness, and spatial-frequency feature modeling. The MCA improves multi-scale feature fusion, the SGGA compensates for Equirectangular Projection (ERP)-induced geometric distortion, and the SFJPM enhances long-range dependency modeling. Extensive experiments verify the effectiveness and feasibility of applying SAM to 360° SOD. Future research will extend this framework to omnidirectional video and multi-modal scenarios to further improve spatiotemporal modeling and scene understanding.
Wavelet Transform and Attentional Dual-Path EEG Model for Virtual Reality Motion Sickness Detection
CHEN Yuechi, HUA Chengcheng, DAI Zhian, FU Jingqi, ZHU Min, WANG Qiuyu, YAN Ying, LIU Jia
2026, 48(5): 1997-2007. doi: 10.11999/JEIT251233
Abstract:
  Objective  Virtual Reality Motion Sickness (VRMS) presents a barrier to the wider adoption of immersive Virtual Reality (VR). It is primarily caused by sensory conflict between the vestibular and visual systems. Existing assessments rely on subjective reports that disrupt immersion and do not provide real-time measurements. An objective detection method is therefore needed. This study proposes a dual-path fusion model, the Wavelet Transform ATtentional Network (WTATNet), which integrates wavelet transform and attention mechanisms. WTATNet is designed to classify resting-state ElectroEncephaloGraph (EEG) signals collected before and after VR motion stimulus exposure to support VRMS detection and research on the mechanisms and mitigation strategies.  Methods  WTATNet contains two parallel pathways for EEG feature extraction. The first applies a Two-Dimensional Discrete Wavelet Transform (2D-DWT) to both the time and electrode dimensions of the EEG, reshaping the signal into a two-dimensional matrix based on the spatial layout of the scalp electrodes in horizontal or vertical form. This decomposition captures multi-scale spatiotemporal features, which are then processed using Convolutional Neural Network (CNN) layers. The second pathway applies a one-dimensional CNN for initial filtering followed by a dual-attention structure consisting of a channel attention module and an electrode attention module. These modules recalibrate the importance of features across channels and electrodes to emphasize task-relevant information. Features from both pathways are fused and passed through fully connected layers to classify EEGs into pre-exposure (non-VRMS) and post-exposure (VRMS) states based on subjective questionnaire validation. EEG data were collected from 22 subjects exposed to VRMS using the game “Ultrawings2.” Ten-fold cross-validation was used for training and evaluation with accuracy, precision, recall, and F1-score as metrics.  Results and Discussions  WTATNet achieved high VRMS-related EEG classification performance, with an average accuracy of 98.39%, F1-score of 98.39%, precision of 98.38%, and recall of 98.40%. It outperformed classical and state-of-the-art EEG models, including ShallowConvNet, EEGNet, Conformer, and FBCNet (Table 2). Ablation experiments (Tables 3 and 4) showed that removing the wavelet transform path, the electrode attention module, or the channel attention module reduced accuracy by 1.78%, 1.36%, and 1.01%, respectively. The 2D-DWT performed better than the one-dimensional DWT, supporting the value of joint spatiotemporal analysis. Experiments with randomized electrode ordering (Table 4) produced lower accuracy than spatially coherent layouts, indicating that 2D-DWT leverages inherent spatial correlations among electrodes. Feature visualizations using t-SNE (Figures 5 and 6) showed that WTATNet produced more discriminative features than baseline and ablated variants.  Conclusions  The dual-path WTATNet model integrates wavelet transform and attention mechanisms to achieve accurate VRMS detection using resting-state EEG. Its design combines interpretable, multi-scale spatiotemporal features from 2D-DWT with adaptive channel-level and electrode-level weighting. The experimental results confirm state-of-the-art performance and show that WTATNet offers an objective, robust, and non-intrusive VRMS detection method. It provides a technical foundation for studies on VRMS neural mechanisms and countermeasure development. WTATNet also shows potential for generalization to other EEG decoding tasks in neuroscience and clinical research.
A Lightweight Semi-supervised Brain Tumor Segmentation Network with Counterfactual Reasoning
FAN Yawen, WANG Chaoyuan, WANG Xin, ZHANG Xinchen, ZHOU Quan
2026, 48(5): 2008-2019. doi: 10.11999/JEIT251130
Abstract:
  Objective  Brain tumor segmentation plays a key role in clinical diagnosis and treatment planning. However, reliable annotation of medical images is costly and time-consuming, which limits the availability of large annotated datasets. To address this problem, this paper proposes a semi-supervised brain tumor segmentation method that combines a lightweight multimodal fusion segmentation network with counterfactual reasoning. The aim is to improve segmentation accuracy while maintaining sufficient efficiency for deployment in resource-limited clinical scenarios.  Methods  A parameter-sharing multimodal encoder-decoder network is designed to reduce model size and computational cost. An anatomical-structure consistency prior is incorporated to improve alignment with brain anatomy. During training, a teacher-student framework is used to generate counterfactual samples from model predictions. These samples guide learning from unlabeled MRI scans through a counterfactual consistency loss that enforces pixel-level consistency and feature-level semantic stability. This strategy helps the model extract structural information from unlabeled data while reducing the risk of boundary distortion caused by conventional data augmentation.  Results and Discussions  Experiments on the BraTS 2019 and BraTS 2021 datasets show that the proposed method consistently outperforms comparison models under limited-label conditions. On BraTS 2019, the proposed method achieves the best average Dice Similarity Coefficient (DSC) of 66.06%, and its average Intersection over Union (IoU) of 53.16% is comparable to those of other models. More importantly, it obtains the lowest average 95% Hausdorff Distance (HD95) of 7.60 mm, representing reductions of approximately 11% and 6% compared with UNet3D and LightMUnet, respectively (Tables 3 and 4). On BraTS 2021, the semi-supervised model improves the average DSC and IoU by 4.51% and 5.29%, respectively, and reduces the average HD95 by 0.68 mm compared with the baseline model (Tables 5 and 6). With only 10% labeled data, the proposed method achieves approximately 94% of the fully supervised performance in the main segmentation metrics. The model is also efficient, with only 1.657M parameters, a computational cost of 0.440 2 T, and an inference time of 0.093 7 s (Table 7). These results indicate that the proposed design achieves a favorable balance among segmentation accuracy, computational efficiency, and clinical deployment. The improvement is attributed to both the lightweight multimodal fusion segmentation network and the counterfactual mechanism, which guides the model to learn anatomically meaningful representations.  Conclusions  The proposed framework provides an effective solution for semi-supervised brain tumor segmentation. It balances accuracy, efficiency, and interpretability, and shows that causal reasoning can be integrated into medical image analysis in a practical manner.
PSAQNet: A Perceptual Structure Adaptive Quality Network for Authentic Distortion Oriented No-reference Image Quality Assessment
JIA Huizhen, ZHAO Yuxuan, FU Peng, WANG Tonghan
2026, 48(5): 2020-2030. doi: 10.11999/JEIT251220
Abstract:
  Objective  No-Reference Image Quality Assessment (NR-IQA) is critical for practical imaging systems when pristine reference images are unavailable. However, many existing methods face three major challenges: limited robustness under complex distortions, weak generalization when distortion distributions shift (e.g., from synthetic to real-world settings), and insufficient modeling of geometric or structural degradations such as spatially varying blur, misalignment, and texture-structure coupling. These limitations cause models to rely excessively on dataset-specific statistics and reduce their effectiveness when applied to diverse scenes with mixed degradations. To address these issues, the Perceptual Structure Adaptive Quality Network (PSAQNet) is proposed to improve the accuracy and adaptability of NR-IQA under complex distortion conditions.  Methods  PSAQNet is designed as a unified CNN-Transformer framework that preserves hierarchical perceptual cues and supports global context reasoning. Instead of relying on late-stage pooling, distortion evidence is progressively enhanced throughout the network. The architecture contains several key components. The Advanced Distortion Enhanced Module (ADEM) operates on multi-scale features extracted from a pre-trained backbone. It adopts multi-branch gating and a distortion-aware adapter to emphasize degradation-related signals and reduce interference from dominant image content. This mechanism dynamically selects feature branches that correspond to perceptual degradation patterns, which is beneficial for spatially non-uniform or mixed distortions. To model geometric degradations, PSAQNet integrates Spatial-Guided Convolution (SGC) and Channel-Aware Adaptive Kernel convolution (CA_AK). SGC improves spatial sensitivity by guiding convolutional responses with structure-aware cues and focusing on regions where geometric distortions are prominent. CA_AK further improves geometric modeling by adaptively adjusting receptive behavior and recalibrating channels to preserve distortion-sensitive components. Additionally, PSAQNet incorporates efficient feature fusion strategies. Group Convolutional Block Attention Module (GroupCBAM) enables lightweight attention-based fusion of multi-level CNN features, whereas AttInjector selectively injects local distortion cues into global Transformer representations. This design allows global semantic reasoning to be guided by localized degradation evidence without introducing redundancy or instability.  Results and Discussions  Extensive experiments on six benchmark datasets containing both synthetic and real-world distortions demonstrate that PSAQNet achieves strong performance and stable agreement with human subjective judgments. The proposed method outperforms several recent approaches, particularly on real-world distortion datasets. These results indicate that PSAQNet effectively enhances distortion evidence, models geometric degradation, and integrates local distortion cues with global semantic representations. Such capabilities improve robustness under distribution shifts and reduce reliance on narrow distortion priors. Ablation studies confirm the contribution of each module. ADEM increases distortion saliency, SGC and CA_AK improve sensitivity to geometric degradations, and GroupCBAM and AttInjector strengthen the interaction between local and global features. Cross-dataset evaluations further demonstrate the generalization capability of PSAQNet across different content categories and distortion types. Scalability experiments also show that the framework benefits from stronger pretrained backbones without compromising its modular design.  Conclusions  PSAQNet addresses several key limitations in NR-IQA by integrating local distortion enhancement, geometric-aware feature modeling, and global semantic fusion within a unified framework. The modular architecture improves robustness and generalization across diverse distortion conditions and supports practical deployment in real-world scenarios. Future work will explore vision–language pre-training to improve cross-scene adaptability.
Entropy-driven Adaptive Fusion Network for Scene Classification of High-Resolution Remote Sensing Images
SONG Wanying, LIU Yuchen, WANG Jie, WANG Anyi
2026, 48(5): 2031-2040. doi: 10.11999/JEIT251147
Abstract:
  Objective  Remote sensing image scene classification is intended to assign semantic labels to aerial or satellite images. With the rapid development of Earth observation technologies, high-resolution remote sensing images provide abundant detail but also present major challenges, including complex spatial structures, large scale variations, high intra-class variance, and strong inter-class similarity. Traditional Convolutional Neural Networks (CNNs) have achieved notable success in local spatial modeling, but they cannot adequately capture long-range dependencies because of their fixed receptive fields. To address this limitation, CNN-Transformer hybrid architectures have been proposed to balance local detail and global semantics. However, these models usually adopt simple concatenation for multi-scale feature fusion, which introduces redundancy and reduces discriminability. In addition, although the Swin Transformer uses window-based self-attention to capture contextual information, it still shows clear limitations in the analysis of complex high-resolution images. Specifically, long-range dependency modeling across windows is constrained by the fixed window size. The extraction of fine-grained local features is also limited because deep networks tend to overlook crucial fine-texture information from low- and mid-level features. Moreover, existing multi-level feature fusion strategies lack semantic guidance and therefore readily introduce background noise. Therefore, a network that can balance global contextual modeling and local discriminability while enabling adaptive fusion is still needed.  Methods  To address limited cross-window interaction and the absence of semantic guidance in multi-level feature fusion, an Entropy-driven Adaptive Fusion Swin Transformer (E-AF-ST) network is proposed. The architecture uses a lightweight Swin-Tiny backbone and incorporates two key modules: the Attention-guided region Selection and feature Optimization module (ASO) and the Entropy-driven Gated Fusion module (EGF) (Fig. 1). The ASO module addresses weak cross-window interaction and insufficient fine-grained feature extraction in the Swin Transformer through three consecutive stages (Fig. 2a). First, cross-window sparse attention is computed to remove physical window boundaries. By enlarging the patch partition size, sparse attention is applied to the entire image sequence, allowing global contextual correlations across the whole image to be captured. Second, dynamic region selection is performed. On the basis of pixel-level entropy measurement, a multilayer perceptron maps entropy features to attention scores, and a Top-k masking strategy dynamically selects the most informative discriminative regions. Third, recursive feature optimization is performed. Multi-head self-attention and layer normalization are applied at the local scale to progressively enhance boundaries and microstructural information. The EGF module then integrates the Swin Transformer output features, the globally enhanced contextual features, and the locally optimized features to reduce semantic discrepancies (Fig. 2b). First, energy normalization is performed using the Frobenius norm to obtain a probabilistic energy distribution. Next, an entropy-driven gated fusion mechanism calculates the Shannon entropy for each branch. A learnable soft-normalization gating function then maps the entropy information to normalized fusion weights, automatically reducing the weight of branches with high entropy caused by cluttered backgrounds. Finally, the fused representations undergo lightweight recursive optimization using depthwise separable convolutions and GELU activation functions with residual connections to suppress redundant information. The forward propagation process is systematically summarized in Algorithm 1.  Results and Discussions  To validate the discriminative capability of the proposed network, extensive experiments were conducted on two widely used public datasets, AID and NWPU-RESISC45. The proposed E-AF-ST network shows superior classification performance compared with existing advanced methods (Table 1). On the AID dataset, the model achieves state-of-the-art overall accuracies of 95.56% and 97.21% at training ratios of 20% and 50%, respectively. On the challenging NWPU-RESISC45 dataset, it achieves the highest accuracies of 92.45% and 94.59% at training ratios of 10% and 20%, respectively. The confusion matrices show that the recognition accuracy of most categories exceeds 95% (Figs. 3, 4), and the misclassification proportions for classes with complex backgrounds are significantly lower than those of the baseline model (Table 2). Visual analysis based on Grad-CAM further confirms the advantages of the E-AF-ST network in global contextual modeling and critical region selection. Compared with the Swin-Tiny baseline, the proposed network demonstrates more precise semantic focus (Fig. 5). In “airport” and “port” scenes, background noise is effectively suppressed and key targets are accurately highlighted. In structurally complex scenes such as “viaducts" and “railway stations”, extension directions and texture characteristics are comprehensively captured. Ablation experiments confirm that the cross-window sparse attention in the ASO module and the dynamic weight allocation in the EGF module are highly complementary. Furthermore, this performance gain is achieved with only a minimal increase in model complexity, with a total of 30.45M parameters and 4.72G- FLOPs.  Conclusions  An E-AF-ST network is proposed to address insufficient extraction of local discriminative information, cross-scale feature inconsistency, and semantic redundancy in high-resolution remote sensing image scene classification. With information entropy used as a guiding metric, the ASO module enables precise selection and recursive optimization of discriminative regions, whereas the EGF module achieves adaptive and redundancy-reduced integration of multi-source features. Experimental and visual results show that the proposed method effectively reduces interference from complex backgrounds and outperforms existing mainstream CNN-Transformer hybrid architectures. This study provides a new theoretical perspective and technical route for multi-scale target perception and feature semantic alignment.
Multimodal Pedestrian Trajectory Prediction with Multi-Scale Spatio-Temporal Group Modeling and Diffusion
KONG Xiangyan, GAO YuLong, WANG Gang
2026, 48(5): 2041-2052. doi: 10.11999/JEIT250900
Abstract:
  Objective  The rapid development of autonomous driving and social robotics has increased the need for accurate pedestrian trajectory prediction to improve safety and interaction efficiency. Existing group-based methods mainly emphasize local spatial interaction and often overlook latent grouping characteristics across time. This study proposes a multi-scale spatiotemporal feature construction method that separates trajectory shape from absolute spatiotemporal coordinates. This enables the model to capture latent group associations across different temporal intervals. A spatiotemporal interaction three-element encoding mechanism is incorporated to extract dynamic relationships between individuals and groups. By integrating the reverse process length mechanism of diffusion models, the system progressively reduces prediction uncertainty. This approach provides an effective solution for multimodal trajectory prediction in complex, crowded scenes and offers theoretical support for improving the accuracy and stability of long-range trajectory forecasting.  Methods  The algorithm performs deep modeling of pedestrian trajectories through multi-scale spatiotemporal group modeling across three components: group construction, interaction modeling, and trajectory generation. First, to address the limitations of methods that focus on local spatiotemporal patterns but overlook cross-dimensional latent characteristics, a multiscale trajectory grouping model is developed. Its core design extracts trajectory offsets to represent trajectory shapes, separating motion features from absolute positions. This enables the system to identify latent group associations among agents who follow similar motion patterns across different periods. Second, a spatiotemporal interaction three-element encoding method is proposed. By defining neural interaction strength, interaction categories, and category functions, the method captures detailed individual interactions and the global dynamic evolution of collective behavior. Finally, a Diffusion Model is introduced for multimodal prediction. Through the reverse process length mechanism, the model converges gradually, reduces uncertainty, and transforms a diffuse prediction space into plausible future trajectories.  Results and Discussions  The proposed model was evaluated against 11 state-of-the-art baselines on the NBA dataset (Table 1). The results show clear advantages in minADE20. It achieves substantial gains over GroupNet+CVAE in long-term prediction tasks, improving minADE20 and minFDE20 by 0.19 and 0.36, respectively, at the 4-second horizon. Although it is slightly inferior to MID in long-term trend prediction, possibly because group dynamics shift rapidly and intensely in NBA scenarios, the model maintains strong instantaneous accuracy. This supports the effectiveness of the multi-scale grouping strategy, which uses historical trajectories to capture complex dynamic interactions. On the ETH/UCY datasets (Table 2), MSGD provides consistent improvements across all five sub-scenes. In the dense and highly interactive UNIV scene, the method exceeds all baselines by leveraging the strengths of multi-scale modeling. Although MSGD is marginally behind PPT in long-distance endpoint constraints, it maintains a lead in minADE20. It also outperforms Trajectory++ in velocity smoothness and directional coherence (std dev: 0.701 2) (Table 3), indicating that the generated trajectories maintain natural smoothness aligned with human motion. Ablation studies verify the independent effects of the diffusion model, spatiotemporal feature extraction, and multi-scale grouping modules (Table 4). Grouping sensitivity analysis on the NBA dataset shows that full-court grouping (group size 11) enhances long-term stability, reducing minFDE20 by 0.026~0.03 at 4 seconds (Table 5). Configurations with group sizes of 5 or 2 further support the importance of team formations and “one-on-one” local offensive and defensive dynamics (Table 6). Diffusion-step and training-epoch sensitivity analysis reveals a complementary relationship: moderate diffusion steps (30~40) refine denoising and improve accuracy, whereas excessive steps may cause overfitting (Table 7). Qualitative visualization confirms that MSGD generates multimodal trajectories with high overlap with ground truth (Fig. 2).  Conclusions  This study presents a trajectory prediction algorithm that improves performance in two primary ways: (1) it captures pedestrian interactions by extracting spatiotemporal features, and (2) it strengthens collective behavior modeling through multi-scale grouping. Experiments show that the method achieves state-of-the-art performance on the NBA and ETH/UCY datasets, and ablation studies confirm the effectiveness of all modules. Two limitations remain. First, explicit environmental information, such as maps or obstacles, is not yet incorporated. Second, the diffusion model requires substantial computational cost during inference. Future research will address these issues.
Vision-Guided and Force-Controlled Method for Robotic Screw Assembly
ZHANG Chunyun, MENG Xintong, TAO Tao, ZHOU Huaidong
2026, 48(5): 2053-2065. doi: 10.11999/JEIT251193
Abstract:
  Objective  With the rapid development of intelligent manufacturing and industrial automation, robots are increasingly applied to high-precision assembly tasks, especially screw assembly. However, current systems still face several challenges. The pose of assembly objects is often uncertain, which makes initial localization difficult. Small features such as threaded holes are blurred and difficult to identify accurately. Conventional vision-based open-loop control may also cause assembly deviation or jamming. This study proposes a vision–force cooperative method for robotic screw assembly. The method establishes a closed-loop assembly system that covers coarse positioning and fine alignment. A semantic-enhanced 6D pose estimation algorithm and a lightweight hole detection model are used to improve perception accuracy. Force-feedback control then adjusts the end-effector posture dynamically. This approach improves the accuracy and stability of screw assembly.  Methods  The proposed screw-assembly method is based on a vision-force cooperative strategy that forms a closed-loop process. In the visual perception stage, a semantic-enhanced 6D pose estimation algorithm addresses disturbances and pose uncertainty in complex industrial environments. During initial pose estimation, Grounding DINO and SAM2 generate pixel-level masks that provide semantic priors for the FoundationPose module. In the continuous tracking stage, semantic cues from Grounding DINO support translational correction. To detect small threaded holes, an improved lightweight hole detection algorithm based on NanoDet is designed. It uses MobileNetV3 as the backbone and adds a CircleRefine module in the detection head to estimate hole centers precisely. In the assembly positioning stage, a hierarchical vision-guided strategy is used. The global camera performs coarse positioning for overall guidance, while the hand–eye camera conducts local correction using hole detection results. In the closed-loop assembly stage, force-feedback control adjusts the posture to achieve accurate alignment between the screw and the threaded hole.  Results and Discussions  The method is validated experimentally in robotic screw assembly scenarios. The improved 6D pose estimation algorithm reduces the average position error by 18% and the orientation error by 11.7% compared with the baseline (Table 1). The tracking success rate in dynamic sequences increases from 72% to 85% (Table 2). For threaded hole detection, the lightweight NanoDet-based algorithm is evaluated on a dataset collected from assembly environments. It achieves 98.3% precision, 99.2% recall, and 98.7% mAP (Table 3). The model size is 11.7 MB and the computational cost is 2.9 GFLOPs, which are both lower than most benchmark models while maintaining high accuracy. A circular branch is introduced to fit hole edges (Fig. 8), providing accurate center predictions for visual guidance. Under different inclination angles (Fig. 10), the assembly success rate remains above 91.6% (Table 4). For screws of different sizes (M4, M6, and M8), the success rate remains above 90% (Table 5). Under small external disturbances (Fig. 12), the success rates reach 93.3%, 90%, and 83.3% for translational, rotational, and mixed disturbances, respectively (Table 6). Force-feedback comparison experiments show that the success rate is 66.7% under visual guidance alone. With force-feedback control, the rate increases to 96.7% (Table 7). The system maintains stable performance throughout complete screw-assembly cycles and achieves an average cycle time of 9.53 s (Table 8), meeting industrial assembly requirements.  Conclusions  This study presents a vision-force cooperative method that addresses key challenges in robotic screw assembly. The approach enhances target localization accuracy through a semantic-enhanced 6D pose estimation algorithm and a lightweight threaded hole detection network. The integration of hierarchical vision guidance and force-feedback control enables precise alignment between screws and threaded holes. Experimental results show that the method ensures reliable assembly under varied conditions, providing a practical solution for intelligent robotic assembly. Future work will focus on adaptive force control, multimodal perception fusion, and intelligent task planning to further improve generalization and self-optimization in complex industrial environments.
Wireless Communication and Internet of Things
Model-Free Adaptive Resilient Control of Vehicle Platoons Against Hybrid Cyberattacks
HAN Qiaoni, MA Jianguo, LI Peng, ZUO Zhiqiang
2026, 48(5): 2066-2076. doi: 10.11999/JEIT251135
Abstract:
  Objective  Connected and automated vehicle platoons represent a key technology for improving traffic efficiency, driving safety, and fuel economy in intelligent transportation systems. Through inter-vehicle information exchange and cooperative control, vehicle platoons achieve safe and efficient car-following operations. However, the strong dependence on vehicular communication networks makes such systems vulnerable to cyberattacks, particularly hybrid threats that combine Denial-of-Service (DoS) and False Data Injection (FDI) attacks. These attacks may interrupt communication or tamper with transmitted information, which threatens the safety and stability of vehicle platoon systems. In addition, vehicle platoon control is affected by environmental disturbances, parametric uncertainties, and nonlinear vehicle dynamics. Existing model-based control methods often experience performance degradation under such complex conditions. Therefore, a resilient data-driven control strategy that does not rely on accurate mechanical models is required. This paper develops an attack-compensated Model-Free Adaptive Control (MFAC) framework to ensure secure and stable operation of heterogeneous nonlinear vehicle platoons under hybrid cyberattacks.  Methods  To address the resilient control problem of connected vehicle platoons under cyberattacks, an MFAC method with attack compensation is proposed for hybrid attacks that include both DoS and FDI attacks. First, a nonlinear longitudinal vehicle dynamics model of the platoon is established. Using the dynamic linearization technique, the model is converted into an equivalent compact-form dynamic linearized data model. This transformation decouples controller design from the specific mechanical model of the vehicle. An output tuning factor is further introduced to balance the tracking of position and velocity states. Second, a hybrid attack model is constructed to represent persistent FDI attacks that inject malicious data and aperiodic DoS attacks that interrupt communication. A pseudo-gradient estimator is then designed to capture system dynamics from real-time input-output data. The influence of hybrid attacks on this estimator is analyzed, and an adaptive update strategy is proposed for operation during DoS attacks. Finally, an intelligent attack compensation mechanism is designed. During DoS attack periods, the mechanism utilizes historical control input information to maintain controller operation. This design enables the system to operate continuously even when real-time vehicle state information is unavailable and further improves control performance under DoS attacks.  Results and Discussions  Rigorous theoretical analysis proves that the tracking error of the closed-loop system remains bounded under specific conditions on the frequency and duration of cyberattacks (Theorem 1). Extensive simulations verify the effectiveness of the proposed method. During cyberattacks, the MFAC method with the proposed compensation mechanism adaptively adjusts the attenuation rate of the control input and maintains system control performance (Fig. 3). Follower vehicles successfully track the leader’s velocity variations and maintain the desired inter-vehicle spacing (Fig. 4a, 4b). The tracking error exhibits satisfactory convergence behavior (Fig. 4d), which confirms the stability of the closed-loop system. Comparative studies highlight the role of the compensation mechanism. When the mechanism is disabled, the platoon experiences clear performance degradation during cyberattacks (Fig. 5). In contrast, the proposed method maintains higher tracking accuracy and faster error recovery. Additional simulations analyze the effect of FDI attack intensity. As attack intensity increases, the steady-state error bound expands (Fig. 6). This observation quantitatively supports the theoretical robustness analysis and provides useful guidance for determining security thresholds in applications.  Conclusions  This paper advances secure control of heterogeneous nonlinear connected vehicle platoons by proposing an attack-compensated MFAC framework. The framework addresses the combined challenges of hybrid cyberattacks (DoS and FDI attacks) and nonlinear system dynamics. Specifically, three key contributions are made: (1) A data-driven dynamic linearization framework is developed, and an output tuning factor is introduced to enable simultaneous position and velocity tracking based on the nonlinear longitudinal vehicle dynamics model and its equivalent data-based linearized model. (2) A hybrid attack model is established that includes aperiodic DoS attacks that interrupt communication and bounded additive FDI attacks that inject malicious data, capturing their essential characteristics. (3) An intelligent historical input-driven compensation mechanism is designed and integrated with a pseudo-gradient estimator to improve control performance during DoS-induced communication interruptions. Theoretical analysis and simulation results confirm the effectiveness of the proposed method. When attack parameters satisfy specific conditions, the system tracking error remains bounded, and follower vehicles accurately track the leader’s states. The proposed method also achieves better velocity tracking accuracy and faster error convergence than the compensation-free baseline scheme. By focusing on hybrid scenarios with aperiodic DoS and bounded additive FDI attacks, this study provides a practical model-free approach to improve cybersecurity in connected vehicle platoons. Future work will examine more stealthy hybrid attack modes, including non-additive FDI, spoofing, and DoS attacks, to analyze their coupling mechanisms and develop targeted defense strategies. In addition, a communication-efficient MFAC strategy that integrates an event-triggered mechanism will be investigated to reduce network load and improve scalability.
A Closed-loop Feedback Adaptive Beam Alignment Algorithm for Shipborne Low Earth Orbit Satellite Communication Terminals
CHEN Haotian, MA Zixian, XIE Xinhong, LI Nayu, LI Baozhu, SONG Chunyi, XU Zhiwei
2026, 48(5): 2077-2088. doi: 10.11999/JEIT251324
Abstract:
  Objective  The 6G-based SATellite COMmunication (SATCOM) network has become a primary solution for ubiquitous and oceanic communications. Compared with traditional Geostationary Earth Orbit (GEO) satellites, the latest generation of Low Earth Orbit (LEO) satellites offers higher throughput, lower end-to-end latency, and lower deployment cost. Phased arrays are therefore widely used in LEO SATCOM because of their beam agility. However, maritime wind-wave disturbances cause nonlinear relative motion between shipborne terminals and LEO satellites, which creates major challenges for high-precision satellite acquisition and tracking. To address this issue, a new beam alignment algorithm is required for LEO SATCOM systems. Such an algorithm should first obtain the instantaneous target state and motion characteristics through target acquisition, and then use a multi-target tracking method to predict satellite trajectories on the basis of the target states, thereby compensating for estimation errors caused by severe coupled motions.  Methods  The proposed closed-loop feedback adaptive beam alignment algorithm consists of two tightly coupled components: target acquisition and target state updating. In the target acquisition stage, a RAnk Reduction Estimator(RARE) is first used to decompose the array factor matrix and convert the original two-dimensional Direction Of Arrival(DOA) estimation problem into two sequential one-dimensional estimation problems. This process greatly reduces the computational complexity of each Sparse Bayesian Learning(SBL) iteration. On the basis of the coarse grid generated by RARE, an Adaptive Newton Sparse Bayesian Learning(ANSBL) method is developed. ANSBL uses block-sparse Bayesian learning to achieve initial target acquisition on the coarse grid, and then performs two-stage Newton refinement to reduce off-grid mismatch. This strategy provides high-accuracy DOA estimation in both \begin{document}$ \theta $\end{document} and \begin{document}$ \varphi $\end{document} and improves angular observation precision. In the target state updating stage, an Unscented Kalman Filter(UKF)-based ternary joint prediction mechanism is proposed. The UKF simultaneously predicts the target motion state, signal variance, and noise variance for the next target acquisition process. These predicted probability distributions are then used to update the initial grid and hyperparameters of the subsequent SBL acquisition stage, providing more consistent and comprehensive initial values. Through this closed-loop interaction, target acquisition and state tracking are deeply integrated, which substantially reduces the number of SBL iterations required for convergence. This advantage is particularly evident under high sea-state conditions, where reduced beam alignment time is critical.  Results and Discussions  The proposed closed-loop feedback adaptive beam alignment algorithm first uses on-grid DOA estimation to reduce array factor correlation and improve target acquisition efficiency, and then uses Newton iteration to achieve higher off-grid accuracy (Fig. 3). The proposed method is subsequently validated using real ship attitude data collected from a 28000-DWT bulk carrier under actual sea conditions (Fig. 4). The UKF refines the DOA results through state updating. Its predictions of signal position, signal variance, and noise variance provide accurate initial values for the hyperparameters, thereby reducing the number of iterations and enabling faster convergence than other algorithms (Fig. 5). Under low sea-state conditions, the proposed method not only achieves satellite alignment in less than 0.2 s, but also reduces the satellite position estimation error from ±1°\begin{document}$ \sim $\end{document}±0.5° (Fig. 6(a)). Under high sea-state conditions, the UKF effectively predicts satellite positions and reduces the satellite position estimation error from ±2.5°\begin{document}$ \sim $\end{document}±0.65°, which verifies the robust tracking accuracy and error mitigation capability of the proposed method in harsh marine environments (Fig. 6(b)).  Conclusions  To meet the performance requirements of beam alignment algorithms for LEO communication satellites, this paper proposes a closed-loop feedback adaptive beam alignment algorithm. The algorithm first uses a block-based SBL algorithm to obtain grid-based DOA estimation results, and then achieves super-resolution direction estimation under off-grid conditions through adaptive Newton iteration. Through the UKF, the estimation results are dynamically calibrated in real time. The UKF further predicts the target motion state, signal variance, and noise variance for the next target acquisition process, thereby improving tracking continuity and alignment accuracy. Numerical simulations show that the proposed algorithm outperforms traditional beam alignment methods in both numerical accuracy and robustness, and effectively mitigates severe terminal shaking under complex sea conditions.
Prior-guided Temporal Fusion Method for Multi-UAV Cooperative Obstacle-avoidance Route Planning
WANG Ao, LI Dapeng, XU Yifan, FAN Bingyang, HAN Guang, ZHAO Haitao
2026, 48(5): 2089-2101. doi: 10.11999/JEIT251231
Abstract:
  Objective  Traditional multi-agent reinforcement learning methods for multi-Unmanned Aerial Vehicle(UAV) cooperative obstacle-avoidance route planning in cluttered 3D environments often suffer from slow convergence, weak coordination, and limited global awareness under partial observability. To address these limitations, this paper proposes a prior-guided temporal fusion value-decomposition framework, termed Prior-Guided-LSTM-QMIX (PGL-QMIX). The method uses local heuristic scores derived from offline A* reference paths to guide decision-making under partial observability. The aim is to reduce route length, avoid collisions, and preserve real-time planning capability.  Methods   The multi-UAV cooperative obstacle-avoidance route-planning task is formulated as a Partially Observable Markov Decision Process (POMDP). In the offline stage, A* is used to generate a reference path for each UAV. During online execution, only the locally visible path segment is extracted, and heuristic scores are constructed from this local prior information and fused with each UAV’s local observation. An individual-level Long Short-Term Memory (LSTM) network is used to capture temporal dependencies in local perception and prior guidance, whereas a system-level LSTM-based mixing network dynamically generates the mixing weights and bias for value decomposition, thereby enabling coordinated joint action-value estimation. Potential-based reward shaping is further adopted to improve training stability.  Results and Discussions   Simulation results in 3D grid environments show that PGL-QMIX converges faster and more stably than QMIX, VDN, and MAPPO. Compared with the corresponding second-best result in each scenario, PGL-QMIX reduces the number of convergence steps by 3.0%, 7.2%, and 7.4%, improves the steady-state task success rate by 1.26, 4.41, and 8.12 percentage points, and shortens the average route length by 6.2%, 8.5%, and 10.0%, respectively. In addition, the generated trajectories are shorter and more efficient across different map sizes.  Conclusions   PGL-QMIX improves coordination, safety, and route efficiency for multi-UAV cooperative obstacle avoidance in cluttered 3D environments. By integrating heuristic prior guidance, recurrent temporal fusion, and value decomposition, the proposed method achieves faster convergence, higher success rates, and better generalization than existing baselines. Future work will incorporate real UAV dynamic constraints and communication-aware cooperative obstacle avoidance.
Jointly Improving Information Timeliness and Fidelity under Finite-Blocklength Source Coding in a Wireless IoT System
DUAN Jianxin, ZHANG Tianci, CHEN Zhengchuan, ZHANG Di, ZHU Xu, TIAN Zhong, WANG Min, ZHANG Lütianyang
2026, 48(5): 2102-2112. doi: 10.11999/JEIT251057
Abstract:
  Objective  Wireless Internet of Things (IoT) information update systems are essential for time-sensitive applications. In these systems, timely information delivery with high fidelity is critical for accurate sensing, estimation, and decision-making. However, short-packet transmission and strict latency requirements make classical asymptotic rate-distortion theory insufficient for characterizing practical system performance. Under finite-blocklength source coding, shorter source-coding blocklengths reduce latency but increase distortion, whereas longer source-coding blocklengths improve information fidelity at the cost of higher delay. This leads to a fundamental trade-off between information timeliness and information fidelity, which remains insufficiently characterized in the non-asymptotic regime.  Methods  Age of Information (AoI) and Mean Squared Error (MSE) are used to quantify information timeliness and information fidelity, respectively. Closed-form expressions for time-average AoI and time-average MSE are derived under finite-blocklength source coding. Based on distortion tolerance, excess distortion probability, and transmission rate, a joint optimization problem is formulated to minimize the weighted-sum objective of time-average AoI and time-average MSE. The monotonicity and convexity of the objective function are analyzed with respect to these design variables. An alternating iterative algorithm is then developed to jointly optimize distortion tolerance, excess distortion probability, and transmission rate.  Results and Discussions  Numerical simulations are conducted under different weight settings to examine the trade-off between information timeliness and information fidelity in representative operating scenarios. The proposed framework reveals the effect of finite-blocklength parameters on system performance. The results show that the proposed method balances AoI and MSE under different design priorities. At a transmit power of 20, the weighted-sum metric of the scheme with the highest distortion tolerance is improved by approximately 33.7% compared with that of the scheme with the lowest distortion tolerance. The maximum relative error between the theoretical analysis and Monte Carlo simulations remains below 0.3%, verifying the accuracy of the derived analytical expressions.  Conclusions  This paper presents a non-asymptotic analysis of the timeliness-fidelity trade-off in a wireless IoT information update system by explicitly considering finite-blocklength source coding. By treating distortion tolerance, excess distortion probability, and transmission rate as design variables, the proposed framework verifies the necessity of finite-blocklength modeling and the advantage of joint parameter optimization. The results provide theoretical guidance for the design and optimization of timely and high-fidelity wireless IoT systems.
Multi-path Resource Allocation for Confidential Services Based on Network Coding and Fragmentation Awareness in EONs
LIU Huanlin, AN Dongxin, CHEN Yong, CHEN Haonan, MA Bing, ZOU Jiachen
2026, 48(5): 2113-2121. doi: 10.11999/JEIT251222
Abstract:
  Objective  Each fiber in Elastic Optical Networks (EONs) provides enormous bandwidth capacity and carries a large volume of services and data. If any element in EONs is eavesdropped on or attacked, even for a short period, a large amount of data may be leaked or lost, which significantly reduces network performance. Moreover, confidential services are increasingly sensitive to data leakage and loss during transmission. Network attacks may therefore compromise a large number of confidential services. Network Coding (NC) combines data from different services using the XOR operation and transmits the coded data through EONs. Decoding is then performed at the receiver to recover the original information, providing a potential method to mitigate data eavesdropping during transmission. However, NC requires encryption constraints in EONs. Specifically, the routing and Frequency Slot (FS) allocation of other services must overlap with those of the confidential service to be encrypted. Therefore, routing and spectrum allocation for confidential services should consider both NC constraints and the efficiency of resource allocation.  Methods  A Multi-path Resource Allocation based on Network Coding and Fragmentation Awareness (MRA-NCFA) method is proposed to support secure and reliable transmission of confidential services under eavesdropping attacks. First, the proposed method applies NC to encrypt service data and adopts multi-path protection to improve transmission reliability. Second, in the routing stage, different strategies are designed for confidential and non-confidential services. For non-confidential services, the objective is to balance network load and improve resource utilization. A path weight function based on path load is designed. This function considers path hop count, the maximum idle spectrum block on the path, and the required FS of the service. The path with the largest function value is selected as the transmission path. For confidential services, routing selection focuses on preventing information leakage while considering path resource availability. Therefore, a path cost function based on eavesdropping probability is designed, and a routing strategy that considers this probability is adopted. Finally, different resource allocation strategies are applied. For non-confidential services, the objective is to maximize spectrum efficiency. Spectrum fragmentation should be minimized to maintain resource continuity and consistency. Therefore, a fragmentation-aware spectrum allocation strategy is designed. A fragmentation measurement formula evaluates the effect of service allocation on link resources. For confidential services, encryption constraints and FS matching must be satisfied. Therefore, a spectrum allocation strategy based on FS and fragmentation sensing is designed. This strategy considers both the effect of spectrum fragments and the effect of established service resources, which improves transmission security for confidential services.  Results and Discussions  The proposed MRA-NCFA algorithm achieves the lowest service blocking probability (Fig. 1). During routing selection, both confidential and non-confidential services consider path resource conditions. During resource allocation, fragmentation effects are also considered, which preserves idle resources for subsequent services as much as possible. In addition, confidential services adopt a multi-path transmission method. Large services can be divided into multiple sub-services, which improves spectrum resource utilization. As the number of services increases, the spectrum utilization of the MRA-NCFA algorithm improves significantly. This improvement results from the multi-path transmission mechanism, which divides large services into smaller ones and allows efficient use of small spectrum fragments. In addition, both confidential and non-confidential services consider path resource quantity during routing and prefer paths with lower spectrum consumption. During resource allocation, fragmentation effects are considered to avoid generating new fragments, which improves spectrum utilization (Fig. 2). As the number of services increases, the proposed MRA-NCFA algorithm shows the slowest and smallest increase in spectrum fragmentation ratio compared with the other two algorithms. This result occurs because the algorithm combines multi-path transmission with fragmentation-aware resource allocation, which improves the utilization of small spectrum fragments and reduces fragmentation in EONs. Moreover, both confidential and non-confidential services consider fragmentation effects during resource allocation and apply strategies to reduce fragmentation. Therefore, the proposed algorithm performs better than the Survivable Multipath Fragmentation-Sensitive Fragmentation-Aware Routing and Spectrum Assignment (SM-FSFA-RSA) algorithm and the Network Coding-based Routing and Spectrum Allocation (NC-RSA) algorithm (Fig. 3).  Conclusions  This study examines resource allocation for services that require protection against eavesdropping attacks in elastic optical networks. The objective is to satisfy the security requirements of confidential services and reduce spectrum fragmentation. The proposed MRA-NCFA algorithm applies NC to encrypt confidential services and adopts multi-path protection to improve transmission reliability. For non-confidential services, a path weight function based on path resources is designed for routing selection, and fragmentation-aware spectrum metrics are used for resource allocation. For confidential services, a path cost function that considers both path resources and eavesdropping probability is designed for routing selection. A bandwidth segmentation strategy based on eavesdropping probability supports multi-path transmission, and an FS and fragmentation sensing function based on encryption constraints is used for spectrum allocation. These mechanisms improve both reliability and security for confidential services. As the number of security-sensitive services on the Internet increases, the proposed MRA-NCFA algorithm can effectively reduce traffic blocking probability and improve spectrum resource utilization.
Routing and Resource Scheduling Algorithm Driven by Mixture of Experts in Large-scale Heterogeneous Local Power Communication Network
JING Chuanfang, ZHU Xiaorong
2026, 48(5): 2122-2131. doi: 10.11999/JEIT251176
Abstract:
  Objective  Emerging power services, such as distributed energy consumption, place stringent performance requirements on Large-Scale Heterogeneous Local Power Communication Networks (LHLPCNs). Limited communication resources and increasing service demands make it challenging to provide on-demand services and improve network capacity while ensuring Quality of Service (QoS). Conventional routing and resource scheduling algorithms based on optimization or heuristics depend on precise mathematical models and parameters, and their computational cost increases as network size and variables grow. These limitations reduce their adaptability to expanding power application scenarios. Advances in Mixture-of-Experts (MoE) frameworks offer a promising direction because they reduce the need to train task-specific models by using an ensemble of specialized AI experts. Motivated by these challenges, this study proposes an MoE-based routing and resource scheduling algorithm (RASMoE) for LHLPCNs integrating High-Power Line Carrier (HPLC) and Radio Frequency (RF). RASMoE is designed to meet personalized QoS requirements and support more power services within limited resources.  Methods  An optimization problem that minimizes the difference between QoS supply and demand in LHLPCNs is formulated as a 0-1 integer linear programming model considering multimodal links, channels, and modulation methods. To solve this NP-hard problem, a new MoE framework comprising expert networks and gated networks is designed. The framework supports personalized service requirements in terms of data rate, delay, and reliability, while improving convergence. The expert networks include shared and QoS-specific experts that generate optimal next hops and compute allocation strategies for links, channels, and modulation modes between node pairs. The gated networks dynamically combine and reuse these experts to support known and unforeseen service types. Extensive comparative experiments are conducted, and RASMoE shows improved resource utilization, reduced delay, and higher reliability relative to multiple baselines.  Results and Discussions  The performance supply-demand differences of five algorithms under varying service numbers are compared (Fig. 3). RASMoE consistently achieves the smallest differences across scenarios due to its gating network, which combines QoS-specific experts to align resource allocation with service requirements. Because control and compute-intensive services have strict delay requirements, their average End-to-End (E2E) latency under different service numbers is evaluated (Fig. 4). The proposed algorithm achieves the lowest average E2E latency because its GAT-enhanced expert networks extract node load states and interact with the network environment in real time through a Multi-Armed Bandit (MAB) mechanism. This supports adaptive allocation strategies. The average reliability of E2E paths for different numbers of control, compute-intensive, and acquisition services is also illustrated (Fig. 5).  Conclusions  This study proposes an MoE-driven routing and resource scheduling algorithm for LHLPCNs. The framework integrates expert networks and a gating network. The expert networks include GAT-based shared experts for E2E path selection and MAB-based QoS-specific experts for adaptive allocation of links, channels, and modulation schemes according to QoS demands and link states. The gated networks orchestrate and reuse these experts to support services with single or multiple QoS requirements, including previously unseen service types. Theoretical analysis shows that the method improves resource utilization in LHLPCNs, with notable advantages in multi-service scenarios characterized by diverse QoS demands. Future work will examine integrating the MoE framework with domain-specific models, including power load forecasting and predictive analytics, to enhance the use of renewable energy sources.
Joint Optimization of Service Placement and Task Offloading for QoS Balancing in Satellite-Terrestrial Integrated Networks
DAI Cuiqin, WANG Hongyun, LIAO Rongpeng, CHEN Qianbin
2026, 48(5): 2132-2143. doi: 10.11999/JEIT251294
Abstract:
  Objective  Satellite-Terrestrial Integrated Networks (STIN) integrate multi-source and multi-dimensional services from terrestrial and satellite networks, providing wide coverage, large capacity, and flexible networking. These features support global coverage and ubiquitous access for diverse services. However, the dynamic topology and heterogeneous, resource-constrained nodes in STIN complicate service placement at satellite-terrestrial edge nodes. This further increases the difficulty of matching user service requests with edge computing resources during task offloading, making it difficult to satisfy Quality of Service (QoS) requirements. To address this issue, a joint optimization scheme for QoS-Balanced Service Placement and Task Offloading (BQSPTO) is proposed. The scheme integrates a Delay, Security, and Privacy-aware QoS (DSPQoS) evaluation model with satellite-terrestrial collaboration, inter-satellite cooperation, and service migration. It enables joint optimization of service placement and task offloading in a cloud-edge-end architecture, while satisfying task latency, security, and privacy requirements.  Methods  The proposed scheme integrates service placement, task offloading, and QoS evaluation into a unified framework. First, a cloud-edge-end collaborative STIN model is constructed, including terminal devices, terrestrial edge servers, satellite edge nodes, and cloud servers. Task security is quantified using the attack avoidance probability derived from key-cracking capability, and task privacy is characterized by usage-pattern privacy and location privacy. A DSPQoS evaluation model is established by combining task completion latency, attack avoidance probability, and privacy level. Second, a service placement strategy is designed based on task popularity prediction and service migration. A cloud-edge-end collaborative full offloading strategy is developed by determining offloading locations and multi-node cooperation modes according to QoS performance. Based on the service placement strategy and task offloading decisions, an optimization problem is formulated to maximize the total QoS performance under communication and computation resource constraints. Third, the joint optimization problem is decomposed into service placement and task offloading subproblems. A Non-dominated Sorting Genetic Algorithm II (NSGA-II) is applied to the service placement subproblem, while a hybrid Grey Wolf Optimization (GWO) and Whale Optimization Algorithm (WOA) is applied to the task offloading subproblem. Alternating optimization is employed to iteratively update both decisions and obtain the final solution.  Results and Discussions  The QoS performance of the proposed BQSPTO scheme is evaluated through MATLAB simulations. The cloud-edge-end collaborative task processing model (Fig. 2) and the overall BQSPTO framework (Fig. 3) are analyzed. The proposed scheme is compared with three baseline methods: GWOBQ (Grey Wolf Optimization Algorithm-based BQSPTO Scheme), BSSLM (BQSPTO Scheme Without Service Migration), and HWGWTO (Hybrid Grey Wolf Optimization with Whale Algorithm Fusion for Task Offloading). Results show that BQSPTO achieves faster convergence and better avoids local optima, resulting in higher QoS performance (Fig. 4). Compared with GWOBQ, HWGWTO, and BSSLM, the QoS performance is improved by approximately 2.1%, 5.4%, and 4.8%, respectively. As the number of tasks increases, QoS performance improves for all methods, while BQSPTO consistently achieves the highest performance (Fig. 5). Latency, security, and privacy metrics increase with task volume, and BQSPTO maintains superior performance across these metrics, although trade-offs appear due to multi-objective optimization (Fig. 6). QoS performance decreases as the number of malicious users increases, while BQSPTO shows stronger robustness and stability (Fig. 7). As satellite capacity increases, the number of deployable service types grows, and QoS performance improves for all methods. BQSPTO remains superior under different capacity settings (Fig. 8).  Conclusions  A joint optimization scheme for service placement and task offloading in STIN is proposed under multi-objective QoS constraints. The DSPQoS evaluation model integrates latency, security, and privacy into a unified evaluation framework. The joint optimization problem is decomposed and solved using alternating optimization, enabling effective coordination between service placement and task offloading. Simulation results demonstrate that the proposed scheme achieves higher QoS performance, better convergence stability, and improved multi-objective balance under varying task loads, malicious user scales, and satellite capacities.
SCUNet-Based Decoding Algorithm for Rayleigh Fading Channels Integrating Feature Extraction and Recovery Mechanisms
WANG Leijun, WANG Kuan, XIE Jinfa, PENG Xidong, LI Jiawen, CHEN Rongjun
2026, 48(5): 2144-2153. doi: 10.11999/JEIT251138
Abstract:
  Objective  This study examines limitations of conventional Deep Neural Network (DNN) decoding algorithms in Rayleigh fading channels, including constrained performance, limited generalization, and weak fading resistance. To address these issues, a decoding algorithm based on the SCUNet (Swin Conv UNet) architecture, termed SCUNetDec, is proposed. In 6G communication scenarios, wireless channels exhibit strong dynamics and complexity, which restrict the ability of traditional decoding methods to meet requirements for high reliability, low latency, and robustness. Intelligent decoding methods with adaptive feature learning are therefore valuable. SCUNetDec integrates multi-dimensional feature extraction and recovery modules and uses a noise-level map to strengthen channel-state perception. These components enable the network to learn channel characteristics, reduce fading effects, and improve decoding performance. The study provides an approach for intelligent decoding in complex channel environments and supports the development of efficient 6G communication systems.  Methods  The SCUNetDec network combines three mechanisms—data preprocessing, feature extraction and recovery, and noise-level mapping—to enhance signal representation learning and decoding in Rayleigh fading channels. In the preprocessing stage, dimensionality expansion converts the one-dimensional received signal into a two-dimensional feature map, improving structural visibility and supporting spatial correlation learning. The feature extraction and recovery module uses multi-layer convolution and attention mechanisms to capture essential channel features, whereas deconvolution layers and residual connections suppress interference introduced during dimensionality transformation. This improves reconstruction quality and decoding accuracy. A noise-level map embeds SNR (Signal to Noise Ratio)-related information aligned with the feature maps, allowing the model to adjust to channel variation and adapt decoding strength. The combined effect of these mechanisms increases noise robustness, generalization, and decoding stability, offering a systematic decoding solution for complex 6G wireless environments.  Results and Discussions  SCUNetDec enhances signal learning and decoding in Rayleigh fading channels through its feature extraction-recovery module and noise-level map. Simulations under different coding schemes validate its effectiveness. For the (7,4) Hamming code, SCUNetDec outperforms conventional DNN decoding and approaches Maximum Likelihood (ML) performance; at BER (Bit Error Rate) = 10–4, the gap to ML is about 1.5 dB, and at FER (Frame Error Rate) = 10–3, the gap is about 2.0 dB (Fig. 4). This indicates that SCUNetDec captures complex signal relationships and learns associations between information and parity-check nodes. For the (2,1,3) convolutional code, SCUNetDec performs close to the Viterbi algorithm at BER = 10–3, with a gap of roughly 2.0 dB, while conventional DNN decoding degrades at high SNRs (Fig. 5). For Polar codes with a rate of 0.5, SCUNetDec shows a gain of about 4.0 dB over successive cancellation (SC) decoding at BER = 10–4 and maintains an advantage of about 1.0 dB at FER = 10–3, with SC performing slightly better only in the low-SNR region (Fig. 6). Decoding-time comparisons show that SCUNetDec reduces decoding latency relative to traditional methods (Table 1). Ablation experiments confirm that integrating the feature extraction and recovery modules into SCUNet improves decoding performance (Fig. 7). Overall, results show that SCUNetDec provides robust decoding performance across coding schemes and SNR levels.  Conclusions  This study proposes SCUNetDec to address performance limitations of DNN decoders in Rayleigh fading channels. The method enhances SCUNet using signal feature extraction and recovery modules. Simulations and ablation experiments on Hamming, convolutional, and Polar codes show strong generalization capability and effectiveness. Compared with traditional DNN models, SCUNetDec achieves decoding performance close to optimal decoding algorithms and reduces decoding time. These findings indicate that SCUNetDec has practical potential for complex channel environments. Future work will examine fusion of neural and traditional algorithms to balance performance and complexity through dynamic parameter optimization and explore intelligent decoding strategies for long codes. Research will also investigate joint modulation-decoding modeling and end-to-end architectures to improve adaptability under high-order modulation and complex channels.
Research on Time Slots Aggregation and Topology Aggregation Model for Unmanned Aerial Vehicle Swarm Overall Time Synchronization
WANG Zhenling, TAO Haihong, WEI Haitao, WANG Zhengyong
2026, 48(5): 2154-2165. doi: 10.11999/JEIT251274
Abstract:
  Objective  Unmanned Aerial Vehicle (UAV) swarms overcome the technical and performance limitations of individual UAVs and enable complex missions that cannot be accomplished by a single platform. High-precision time synchronization among swarm nodes serves as a fundamental requirement for key swarm operations, including resource scheduling, cooperative positioning, and multi-node data fusion. Existing research on UAV time synchronization mainly focuses on improving the accuracy of basic synchronization approaches. However, limitations remain in adapting to topological changes during swarm formation flights and in achieving global synchronization among multiple nodes. As the scale of UAV swarms increases, the connectivity of time-comparison links between nodes during formation flights exhibits clear time-varying characteristics. These characteristics create challenges for maintaining continuous, reliable, and precise overall time synchronization. To address stable formation flight and formation transformation scenarios in different mission stages of UAV swarms, an Observation Time Slots Aggregation (OTSA) model and a Time-Varying Topology Aggregation (TVTA) model are proposed to enhance the robustness of global time synchronization among swarm nodes and to improve Time Synchronization Accuracy (TSA). This study proposes an effective solution for Leader-Following Consistency Time Synchronization (LFCTS) in UAV swarms and provides references for time synchronization applications in heterogeneous and distributed systems.  Methods  Compared with the traditional Quasi Real-time Bidirectional Time Comparison (QRBTC) scheme, the time synchronization method based on the OTSA model fully uses all synchronization signal transmission and reception link resources within each time slot of the system synchronization period. Based on the “one transmission and multiple receptions” mechanism of all nodes, the Follower Node (FN) performs direct synchronization or single-hop indirect synchronization with the Leader Node (LN) in each time slot according to the OTSA model. This process produces tens of times more clock-skew observation samples than the traditional QRBTC scheme. The OTSA method improves the robustness of global time synchronization. It also enables secondary data processing using multi-slot synchronization samples, which further improves TSA compared with the QRBTC method. Based on the LFCTS results obtained during the system signal synchronization period, the TVTA model extends the direct comparison and single-hop indirect comparison mechanism of the OTSA model to cross-period multi-hop comparison. This extension addresses overall time synchronization instability caused by the time-varying characteristics of synchronization link relationships during UAV swarm takeoff, assembly, and formation transformation.  Results and Discussions  In the OTSA method, all time-comparison link resources of the total time slots are fully used during the synchronization period (Fig. 2). Based on the constructed error model and simulation analysis, for a UAV swarm with 50 nodes and a time slot allocation of 20 ms, time synchronization using the OTSA model achieves a single-slot TSA of 4.10~4.27 ns (Fig. 6). Within a complete time synchronization period, the overall TSA reaches 2.46~2.56 ns, which is better than the QRBTC scheme under the same conditions (Fig. 5(a)). The TVTA method uses cross-period synchronization comparison relationships to construct multi-hop time comparison links (Fig. 3 and 4). When the FN obtains external comparison relationships of other nodes through aggregation processing, one-way or two-way Dijkstra’s algorithm is applied to determine the multi-hop comparison link with optimal connectivity. Time tracing and comparison with the LN are then completed through edge computing. Error analysis indicates that during UAV swarm takeoff, assembly, and transitions to triangle or rhombus formations, time synchronization based on the TVTA model achieves an overall TSA better than 8.6 ns, which provides stronger global time synchronization capability.  Conclusions  This study addresses the robustness of time synchronization in UAV swarm formation flights. For stable formation flight and formation transformation scenarios during different mission stages, the OTSA and TVTA models are proposed. An error model is constructed and performance is analyzed. The results show the following. (1) The OTSA model improves the robustness of overall time synchronization through direct comparison and single-hop indirect comparison across multiple time slots within one synchronization period. The model achieves an overall TSA better than 2.56 ns and performs better than the traditional QRBTC method. (2) The TVTA model achieves overall UAV swarm time synchronization through multi-hop relay between nodes. Even when time-comparison links change, the model maintains global TSA better than 8.6 ns. (3) These two methods consider the time-varying characteristics of comparison links among UAV swarm nodes and have been verified through small-scale UAV swarm flight tests. They maintain synchronization robustness and performance and provide necessary support for coordinated UAV swarm operations. Future work will focus on practical flight verification, adaptation in complex scenarios, and further improvement of overall synchronization accuracy.
Multi-dimensional Resource Joint Optimization Algorithm for UAV Inspection of Collaborative Tasks of Perception and AI
LI Shiyang, ZHU Xiaorong
2026, 48(5): 2166-2177. doi: 10.11999/JEIT251284
Abstract:
  Objective  With increasing demand for aerial operations, the capabilities of various aircraft are steadily expanding across all airspace levels and multiple industries. The application of Unmanned Aerial Vehicles (UAVs) now spans multiple altitude layers, from low to high altitudes, and covers micro, medium, and large models. UAVs are widely used in public safety, transportation, emergency management, logistics and distribution, geographic surveying and mapping, and other fields, thereby promoting innovation and transformation in production and daily life. Compared with traditional manual inspection, UAV inspection, as an emerging operational approach, can acquire image information that is difficult for the human eye to capture. Labor costs are therefore significantly reduced, and the accuracy and efficiency of inspection operations are improved. However, UAV inspection also creates new challenges for multidimensional resource allocation and task scheduling. In power system inspection, for example, transmission lines are exposed to outdoor environments for long periods and are vulnerable to corrosion, aging, and even damage. Regular inspections are therefore required to ensure operational safety.   Methods  A four-stage multidimensional resource inspection and scheduling collaborative optimization algorithm is proposed. The original optimization problem is decomposed into four subproblems according to the inspection process. After mathematical analysis of each subproblem, a corresponding solution method is proposed. For the node selection problem, a dual-aided Mixed-Integer Linear Programming (MILP) transformation method is used. For the UAV data acquisition problem, a data-driven boundary learning method is adopted. For UAV communication resource allocation, a bandwidth-power joint optimization algorithm based on Successive Convex Approximation (SCA) is used. For node computing power allocation, a lower-bound analytical allocation method is adopted. Finally, the original problem is solved by an alternating optimization method across the subproblems, thereby forming the complete algorithm.  Results and Discussions  Simulation results show that the proposed algorithm reduces overall UAV energy consumption compared with the benchmark algorithms. Simulation training is conducted for visual positioning and fault detection services to examine the relationship among compression ratio, data volume, and service performance. Figures 25 show that fault detection accuracy reaches its optimum at 60% data volume and 60% compression ratio. Visual positioning accuracy reaches its optimum at 80% data volume and 80% compression ratio. Figure 6 shows that the proposed algorithm achieves higher accuracy than the benchmark algorithms for AI services. As shown in Figures 7 and 8, under varying bandwidth, computing power, and other resource conditions, the proposed algorithm consistently performs better than the benchmark algorithms in terms of energy consumption and effectively reduces total energy consumption.  Conclusions  A multidimensional resource joint optimization algorithm is proposed for intelligent UAV inspection with collaborative perception and AI tasks. An optimization problem is formulated with the objective of minimizing UAV energy consumption, using bandwidth, power, computing power, node selection, data volume, and actual compression ratio as variables. The algorithm jointly minimizes UAV energy consumption for two AI services, fault detection and visual localization. Simulation results show that the algorithm reduces total UAV energy consumption and improves model training accuracy. This study focuses on the application scenario of single-UAV inspection. More complex multi-UAV collaborative inspection scenarios can be examined in future work, and additional services can be incorporated for a more comprehensive analysis.
Cell-Free Joint Beamforming and AP-User/Target AssociationOptimization for Integrated Sensing and Communication
FANG Zhiyu, XIA Xiaochen, XU Kui, WEI Chen, XIE Wei, YE Zilü
2026, 48(5): 2178-2187. doi: 10.11999/JEIT250574
Abstract:
  Objective  Integrated Sensing And Communication (ISAC) is a key technology for Sixth-Generation (6G) networks. The cell-free architecture is a promising regional coverage paradigm for 6G. Cooperation among Access Points (APs) mitigates coverage imbalance, interference, and capacity limitations in conventional cellular systems, while enabling communication and sensing services for low-altitude targets with wide-area continuous coverage. However, existing studies on cell-free systems often rely on statistical channel models, which fail to capture realistic propagation characteristics in complex environments. The global Channel State Information (CSI) required for transmission optimization is difficult to obtain, and instantaneous CSI cannot be guaranteed due to the high mobility of low-altitude targets. To address these issues, a joint beamforming and AP-user/target association optimization method based on a Binary Radio Map (BRM) is proposed. The environmental information provided by the BRM is used to predict channels between APs and users/targets, thereby providing global channel information for joint optimization. On this basis, an ISAC satisfaction-based optimization model is constructed, and an iterative optimization algorithm for beamforming design and AP-user/target association is developed using a genetic algorithm.  Methods  First, the channels between APs and users/targets are predicted using environmental information derived from the BRM. An ISAC satisfaction-based optimization model is then established to unify communication and sensing performance. Due to the coupling between communication and sensing and the non-convex nature of the problem, the optimization problem is decomposed into two subproblems corresponding to communication and sensing beamforming. In each iteration, the beamforming design is reformulated as a Second-Order Cone Program (SOCP) to obtain beamforming matrices that maximize the satisfaction function. An iterative solution algorithm is applied to compute the communication and sensing beamforming matrices efficiently. Subsequently, based on the optimized satisfaction function, an AP-user/target association optimization method is designed using a genetic algorithm.  Results and Discussions  Simulation results verify the effectiveness of the BRM-assisted channel prediction and association optimization method. Compared with the conventional AP association method based on the shortest path, the proposed approach reduces the required transmission power by approximately 5 dBm while achieving higher user/target satisfaction (Fig. 7). As the transmission power increases, the satisfaction of users/targets gradually improves and approaches 1. In contrast, under the conventional scheme, a large gap remains between the maximum and minimum satisfaction values at the same transmission power (Fig. 8). When the transmission power is 40 dBm, the proposed method effectively reduces this disparity and balances performance among different users/targets. Although the null-space projection scheme leads to some degradation in sensing performance, the minimum received sensing power remains stable. This indicates that the overall system satisfaction is not affected and that sensing requirements are still satisfied (Fig. 9).  Conclusions  This study addresses the AP-user/target association problem in low-altitude airspace. The BRM is used to predict channels between APs and users/targets and to provide global channel information for joint optimization. By maximizing the minimum user/target satisfaction, ISAC beamforming is optimized, and AP-user/target association is iteratively refined using a genetic algorithm. Simulation results show that the proposed method effectively improves AP-user/target association and enhances integrated communication and sensing performance compared with existing approaches.
Semi-passive Intelligent Reflecting Surface-assisted Integrated Sensing and Communication for Distributed and High-precision Joint Localization
HUANG Yi, XIONG Chaorui, TANG Xiaowei, SHI Yunmei
2026, 48(5): 2188-2198. doi: 10.11999/JEIT251039
Abstract:
  Objective   Integrated Sensing And Communication (ISAC) enables communication and sensing on a shared radio platform, supporting emerging applications such as autonomous driving and smart city infrastructure while improving spectral efficiency and reducing system cost. A key feature of ISAC systems is the reuse of communication signals for sensing and localization, which enables high-precision positioning without dedicated localization pilots. In semi-passive Intelligent Reflecting Surface (IRS)-aided ISAC systems, sensing performance is improved while low hardware complexity and power consumption are maintained. Compared with fully passive IRSs, semi-passive IRSs provide limited signal-processing capability for more flexible beam control, while avoiding the high hardware cost of fully active IRSs. In addition, a semi-passive IRS can cooperate with the sensing array at the Base Station (BS) to form a distributed sensing architecture. Through joint processing of the signals received at the BS and the IRS sensing arrays, the effective sensing aperture is enlarged, which improves the accuracy and robustness of channel-parameter estimation. However, existing studies mainly address fully passive or fully active IRSs in communication scenarios, whereas the sensing capability of semi-passive IRSs and their cooperation with BS arrays for high-precision localization remain insufficiently studied. Therefore, high-precision Three-Dimensional (3D) target localization under semi-passive IRS-assisted cooperative sensing is investigated.  Methods  A semi-passive IRS-assisted ISAC framework is proposed for cooperative 3D target localization. Sensing arrays are deployed at both the BS and IRS to jointly receive target-reflected Orthogonal Frequency Division Multiplexing (OFDM) signals, which are then delivered through reliable backhaul links to a central processor for joint processing. Two localization algorithms are proposed. The first is a parameter-decoupled two-step localization method. In this method, the Angle of Arrival (AoA) is first estimated by Fast Fourier Transform (FFT) with a refinement procedure, and the propagation delay is then estimated by the Spatial Smoothing MUltiple SIgnal Classification (MUSIC) algorithm. The target position is subsequently obtained by solving linear equations constructed from the estimated channel parameters and the geometric relationships among the arrays. The second is a Direct Position Determination (DPD) method, in which a maximum-likelihood optimization problem is formulated and a Newton-like algorithm is used to estimate the target position directly. By jointly using prior information, including spatial correlation among arrays, communication symbols, beamforming vectors, and IRS reflection coefficients, this method reduces the error propagation of the two-step localization method and improves localization accuracy and robustness. Furthermore, the Cramér-Rao Lower Bound (CRLB) for target-position estimation is derived under circularly symmetric complex Gaussian noise to provide a theoretical benchmark. Monte Carlo simulations are conducted to verify the proposed algorithms, examine the effect of the Rician K-factor on localization performance, and compare the proposed methods with conventional AoA/ToA-based localization methods.  Results and Discussions  Under the proposed semi-passive IRS-assisted ISAC framework, the two-step localization method achieves statistically efficient channel-parameter estimation, and its estimation error approaches the CRLB at high Signal-to-Noise Ratio (SNR) (Figs. 24). At low BS transmit power, severe path loss and noise distortion cause a clear gap between the Root Mean Square Error (RMSE) and the CRLB. As the transmit power increases, the sensing SNR increases and parameter-estimation accuracy is improved. Because the target position in the two-step localization method is obtained from linear equations constructed from the estimated channel parameters and known array geometry, the final localization accuracy follows the same trend as the intermediate parameter-estimation performance. However, because of error propagation in the two-stage process, the localization error deviates more clearly from the CRLB (Fig. 5). Increasing the number of OFDM symbols improves localization accuracy, but also increases latency, which indicates a trade-off between accuracy and delay in practical systems. Compared with the two-step localization method, the DPD method achieves higher localization accuracy under the same number of OFDM symbols (Fig. 5). By jointly processing the signals received from all sensing arrays and directly optimizing the target position under the maximum-likelihood criterion, error propagation is effectively avoided. In addition, spatial correlation among arrays, communication symbols, beamforming vectors, and IRS reflection coefficients are fully used, which further improves estimation performance. For the same localization accuracy, the DPD method requires fewer OFDM symbols or lower transmit power than the two-step localization method, which shows clear advantages in latency and energy efficiency. Simulation results also show that both proposed methods benefit from a larger Rician K-factor (Fig. 6), because a stronger line-of-sight component suppresses multipath interference. This effect is more evident in the high-SNR region, where small-scale fading becomes the main factor limiting performance. Finally, compared with conventional AoA/ToA-based localization methods, the proposed methods provide better localization accuracy and robustness (Fig. 7).  Conclusions  A semi-passive IRS-assisted ISAC system is proposed for 3D cooperative localization with reduced localization pilot overhead. Two localization algorithms are developed: a low-complexity two-step localization method and a high-accuracy DPD method. The theoretical performance limit is established through derivation of the CRLB. Simulation results verify that the two-step localization method enables high-precision localization, whereas the DPD method provides better performance, and its RMSE approaches the CRLB at high SNR. Both methods also show good scalability and robustness. Future work will address multi-target scenarios and resource optimization.
Design of Dynamic Resource Awareness and Task Offloading Schemes in Multi-Access Edge Computing Networks
ZHANG Bingxue, LI Xisheng, YOU Jia
2026, 48(5): 2199-2209. doi: 10.11999/JEIT250640
Abstract:
  Objective  With the growth of the industrial Internet of Things and the widespread use of multimode terminals, multi-access edge computing has become a key technology that supports low-latency and energy-efficient industrial applications. Task offloading is central to addressing the large volume and complex processing requirements of multimode terminals. In multi-access edge computing systems, end-user network selection strongly affects offloading and resource allocation. However, existing network-selection mechanisms emphasize user decisions while neglecting the effects of task execution, task-data transmission, and processing on network performance. Current studies on offloading design emphasize delay, energy optimization, and resource allocation, but overlook how collaborative computing across heterogeneous networks affects resource cost and dynamic resource balance. To address these issues, this study considers users’ diverse requirements and the differentiated capabilities of heterogeneous resource providers. It focuses on cost-efficient task-execution decisions and dynamic-resource allocation in multi-access heterogeneous networks to reduce system cost, improve service quality, and support cooperative use of heterogeneous resources.  Methods  Following the MEC network model, this study establishes cost-calculation models for task-execution time, energy consumption, and communication-resource consumption for different networks during end-user task selection. Using auction theory, it constructs a cost-effectiveness model for task evaluation and bidding between users and edge servers, and formulates the objective optimization problem based on combinatorial two-way auction theory. A dynamic resource-sensing and task offloading algorithm based on an auction mechanism is then proposed. Through two-way broadcasting of pending tasks and required resources, the algorithm performs network-selection assessment and dynamic allocation of computing and communication resources. Servers submit valid bids only when their available resources satisfy user constraints. Servers that issue valid bids compete for task-execution opportunities until the user obtains the optimal bid and corresponding server, which completes the auction-matching process.  Results and Discussions  The proposed dynamic-resource allocation and task offloading algorithm accounts for heterogeneous-network conditions and resource usage, and selects offloading locations based on resource availability. By setting simulation parameters, a heterogeneous wireless-network cooperation model is constructed. The effects of network size on offloading cost and offloaded data volume are analyzed. Simulation results show that the algorithm reduces system cost by at least 5% compared with benchmark algorithms (Fig. 3), with larger advantages when the number of end users increases. Changes in the number of servers influence users’ network-selection behavior (Fig. 4, 5, 6). Across algorithms, the proposed method increases the amount of offloaded data by approximately 10% relative to benchmark schemes (Figs. 7, 8). Finally, the study analyzes how variation in communication-resource cost parameters affects users’ preference for offloading via the 5G public network. Higher communication-cost parameters markedly reduce the data volume offloaded through the 5G network (Fig. 9).  Conclusions  To address complex data-processing demands from multimode terminals, this study develops a cooperative multi-access edge computing architecture for multimode devices. Flexible and intelligent wireless-network selection provides additional resources for end-user task offloading. A server-bidding and user-target bidding model is built using an auction framework, and a dynamic resource-perception and task offloading algorithm is proposed. The algorithm first adjusts and selects the offloading network and allocates computing and communication resources according to incoming tasks. It then determines the offloading location with minimum execution cost based on competition among edge servers. Results indicate that the proposed algorithm lowers system cost compared with benchmark approaches, increases the amount of data offloaded to multiple edge servers, improves utilization of edge-computing resources, and enhances system energy efficiency and operational efficiency.
Radar, Sonar and Array Signal Processing
SAR Saturated Interference Suppression Method Guided by Precise Saturation Model
DUAN Lunhao, LU Xingyu, TAN Ke, LIU Yushuang, YANG Jianchao, YU Jing, GU Hong
2026, 48(5): 2210-2222. doi: 10.11999/JEIT251283
Abstract:
  Objective  With the increasing number of electromagnetic devices, Synthetic Aperture Radar (SAR) is highly susceptible to Radio Frequency Interference (RFI) within the same frequency band. RFI typically appears as bright streaks in SAR images and severely degrades image quality. Considerable research has been conducted on interference suppression, and many effective methods have been proposed. However, most existing approaches do not consider the nonlinear saturation of interfered echoes. In practical scenarios, the interference power is usually high, and the gain controller in the SAR receiver cannot effectively regulate the amplitude of interfered echoes. Therefore, the input signal amplitude of the Analog-to-Digital Converter (ADC) exceeds its dynamic range. This condition drives the SAR receiver into saturation and leads to nonlinear distortion in the interfered echoes. Such phenomena have been observed in multiple SAR systems. Documented cases include receiver saturation in the LuTan-1 satellite and several airborne SAR platforms. Analyses of SAR data further confirm the presence of saturated interference in systems such as Sentinel-1, Gaofen-3, and other spaceborne SAR platforms. After saturation occurs, the echo spectrum exhibits spurious components and spectral artifacts. These effects cause a mismatch between existing suppression methods and the actual characteristics of saturated interference. Therefore, many current methods cannot effectively mitigate this type of interference. Moreover, accurate models that precisely describe the output components of saturated interfered echoes remain limited. To address these issues, a precise analytical model for saturated interference is established, and an effective saturated interference suppression method is proposed based on this model.  Methods  Based on the processing of the basic saturation model, a mathematical model is first developed to accurately characterize the output components of saturated interference. The accuracy of the model in describing amplitude and phase is validated through simulations. A detailed analysis of the output components of interfered echoes under saturation conditions is also conducted. Compared with the one-bit sampling model and the traditional tanh saturation model, the proposed model provides higher accuracy in describing amplitude information. In addition, the model is not limited by the sampling bit width of ADCs and can theoretically be extended to describe saturation outputs in other radar receivers. Based on the observation that harmonic phases can be expressed as a linear combination of the phases of the original signal components, and by exploiting the high-power characteristic of the interference fundamental harmonic, a saturated interference suppression method is proposed. First, because the interference fundamental harmonic has relatively high power, it is extracted using eigen-subspace decomposition. Then, based on harmonic phase relationships, the extracted interference fundamental harmonic, and the SAR transmitted signal, various interference harmonics are systematically constructed. These include higher-order interference harmonics, target harmonics, and intermodulation harmonics, which together form a complete dictionary. Finally, a sparse optimization problem is solved to achieve separation and suppression of saturated interference. The effectiveness of the proposed method is verified using measured Gaofen-3 data.  Results and Discussions  Experiments are conducted using both simulated and measured data to verify the effectiveness of the proposed method in suppressing saturated interference. For simulated data, the proposed method completely removes interference stripes in the SAR image (Fig. 7). Analysis of the time-frequency spectra of the processed echoes (Fig. 8 and Fig. 9) shows that traditional methods cannot effectively eliminate higher-order harmonics. Thus, the proposed method improves the Target-to-Background Ratio (TBR) by 1.76 dB and achieves the lowest Root Mean Square Error (RMSE) of 0.078 3 (Table 3). For the measured Gaofen-3 data, analysis of the processed images and the time-frequency spectra of echoes confirms that the proposed method effectively suppresses interference. Conventional methods still exhibit residual interference in the processed results (Fig. 10 and Fig. 11).  Conclusions  With the increasing deployment of electromagnetic devices, SAR systems are increasingly susceptible to in-band interference. High-power interference can drive the SAR receiver into saturation and cause nonlinear distortion, which reduces the effectiveness of traditional interference suppression methods. To address this issue, a model that precisely characterizes the saturated output components of interfered echoes is established. Based on this model, an interference suppression method for saturated interference is proposed. Simulation and experimental results show that the model accurately describes saturation behavior and that the proposed method effectively suppresses saturated interference.
Cryption and Information Security
A Quantum-resistant Threshold Signature Scheme for Database Audit Logs
CHEN Dajiang, ZHANG Yiwen, JIAO Lihua, WANG Baizheng, CHEN Ruidong
2026, 48(5): 2223-2232. doi: 10.11999/JEIT251320
Abstract:
  Objective  Database audit logs are a core basis for ensuring data integrity, accountability, and traceability in distributed systems. However, current audit-log protection mechanisms still rely on classical public-key signature algorithms such as RSA and ECDSA, which are vulnerable to quantum attacks. Shor’s algorithm can break integer-factorization- and discrete-logarithm-based cryptography in polynomial time, while Grover’s algorithm reduces the brute-force security of hash-based and symmetric primitives. These threats weaken the long-term reliability of existing database audit-log protection mechanisms in cloud and data-intensive environments. To address this issue, a quantum-resistant framework for database audit logs is proposed to satisfy practical requirements for efficiency, real-time verification, scalable deployment, and distributed trust management. The goal is to provide a robust cryptographic foundation for next-generation database audit-log systems with unforgeability and tamper resistance under quantum threats.   Methods   A hybrid hash-based signature layer is constructed by combining Few-Time Signature (FORS) and eXtended Merkle Signature Scheme-Tree (XMSS-T). FORS supports efficient signing for high-frequency log events, whereas XMSS-T organizes authentication paths in a Merkle-tree hierarchy for scalable state management. This combination yields a multi-level quantum-resistant signing structure. A Shamir (r,n) threshold secret-sharing mechanism is then adopted to split the signing key into multiple shares managed by independent audit agents. This design avoids a single point of failure, supports collaborative attestation, and ensures that no single party holds complete signing authority. In addition, a chained-hash structure is used to bind consecutive log entries through one-way linkage, thereby ensuring tamper evidence and chronological integrity. The framework further defines a complete set of system algorithms, including setup, key distribution, partial-signature generation, signature aggregation, log-chain update, and verification, all of which operate efficiently in a distributed setting. For formal security analysis, the scheme is modeled in the Quantum Random Oracle Model (QROM), and adversarial capabilities are characterized through UF-CMA, IND-CCA2, and IND-CKA2 games to capture forgery, decryption misuse, and index-indistinguishability attacks. A prototype implementation is developed and evaluated under realistic multi-node settings across different log scales, message sizes, interval configurations, and threshold ratios.  Results and Discussions  Experimental results show that the proposed scheme achieves a good balance between quantum-resistant security and system performance. For large-scale logs, the average signing latency increases linearly with log volume, which supports the efficiency of the chained-hash structure (Table 2). Compared with representative quantum-resistant signatures such as Dilithium and SPHINCS+, the threshold-signing design reduces the peak computational burden on individual nodes while preserving strong security guarantees. The system also maintains a stable throughput of about 2 000 operations per second. The message-size analysis shows that latency increases with message size but remains manageable even when the message exceeds 4 kB (Fig. 2(b)). Additionally, variation in the threshold ratio (r/n) has a measurable but moderate effect on system latency. A higher threshold improves resistance to collusion, but slightly increases delay (Fig. 2(e)). The interval-based chained-signing strategy further reduces the signing frequency and improves throughput without weakening log-integrity guarantees. These results indicate that the proposed scheme is well suited to cloud-based and distributed database environments that require real-time auditing and high-volume log processing.  Conclusions  A quantum-resistant mechanism for database audit logs is presented by integrating hash-based signatures, threshold secret sharing, and chained log-integrity protection. The scheme provides strong quantum-resistant security guarantees, including provable unforgeability, confidentiality, and tamper resistance, supported by formal proofs in the QROM. Experimental results show that the mechanism maintains high signing and verification efficiency under large-scale deployment, with good scalability across different log volumes, message sizes, and threshold settings. Owing to its distributed trust model and quantum-resistant cryptographic basis, the proposed scheme offers a practical and secure solution for next-generation database audit systems in cloud computing, big-data processing, and compliance-critical environments.
A Dimension-reduction Attack on Shortest Vector Problem Using Hints
YIN Risheng, CAO Jinzheng, MA Yongliu, WANG Hong, CHENG Qingfeng
2026, 48(5): 2233-2241. doi: 10.11999/JEIT251277
Abstract:
  Objective  Cryptographic algorithms based on the Learning With Errors (LWE) problem and its variants are widely used, including the key encapsulation mechanism Kyber and the digital signature scheme Dilithium. In many applications, the LWE secret is a short vector. Therefore, reducing LWE to the Shortest Vector Problem (SVP) is a common approach to cryptanalysis. Traditional SVP algorithms, including enumeration, lattice sieving, and lattice basis reduction, become difficult to apply directly in high-dimensional lattices because of their high computational cost. With the use of side-channel attacks, hints about the secret vector provide a new way to solve SVP. This paper proposes a dimension-reduction attack based on such hints. The method uses hints to reduce the problem dimension, thereby extending the practical range of enumeration and sieving.  Methods  Two types of hints are analyzed: integer hints and modular hints. For integer hints, which provide exact inner-product information about the shortest vector, the problem is formulated as a system of integer equations. The solution space of this system is then used to represent the shortest vector in a shorter linear form. Hermite normal form and Gaussian elimination are applied to obtain a particular solution and a fundamental solution system. This representation reduces the number of unknown coefficients that must be searched in enumeration or sampled in sieving. Thus, the search space is reduced, and the original SVP instance is transformed into a lower-dimensional problem. For modular hints, which provide inner-product information about the shortest vector modulo an integer, a conversion mechanism based on Coppersmith’s lemma is developed. For common-modulus modular equations, Lenstra-Lenstra-Lovász (LLL) lattice basis reduction is first used to reduce the norms of row vectors. Gaussian elimination is then applied to decrease the number of nonzero terms. Each resulting modular equation is screened according to Coppersmith’s lemma. Equations that satisfy the conversion condition are transformed into integer equations. For non-common-modulus modular equations, the moduli are first factorized into prime-power moduli. Equations with the same modulus are grouped and processed in the same manner. The resulting integer equations are then solved using the dimension-reduction enumeration or sieving method.  Results and Discussions  To evaluate the proposed dimension-reduction attack, the enumeration-based and sieving-based algorithms are compared with the lattice basis reduction algorithm in Algorithm 5 in terms of runtime and solution exactness. The effect of key parameters on dimension reduction is first analyzed. These parameters include the number of screening rounds (Fig. 2), the small-root bound (Fig. 3), and the modulus size (Fig. 4). The conversion efficiency of Algorithm 3 under different parameter settings is summarized in Table 1. The results show that more screening rounds generally improve the reduction effect, but this improvement has a saturation point. Beyond this point, additional rounds provide limited benefit. Finally, the computational efficiency of the proposed methods is compared with that of lattice basis reduction (Fig. 5). The results show that the computational cost of enumeration and sieving increases rapidly with dimension. However, up to dimension 90, the dimension-reduction attack can still use hints to reduce the dimension and obtain exact solutions more efficiently. Lattice basis reduction shows a slower increase in runtime as the dimension grows and is therefore more suitable for higher-dimensional SVP instances.  Conclusions  The proposed dimension-reduction attack provides a simple and effective method for solving SVP using hints. For integer hints, the solution space of the corresponding equation system is used to reduce the number of variables in enumeration and sieving. For modular hints, Coppersmith’s lemma is used to convert selected modular equations into integer equations, reducing the problem to the integer-hint case. The experiments show that, when sufficient hints are available, the method can effectively reduce the lattice dimension and extend the practical range of enumeration and sieving. Compared with lattice basis reduction, enumeration and sieving after dimension reduction can provide exact solutions within their applicable dimension range. Although the reduction effect tends to saturate as the number of hints increases, a moderate number of hints is sufficient to achieve effective dimension reduction. These results indicate that hint-based dimension-reduction attacks offer a practical route for exact SVP solving and provide useful evidence for the security evaluation of lattice-based cryptographic schemes.
Aperiodic Total Squared Ambiguity Function: Theoretical Bounds for Binary Sequence Sets and Optimal Constructions
WEI Wenbo, SHEN Bingsheng, YANG Yang, ZHOU Zhengchun
2026, 48(5): 2242-2250. doi: 10.11999/JEIT251327
Abstract:
  Objective  In direct-sequence code division multiple access systems, the performance of spreading sequence sets is commonly evaluated using the total squared correlation metric. Traditional metrics such as total squared correlation and aperiodic total squared correlation are applicable only to synchronous communication systems and asynchronous systems with time shifts only, respectively. In modern high-speed mobile and satellite communications, the Doppler effect becomes significant. It causes both time and Doppler shifts in the received signal and leads to severe signal distortion. In communication scenarios that consider only time shift, the one-dimensional correlation function is typically used to measure system interference. However, in high-speed mobile environments the Doppler effect appears during signal transmission. Both time shift and Doppler shift of the sequence must therefore be considered simultaneously. In such cases, the two-dimensional ambiguity function should replace the one-dimensional correlation function. To mitigate Doppler effects, recent studies have focused on the design of Doppler-resilient sequences for mobile channels. Existing work mainly studies theoretical bounds of the ambiguity function, particularly the maximum ambiguity magnitude. Sequence sets are then constructed to achieve or asymptotically approach these bounds. This study instead examines the overall ambiguity function performance of binary sequence sets in asynchronous communication, namely the Aperiodic Total Squared Ambiguity Function (ATSAF). The objectives are as follows. First, the theoretical lower bound for the ATSAF of binary sequence sets is derived. Second, several classes of optimal binary sequence sets that achieve this bound are constructed based on the derived ATSAF bound.  Methods  The aperiodic time-phase cycling extension matrix \begin{document}$ {\boldsymbol{S}}_{\rm{a}} $\end{document} is defined for a binary sequence set \begin{document}$ \boldsymbol{S} $\end{document} consisting of \begin{document}$ K $\end{document} sequences of length \begin{document}$ L $\end{document} to account for both time shifts and Doppler shifts. This definition converts the computation of the ATSAF for the sequence set \begin{document}$ \boldsymbol{S} $\end{document} into the calculation of the total squared correlation of the matrix \begin{document}$ {\boldsymbol{S}}_{\rm{a}} $\end{document}. The theoretical lower bounds for the ATSAF of the binary sequence set \begin{document}$ \boldsymbol{S} $\end{document} are then derived for different combinations of the set size \begin{document}$ K $\end{document}, sequence length \begin{document}$ L $\end{document}, and Doppler shift \begin{document}$ V $\end{document}. To design binary sequence sets that achieve these ATSAF lower bounds, it is first proven that binary aperiodic complementary sets form ATSAF-optimal binary sequence sets. Furthermore, two additional classes of optimal binary sequence sets are constructed using Hadamard matrices and specific sequences. These sets are proven to achieve the theoretical ATSAF lower bound.  Results and Discussions  Existing studies mainly examine the maximum ambiguity magnitude of sequence sets, whereas this study analyzes the overall ambiguity function performance. The one-dimensional aperiodic total squared correlation analysis for asynchronous communication with delay only, studied by Ganapathy et al., is extended to the two-dimensional ATSAF, which considers both time delay and Doppler shift. First, the aperiodic time-phase cycling extension matrix \begin{document}$ {\boldsymbol{S}}_{\rm{a}} $\end{document} is defined for a binary sequence set \begin{document}$ \boldsymbol{S} $\end{document} (Definition 3). The theoretical lower bounds for the ATSAF of the binary sequence set \begin{document}$ \boldsymbol{S} $\end{document} are then derived for different parameters, including set size \begin{document}$ K $\end{document}, sequence length \begin{document}$ L $\end{document}, and Doppler shift \begin{document}$ V $\end{document} (Theorem 1). When the Doppler shift \begin{document}$ V=1 $\end{document}, the derived ATSAF bound reduces to the aperiodic total squared correlation bound. Binary sequence sets that achieve these ATSAF bounds maintain the overall cross-interference energy in the two-dimensional delay-Doppler domain at its theoretical minimum. To construct such sequence sets, it is first proven that binary aperiodic complementary sets are ATSAF-optimal binary sequence sets (Theorem 2). Furthermore, two further classes of ATSAF-optimal binary sequence sets are constructed using Hadamard matrices and specific sequences (Theorems 3 and 4). Finally, an example demonstrates that the sequence set constructed in Theorem 4 is ATSAF-optimal (Example 1).  Conclusions  In high-speed mobile communication scenarios, Doppler effects cause distortion in received signals. By defining the aperiodic time-phase cycling extension matrix \begin{document}$ {\boldsymbol{S}}_{\rm{a}} $\end{document} for a binary sequence set \begin{document}$ \boldsymbol{S} $\end{document}, the theoretical lower bound for the ATSAF is derived. This bound specifies the minimum theoretical value of the total energy of the binary sequence set S in the two-dimensional delay-Doppler domain. When Doppler shifts are not considered, the derived ATSAF bound reduces to the aperiodic total squared correlation bound. Furthermore, three classes of ATSAF-optimal binary sequence sets that achieve this theoretical bound are constructed using binary aperiodic complementary sets, Hadamard matrices, and specific sequences. These sequence sets maintain the overall cross-interference energy at the theoretical minimum in the two-dimensional delay-Doppler domain.
Construction of Maximum Distance Separable Codes and Near Maximum Distance Separable Codes Based on Cyclic Subgroup of $ \mathbb{F}_{{q}^{2}}^{*} $
DU Xiaoni, XUE Jing, QIAO Xingbin, ZHAO Ziwei
2026, 48(5): 2251-2258. doi: 10.11999/JEIT251204
Abstract:
  Objective  The demand for higher performance and efficiency in error-correcting codes has increased with the rapid development of modern communication technologies. These codes detect and correct transmission errors. Because of their algebraic structure, straightforward encoding and decoding, and ease of implementation, linear codes are widely used in communication systems. Their parameters follow classical bounds such as the Singleton bound: for a linear code with length \begin{document}$ n $\end{document} and dimension \begin{document}$ k $\end{document}, the minimum distance \begin{document}$ d $\end{document} satisfies \begin{document}$ d\leq n-k+1 $\end{document}. When \begin{document}$ d=n-k+1 $\end{document}, the code is a Maximum Distance Separable (MDS) code. MDS codes are applied in distributed storage systems and random error channels. If \begin{document}$ d=n-k $\end{document}, the code is Almost MDS (AMDS); when both a code and its dual are AMDS, the code is Near MDS (NMDS). NMDS codes have geometric properties that are useful in cryptography and combinatorics. Extensive research has focused on constructing structurally simple, high-performance MDS and NMDS codes. This paper constructs several families of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document} over the finite field \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document} of even characteristic using the cyclic subgroup \begin{document}$ {U}_{q+1} $\end{document}. Several families of optimal Locally Repairable Codes (LRCs) are also obtained. LRCs support efficient failure recovery by accessing a small set of local nodes, which reduces repair overhead and improves system availability in distributed and cloud-storage settings.  Methods  In 2021, Wang et al. constructed NMDS codes of dimension 3 using elliptic curves over \begin{document}$ {\mathbb{F}}_{q} $\end{document}. In 2023, Heng et al. obtained several classes of dimension-4 NMDS codes by appending appropriate column vectors to a base generator matrix. In 2024, Ding et al. presented four classes of dimension-4 NMDS codes, determined the locality of their dual codes, and constructed four classes of distance-optimal and dimension-optimal LRCs. Building on these works, this paper uses the unit circle \begin{document}$ {U}_{q+1} $\end{document} in \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document} and elliptic curves to construct generator matrices. By augmenting these matrices with two additional column vectors, several classes of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document} are obtained. The locality of the constructed NMDS codes is also determined, yielding several classes of optimal LRCs.  Results and Discussions  In 2023, Heng et al. constructed generator matrices with second-row entries in \begin{document}$ \mathbb{F}_{q}^{*} $\end{document} and with the remaining entries given by nonconsecutive powers of the second-row elements. In 2025, Yin et al. extended this approach by constructing generator matrices using elements of \begin{document}$ {U}_{q+1} $\end{document} and obtained infinite families of MDS and NMDS codes. Following this direction, the present study expands these matrices by appending two column vectors whose elements lie in \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document}. The resulting matrices generate several classes of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document}. Several classes of NMDS codes with identical parameters but different weight distributions are also obtained. Computing the minimum locality of the constructed NMDS codes shows that some are optimal LRCs satisfying the Singleton-like, Cadambe-Mazumdar, Plotkin-like, and Griesmer-like bounds. All constructed MDS codes are Griesmer codes, and the NMDS codes are near Griesmer. These results show that the proposed constructions are more general and unified than earlier approaches.  Conclusions  This paper constructs several families of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document} over \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document} using elements of the unit circle \begin{document}$ {U}_{q+1} $\end{document} and oval polynomials, and by appending two additional column vectors with entries in \begin{document}$ {\mathbb{F}}_{q} $\end{document}. The minimum locality of the constructed NMDS codes is analyzed, and some of these codes are shown to be optimal LRCs. The framework generalizes earlier constructions, and the resulting codes are optimal or near-optimal with respect to the Griesmer bound.
Full-round Integral Cryptanalysis of the Lightweight Block Cipher INLEC
YU Bin, LIU Wenfen, CHEN Wen, GUO Ying, LU Yongcan, HUANG Yuehua
2026, 48(5): 2259-2267. doi: 10.11999/JEIT251131
Abstract:
  Objective  With the rapid development of telecommunication technology, Internet of Things (IoT) devices have been widely deployed in modern applications. However, their limited computing resources and energy supply create challenges for data privacy and security. To address these issues, Feng et al. proposed INLEC, a low-energy lightweight block cipher designed for resource-constrained IoT environments. The designers claimed that INLEC can resist differential, linear, impossible differential, and side-channel attacks. However, its security against integral cryptanalysis has not yet been evaluated. This paper presents a comprehensive full-round integral cryptanalysis of INLEC to assess its actual resistance to integral cryptanalysis.  Methods  The monomial prediction technique proposed by Hu et al. is used to construct a Mixed Integer Linear Programming (MILP) model for the monomial trails of INLEC. Based on this model, a 9-round integral distinguisher for INLEC is obtained. By further using the structural properties of the diffusion layer, the 9-round integral distinguisher is extended to a 10-round integral distinguisher by adding an initial round. This is the first 10-round integral distinguisher constructed for INLEC. To reduce the complexity of key recovery, a multi-key guessing method is proposed. Combined with the partial-sum technique, this method enables the first 14-round key recovery attack on INLEC. An integral cryptanalysis framework for the full-round INLEC cipher is therefore established.  Results and Discussions  The analysis shows that the 10-round integral distinguisher provides exploitable balanced bits for key recovery. Based on this distinguisher, the proposed 14-round key recovery attack achieves a data complexity of 263 chosen plaintexts and a time complexity of 289.843 14-round encryptions. These results indicate that the diffusion layer of INLEC does not fully eliminate integral properties within 10 rounds. The remaining structural properties can be used to support key recovery. This finding challenges the original security claims for INLEC and shows that integral properties should be considered when evaluating lightweight block ciphers for IoT applications.  Conclusions  This paper evaluates the resistance of the lightweight block cipher INLEC to integral cryptanalysis based on monomial prediction. A 9-round integral distinguisher is first constructed using an MILP model of monomial trails. The 9-round integral distinguisher is then extended to a 10-round integral distinguisher by exploiting the structural properties of the diffusion layer. A 14-round key recovery attack is further achieved by combining the partial-sum technique with the multi-key guessing method. The results show that INLEC has insufficient resistance to integral cryptanalysis and that its practical security may be lower than expected. Therefore, more rounds should be considered in the design of such ciphers to resist known integral attacks.
A Testability Evaluation Method Based on Reconvergent Fan-Out
WU Wenjun, LIANG Huaguo, YOU Chang, DOU Xianrui, XIAO Jiahui, LU Yingchun
2026, 48(5): 2268-2276. doi: 10.11999/JEIT251286
Abstract:
  Objective  As the scale and structural complexity of integrated circuits continue to increase, accurate testability evaluation becomes essential for Trojan detection, fault diagnosis, and test-point optimization in modern Design-for-Testability (DFT) flows. Metrics such as controllability, observability, and fault coverage depend on reliable probabilistic modeling of signal propagation. However, existing analytical and learning-based approaches often lose accuracy in circuits with dense Reconvergent Fan-Out (RFO) structures, where strong signal correlation invalidates classical independence assumptions and causes substantial estimation bias. Although several enhanced techniques attempt to incorporate structural information, many have high computational cost or limited scalability in deeper or highly reconvergent logic networks. This work addresses these limitations by proposing a testability evaluation method that incorporates RFO structural characteristics to improve modeling accuracy while maintaining practical computational efficiency.  Methods  The proposed approach starts with a structural analysis algorithm that identifies RFO regions through topological traversal of the circuit. A dedicated RFO recognition mechanism maps each root fan-out node to its corresponding RFO nodes, capturing the structural dependencies that govern correlated signal behavior and providing the basis for accurate probabilistic modeling. Building on this structural extraction, a weighted conditional probability model is formulated to correct testability distortion in reconvergent regions. Unlike previous optimization schemes, the weighting strategy assigns influence-based weights derived from the contribution of each root node to the target node, yielding probability estimates that more accurately reflect actual testability behavior. An efficient computational framework is also developed to integrate conditional probability propagation and weight selection into a single topological traversal process, thereby maintaining low algorithmic complexity while improving accuracy.  Results and Discussions  The proposed method is evaluated on representative benchmark circuits from the ISCAS85, ISCAS89, ITC99, and EPFL suites. Performance is assessed in terms of controllability accuracy, ordering consistency, fault coverage estimation, and runtime efficiency. For controllability prediction, the method achieves an average RMSE of 0.0568, which corresponds to an average reduction of 25% relative to existing techniques, as reported in Table 2. Ordering consistency also improves, with the average Spearman correlation coefficient reaching 0.935, outperforming existing techniques. Fault coverage estimation shows similarly strong performance, with an average relative error of 3.64%, which is lower than that of previously reported methods, as shown in Table 1. Runtime analysis further indicates that the proposed framework maintains practical computational efficiency. Across all benchmark circuits, the method achieves an average speedup of 7× while preserving high accuracy, as illustrated in Figure 5.  Conclusions  This work addresses the degradation in testability evaluation accuracy caused by RFO structures in integrated circuits by proposing a reconvergent-fan-out-aware testability analysis method. The presented RFO structure identification algorithm extracts reconvergent information at the topological level and establishes explicit mappings between root nodes and RFO nodes. On this structural basis, a weighted conditional probability model is constructed to mitigate probability distortion induced by signal correlation in RFO regions. An efficient computational framework is further developed to integrate the full computation into a streamlined traversal-based process. Experimental results show that the proposed technique achieves accurate fitting of controllability RMSE and ordering consistency relative to simulation-based ground truth. In testability estimation, the predicted fault coverage values also closely match the simulation results. While maintaining high accuracy, the method also has low computational overhead.
Reconfigurable Intelligent Surface Assisted Key Generation Resistant to Signal Injection Attacks
YANG Lijun, WANG Haomin, ZHU Tiancheng, WU Meng
2026, 48(5): 2277-2287. doi: 10.11999/JEIT251281
Abstract:
  Objective  This study examines the potential threat of signal injection attacks to Physical Layer Key Generation (PLKG) in Reconfigurable Intelligent Surface (RIS)-assisted wireless systems. The threat is especially pronounced in quasi-static channels, where the channel state remains highly correlated across multiple probing rounds. From both attack and defense perspectives, the study clarifies how spatial correlation between RIS reflection channels and eavesdropping channels can be exploited to improve key inference. A channel-randomization mechanism is designed that uses the controllability of RIS to suppress key leakage, reduce the eavesdropper’s key capacity, and improve the security of RIS-assisted PLKG in future 6G scenarios. Quantitative analysis further examines the relationships among injection power, Signal-to-Noise Ratio (SNR), and spatial correlation. These results provide reference guidance for robust RIS configuration and secure system design.  Methods  An RIS-assisted Time-Division Duplex (TDD) system is considered. Single-antenna Alice and Bob generate symmetric keys from a reciprocal channel, whereas a two-antenna active eavesdropper, Eve, injects signals using previously observed Channel State Information (CSI) (Fig. 1). The links follow quasi-static Rayleigh block fading. CSI for Alice, Bob, and Eve is defined for each time slot within a coherence interval. A conventional injection attack is first modeled. Eve estimates the eavesdropping channel in one slot, precodes an injected waveform, and contaminates the subsequent probing at Alice and Bob, partially steering their key source. A joint key inference strategy is then proposed. This strategy exploits the spatial correlation between RIS reflection channels and eavesdropping channels, as well as the common RIS-induced subchannel shared by legitimate and eavesdropping links (Table 1). As a defense, a channel-randomization PLKG scheme is proposed. Alice randomly reconfigures RIS coefficients at each probing round. Therefore, the effective channels of Alice-Bob, Alice-Eve, and Bob-Eve vary independently across rounds, whereas Alice-Bob reciprocity within a single round is preserved. Injection signals precoded with outdated CSI therefore appear as uncorrelated interference at the legitimate nodes. Mutual-information-based bounds on secret-key capacity are derived to obtain key capacities. The eavesdropper’s Key Recovery Rate (KRR) is defined for performance evaluation. The theoretical results are validated through MATLAB Monte Carlo simulations with 10,000 trials using an information-theoretic estimator toolbox. The simulations examine different SNR levels, injection power values, and spatial correlation conditions (Figs. 2\begin{document}$ \sim $\end{document}5, Table 2).  Results and Discussions  Analysis of the conventional injection attack without RIS defense shows that at high SNR, Alice and Bob observe nearly identical reciprocal channels due to channel reciprocity. Eve’s estimate, derived from injected signals, follows a similar trend but shows noticeable mismatch (Fig. 2). Eve can therefore recover some key bits, although errors remain, and the KRR remains moderate. When the proposed joint key inference strategy is applied, Eve’s reconstructed channel more closely matches the legitimate response (Fig. 3). This effect arises because RIS-assisted PLKG causes legitimate and eavesdropping links to share an RIS-induced subchannel. The resulting spatial correlation provides additional exploitable information beyond the known injected signal. Therefore, Eve’s key capacity and KRR increase significantly, which indicates a stronger RIS-specific security threat. At fixed SNR (Fig. 4), Eve’s key capacity without defense increases rapidly with injection power and may approach or exceed the legitimate key capacity. Under RIS randomization, the legitimate capacity decreases slightly, whereas Eve’s capacity remains small and nearly constant. This result indicates that randomization converts structured injection signals into noise. Spatial-correlation analysis in Fig. 5 shows that Eve’s capacity without defense increases rapidly and becomes critical as correlation approaches one. In contrast, under RIS randomization the increase is gradual, and the capacity may remain near zero at moderate correlation levels. Table 2 confirms these trends in terms of KRR. The KRR is about 50% without correlation and injection. It increases to about 62.5% when injection is applied but spatial correlation is zero, whereas the defense keeps the value close to random guessing. When spatial correlation and injection power are higher, the KRR exceeds 80%. The proposed defense reduces this value to approximately 57%~66%.  Conclusions  This study examines the dual role of RIS in PLKG security. RIS can increase vulnerability but can also serve as an effective defensive mechanism. By exploiting the correlation between RIS reflection channels and eavesdropping channels, a joint key inference attack is developed that increases the eavesdropper’s key capacity and recovery rate compared with conventional injection attacks. This result reveals a new attack vector in RIS-assisted systems. A channel-randomization PLKG scheme is then proposed by exploiting the dynamic controllability of RIS. The scheme shortens the effective coherence time to a single probing round and decorrelates successive channel realizations from the attacker’s perspective. Theoretical analysis and Monte Carlo simulations show that the proposed scheme converts malicious injection signals into uncorrelated interference, reduces the eavesdropping key capacity, and pushes the eavesdropper’s KRR close to random guessing. This property remains effective even under high SNR, strong spatial correlation, and high injection power. The scheme achieves these security improvements with low hardware overhead compared with reconfigurable antenna-based solutions, because RIS devices are expected to serve as infrastructure elements in future 6G networks. The results provide guidance for the secure design of RIS-assisted PLKG systems and suggest that the controllable characteristics of RIS should be used for both performance improvement and security protection.
A Risk-modulated Learning Framework for Physical-layer RFIDAuthentication under Dynamic Interference
WU Haifeng, YU Wenbo, ZENG Yu, YANG JiangFeng
2026, 48(5): 2288-2303. doi: 10.11999/JEIT251108
Abstract:
  Objective  Dynamic interference and metallic reflections severely affect the reliability of coupled Radio Frequency IDentification (RFID) authentication. Conventional static models cannot adapt to time-varying noise and multipath effects, which leads to unstable recognition. To address this problem, this paper proposes a Risk-Modulated Learning Identification Framework (RMLIF) that integrates stochastic channel modeling, adaptive risk regulation, and risk-regularized classification. The aim is to achieve stable and interpretable physical-layer authentication under nonstationary interference, thereby improving the anti-counterfeiting reliability of RFID systems.  Methods  A Stochastic Differential Equation (SDE)-based coupled channel model is first established to jointly characterize drift, diffusion, and impulsive interference (Eq.(1)), and the existence and uniqueness of its solution are proved. A Target-Driven Adaptive Risk (TDAR) algorithm is then designed to dynamically adjust physical-layer parameters based on the Recognition Risk Index (RRI). The RRI is derived from classification posterior probabilities (Eq.(3)), and its exponential mapping to the Signal-to-Interference-plus-Noise Ratio (SINR) is characterized analytically (Eq.(11), Fig. 3), which enables real-time risk estimation and closed-loop control. For feature representation, a difference-based compressive feature modeling method is used to capture the perturbation between normalized and reference signals (Fig. 1), and Theorem 1 establishes the stability of the compressed mapping. Parallel steady-state and perturbation feature paths are further designed (Table 1), and their joint robustness is proved in Corollary 4. In addition, the framework shows that TDAR regulation is equivalent to a risk-regularized classification process (Theorem 3), which effectively enlarges the classification margin without modifying the classifier structure.  Results and Discussions  Theoretical analysis derives the generalization error bound, sample complexity, and robustness limits (Theorems 4~7), showing that filtering high-risk samples reduces redundancy and improves learning efficiency. The Asymptotic Real Risk Index (ARRI) is further defined to explain long-term convergence and structural self-consistency (Theorem 8). Experiments conducted on a USRP N2000 platform (Table 3) use six types of EPC C1 Gen2 tags under four interference conditions, namely no copper plate and small, medium, and large copper plates (Fig. 4). Compared with conventional methods, including Coupling_14, Hu_Fu, CNN_Vgg19, and PCFM, the corresponding RMLIF-enhanced versions achieve clear gains in classification accuracy (Fig. 5). In all no/small/medium/large copper-plate interference scenarios, the proposed framework achieves accuracy above 90%, with an average improvement of 10%~20% over traditional methods. PCFM_RMLIF achieves the best overall performance. PCA visualization confirms the stability of the compressed features (Fig. 6) and the clearer class separation after risk regulation (Fig. 7). The TDAR algorithm converges rapidly, generally within two iterations (Fig. 9). As the effective sample ratio and feature dimension increase, the RRI decreases monotonically (Fig. 10), in agreement with Theorem 6. Entropy analysis (Fig. 11) shows that risk regulation reduces system uncertainty and improves stability. Cross-condition tests further verify the robustness and generalization ability of the framework (Fig. 12).  Conclusions  This paper develops a unified risk-modulated learning framework for physical-layer RFID authentication under dynamic interference. The RMLIF framework combines SDE-based channel modeling, adaptive TDAR regulation, and compressive feature reconstruction into a closed-loop mechanism that links physical signals with recognition risk. Both theoretical analysis and experimental results show that risk-driven regulation effectively suppresses disturbance, improves feature separability, and reduces generalization error. The proposed approach achieves high accuracy, rapid convergence, and strong robustness, and provides an effective solution for dynamic RFID anti-counterfeiting authentication.
Intelligent Protection Method for Personalized Location Privacy in 3D MCS Scenario
MIN Minghui, YE Jun, WEI Xipeng, MIN Bo, LI Shiyin
2026, 48(5): 2304-2316. doi: 10.11999/JEIT251237
Abstract:
  Objective  With the widespread adoption of intelligent mobile devices and growing reliance on location-based services, Mobile CrowdSensing (MCS) systems have become a critical infrastructure for urban sensing and smart city applications. In complex 3D environments such as hospitals and shopping malls, real-time user location data uploaded during task execution can be exploited by untrusted servers or external attackers, resulting in severe privacy risks. Existing location privacy protection methods are largely designed for 2D spaces and rely on fixed privacy budgets, lacking adaptability to dynamic user energy states, personalized privacy requirements, and inference attacks. These limitations hinder the simultaneous optimization of location privacy and service quality in 3D MCS systems. This paper proposes a personalized privacy-protection task assignment mechanism that integrates 3D Geo-Indistinguishability (3DGI) and distortion privacy, enabling dynamic optimization of location perturbation strategies and task allocation in complex 3D environments.  Methods  A dynamic 3D MCS system model is established, incorporating user energy states, task execution costs, individual privacy preferences, and attacker Bayesian inference behaviors. A reinforcement learning approach is adopted to learn personalized location perturbation strategies through continuous interaction with the environment. Specifically, a Proximal Policy Optimization (PPO)-based mechanism, PPOM, is proposed. It employs an Actor-Critic architecture to operate in a continuous action space for effective policy learning. A utility-driven reward function integrating user privacy feedback and server profit allows the system to optimize privacy protection and economic benefit simultaneously.  Results and Discussions  Extensive simulations on synthetic and GeoLife datasets demonstrate that PPOM outperforms 3DGI, 3DGI-PPOM, and LEAPER under Single-user Single-task (S-S) and Single-user Multi-task (S-M) scenarios. PPOM achieves superior 3D location privacy protection through personalized perturbation and two-dimensional action space design. Server net profit is maintained at a level comparable to 3DGI-PPOM while system utility is significantly improved, even under high user privacy preferences. LEAPER underperforms due to its 2D-oriented design. Overall, PPOM dynamically balances personalized privacy protection and server economic benefits in complex 3D MCS environments.  Conclusions  This study presents a reinforcement learning-based mechanism for personalized 3D location privacy protection and task assignment in dynamic MCS systems. Key contributions include: (1) a personalized privacy protection framework integrating 3DGI and distortion privacy, accounting for user energy status, task costs, privacy preferences, and attacker Bayesian inference in real time; (2) a perturbation policy optimization mechanism, PPOM, based on the PPO with an Actor-Critic structure, Gaussian sampling, and advantage-based learning to enhance robustness and stability in continuous high-dimensional action spaces; (3) a privacy-aware task assignment model using inferred locations from perturbed data, with a utility function jointly quantifying privacy protection and server profit, achieving dynamic trade-offs between user privacy and service quality under resource constraints.