Advanced Search
Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Research on Non-cooperative Interference Suppression Technology for Dual Antennas without Channel Prior Information
YAN Cheng, LI Tong, PAN Wensheng, DUAN Baiyu, SHAO Shihai
 doi: 10.11999/JEIT250378
[Abstract](70) [FullText HTML](24) [PDF 2996KB](10)
Abstract:
  Objective  In electronic countermeasures, friendly communication links are vulnerable to interference from adversaries. The auxiliary antenna scheme is employed to extract reference signals for interference cancellation, which improves communication quality. Although the auxiliary antenna is designed to capture interference signals, it often receives communication signals at the same time, and this reduces the suppression capability. Typical approaches for non-cooperative interference suppression include interference rejection combining and spatial domain adaptive filtering. These approaches rely on the uncorrelated nature of the interference and desired signals to achieve suppression. They also require channel information and interference noise information, which restricts their applicability in some scenarios.  Methods  This paper proposes the Fast ICA-based Simulated Annealing Algorithm for SINR Maximization (FSA) to address non-cooperative interference suppression in communication systems. Designed for scenarios without prior channel information, FSA applies a weighted reconstruction cancellation method implemented through a Finite Impulse Response (FIR) filter. The method operates in a dual-antenna system in which one antenna supports communication and the other provides an auxiliary reference for interference. Its central innovation is the optimization of weighted reconstruction coefficients using the Simulated Annealing algorithm, together with Fast Independent Component Analysis (Fast ICA) for SINR estimation. The FIR filter reconstructs interference from the auxiliary antenna signal using optimized coefficients and then subtracts this reconstructed interference from the main received signal to improve communication quality. Accurate SINR estimation in non-cooperative settings is difficult because the received signals contain mixed components. FSA addresses this through blind source separation based on Fast ICA, which extracts sample signals of both communication and interference components. SINR is then calculated from cross-correlation results between these separated signals and the signals after interference suppression. The Simulated Annealing algorithm functions as a probabilistic optimization process that adjusts reconstruction coefficients to maximize the output SINR. Starting from initial coefficients, the algorithm perturbs them and evaluates the resulting SINR. Using the Monte Carlo acceptance rule, it allows occasional acceptance of perturbations that do not yield immediate improvement, which supports escape from local optima and promotes convergence toward global solutions. This iterative process identifies optimal filter coefficients within the search range. The combined use of Fast ICA and Simulated Annealing enables interference suppression without prior channel information. By pairing blind estimation with robust optimization, the method provides reliable performance in dynamic interference environments. The FIR-based structure offers a practical basis for real-time interference cancellation. FSA is therefore suitable for electronic countermeasure applications where channel conditions are unknown and change rapidly. This approach advances beyond conventional techniques that require channel state information and offers improved adaptability in non-cooperative scenarios while maintaining computational efficiency through the combined use of blind source separation and intelligent optimization.  Results and Discussions  The performance of the proposed FSA is assessed through simulations and experiments. The output SINR is improved under varied conditions. In simulations, a maximum SINR improvement of 27.2 dB is achieved when the communication and auxiliary antennas have a large SINR difference and are placed farther apart (Fig. 5). The performance is reduced when the channel correlation between the antennas increases. Experimental results confirm these observations, and an SINR improvement of 19.6 dB is measured at a 2 m antenna separation (Fig. 7). The method is shown to be effective for non-cooperative interference suppression without prior channel information, although its performance is affected by antenna configuration and channel correlation.  Conclusions  The proposed FSA method provides an effective solution for non-cooperative interference suppression in communication systems. The method applies weighted reconstruction cancellation optimized by the Simulated Annealing algorithm and uses Fast ICA-based SINR estimation to improve communication quality without prior channel information. The results from simulations and experiments show that the method performs well across varied conditions and has potential for practical electronic warfare applications. The study finds that the performance of the FSA method depends on the SINR difference and the channel correlation between the communication and auxiliary antennas. Future research focuses on refining the algorithm for more complex scenarios and examining the effect of system parameters on its performance. These findings support the development of communication systems that operate reliably in challenging interference environments.
Modeling, Detection, and Defense Theories and Methods for Cyber-Physical Fusion Attacks in Smart Grid
WANG Wenting, TIAN Boyan, WU Fazong, HE Yunpeng, WANG Xin, YANG Ming, FENG Dongqin
 doi: 10.11999/JEIT250659
[Abstract](179) [FullText HTML](103) [PDF 859KB](25)
Abstract:
  Significance   Smart Grid (SG), the core of modern power systems, enables efficient energy management and dynamic regulation through cyber–physical integration. However, its high interconnectivity makes it a prime target for cyberattacks, including False Data Injection Attacks (FDIAs) and Denial-of-Service (DoS) attacks. These threats jeopardize the stability of power grids and may trigger severe consequences such as large-scale blackouts. Therefore, advancing research on the modeling, detection, and defense of cyber–physical attacks is essential to ensure the safe and reliable operation of SGs.  Progress   Significant progress has been achieved in cyber–physical security research for SGs. In attack modeling, discrete linear time-invariant system models effectively capture diverse attack patterns. Detection technologies are advancing rapidly, with physical-based methods (e.g., physical watermarking and moving target defense) complementing intelligent algorithms (e.g., deep learning and reinforcement learning). Defense systems are also being strengthened: lightweight encryption and blockchain technologies are applied to prevention, security-optimized Phasor Measurement Unit (PMU) deployment enhances equipment protection, and response mechanisms are being continuously refined.  Conclusions  Current research still requires improvement in attack modeling accuracy and real-time detection algorithms. Future work should focus on developing collaborative protection mechanisms between the cyber and physical layers, designing solutions that balance security with cost-effectiveness, and validating defense effectiveness through high-fidelity simulation platforms. This study establishes a systematic theoretical framework and technical roadmap for SG security, providing essential insights for safeguarding critical infrastructure.  Prospects   Future research should advance in several directions: (1) deepening synergistic defense mechanisms between the information and physical layers; (2) prioritizing the development of cost-effective security solutions; (3) constructing high-fidelity information–physical simulation platforms to support research; and (4) exploring the application of emerging technologies such as digital twins and interpretable Artificial Intelligence (AI).
Design and Optimization for Orbital Angular Momentum–based wireless-powered Noma Communication System
CHEN Ruirui, CHEN Yu, RAN Jiale, SUN Yanjing, LI Song
 doi: 10.11999/JEIT250634
[Abstract](99) [FullText HTML](37) [PDF 1554KB](17)
Abstract:
  Objective  The Internet of Things (IoT) requires not only interconnection among devices but also seamless connectivity among users, information, and things. Ensuring stable operation and extending the lifespan of IoT Devices (IDs) through continuous power supply have become urgent challenges in IoT-driven Sixth-Generation (6G) communications. Radio Frequency (RF) signals can simultaneously transmit information and energy, forming the basis for Simultaneous Wireless Information and Power Transfer (SWIPT). Non-Orthogonal Multiple Access (NOMA), a key technology in Fifth-Generation (5G) communications, enables multiple users to share the same time and frequency resources. Efficient wireless-powered NOMA communication requires a Line-of-Sight (LoS) channel. However, the strong correlation in LoS channels severely limits the degree of freedom, making it difficult for conventional spatial multiplexing to achieve capacity gains. To address this limitation, this study designs an Orbital Angular Momentum (OAM)-based wireless-powered NOMA communication system. By exploiting OAM mode multiplexing, multiple data streams can be transmitted independently through orthogonal OAM modes, thereby significantly enhancing communication capacity in LoS channels.  Methods  The OAM-based wireless-powered NOMA communication system is designed to enable simultaneous energy transfer and multi-channel information transmission for IDs under LoS conditions. Under the constraints of the communication capacity threshold and the harvested energy threshold, this study formulates a sum-capacity maximization problem by converting harvested energy into the achievable uplink information capacity. The optimization problem is decomposed into two subproblems. A closed-form expression for the optimal Power-Splitting (PS) factor is derived, and the optimal power allocation is obtained using the subgradient method. The transmitting Uniform Circular Array (UCA) employs the Movable Antenna (MA) technique to adjust both position and array angle. To maintain system performance under typical parallel misalignment conditions, a beam-steering method is investigated.  Results and Discussions  Simulation results demonstrate that the proposed OAM-based wireless-powered NOMA communication system effectively enhances capacity performance compared with conventional wireless communication systems. As the OAM mode increases, the sum capacity of the ID decreases. This occurs because higher OAM modes exhibit stronger hollow divergence characteristics, resulting in greater energy attenuation of the received OAM signals (Fig. 3). The sum capacity of the ID increases with the PS factor (Fig. 4). However, as the harvested energy threshold increases, the system’s sum capacity decreases (Fig. 5). When the communication capacity threshold increases, the sum capacity first rises and then gradually declines (Fig. 6). In power allocation optimization, allocating more power to the ID with the best channel condition further improves the total system capacity.  Conclusions  To enhance communication capacity under LoS conditions, this study designs an OAM-based wireless-powered NOMA communication system that employs mode multiplexing to achieve independent multi-channel information transmission. On this basis, a sum-capacity maximization problem is formulated under communication capacity and harvested energy threshold constraints by transforming harvested energy into achievable uplink information capacity. The optimization problem is decomposed into two subproblems. A closed-form expression for the optimal PS factor is derived, and the optimal power allocation is obtained using the subgradient method. In future work, the MA technique will be integrated into the proposed OAM-based wireless-powered NOMA system to further optimize sum-capacity performance based on the three-dimensional spatial configuration and adjustable array angle.
Speaker Verification Based on Tide-Ripple Convolution Neural Network
CHEN Chen, YI Zhixin, LI Dongyuan, CHEN Deyun
 doi: 10.11999/JEIT250713
[Abstract](54) [FullText HTML](21) [PDF 2580KB](2)
Abstract:
  Objective  State-of-the-art speaker verification models typically rely on fixed receptive fields, which limits their ability to represent multi-scale acoustic patterns while increasing parameter counts and computational loads. Speech contains layered temporal–spectral structures, yet the use of dynamic receptive fields to characterize these structures is still not well explored. The design principles for effective dynamic receptive field mechanisms also remain unclear.  Methods  Inspired by the non-linear coupling behavior of tidal surges, a Tide-Ripple Convolution (TR-Conv) layer is proposed to form a more effective receptive field. TR-Conv constructs primary and auxiliary receptive fields within a window by applying power-of-two interpolation. It then employs a scan-pooling mechanism to capture salient information outside the window and an operator mechanism to perceive fine-grained variations within it. The fusion of these components produces a variable receptive field that is multi-scale and dynamic. A Tide-Ripple Convolutional Neural Network (TR-CNN) is developed to validate this design. To mitigate label noise in training datasets, a total loss function is introduced by combining a NoneTarget with Dynamic Normalization (NTDN) loss and a weighted Sub-center AAM Loss variant, improving model robustness and performance.  Results and Discussions  The TR-CNN is evaluated on the VoxCeleb1-O/E/H benchmarks. The results show that TR-CNN achieves a competitive balance of accuracy, computation, and parameter efficiency (Table 1). Compared with the strong ECAPA-TDNN baseline, the TR-CNN (C=512, n=1) model attains relative EER reductions of 4.95%, 4.03%, and 6.03%, and MinDCF reductions of 31.55%, 17.14%, and 17.42% across the three test sets, while requiring 32.7% fewer parameters and 23.5% less computation (Table 2). The optimal TR-CNN (C=1024, n=1) model further improves performance, achieving EERs of 0.85%, 1.10%, and 2.05%. Robustness is strengthened by the proposed total loss function, which yields consistent improvements in EER and MinDCF during fine-tuning (Table 3). Additional evaluations, including ablation studies (Tables 5 and 6), component analyses (Fig. 3 and Table 4), and t-SNE visualizations (Fig. 4), confirm the effectiveness and robustness of each module in the TR-CNN architecture.  Conclusions  This research proposes a simple and effective TR-Conv layer built on the T-RRF mechanism. Experimental results show that TR-Conv forms a more expressive and effective receptive field, reducing parameter count and computational cost while exceeding conventional one-dimensional convolution in speech feature modeling. It also exhibits strong lightweight characteristics and scalability. Furthermore, a total loss function combining the NTDN loss and a Sub-center AAM loss variant is proposed to enhance the discriminability and robustness of speaker embeddings, particularly under label noise. TR-Conv shows potential as a general-purpose module for integration into deeper and more complex network architectures.
Dynamic Target Localization Method Based on Optical Quantum Transmission Distance Matrix Constructing
ZHOU Mu, WANG Min, CAO Jingyang, HE Wei
 doi: 10.11999/JEIT250020
[Abstract](83) [FullText HTML](35) [PDF 5414KB](6)
Abstract:
  Objective  Quantum information research has grown rapidly with the integration of quantum mechanics, information science, and computer science. Grounded in principles such as quantum superposition and quantum entanglement, quantum information technology can overcome the limitations of traditional approaches and address problems that classical information technologies and conventional computers cannot resolve. As a core technology, space-based quantum information technology has advanced quickly, offering new possibilities to overcome the performance bottlenecks of conventional positioning systems. However, existing quantum positioning methods mainly focus on stationary targets and have difficulty addressing the dynamic variations in the transmission channels of entangled photon pairs caused by particles, scatterers, and noise photons in the environment. These factors hinder the detection of moving targets and increase positioning errors because of reduced data acquisition at fixed points during target motion. Traditional wireless signal-based localization methods also face challenges in dynamic target tracking, including signal attenuation, multipath effects, and noise interference in complex environments. To address these limitations, a dynamic target localization method based on constructing an optical quantum transmission distance matrix is proposed. This method achieves high-precision and robust dynamic localization, meeting the requirements for moving target localization in practical scenarios. It provides centimeter-level positioning accuracy and significantly enhances the adaptability and stability of the system for moving targets, supporting the future practical application of quantum-based dynamic localization technology.  Methods  To improve the accuracy of the dynamic target localization system, a dynamic threshold optical quantum detection model based on background noise estimation is proposed, utilizing the characteristics of optical quantum echo signals. A dynamic target localization optical path is established in which two entangled optical signals are generated through the Spontaneous Parametric Down-Conversion (SPDC) process. One signal is retained as a reference in a local Single-Photon Detector (SPD), and the other is transmitted toward the moving target as the signal light. The optical quantum echo signals are analyzed, and the background noise is estimated using a coincidence counting algorithm. The detection threshold is then dynamically adjusted and compared with the signals from the detection unit, enabling rapid detection of dynamic targets. To accommodate variations in quantum echo signals caused by target motion, an adaptive optical quantum grouping method based on velocity measurement is introduced. The time pulse sequence is initially coarsely grouped to calculate the rough velocity of the target. The grouping size is subsequently adjusted according to the target’s speed, updating the time grouping sequence and further optimizing the distance measurement accuracy to generate an updated velocity matrix. The photon transmission distance matrix is refined using the relative velocity error matrix. By constructing a system of equations involving the coordinates of the light source, the optical quantum transmission distance matrix, and the dynamic target coordinate sequence, the target position is estimated through the least squares method. This approach improves localization accuracy and effectively reduces errors arising from target motion.  Results and Discussions  The effectiveness of the proposed method is verified through both simulations and experimental validation on a practical measurement platform. The experimental results demonstrate that the dynamic threshold detection approach based on background noise estimation achieves high-sensitivity detection performance (Fig. 7). When a moving target enters the detection range, rapid identification is realized, enabling subsequent dynamic localization. The adaptive grouping method based on velocity measurement significantly improves the performance of the quantum dynamic target localization system. Through grouped coincidence counting, the problem of blurred coincidence counting peaks caused by target movement is effectively mitigated (Fig. 8), achieving high-precision velocity measurement (Table 1) and reducing localization errors associated with motion. Centimeter-level positioning accuracy is attained (Fig. 9). Furthermore, an entangled optical quantum experimental platform is established, with analyses focusing on measurement results under different velocities and localization performances across various methods. The findings confirm the reliability and adaptability of the proposed approach in improving distance measurement accuracy (Fig. 12).  Conclusions  A novel method for dynamic target localization in entangled optical quantum dynamics is proposed based on constructing an optical quantum transmission distance matrix. The method enhances distance measurement accuracy and optimizes the overall positioning accuracy of the localization system through a background noise estimation-based dynamic threshold detection model and a velocity measurement-based adaptive grouping approach. By integrating the optical quantum transmission distance matrix with the least squares optimization method, the proposed framework offers a promising direction for achieving more precise quantum localization systems and demonstrates strong potential for real-time dynamic target tracking. This approach not only improves the accuracy of dynamic quantum localization systems but also broadens the applicability of quantum localization technology in complex environments. It is expected to provide solid support for real-time quantum dynamic target localization and find applications in intelligent health monitoring, the Internet of Things, and autonomous driving.
Performance Optimization of UAV-RIS-assisted Communication Networks Under No-Fly Zone Constraints
XU Junjie, LI Bin, YANG Jingsong
 doi: 10.11999/JEIT250681
[Abstract](138) [FullText HTML](86) [PDF 4102KB](20)
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RIS) mounted on Unmanned Aerial Vehicles (UAVs) are considered an effective approach to enhance wireless communication coverage and adaptability in complex or constrained environments. However, two major challenges remain in practical deployment. The existence of No-Fly Zones (NFZs), such as airports, government facilities, and high-rise areas, restricts the UAV flight trajectory and may result in communication blind spots. In addition, the continuous attitude variation of UAVs during flight causes dynamic misalignment between the RIS and the desired reflection direction, which reduces signal strength and system throughput. To address these challenges, a UAV-RIS-assisted communication framework is proposed that simultaneously considers NFZ avoidance and UAV attitude adjustment. In this framework, a quadrotor UAV equipped with a bottom-mounted RIS operates in an environment containing multiple polygonal NFZs and a group of Ground Users (GUs). The aim is to jointly optimize the UAV trajectory, RIS phase shift, UAV attitude (represented by Euler angles), and Base Station (BS) beamforming to maximize the system sum rate while ensuring complete obstacle avoidance and stable, high-quality service for GUs located both inside and outside NFZs.  Methods  To achieve this objective, a multi-variable coupled non-convex optimization problem is formulated, jointly capturing UAV trajectory, RIS configuration, UAV attitude, and BS beamforming under NFZ constraints. The RIS phase shifts are dynamically adjusted according to the UAV orientation to maintain beam alignment, and UAV motion follows quadrotor dynamics while avoiding polygonal NFZs. Because of the high dimensionality and non-convexity of the problem, conventional optimization approaches are computationally intensive and lack real-time adaptability. To address this issue, the problem is reformulated as a Markov Decision Process (MDP), which enables policy learning through deep reinforcement learning. The Soft Actor-Critic (SAC) algorithm is employed, leveraging entropy regularization to improve exploration efficiency and convergence stability. The UAV-RIS agent interacts iteratively with the environment, updating actor-critic networks to determine UAV position, RIS phase configuration, and BS beamforming. Through continuous learning, the proposed framework achieves higher throughput and reliable NFZ avoidance, outperforming existing benchmarks.  Results and Discussions  As shown in (Fig. 3), the proposed SAC algorithm achieves higher communication rates than PPO, DDPG, and TD3 during training, benefiting from entropy-regularized exploration that prevents premature convergence. Although DDPG converges faster, it exhibits instability and inferior long-term performance. As illustrated in (Fig. 4), the UAV trajectories under different conditions demonstrate the proposed algorithm’s capability to achieve complete obstacle avoidance while maintaining reliable communication. Regardless of variations in initial UAV positions, BS locations, or NFZ configurations, the UAV consistently avoids all NFZs and dynamically adjusts its trajectory to serve users located both inside and outside restricted zones, indicating strong adaptability and scalability of the proposed model. As shown in (Fig. 5), increasing the number of BS antennas enhances system performance. The proposed framework significantly outperforms fixed phase shift, random phase shift, and non-RIS schemes because of improved beamforming flexibility.  Conclusions  This paper investigates a UAV-RIS-assisted wireless communication system in which a quadrotor UAV carries an RIS to enhance signal reflection and ensure NFZ avoidance. Unlike conventional approaches that emphasize avoidance alone, a path integral-based method is proposed to generate obstacle-free trajectories while maintaining reliable service for GUs both inside and outside NFZs. To improve generality, NFZs are represented as prismatic obstacles with regular n-sided polygonal cross-sections. The system jointly optimizes UAV trajectory, RIS phase shifts, UAV attitude, and BS beamforming. A DRL framework based on the SAC algorithm is developed to enhance system efficiency. Simulation results demonstrate that the proposed approach achieves reliable NFZ avoidance and maximized sum rate, outperforms benchmarks in communication performance, scalability, and stability.
Minimax Robust Kalman Filtering under Multistep Random Measurement Delays and Packet Dropouts
YANG Chunshan, ZHAO Ying, LIU Zheng, QIU Yuan, JING Benqin
 doi: 10.11999/JEIT250741
[Abstract](48) [FullText HTML](21) [PDF 2768KB](11)
Abstract:
  Objective  Networked Control Systems (NCSs) provide advantages such as flexible installation, convenient maintenance, and reduced cost, but they also present challenges arising from random measurement delays and packet dropouts caused by communication network unreliability and limited bandwidth. Moreover, system noise variance may fluctuate significantly under strong electromagnetic interference. In NCSs, time delays are random and uncertain. When a set of Bernoulli-distributed random variables is used to describe multistep random measurement delays and packet dropouts, the fictitious noise method in existing studies introduces autocorrelation among different components, which complicates the computation of fictitious noise variances and makes it difficult to establish robustness. This study presents a solution for minimax robust Kalman filtering in systems characterized by uncertain noise variance, multistep random measurement delays, and packet dropouts.  Methods  The main challenges lie in model transformation and robustness verification. When a set of Bernoulli-distributed random variables is employed to represent multistep random measurement delays and packet dropouts, a series of strategies are applied to address the minimax robust Kalman filtering problem. First, a new model transformation method is proposed based on the flexibility of the Hadamard product in multidimensional data processing, after which a robust time-varying Kalman estimator is designed in a unified framework following the minimax robust filtering principle. Second, the robustness proof is established using matrix elementary transformation, strictly diagonally dominant matrices, the Gerŝgorin circle theorem, and the Hadamard product theorem within the framework of the generalized Lyapunov equation method. Additionally, by converting the Hadamard product into a matrix product through matrix factorization, a sufficient condition for the existence of a steady-state estimator is derived, and the robust steady-state Kalman estimator is subsequently designed.  Results and Discussions  The proposed minimax robust Kalman filter extends the robust Kalman filtering framework and provides new theoretical support for addressing the robust fusion filtering problem in complex NCSs. The curves (Fig. 5) present the actual accuracy \begin{document}${\text{tr}}{{\mathbf{\bar P}}^l}(N)$\end{document}, \begin{document}$l = a,b,c,d$\end{document} as a function of \begin{document}$ 0.1 \le {\alpha _0} $\end{document}, \begin{document}${\alpha _1} $\end{document}, \begin{document}${\alpha _2} \le 1 $\end{document}. It is observed that situation (1) achieves the highest robust accuracy, followed by situations (2) and (3), whereas situation (4) exhibits poorer accuracy. This difference arises because the estimators in situation (1) receive measurements with one-step random delay, whereas situation (4) experiences a higher packet loss rate. The curves (Fig. 5) confirm the validity and effectiveness of the proposed method. Another simulation is conducted for a mass-spring-damper system. The comparison between the proposed approach and the optimal robust filtering method (Table 2, Fig. 7) indicates that although the proposed method ensures that the actual prediction error variance attains the minimum upper bound, its actual accuracy is slightly lower than the optimal prediction accuracy.  Conclusions  The minimax robust Kalman filtering problem is investigated for systems characterized by uncertain noise variance, multistep random measurement delays, and packet dropouts. The system noise variance is uncertain but bounded by known conservative upper limits, and a set of Bernoulli-distributed random variables with known probabilities is used to represent the multistep random measurement delays and packet dropouts between the sensor and the estimator. The Hadamard product is used to enhance the model transformation method, followed by the design of a minimax robust time-varying Kalman estimator. Robustness is demonstrated through matrix elementary transformation, the Gerschgorin circle theorem, the Hadamard product theorem, matrix factorization, and the Lyapunov equation method. A sufficient condition is established for the time-varying generalized Lyapunov equation to possess a unique steady-state positive semidefinite solution, based on which a robust steady-state estimator is constructed. The convergence between the time-varying and steady-state estimators is also proven. Two simulation examples verify the effectiveness of the proposed approach. The presented methods overcome the limitations of existing techniques and provide theoretical support for solving the robust fusion filtering problem in complex NCSs.
Short Packet Secure Covert Communication Design and Optimization
TIAN Bo, YANG Weiwei, SHA Li, SHANG Zhihui, CAO Kuo, LIU Changming
 doi: 10.11999/JEIT250800
[Abstract](105) [FullText HTML](59) [PDF 1597KB](20)
Abstract:
  Objective  The study addresses the dual security threats of eavesdropping and detection in Multiple-Input Single-Output (MISO) communication systems under short packet transmission conditions. An integrated secure and covert transmission scheme is proposed, combining physical layer security with covert communication techniques. The approach aims to overcome the limitations of conventional encryption in short packet scenarios, enhance communication concealment, and ensure information confidentiality. The optimization objective is to maximize the Average Effective Secrecy and Covert Rate (AESCR) through the joint optimization of packet length and transmit power, thereby providing robust security for low-latency Internet of Things (IoT) applications.  Methods  An MISO system model employing MRT beamforming is adopted to exploit spatial degrees of freedom for improved security. Through theoretical analysis, closed-form expressions are derived for the warden’s (Willie’s) optimal detection threshold and minimum detection error probability. A statistical covertness constraint based on Kullback–Leibler (KL) divergence is formulated to convert intractable instantaneous requirements into a tractable average constraint. A new performance metric, the AESCR, is proposed to comprehensively assess system performance in terms of covertness, secrecy, and reliability. The optimization strategy centers on the joint design of packet length and transmit power. By utilizing the inherent coupling between these variables, the original dual-variable maximization problem is reformulated into a tractable form solvable through an efficient one-dimensional search.  Results and Discussions   Simulation results confirm the theoretical analysis, showing close consistency between the derived expressions and Monte Carlo simulations for Willie’s detection error probability. The findings indicate that multi-antenna configurations markedly enhance the AESCR by directing signal energy toward the legitimate receiver and reducing eavesdropping risk. The proposed joint optimization of transmit power and packet length achieves a substantially higher AESCR than power-only optimization, particularly under stringent covertness constraints. The study further reveals key trade-offs: an optimal packet length exists that balances coding gain and exposure risk, while relaxed covertness constraints yield continuous improvements in AESCR. Moreover, multi-antenna technology is shown to be crucial for mitigating the inherent low-power limitations of covert communication.  Conclusions  This study presents an integrated framework for secure and covert communication in short packet MISO systems, achieving notable performance gains through the joint optimization of transmit power and packet length. The main contributions include: (1) a transmission architecture that combines security and covertness, supported by closed-form solutions for the warden’s detection threshold and error probability under a KL divergence-based constraint; (2) the introduction of the AESCR metric, which unifies the assessment of secrecy, covertness, and reliability; and (3) the formulation and efficient resolution of the AESCR maximization problem. Simulation results verify that the proposed joint optimization strategy exceeds power-only optimization, particularly under stringent covertness conditions. The AESCR increases monotonically with the number of transmit antennas, and an optimal packet length is identified that balances transmission efficiency and covertness.
Coalition Formation Game based User and Networking Method for Status Update Satellite Internet of Things
GAO Zhixiang, LIU Aijun, HAN Chen, ZHANG Senbai, LIN Xin
 doi: 10.11999/JEIT250838
[Abstract](48) [FullText HTML](21) [PDF 2055KB](4)
Abstract:
  Objective  Satellite communication has become a major focus in the development of next-generation wireless networks due to its advantages of wide coverage, long communication distance, and high flexibility in networking. Short-packet communication represents a critical scenario in the Satellite Internet of Things (S-IoT). However, research on the status update problem for massive users remains limited. It is necessary to design reasonable user-networking schemes to address the contradiction between massive user access demands and limited communication resources. In addition, under the condition of large-scale user access, the design of user-networking schemes with low complexity remains a key research challenge. This study presents a solution for status updates in S-IoT based on dynamic orthogonal access for massive users.  Methods  In the S-IoT, a state update model for user orthogonal dual-layer access is established. A dual-layer networking scheme is proposed in which users dynamically allocate bandwidth to access the base station, and the base station adopts time-slot polling to access the satellite. The closed-form expression of the average Age of Information (aAoI) for users is derived based on short-packet communication theory, and a simplified approximate expression is further obtained under high signal-to-noise ratio conditions. Subsequently, a distributed Dual-layer Coalition Formation Game User-base Station-Satellite Networking (DCFGUSSN) algorithm is proposed based on the coalition formation game framework.  Results and Discussions  The approximate aAoI expression effectively reduces computational complexity. The exact potential game is used to demonstrate that the proposed DCFGUSSN algorithm achieves stable networking formation. Simulation results verify the correctness of the theoretical analysis of user aAoI in the proposed state update model (Fig. 5). The results further indicate that with an increasing number of iterations, the user aAoI gradually decreases and eventually converges (Fig. 6). Compared with other access schemes, the proposed dual-layer access scheme achieves a lower aAoI (Figs. 7\begin{document}$ \sim $\end{document}9).  Conclusions  This study investigates the networking problem of massive users assisted by base stations in the status update S-IoT. A dynamic dual-layer user access framework and the corresponding status update model are first established. Based on this framework, the DCFGUSSN algorithm is proposed to reduce user aAoI. Theoretical and simulation results show strong consistency, and the proposed algorithm demonstrates significant performance improvement compared with traditional algorithms.
Vegetation Height Prediction Dataset Oriented to Mountainous Forest Areas
YU Cuilin, ZHONG Zixuan, PANG Hongyi, DING Yusheng, LAI Tao, Huang Haifeng, WANG Qingsong
 doi: 10.11999/JEIT250941
[Abstract](64) [FullText HTML](33) [PDF 6680KB](7)
Abstract:
  Objective   Vegetation height is a key ecological parameter that reflects forest vertical structure, biomass, ecosystem functions, and biodiversity. Existing open-source vegetation height datasets are often sparse, unstable, and poorly suited to mountainous forest regions, which limits their utility for large-scale modeling. This study constructs the Vegetation Height Prediction Dataset (VHP-Dataset) to provide a standardized large-scale training resource that integrates multi-source remote sensing features and supports supervised learning tasks for vegetation height estimation.  Methods   The VHP-Dataset is constructed by integrating Landsat 8 multispectral imagery, the digital elevation model AW3D30 (ALOS World 3D, 30 m), land cover data CGLS-LC100 (Copernicus Global Land Service, Land Cover 100 m), and tree canopy cover data GFCC30TC (Global Forest Canopy Cover 30 m Tree Canopy). Canopy height from GEDI L2A (Global Ecosystem Dynamics Investigation, Level 2A) footprints is used as the target variable. A total of 18 input features is extracted, covering spatial location, spectral reflectance, topographic structure, vegetation indices, and vegetation cover information (Table 4, Fig. 4). For model validation, five representative approaches are applied: Extremely Randomized Trees (ExtraTree), Random Forest (RF), Artificial Neural Network (ANN), Broad Learning System (BLS), and Transformer. Model performance is assessed using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Standard Deviation (SD), and Coefficient of Determination (R2).  Results and Discussions   The experimental results show that the VHP-Dataset supports stable vegetation height prediction across regions and terrain conditions, which reflects its scientific validity and practical applicability. Model comparisons indicate that ExtraTree achieves the best performance in most regions, and Transformer performs well in specific areas, which confirms that the dataset is compatible with different approaches (Table 6). Stratified analyses show that prediction errors increase under high canopy cover and steep slope conditions, and predictions remain more stable at higher elevations (Figs. 69). These findings indicate that the dataset captures the effects of complex topography and canopy structure on model accuracy. Feature importance analysis shows that spatial location, topographic factors, and canopy cover indices are the primary drivers of prediction accuracy, while spectral and land cover information provide complementary contributions (Fig. 10).  Conclusions   The results show that the VHP-Dataset supports vegetation height prediction across regions and terrain types, which reflects its scientific validity and applicability. The dataset enables robust predictions with traditional machine learning methods such as tree-based models, and it also provides a foundation for deep learning approaches such as Transformers, which reflects broad methodological compatibility. Stratified analyses based on vegetation cover and terrain show the effects of complex canopy structures and topographic factors on prediction accuracy, and feature importance analysis identifies spatial location, topographic attributes, and canopy cover indices as the primary drivers. Overall, the VHP-Dataset fills the gap in large-scale high-quality datasets for vegetation height prediction in mountainous forests and provides a standardized benchmark for cross-regional model evaluation and comparison. This offers value for research on vegetation height prediction and forest ecosystem monitoring.
Comparison of DeepSeek-V3.1 and ChatGPT-5 in Multidisciplinary Team Decision-making for Colorectal Liver Metastases
ZHANG Yangzi, XU Ting, GAO Zhaoya, SI Zhenduo, XU Weiran
 doi: 10.11999/JEIT250849
[Abstract](69) [FullText HTML](29) [PDF 787KB](7)
Abstract:
  Objective   ColoRectal Cancer (CRC) is the third most commonly diagnosed malignancy worldwide. Approximately 25~50% of patients with CRC develop liver metastases during the course of their disease, which increases the disease burden. Although the MultiDisciplinary Team (MDT) model improves survival in ColoRectal Liver Metastases (CRLM), its broader implementation is limited by delayed knowledge updates and regional differences in medical standards. Large Language Models (LLMs) can integrate multimodal data, clinical guidelines, and recent research findings, and can generate structured diagnostic and therapeutic recommendations. These features suggest potential to support MDT-based care. However, the actual effectiveness of LLMs in MDT decision-making for CRLM has not been systematically evaluated. This study assesses the performance of DeepSeek-V3.1 and ChatGPT-5 in supporting MDT decisions for CRLM and examines the consistency of their recommendations with MDT expert consensus. The findings provide evidence-based guidance and identify directions for optimizing LLM applications in clinical practice.  Methods   Six representative virtual CRLM cases are designed to capture key clinical dimensions, including colorectal tumor recurrence risk, resectability of liver metastases, genetic mutation profiles (e.g., KRAS/BRAF mutations, HER2 amplification status, and microsatellite instability), and patient functional status. Using a structured prompt strategy, MDT treatment recommendations are generated separately by the DeepSeek-V3.1 and ChatGPT-5 models. Independent evaluations are conducted by four MDT specialists from gastrointestinal oncology, gastrointestinal surgery, hepatobiliary surgery, and radiation oncology. The model outputs are scored using a 5-point Likert scale across seven dimensions: accuracy, comprehensiveness, frontier relevance, clarity, individualization, hallucination risk, and ethical safety. Statistical analysis is performed to compare the performance of DeepSeek-V3.1 and ChatGPT-5 across individual cases, evaluation dimensions, and clinical disciplines.  Results and Discussions   Both LLMs, DeepSeek-V3.1 and ChatGPT-5, show robust performance across all six virtual CRLM cases, with an average overall score of ≥ 4.0 on a 5-point scale. This performance indicates that clinically acceptable decision support is provided within a complex MDT framework. DeepSeek-V3.1 shows superior overall performance compared with ChatGPT-5 (4.27±0.77 vs. 4.08±0.86, P=0.03). Case-by-case analysis shows that DeepSeek-V3.1 performs significantly better in Cases 1, 4, and 6 (P=0.04, P<0.01, and P =0.01, respectively), whereas ChatGPT-5 receives higher scores in Case 2 (P<0.01). No significant differences are observed in Cases 3 and 5 (P=0.12 and P=1.00, respectively), suggesting complementary strengths across clinical scenarios (Table 3). In the multidimensional assessment, both models receive high scores (range: 4.12\begin{document}$ \sim $\end{document}4.87) in clarity, individualization, hallucination risk, and ethical safety, confirming that readable, patient-tailored, reliable, and ethically sound recommendations are generated. Improvements are still needed in accuracy, comprehensiveness, and frontier relevance (Fig. 1). DeepSeek-V3.1 shows a significant advantage in frontier relevance (3.90±0.65 vs. 3.24±0.72, P=0.03) and ethical safety (4.87±0.34 vs. 4.58±0.65, P= 0.03) (Table 4), indicating more effective incorporation of recent evidence and more consistent delivery of ethically robust guidance. For the case with concomitant BRAF V600E and KRAS G12D mutations, DeepSeek-V3.1 accurately references a phase III randomized controlled study published in the New England Journal of Medicine in 2025 and recommends a triple regimen consisting of a BRAF inhibitor + EGFR monoclonal antibody + FOLFOX. By contrast, ChatGPT-5 follows conventional recommendations for RAS/BRAF mutant populations-FOLFOXIRI+bevacizumab-without integrating recent evidence on targeted combination therapy. This difference shows the effect of timely knowledge updates on the clinical value of LLM-generated recommendations. For MSI-H CRLM, ChatGPT-5’s recommendation of “postoperative immunotherapy” is not supported by phase III evidence or existing guidelines. Direct use of such recommendations may lead to overtreatment or ineffective therapy, representing a clear ethical concern and illustrating hallucination risks in LLMs. Discipline-specific analysis shows notable variation. In radiation oncology, DeepSeek-V3.1 provides significantly more precise guidance on treatment timing, dosage, and techniques than ChatGPT-5 (4.55±0.67 vs. 3.38±0.91, P<0.01), demonstrating closer alignment with clinical guidelines. In contrast, ChatGPT-5 performs better in gastrointestinal surgery (4.48±0.67 vs. 4.17 ±0.85, P=0.02), with experts rating its recommendations on surgical timing and resectability as more concise and accurate. No significant differences are identified in gastrointestinal oncology and hepatobiliary surgery (P=0.89 and P=0.14, respectively), indicating comparable performance in these areas (Table 5). These findings show a performance bias across medical sub-specialties, demonstrating that LLM effectiveness depends on the distribution and quality of training data.  Conclusions   Both DeepSeek-V3.1 and ChatGPT-5 demonstrated strong capabilities in providing reliable recommendations for CRLM-MDT decision-making. Specifically, DeepSeek-V3.1 showed notable advantages in integrating cutting-edge knowledge, ensuring ethical safety, and performing in the field of radiation oncology, whereas ChatGPT-5 excelled in gastrointestinal surgery, reflecting a complementary strength between the two models. This study confirms the feasibility of leveraging LLMs as “MDT collaborators”, offering a readily applicable and robust technical solution to bridge regional disparities in clinical expertise and enhance the efficiency of decision-making. However, model hallucination and insufficient evidence grading remain key limitations. Moving forward, mechanisms such as real-world clinical validation, evidence traceability, and reinforcement learning from human feedback are expected to further advance LLMs into more powerful auxiliary tools for CRLM-MDT decision support.
Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation
WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun
 doi: 10.11999/JEIT250934
[Abstract](75) [FullText HTML](27) [PDF 16618KB](12)
Abstract:
  Objective  Hydro-Turbine Generator Units (HTGUs) require reliable early fault detection to maintain operational safety and reduce maintenance cost. Acoustic signals provide a non-intrusive and sensitive monitoring approach, but their use is limited by complex structural acoustics, strong background noise, and the scarcity of abnormal data. An unsupervised acoustic anomaly detection framework is presented, in which a large-scale pretrained audio model is integrated with density-based k-nearest neighbors estimation. This framework is designed to detect anomalies using only normal data and to maintain robustness and strong generalization across different operational conditions of HTGUs.  Methods  The framework performs unsupervised acoustic anomaly detection for HTGUs using only normal data. Time-domain signals are preprocessed with Z-score normalization and Fbank features, and random masking is applied to enhance robustness and generalization. A large-scale pretrained BEATs model is used as the feature encoder, and an Attentive Statistical Pooling module aggregates frame-level representations into discriminative segment-level embeddings by emphasizing informative frames. To improve class separability, an ArcFace loss replaces the conventional classification layer during training, and a warm-up learning rate strategy is adopted to ensure stable convergence. During inference, density-based k-nearest neighbors estimation is applied to the learned embeddings to detect acoustic anomalies.  Results and Discussions  The effectiveness of the proposed unsupervised acoustic anomaly detection framework for HTGUs is examined using data collected from eight real-world machines. As shown in Fig. 7 and Table 2, large-scale pretrained audio representations show superior capability compared with traditional features in distinguishing abnormal sounds. With the FED-KE algorithm, the framework attains high accuracy across six metrics, with Hmean reaching 98.7% in the wind tunnel and exceeding 99.9% in the slip-ring environment, indicating strong robustness under complex industrial conditions. As shown in Table 4, ablation studies confirm the complementary effects of feature enhancement, ASP-based representation refinement, and density-based k-NN inference. The framework requires only normal data for training, reducing dependence on scarce fault labels and enhancing practical applicability. Remaining challenges include computational cost introduced by the pretrained model and the absence of multimodal fusion, which will be addressed in future work.  Conclusions  An unsupervised acoustic anomaly detection framework is proposed for HTGUs, addressing the scarcity of fault samples and the complexity of industrial acoustic environments. A pretrained large-scale audio foundation model is adopted and fine-tuned with turbine-specific strategies to improve the modeling of normal operational acoustics. During inference, a density-estimation-based k-NN mechanism is applied to detect abnormal patterns using only normal data. Experiments conducted on real-world hydropower station recordings show high detection accuracy and strong generalization across different operating conditions, exceeding conventional supervised approaches. The framework introduces foundation-model-based audio representation learning into the hydro-turbine domain, provides an efficient adaptation strategy tailored to turbine acoustics, and integrates a robust density-based anomaly scoring mechanism. These components jointly reduce dependence on labeled anomalies and support practical deployment for intelligent condition monitoring. Future work will examine model compression, such as knowledge distillation, to enable on-device deployment, and explore semi-/self-supervised learning and multimodal fusion to enhance robustness, scalability, and cross-station adaptability.
Research on Proximal Policy Optimization for Autonomous Long-Distance Rapid Rendezvous of Spacecraft
LIN Zheng, HU Haiying, DI Peng, ZHU Yongsheng, ZHOU Meijiang
 doi: 10.11999/JEIT250844
[Abstract](5) [FullText HTML](2) [PDF 3092KB](0)
Abstract:
Objective With the increasing demands of deep-space exploration, on-orbit servicing, and space debris removal missions, autonomous long-range rapid rendezvous capabilities have become critical for future space operations. Traditional trajectory planning approaches based on analytical methods or heuristic optimization often exhibit limitations when dealing with complex dynamics, strong disturbances, and uncertainties, which often makes it difficult to balance efficiency and robustness. Deep Reinforcement Learning (DRL), by combining the approximation capabilities of deep neural networks with the decision-making strengths of reinforcement learning, enables adaptive learning and real-time decision-making in high-dimensional continuous state and action spaces. In particular, the Proximal Policy Optimization (PPO) algorithm, with its training stability, sample efficiency, and ease of implementation, has emerged as a representative policy gradient method that enhances policy exploration while ensuring stable policy updates. Therefore, integrating DRL with PPO into spacecraft long-range rapid rendezvous tasks can not only overcome the limitations of conventional methods but also provide an intelligent, efficient, and robust solution for autonomous guidance in complex orbital environments.Methods This study first establishes a spacecraft orbital dynamics model incorporating the effects of J2 perturbation, while also modeling uncertainties such as position and velocity measurement errors and actuator deviations during on-orbit operations. Subsequently, the long-range rapid rendezvous problem is formulated as a Markov Decision Process (MDP), with the state space defined by variables including position, velocity, and relative distance, and the action space characterized by impulse duration and direction. The model further integrates fuel consumption and terminal position and velocity constraints. Based on this formulation, a DRL framework leveraging PPO was constructed, in which the policy network outputs maneuver command distributions and the value network estimate state values to improve training stability. To address convergence difficulties arising from sparse rewards, an enhanced dense reward function was designed, combining a position potential function with a velocity-guidance function to guide the agent toward the target while gradually decelerating and ensuring fuel efficiency. Finally, the optimal maneuver strategy for the spacecraft was obtained through simulation-based training, and its robustness was validated under various uncertainty conditions.Results and Discussions Based on the aforementioned DRL framework, a comprehensive simulation was conducted to evaluate the effectiveness and robustness of the proposed improved algorithm. In Case 1, three reward structures were tested: sparse reward, traditional dense reward, and an improved dense reward integrating a relative position potential function and a velocity guidance term. The results indicate that the design of the reward function significantly impacts convergence behavior and policy stability. With a sparse reward structure, the agent lacks process feedback, which hinders effective exploration of feasible actions. The traditional dense reward provides continuous feedback, allowing for gradual convergence toward local optima. However, terminal velocity deviations remain uncorrected in the later stages, leading to suboptimal convergence and incomplete satisfaction of terminal constraints. In contrast, the improved dense reward effectively guides the agent toward favorable behaviors from the early training stages while penalizing undesirable actions at each step, thereby accelerating convergence and enhancing robustness. The velocity guidance term enables the agent to anticipate necessary adjustments during the mid-to-late phases of the approach, rather than postponing corrections until the terminal phase, resulting in more fuel-efficient maneuvers. The simulation results further demonstrate the actual performance: the maneuvering spacecraft executed 10 impulsive maneuvers throughout the mission, achieving a terminal relative distance of 21.326 km, a relative velocity of 0.0050 km/s, and a total fuel consumption of 111.2123 kg. Furthermore, to validate the robustness of the trained model against realistic uncertainties in orbital operations, 1000 Monte Carlo simulations were performed. As presented in Table 5, the mission success rate reached 63.40%, with fuel consumption in all trials remaining within acceptable bounds. Finally, to verify the superiority of the PPO algorithm, its performance was compared with that of DDPG in a multi-impulse fast-approach rendezvous mission in Case 2. The results from PPO training show that the maneuvering spacecraft performed 5 impulsive maneuvers, achieving a terminal separation of 2.2818 km, a relative velocity of 0.0038 km/s, and a total fuel consumption of 4.1486 kg. The DDPG training results indicate that the maneuvering spacecraft consumed 4.3225 kg of fuel, achieving a final separation of 4.2731 km and a relative velocity of 0.0020 km/s. Both algorithms successfully fulfill mission requirements, with comparable fuel usage. However, it is noted that DDPG required a training duration of 9 hours and 23 minutes, incurring significant computational resource consumption. In contrast, the PPO training process was relatively more efficient, converging within 6 hours and 4 minutes. Therefore, although DDPG exhibits higher sample efficiency, its longer training cycle and greater computational burden make it less efficient than PPO in practical applications. The comparative analysis demonstrates that the proposed PPO with the improved dense reward significantly enhances learning efficiency, policy stability, and robustness.Conclusions This study addressed the problem of autonomous long-range rapid rendezvous for spacecraft under J2 perturbation and uncertainties, and proposed a PPO-based trajectory optimization method. The results demonstrated that the proposed approach could generate maneuver trajectories satisfying terminal constraints under limited fuel and transfer time, while outperforming conventional methods in terms of convergence speed, fuel efficiency, and robustness. The main contributions of this work are: (1) the development of an orbital dynamics framework that incorporates J2 perturbation and uncertainty modeling, and the formulation of the rendezvous problem as a MDP; (2) the design of an enhanced dense reward function combining a position potential function and a velocity-guidance function, which effectively improved training stability and convergence efficiency; (3) simulation-based validation of PPO’s applicability and robustness in complex orbital environments, providing a feasible solution for future autonomous rendezvous and on-orbit servicing missions. Future work will consider sensor noise, environmental disturbances, and multi-spacecraft cooperative rendezvous in complex mission scenarios, aiming to enhance the algorithm’s practical applicability and generalization to real-world operations.
Optimization of Energy Consumption in Semantic Communication Networks for Image Recovery Tasks
CHEN Yang, MA Huan, JI Zhi, LI Ying Qi, LIANG Jia Yu, GUO Lan
 doi: 10.11999/JEIT250915
[Abstract](5) [FullText HTML](3) [PDF 4932KB](0)
Abstract:
  Objective  With the rapid development of semantic communication and the increasing demand for high-fidelity image recovery, high computation and transmission energy consumption have become critical issues limiting network deployment. However, existing resource management strategies are mostly static and have limitations in adapting to dynamic wireless environments and user mobility. To address these challenges, a robust energy optimization strategy driven by a modified Multi-Agent Proximal Policy Optimization (MAPPO) algorithm has emerged as a promising approach. By jointly optimizing communication and computing resources, it is possible to minimize the total network energy consumption while strictly satisfying multi-dimensional constraints such as latency and image recovery quality.  Methods  First, a theoretical model for the semantic communication network is constructed , and a closed-form expression for the user Symbol Error Rate (SER) is derived via asymptotic analysis of the uplink Signal-to-Interference-plus-Noise Ratio (SINR). Subsequently, the coupling relationship among semantic extraction rate, transmit power, computing resources, and network energy consumption is quantified. Based on this, a joint optimization model is established to minimize the total energy under constraints of delay, accuracy, and reliability. To solve this complex mixed-integer nonlinear programming problem, a modified MAPPO algorithm is designed. This algorithm incorporates Long Short-Term Memory (LSTM) networks to capture temporal dynamics of user positions and channel states, and introduces a noise mechanism into the global state and advantage function to enhance policy exploration and robustness.  Results and Discussions  Simulation results demonstrate that the proposed algorithm significantly outperforms baseline methods (including standard MAPPO, NOISE-MAPPO, LSTM-MAPPO, MADDPG, and Greedy algorithms). Specifically, the proposed strategy accelerates the training convergence speed by 66.7%–80% compared to benchmarks. Furthermore, the algorithm exhibits superior stability in dynamic environments, improving the stability of network energy consumption by approximately 50% and user latency stability by over 96%. Additionally, the average SER is effectively reduced by 4%–16.33% without compromising the ultimate image recovery performance, verifying the algorithm's capability to balance energy efficiency and task reliability.  Conclusions   This paper addresses the challenge of energy optimization in semantic communication networks by integrating theoretical modeling with a modified deep reinforcement learning framework. The proposed decision-making method enhances the standard MAPPO algorithm by leveraging LSTM for temporal feature extraction and noise mechanisms for robust exploration. The method is evaluated through simulations in dynamic single-cell and multi-cell scenarios, and the results show that: (1) The proposed method significantly improves convergence efficiency and system stability over the baselines; (2) A better trade-off between energy consumption and service quality is achieved, providing a theoretical foundation and an efficient resource management framework for future energy-constrained semantic communication systems.
Differentiable Sparse Mask Guided Infrared Small Target Fast Detection Network
SHENG Weidong, WU Shuanglin, XIAO Chao, LONG Yunli, LI Xiaobin, ZHANG Yiming
 doi: 10.11999/JEIT250989
[Abstract](38) [FullText HTML](12) [PDF 3515KB](8)
Abstract:
  Objective  Infrared small target detection holds significant and irreplaceable application value across various critical domains, including infrared guidance, environmental monitoring, and security surveillance. Its importance is underscored by tasks such as early warning systems, precision targeting, and pollution tracking, where timely and accurate detection is paramount. The core challenges in this domain stem from the inherent characteristics of infrared small targets: their extremely small size (typically less than 9×9 pixels), limited spatial features due to long imaging distance and the high probability of being overwhelmed by complex and cluttered backgrounds, such as cloud cover, sea glint, or urban thermal noise. These factors make it difficult to distinguish genuine targets from background clutter using conventional methods. Existing approaches to infrared small target detection can be broadly categorized into traditional model-based methods and modern deep learning techniques. Traditional methods often rely on manually designed background suppression operators, such as morphological filters (e.g., Top-Hat) or low-rank matrix recovery (e.g., IPI). While these methods are interpretable in simple scenarios, they struggle to adapt to dynamic and complex real-world environments, leading to high false alarm rates and limited robustness. On the other hand, deep learning-based methods, particularly those employing dense convolutional neural networks (CNNs), have shown improved detection performance by leveraging data-driven feature learning. However, these networks often fail to fully account for the extreme imbalance between target and background pixels—where targets typically constitute less than 1% of the entire image. This imbalance results in significant computational redundancy, as the network processes vast background regions that contribute little to the detection task, thereby hampering efficiency and real-time performance. To address these challenges, exploiting the sparsity of infrared small targets offers a promising direction. By designing a sparse mask generation module that capitalizes on target sparsity, it becomes feasible to coarsely extract potential target regions while filtering out the majority of redundant background areas. This coarse target region can then be refined through subsequent processing stages to achieve satisfactory detection performance. This paper presents an intelligent solution that effectively balances high detection accuracy with computational efficiency, making it suitable for real-time applications.  Methods  This paper proposes an end-to-end infrared small target detection network guided by a differentiable sparse mask. First, an input infrared image is preprocessed with convolution to generate raw features. A differentiable sparse mask generation module then uses two convolution branches to produce a probability map and a threshold map, and outputs a binary mask via a differentiable binarization function to extract target candidate regions and filter background redundancy. Next, a target region sampling module converts dense raw features into sparse features based on the binary mask. A sparse feature extraction module with a U-shaped structure (composed of encoders, decoders, and skip connections) using Minkowski Engine sparse convolution performs refined processing only on non-zero target regions to reduce computation. Finally, a pyramid pooling module fuses multi-scale sparse features, and the fused features are fed into a target-background binary classifier to output detection results.  Results and Discussions  To fully validate the effectiveness of the proposed method, comprehensive experiments were conducted on two mainstream infrared small target datasets: NUAA-SIRST, which contains 427 real-world infrared images extracted from actual videos, and NUDT-SIRST, a large-scale synthetic dataset with 1327 diverse images. The method was compared against 3 representative traditional algorithms (e.g., Top-Hat, IPI) and 6 state-of-the-art deep learning methods (e.g., DNA-Net, ACM). Results demonstrate the method achieves competitive detection performance: on NUAA-SIRST, it attains 74.38% IoU, 100% Pd, and 7.98×10-6 Fa; on NUDT-SIRST, it reaches 83.03% IoU, 97.67% Pd, and 9.81×10-6 Fa, matching the performance of leading deep learning methods. Notably, it excels in efficiency: with only 0.35M parameters, 11.10G Flops, and 215.06 FPS, its FPS is 4.8 times that of DNA-Net, significantly cutting computational redundancy. Ablation experiments (Fig.6) confirm the differentiable sparse mask module effectively filters most backgrounds while preserving target regions. Visual results (Fig.5) show fewer false alarms than traditional methods like PSTNN, as its "coarse-to-fine" mode reduces background interference, verifying balanced performance and efficiency.  Conclusions  This paper addresses the massive computational redundancy of existing dense computing methods in infrared small target detection—caused by extremely unbalanced target-background proportion (target proportion is usually smaller than 1% of the whole image)—by proposing a fast infrared small target detection network guided by a differentiable sparse mask. The network adaptively extracts candidate target regions and filters background redundancy via a differentiable sparse mask generation module, and constructs a feature extraction module based on Minkowski Engine sparse convolution to reduce computation, forming an end-to-end "coarse-to-fine" detection framework. Experiments on NUDT-SIRST and NUAA-SIRST datasets demonstrate that the proposed method achieves comparable detection performance to existing deep learning methods while significantly optimizing computational efficiency, balancing detection accuracy and speed. It provides a new idea for reducing redundancy based on sparsity in infrared small target detection, is applicable to scenarios like remote sensing detection, infrared guidance and environmental monitoring that require both real-time performance and accuracy, and offers useful references for the lightweight development of the field.
2025, 47(11).  
[Abstract](4) [FullText HTML](3) [PDF 2027KB](2)
Abstract:
2025, 47(11): 1-6.  
[Abstract](4) [FullText HTML](4) [PDF 319KB](2)
Abstract:
Excellence Action Plan Leading Column
A Survey on System and Architecture Optimization Techniques for Mixture-of-Experts Large Language Models
WANG Zehao, ZHU Zhenhua, XIE Tongxin, WANG Yu
2025, 47(11): 4055-4078.   doi: 10.11999/JEIT250407
[Abstract](1045) [FullText HTML](533) [PDF 5876KB](294)
Abstract:
The Mixture-of-Experts (MoE) framework has become a pivotal approach for enhancing the knowledge capacity and inference efficiency of Large Language Models (LLMs). Conventional methods for scaling dense LLMs have reached significant limitations in training and inference due to computational and memory constraints. MoE addresses these challenges by distributing knowledge representation across specialized expert sub-networks, enabling parameter expansion while maintaining efficiency through sparse expert activation during inference. However, the dynamic nature of expert activation introduces substantial challenges in resource management and scheduling, necessitating targeted optimization at both the system and architectural levels. This survey focuses on the deployment of MoE-based LLMs. It first reviews the definitions and developmental trajectory of MoE, followed by an in-depth analysis of current system-level optimization strategies and architectural innovations tailored to MoE. The paper concludes by summarizing key findings and proposing prospective optimization techniques for MoE-based LLMs.  Significance   The MoE mechanism offers a promising solution to the computational and memory limitations of dense LLMs. By distributing knowledge representation across specialized expert sub-networks, MoE facilitates model scaling without incurring prohibitive computational costs. This architecture alleviates the bottlenecks associated with training and inference in traditional dense models, marking a notable advance in LLM research. Nonetheless, the dynamic expert activation patterns inherent to MoE introduce new challenges in resource scheduling and management. Overcoming these challenges requires targeted system- and architecture-level optimizations to fully harness the potential of MoE-based LLMs.  Progress   Recent advancements in MoE-based LLMs have led to the development of various optimization strategies. At the system level, approaches such as automatic parallelism, communication-computation pipelining, and communication operator fusion have been adopted to reduce communication overhead. Memory management has been improved through expert prefetching, caching mechanisms, and queue scheduling policies. To address computational load imbalance, both offline scheduling methods and runtime expert allocation strategies have been proposed, including designs that leverage heterogeneous CPU-GPU architectures. In terms of hardware architecture, innovations include dynamic adaptation to expert activation patterns, techniques to overcome bandwidth limitations, and near-memory computing schemes that improve deployment efficiency. In parallel, the open-source community has developed supporting tools and frameworks that facilitate the practical deployment and optimization of MoE-based models.  Conclusions  This survey presents a comprehensive review of system and architectural optimization techniques for MoE-based LLMs. It highlights the importance of reconciling parameter scalability with computational efficiency through the MoE framework. The dynamic nature of expert activation poses significant challenges in scheduling and resource management, which are systematically addressed in this survey. By evaluating current optimization techniques across both system and hardware layers, the paper offers key insights into the state of the field. It also proposes directions for future work, providing a reference for researchers and practitioners seeking to improve the performance and scalability of MoE-based models. The findings emphasize the need for continued innovation across algorithm development, system engineering, and architectural design to fully realize the potential of MoE in real-world applications.  Prospects   Future research on MoE-based LLMs is expected to advance the integration of algorithm design, system optimization, and hardware co-design. Key research directions include resolving load imbalance and maximizing resource utilization through adaptive expert scheduling algorithms, refining system frameworks to support dynamic sparse computation more effectively, and exploring hardware paradigms such as near-memory computing and hierarchical memory architectures. These developments aim to deliver more efficient and scalable MoE model deployments by fostering deeper synergy between software and hardware components.
Overviews
A Review of Clutter Suppression Techniques in Ground Penetrating Radar: Mechanisms, Methods, and Challenges
LEI Wentai, WANG Yiming, ZHONG Jiwei, XU Qiguo, JIANG Yuyin, LI Cheng
2025, 47(11): 4079-4095.   doi: 10.11999/JEIT250524
[Abstract](323) [FullText HTML](137) [PDF 3397KB](54)
Abstract:
  Significance   Ground Penetrating Radar (GPR) is a widely adopted non-destructive subsurface detection technology, extensively applied in urban subsurface exploration, transportation infrastructure monitoring, geophysical surveys, and military operations. It is employed to detect underground pipelines, structural foundations, road voids, and concealed defects in roadbeds, railway tracks, and tunnels, as well as shallow geological formations and military targets such as unexploded ordnance. However, the presence of clutter—unwanted signals including direct coupling waves, ground reflections, and non-target echoes—severely degrades GPR data quality and complicates target detection, localization, imaging, and parameter estimation. Effective clutter suppression is therefore essential to enhance the accuracy and reliability of GPR data interpretation, making it a central research focus in improving GPR performance across diverse application domains.  Progress   Significant progress has been achieved in GPR clutter suppression, largely through two main approaches: signal model-based and neural network-based methods. Signal model-based techniques, such as time-frequency analysis, subspace decomposition, and dictionary learning, rely on physical modeling to distinguish clutter from target signals. These methods provide clear interpretability but are limited in addressing complex and non-linear clutter patterns. Neural network-based methods, employing architectures such as Convolutional Neural Networks, U-Net, and Generative Adversarial Networks, are more effective in capturing non-linear features through data-driven learning. Recent advances, including multi-scale convolutional autoencoders, attention mechanisms, and hybrid models, have further enhanced clutter suppression under challenging conditions. Quantitative metrics such as Mean Squared Error, Peak Signal-to-Noise Ratio, and Structural Similarity Index are commonly used for performance evaluation, often complemented by qualitative visual assessment.  Conclusion  The complexity and diversity of GPR clutter, originating from direct coupling, ground reflections, equipment imperfections, non-uniform media, and non-target scatterers, demand robust suppression strategies. Signal model-based methods provide strong theoretical foundations but are constrained by simplified assumptions, whereas neural network-based approaches offer greater adaptability at the expense of large data requirements and high computational cost. Hybrid approaches that integrate the strengths of both paradigms show considerable potential in addressing complex clutter scenarios. The selection of evaluation metrics plays a pivotal role in algorithm design, with quantitative measures offering objective assessment and qualitative analyses providing intuitive validation. Despite recent advances, significant challenges remain in suppressing non-linear clutter, enabling real-time processing, and reducing reliance on labeled data.  Prospect   Future research in GPR clutter suppression is likely to emphasize integrating the strengths of signal model-based and neural network-based methods to develop robust and adaptive solutions. Real-time processing and online learning will be prioritized to meet the requirements of dynamic applications. Self-supervised and unsupervised learning approaches are expected to reduce dependence on costly labeled datasets and improve model adaptability. Cross-task learning and multi-modal fusion, combining data from multiple sensors or frequencies, are expected to enhance robustness and precision. Furthermore, deeper integration of physical principles, including electromagnetic wave propagation and media properties, into algorithm design is expected to improve suppression accuracy and computational efficiency, advancing the development of more intelligent and effective GPR systems.
Research Progress of Deep Learning Enabled Automatic Modulation Classification Technology
ZHENG Qinghe, LI Binglin, YU Zhiguo, JIANG Weiwei, ZHU Zhengyu, XU Chi, HUANG Chongwen, GUI Guan
2025, 47(11): 4096-4111.   doi: 10.11999/JEIT250674
[Abstract](246) [FullText HTML](138) [PDF 3939KB](58)
Abstract:
  Significance   With the advancement of sixth-Generation (6G) wireless communication systems towards the terahertz frequency band and space-air-ground integrated networks, the communication environment is becoming increasingly heterogeneous and densely deployed. This evolution imposes stringent precision requirements at the sub-symbol period level for Automatic Modulation Classification (AMC). Under complex channel conditions, AMC faces several challenges: feature mixing and distortion caused by time-varying multipath channels, substantial degradation in recognition accuracy of traditional methods under low Signal-to-Noise Ratio (SNR) conditions, and elevated complexity in detecting mixed modulation signals introduced by Sparse Code Multiple Access (SCMA) techniques. Addressing these challenges, this paper first analyzes the fundamental constraints on AMC method design from the perspective of signal transmission characteristics in communication models. It then systematically reviews Deep Learning (DL)-based AMC approaches, summarizes the difficulties these methods encounter in different wireless communication scenarios, evaluates the performance of representative DL models, and concludes with a discussion of current limitations in AMC together with promising research directions.  Process   Current research on AMC technology under complex channel conditions mainly focuses on three methodological categories: Likelihood-Based (LB), Feature-Based (FB), and DL, emphasizing both theoretical exploration and algorithmic innovation. Among these, end-to-end DL approaches have demonstrated superior performance in AMC tasks. By stacking multiple layers of nonlinear activation functions, DL models establish strong nonlinear fitting capabilities that allow them to uncover hidden patterns in radio signals. This enables DL to achieve high robustness and accuracy in complex environments. Convolutional Neural Networks (CNNs), leveraging their hierarchical local perception mechanism, can effectively capture amplitude and phase distortion characteristics of modulated signals, showing distinctive advantages in spatial feature extraction. Recurrent Neural Networks (RNNs), through the temporal memory function of gated units, exhibit theoretical superiority in modeling dynamic signal impairments such as inter-symbol interference, carrier frequency offset, carrier phase offset, and timing errors. More recently, Transformer architectures have achieved global feature association modeling through self-attention mechanisms, thereby enhancing the ability to identify key features and markedly improving AMC accuracy under low SNR conditions. The application potential of Transformers in AMC can be further extended by integrating multi-scale feature fusion, optimizing computational efficiency, and improving generalization.  Prospects   With the continuous growth of communication demands and the increasing complexity of application scenarios, the efficient and reliable management and utilization of wireless spectrum resources has become a central research focus. AMC enables mobile communication systems to achieve dynamic channel adaptation and heterogeneous network integration. Driven by the development of space-air-ground integrated networks, the application scope of AMC has expanded beyond traditional terrestrial cellular systems to emerging domains such as satellite communication and vehicular networking. DL-based AMC frameworks can capture dynamic channel responses through joint time-frequency domain representations, enhance transient feature extraction via attention mechanisms, and effectively decouple the coupling effects of multipath fading and Doppler shifts. By applying neural architecture search and model quantization-compression techniques, DL models can achieve low-complexity, real-time inference at the edge, thereby supporting end-to-end latency control in Vehicle-to-Everything (V2X) communication links. Furthermore, advanced DL architectures introduce feature enhancement mechanisms to preserve signal phase integrity, improving resilience against channel distortion. In dynamic optical network monitoring, feature extraction networks tailored to time-varying channels can adaptively capture the evolution of nonlinear phase shifts. Through implicit channel compensation, DL enables collaborative learning of time-domain and frequency-domain features. At present, AMC technology is progressing towards elastic architectures that support dynamic reconstruction of model parameters through online knowledge distillation and meta-learning frameworks, offering adaptive and lightweight solutions for Internet-of-Things (IoT) scenarios.  Conclusions  This paper systematically reviews the current research and challenges of AMC technology in the context of 6G networks. First, the applications of CNNs, RNNs, Transformers, and hybrid DL models in AMC are discussed in detail, with analysis of the technical advantages and limitations of each approach. Next, three representative application scenarios are examined: the mobile communication, the optical communication, and the IoT, highlighting the specific challenges faced by AMC technology. At present, the development of DL-driven AMC has moved beyond model design to include deployment and application challenges in real wireless communication environments. For example, constructing DL architectures with continuous learning capabilities is essential for adapting to dynamic communication conditions, while developing large-scale DL models provides an effective way to improve cross-scenario generalization. Future research should emphasize directions that integrate prior knowledge of the physical layer with DL architectures, strengthen feature fusion strategies, and advance hardware-algorithm co-design frameworks.
Advances in Deep Neural Network Based Image Compression: A Survey
BAI Yuanchao, LIU Wenchang, JIANG Junjun, LIU Xianming
2025, 47(11): 4112-4128.   doi: 10.11999/JEIT250567
[Abstract](486) [FullText HTML](271) [PDF 7384KB](99)
Abstract:
  Significance   With the continuous advancement of information technology, digital images are evolving toward ultra-high-definition formats characterized by increased resolution, dynamic range, color depth, sampling rates, and multi-viewpoint support. In parallel, the rapid development of artificial intelligence is reshaping both the generation and application paradigms of digital imagery. As visual big data converges with AI technologies, the volume and diversity of image data expand exponentially, creating unprecedented challenges for storage and transmission. As a core technology in digital image processing, image compression reduces storage costs and bandwidth requirements by eliminating internal information redundancy, thereby serving as a fundamental enabler for visual big data applications. However, traditional image compression standards increasingly struggle to meet rising industrial demands due to limited modeling capacity, inadequate perceptual adaptability, and poor compatibility with machine vision tasks. Deep Neural Network (DNN)-based image compression methods, leveraging powerful modeling capabilities, end-to-end optimization mechanisms, and compatibility with both human perception and machine understanding, are progressively exceeding conventional coding approaches. These methods demonstrate clear advantages and broad potential across diverse application domains, drawing growing attention from both academia and industry.  Progress   This paper systematically reviews recent advances in DNN-based image compression from three core perspectives: signal fidelity, human visual perception, and machine analysis. First, in signal fidelity-oriented compression, the rate-distortion optimization framework is introduced, with detailed discussion of key components in lossy image compression, including nonlinear transforms, quantization strategies, entropy coding mechanisms, and variable-rate techniques for multi-rate adaptation. The synergistic design of these modules underpins the architecture of modern DNN-based image compression systems. Second, in perceptual quality-driven compression, the principles of joint rate-distortion-perception optimization models are examined, together with a comparative analysis of two major perceptual paradigms: Generative Adversarial Network (GAN)-based models and diffusion model–based approaches. Both strategies employ perceptual loss functions or generative modeling techniques to markedly improve the visual quality of reconstructed images, aligning them more closely with the characteristics of the human visual system. Finally, in machine analysis-oriented compression, a co-optimization framework for rate-distortion-accuracy trade-offs is presented, with semantic fidelity as the primary objective. From the perspective of integrating image compression with downstream machine analysis architectures, this section analyzes how current methods preserve essential semantic information that supports tasks such as object detection and semantic segmentation during the compression process.  Conclusions  DNN-based image compression shows strong potential across signal fidelity, human visual perception, and machine analysis. Through end-to-end jointly optimized neural network architectures, these methods provide comprehensive modeling of the encoding process and outperform traditional approaches in compression efficiency. By leveraging the probabilistic modeling and image generation capabilities of DNNs, they can accurately estimate distributional differences between reconstructed and original images, quantify perceptual losses, and generate high-quality reconstructions that align with human visual perception. Furthermore, their compatibility with mainstream image analysis frameworks enables the extraction of semantic features and the design of collaborative optimization strategies, allowing efficient compression tailored to machine vision tasks.  Prospects   Despite significant progress in compression performance, perceptual quality, and task adaptability, DNN-based image compression still faces critical technical challenges and practical limitations. First, computational complexity remains high. Most high-performance models rely on deep and sophisticated architectures (e.g., attention mechanisms and Transformer models), which enhance modeling capability but also introduce substantial computational overhead and long inference latency. These limitations are particularly problematic for deployment on mobile and embedded devices. Second, robustness and generalization continue to be major concerns. DNN-based compression models are sensitive to input perturbations and vulnerable to adversarial attacks, which can lead to severe reconstruction distortions or even complete failure. Moreover, while they perform well on training data and similar distributions, their performance often degrades markedly under cross-domain scenarios. Third, the evaluation framework for perceptual- and machine vision-oriented compression remains immature. Although new evaluation dimensions have been introduced, no unified and objective benchmark exists. This gap is especially evident in machine analysis-oriented compression, where downstream tasks vary widely and rely on different visual models. Therefore, comparability across methods is limited and consistent evaluation metrics are lacking, constraining both research and practical adoption. Overall, DNN-based image compression is in transition from laboratory research to real-world deployment. Although it demonstrates clear advantages over traditional approaches, further advances are needed in efficiency, robustness, generalization, and standardized evaluation protocols. Future research should strengthen the synergy between theoretical exploration and engineering implementation to accelerate widespread adoption and continued progress in areas such as multimedia communication, edge computing, and intelligent image sensing systems.
A Survey on Physical Layer Security in Near-Field Communication
XU Yongjun, LI Jing, LUO Dongxin, WANG Ji, LI Xingwang, YANG Long, CHEN Li
2025, 47(11): 4129-4143.   doi: 10.11999/JEIT250336
[Abstract](382) [FullText HTML](208) [PDF 5806KB](68)
Abstract:
  Significance   Traditional wireless communication systems have relied on far-field plane-wave models to support wide-area coverage and long-distance transmission. However, emerging Sixth-Generation (6G) applications—such as extended reality, holographic communication, pervasive intelligence, and smart factories—demand ultra-high bandwidth, ultra-low latency, and sub-centimeter-level localization accuracy. These requirements exceed the spatial multiplexing gains and interference suppression achievable under far-field assumptions. Enabled by extremely large-scale antenna arrays and terahertz technologies, the near-field region has expanded to hundreds of meters, where spherical-wave propagation enables precise beam focusing and flexible spatial resource management. The additional degrees of freedom in the angle and distance domains, however, give rise to new Physical Layer Security (PLS) challenges, including joint angle-distance eavesdropping, beam-split-induced information leakage caused by frequency-dependent focusing, and security-interference conflicts in hybrid near- and far-field environments. This paper provides a comprehensive survey of near-field PLS techniques, advancing theoretical understanding of spherical-wave propagation and associated threat models while offering guidance for designing robust security countermeasures and informing the development of future 6G security standards.  Progress   This paper presents a comprehensive survey of recent advances in PLS for near-field communications in 6G networks, with an in-depth discussion of key enabling technologies and optimization methodologies. Core security techniques, including beam focusing, Artificial Noise (AN), and multi-technology integration, are first examined in terms of their security objectives. Beam focusing exploits ultra-large-scale antenna arrays and the spherical-wave propagation characteristics of near-field communication to achieve precise spatial confinement, thereby reducing information leakage. AN introduces deliberately crafted noise toward undesired directions to hinder eavesdropping. Multi-technology integration combines terahertz communications, Reconfigurable Intelligent Surfaces (RIS), and Integrated Sensing And Communication (ISAC), markedly enhancing overall security performance. Tailored strategies are then analyzed for different transmission environments, including Line-of-Sight (LoS), Non-Line-of-Sight (NLoS), and hybrid near–far-field conditions. In LoS scenarios, beamforming optimization strengthens interference suppression. In NLoS scenarios, RIS reconstructs transmission links, complicating unauthorized reception. For hybrid near-far-field environments, multi-beam symbol-level precoding spatially distinguishes users and optimizes beamforming patterns, ensuring robust security for mixed-distance user groups. Finally, critical challenges are highlighted, including complex channel modeling, tradeoffs between security and performance, and interference management in converged multi-network environments. Promising directions for future research are also identified, such as Artificial Intelligence (AI)-assisted security enhancement, cooperative multi-technology schemes, and energy-efficient secure communications in near-field systems.  Conclusions  This paper provides a comprehensive survey of PLS techniques for near-field communications, with particular emphasis on enabling technologies and diverse transmission scenarios. The fundamentals and system architecture of near-field communications are first reviewed, highlighting their distinctions from far-field systems and their unique channel characteristics. Representative PLS approaches are then examined, including beam focusing, AN injection, and multi-technology integration with RIS and ISAC. Secure transmission strategies are further discussed for LoS, NLoS, and hybrid near-far-field environments. Finally, several open challenges are identified, such as accurate modeling of complex channels, balancing security and performance, and managing interference in multi-network integration. Promising research directions are also outlined, including hybrid near-far-field design and AI-enabled security. These directions are expected to provide theoretical foundations for advancing and standardizing near-field communication security in future 6G networks.  Prospects   Research on PLS for near-field communications remains at an early stage, with no unified or systematic framework established to date. As communication scenarios become increasingly diverse and complex, future studies should prioritize hybrid far-field and near-field environments, where channel coupling and user heterogeneity raise new security challenges. AI-driven PLS techniques show strong potential for adaptive optimization and improved resilience against adversarial threats. In parallel, integrating near-field PLS with advanced technologies such as RIS and ISAC can deliver joint improvements in security, efficiency, and functionality. Moreover, low-power design will be essential to balance security performance with energy efficiency, enabling the development of high-performance, intelligent, and sustainable near-field secure communication systems.
Wireless Communication and Internet of Things
Efficient Storage Method for Real-Time Simulation of Wide-Range Multipath Delay Spread Channels
LI Weishi, ZHOU Hui, JIAO Xun, XU Qiang, TANG Youxi
2025, 47(11): 4144-4152.   doi: 10.11999/JEIT250525
[Abstract](116) [FullText HTML](68) [PDF 4898KB](3)
Abstract:
  Objective  The real-time channel emulator is a critical tool in wireless device research and development, enabling accurate and repeatable experiments in controlled laboratory environments. This capability reduces testing costs by avoiding extensive field trials and accelerates development cycles by allowing rapid iteration and validation of wireless devices under realistic conditions. With the rapid advancement of aerial platforms—including drones, High-Altitude Pseudo-Satellites (HAPS), and Unmanned Aerial Vehicles (UAVs)—for integrated sensing and communication, high-resolution imaging, and environmental reconstruction in complex wireless environments, the challenges of channel modeling have increased considerably. In particular, there is growing demand for real-time simulation of wide-range multipath delay spread channels. Existing simulation methods, although effective in traditional scenarios, face substantial limitations in hardware storage resources when handling such channels. This study addresses these limitations by proposing an efficient storage method for real-time emulation of wide-range multipath channels. The method reduces memory overhead while preserving high fidelity in channel reproduction, thereby offering a practical and optimized solution for next-generation wireless communication research.  Methods  In conventional real-time channel emulation, a combined simulation approach is adopted, employing cascaded common delay and multipath delay spread components. The common delay component is implemented using a single high-capacity memory module, whereas the multipath delay spread component is implemented using a Dense Tapped Delay Line (D-TDL). This design reduces storage resource requirements by multiplexing the common delay component, but the achievable multipath delay spread range remains limited. Moreover, the multipath delay is constrained by the common delay component, reducing flexibility and compromising the ability to emulate complex scenarios. The Sparse Tapped Delay Line (S-TDL) scheme is used in some algorithms to extend the multipath delay emulation range by cascading block memory modules. However, this method introduces inter-tap delay dependencies and cannot adapt to the requirements of wide-range multipath delay spread channels. Alternatively, Time-Division Multiplexing (TDM) is applied in other algorithms to improve the utilization efficiency of block memory modules and decouple multipath delay control. Despite this, TDM is constrained by the read/write bandwidth of memory, making it unsuitable for real-time channel emulation of large-bandwidth signals. To overcome the multi-tap delay coupling issue in the S-TDL algorithm, an Optimized Sparse Tapped Delay Line (OS-TDL) algorithm is proposed. By analyzing delay-dependent relationships among multipath taps, theoretical derivation establishes an analytical relationship between the number of multipaths and the delay spread range achievable under decoupling constraints. Redundant taps are introduced to eliminate inter-tap delay dependencies, enabling flexible configuration of arbitrary multipath delay combinations. The algorithm formulates a joint optimization model that balances hardware memory allocation and multipath delay spread fidelity, supports wide-range multipath scenarios without being limited by memory read/write bandwidth, and allows real-time emulation of large-bandwidth signals. The central innovation lies in dynamically constraining tap activation and sparsity patterns to reduce redundant memory while preserving wide-range multipath delay spread channel characteristics. Compared with conventional approaches, the proposed algorithm significantly enhances storage resource utilization efficiency in wide-range multipath channel emulation. On this basis, a concrete algorithmic procedure is developed, in which an input multipath delay sequence is computationally processed to derive delay configuration parameters and activation sequences for multiple cascaded memory units. Comprehensive validation procedures for the algorithm are presented in later sections.  Results and Discussions  Conventional S-TDL algorithms are constrained by inter-tap delay coupling, which limits their ability to achieve high-fidelity emulation of wide-range multipath delay variations. To overcome this limitation, a comparative simulation of three algorithms—the memory resource exclusive algorithm, the TDM memory resource algorithm, and the OS-TDL algorithm proposed herein—is systematically conducted. A controlled variable approach is employed to evaluate storage resource utilization efficiency across three key dimensions: signal sampling rate, number of emulated multipath components, and multipath delay spread range. Theoretical analysis and simulation results show that the proposed OS-TDL algorithm significantly reduces memory requirements compared with conventional methods, while maintaining emulation fidelity. Its effectiveness is further verified through experimental implementation on AMD’s Virtex UltraScale+ series high-performance Field-Programmable Gate Array (FPGA), using the XCVU13P verification platform. Comparative FPGA resource measurements under identical system specifications confirm the superiority of the proposed algorithm, demonstrating its ability to improve memory efficiency while accurately reproducing wide-range multipath delay spread channels.  Conclusions  This study addresses the challenge of storage resource utilization efficiency in real-time channel emulation for wide-range multipath delay spread by analyzing the inter-tap delay dependency inherent in conventional S-TDL algorithms. An OS-TDL algorithm is proposed to emulate wide-range multipath delay spread channels. Both simulation and hardware verification results demonstrate that the proposed algorithm substantially improves storage efficiency while accurately reproducing multipath wide-range delay spread characteristics. These findings confirm that the algorithm meets the design requirements of real-time channel emulators for increasingly complex verification scenarios.
UAV-Assisted Intelligent Data Collection and Computation Offloading for Railway Wireless Sensor Networks
YAN Li, WANG Junkai, FANG Xuming, LIN Wei, LIANG Yiqun
2025, 47(11): 4153-4165.   doi: 10.11999/JEIT250340
[Abstract](225) [FullText HTML](127) [PDF 4205KB](17)
Abstract:
  Objective  Ensuring the safety and stability of train operations is essential in the advancement of railway intelligence. The growing maturity of Wireless Sensor Network (WSN) technology offers an efficient, reliable, low-cost, and easily deployable approach to monitoring railway operating conditions. However, in complex and dynamic maintenance environments, WSNs encounter several challenges, including weak signal coverage at monitoring sites, limited accessibility for tasks such as sensor node battery replacement, and the generation of large volumes of monitoring data. To address these issues, this study proposes a multi-Unmanned Aerial Vehicle (UAV)-assisted method for data collection and computation offloading in railway WSNs. This approach enhances overall system energy efficiency and data freshness, offering a more effective and robust solution for railway safety monitoring.  Methods  An intelligent data collection and computation offloading system is constructed for multi-UAV-assisted railway WSNs. UAV flight constraints within railway safety protection zones are considered, and wireless sensing services are prioritized to ensure preferential transmission for safety-critical tasks. To balance energy consumption and data freshness, the system optimization objective is defined as the weighted sum of UAV energy consumption, WSN energy consumption, and the Age of Information (AoI). A joint optimization algorithm based on Multi-Agent Soft Actor-Critic (MASAC) is proposed, which balances exploration and exploitation through entropy regularization and adaptive temperature parameters. This approach enables efficient joint optimization of UAV trajectories and computation offloading strategies.  Results and Discussions  (1)Compared with the Multi-Agent Deep Deterministic Policy Gradient (MADDPG), MASAC-Greedy, and MASAC-AOU algorithms, the MASAC-based scheme converges more rapidly and demonstrates greater stability (Fig. 4), ultimately achieving the highest reward. In contrast, MADDPG exhibits slower learning and less stable performance. (2)The comparison of multi-UAV flight trajectories under different algorithms shows that the proposed MASAC algorithm enables effective collaboration among UAVs, with each responsible for monitoring distinct regions while strictly adhering to railway safety protection zone constraints (Fig. 5). (3)The MASAC algorithm yields the best objective function value across all evaluated algorithms (Fig. 6). (4As the number of sensors and the AoI weight increase, UAV energy consumption rises for all algorithms; however, the MASAC algorithm consistently maintains the lowest energy consumption (Fig. 7). (5)In terms of sensor node energy consumption, MADDPG achieves the lowest value, but at the expense of information freshness (Fig. 8). (6)Regarding average AoI performance, the MASAC algorithm performs best across a range of sensor densities and AoI weight settings, with the greatest improvements observed under higher AoI weight conditions (Fig. 9). (7)The AoI performance comparison by sensor type (Table 2) confirms that the system effectively supports priority-based data collection services.  Conclusions  This study proposes an MASAC-based intelligent data collection and computation offloading scheme for railway WSNs supported by multiple UAVs, addressing critical challenges such as limited WSN battery life and the high real-time computational demands of complex railway environments. The proposed algorithm jointly optimizes UAV flight trajectories and computation offloading strategies by integrating considerations of UAV and WSN energy consumption, data freshness, sensing service priorities, and railway safety protection zone constraints. The optimization objective is to minimize the weighted sum of average UAV energy consumption, average WSN energy consumption, and average WSN AoI. Simulation results demonstrate that the proposed scheme outperforms baseline algorithms across multiple performance metrics. Specifically, it achieves faster convergence, efficient multi-UAV collaboration that avoids resource redundancy and spatial overlap, and superior results in UAV energy consumption, sensor node energy consumption, and average AoI.
Secure Beamforming Design for Multi-User Near-Field ISAC Systems
DENG Zhixiang, ZHANG Zhiwei
2025, 47(11): 4166-4175.   doi: 10.11999/JEIT250462
[Abstract](192) [FullText HTML](72) [PDF 3101KB](35)
Abstract:
  Objective  Integrated Sensing and Communication (ISAC) systems, a key enabling technology for 6G, achieve the joint realization of communication and sensing by sharing spectrum and hardware. However, radar targets may threaten the confidentiality of user communications, necessitating secure transmission against potential eavesdropping. At the same time, large-scale antenna arrays and high-frequency bands are expected to be widely deployed to meet future performance requirements, making near-field wireless transmission increasingly common. This trend creates a mismatch between existing ISAC designs that rely on the far-field assumption and the characteristics of real propagation environments. In this study, we design optimal secrecy beamforming for a multi-user near-field ISAC system to improve the confidentiality of user communications while ensuring radar sensing performance. The results show that distance degrees of freedom inherent in the near-field model, together with radar sensing signals serving as Artificial Noise (AN), provide significant gains in communication secrecy.  Methods  A near-field ISAC system model is established, in which multiple communication users and a single target, regarded as a potential eavesdropper, are located within the near-field region of a transmitter equipped with a Uniform Linear Array (ULA). Based on near-field channel theory, channel models are derived for all links, including the communication channels from the transmitter to the users, the transmitter to the target, and the radar echo-based sensing channel.The secrecy performance of each user is quantified as the difference between the achievable communication rate and the eavesdropping rate at the target, and the sum secrecy rate across all users is adopted as the metric for system-wide confidentiality. The sensing performance of the ISAC system is evaluated using the Cramér-Rao bound (CRB), obtained from the Fisher Information Matrix (FIM) for parameter estimation. To enhance secrecy, a joint optimization problem is formulated for the beamforming vectors of communication and radar sensing signals, with the objective of maximizing the sum secrecy rate under base station transmit power and sensing performance constraints.As the joint optimization problem is inherently non-convex, an algorithm combining Semi-Definite Relaxation (SDR) and Weighted Minimum Mean Square Error (WMMSE) is developed. The equivalence between the MMSE-transformed problem and the original secrecy rate maximization problem is first established to handle non-convexity. The CRB constraint is then expressed in convex form using the Schur complement. Finally, SDR is applied to recast the problem into a convex optimization framework, which allows a globally optimal solution to be derived.  Results and Discussions  Numerical evaluations show that the proposed near-field ISAC secrecy beamforming design achieves clear advantages in communication confidentiality compared with far-field and non-AN schemes. Under the near-field channel model, the designed beams effectively concentrate energy on legitimate users while suppressing information leakage through radar sensing signals (Fig. 3b). Even when communication users and radar targets are angularly aligned, the secure beamforming scheme attains spatial isolation through distance-domain degrees of freedom, thereby maintaining positive secrecy rates (Fig. 3a).Joint optimization of communication beams and radar sensing signals significantly improves multi-user secrecy rates while satisfying the CRB constraint. Compared with conventional AN-assisted methods, the proposed solution exhibits superior trade-off performance between sensing and communication (Fig. 4).The number of antennas is directly correlated with beam focusing performance: increasing the antenna count produces more concentrated beam patterns. In the near-field model, however, the incorporation of the distance dimension amplifies this effect, yielding larger performance gains than those observed in conventional far-field systems (Fig. 5).Raising the transmit power further improves the received signal quality at the users, which proportionally enhances system secrecy. The near-field scheme achieves more substantial gains than far-field baselines under higher transmit power conditions (Fig. 6).This paper also examines the effect of user population on secrecy performance. A larger number of users increases inter-user interference, which degrades overall secrecy (Fig. 7). Nevertheless, owing to the intrinsic interference suppression capability of the near-field scheme and the ability of AN to impair eavesdroppers’ decoding, the proposed method maintains stronger robustness against multi-user interference compared with conventional approaches.  Conclusions  This study investigates multi-user secure communication design in near-field ISAC systems and proposes a beamforming optimization scheme that jointly enhances sensing accuracy and communication secrecy. A non-convex optimization model is established to maximize the multi-user secrecy sum rate under base station transmit power and CRB constraints, where radar sensing signals are exploited as AN to impair potential eavesdroppers. To address the complexity of the problem, a joint optimization algorithm combining SDR and WMMSE is developed, which reformulates the original non-convex problem into a convex form solvable with standard optimization tools.
Parametric Holographic MIMO Channel Modeling and Its Bayesian Estimation
YUAN Zhengdao, GUO Yabo, GAO Dawei, GUO Qinghua, HUANG Chongwen, LIAO Guisheng
2025, 47(11): 4176-4187.   doi: 10.11999/JEIT250436
[Abstract](190) [FullText HTML](115) [PDF 2374KB](17)
Abstract:
  Objective  Holographic Multiple-Input Multiple-Output (HMIMO), based on continuous-aperture antennas and programmable metasurfaces, is regarded as a cornerstone of 6G wireless communication. Its potential to overcome the limitations of conventional massive MIMO is critically dependent on accurate channel modeling and estimation. Three major challenges remain: (1) oversimplified electromagnetic propagation models, such as far-field approximations, cause severe mismatches in near-field scenarios; (2) statistical models fail to characterize the coupling between channel coefficients, user positions, and random orientations; and (3) the high dimensionality of parameter spaces results in prohibitive computational complexity. To address these challenges, a hybrid parametric-Bayesian framework is proposed in which neural networks, factor graphs, and convex optimization are integrated. Precise channel estimation, user position sensing, and angle decoupling in near-field HMIMO systems are thereby achieved. The methodology provides a pathway toward high-capacity 6G applications, including Integrated Sensing And Communication (ISAC).  Methods  A hybrid channel estimation method is proposed to decouple the “channel-coordinate-angle” parameters and to enable joint estimation of channel coefficients, coordinates, and angles under random user orientations. A neural network is first employed to capture the nonlinear relationship between holographic channel characteristics and the relative coordinates of the base station and user. The trained network is then embedded into a factor graph, where global optimization is performed. The neural network is dynamically approximated through Taylor expansion, allowing bidirectional message propagation and iterative refinement of parameter estimates. To address random user orientations, Euler angle rotation theory is introduced. Finally, convex optimization is applied to estimate the rotation mapping matrix, resulting in the decoupling of coordinate and angle parameters and accurate channel estimation.  Results and Discussions  The simulations evaluate the performance of different algorithms under varying key parameters, including Signal-to-Noise Ratio (SNR), pilot length L, and base station antenna number M. Two performance metrics are considered: Normalized Mean Square Error (NMSE) of channel estimation and user positioning accuracy, with the Cramér-Rao Lower Bound (CRLB) serving as the theoretical benchmark. At an SNR of 10 dB, the proposed method achieves a channel NMSE below –40 dB, outperforming Least Squares (LS) estimation and approximate model-based approaches. Under high SNR conditions, the NMSE converges toward the CRLB, confirming near-optimal performance (Fig. 5a). The proposed channel model demonstrates superior performance over “approximate methods” due to its enhanced characterization of real-world channels. Moreover, the positioning error gap between the proposed method and the “parallel bound” narrows to nearly 3 dB at high SNR, confirming the accuracy of angle estimation and the effectiveness of parameter decoupling (Fig. 5b). Moreover, the proposed method maintains performance close to the theoretical bounds when system parameters, such as user antenna number N, base station antenna number M, and pilot length L, are varied, demonstrating strong robustness (Figs. 68). These results also show that the Euler angle rotation-based estimation effectively compensates for coordinate offsets induced by random user orientations.  Conclusions  This study proposes a framework for HMIMO channel estimation by integrating neural networks, factor graphs, and convex optimization. The main contributions are threefold. First, Euler angles and coordinate mapping are incorporated into the parameterized channel model through factorization and factor graphs, enabling channel modeling under arbitrary user antenna orientations. Second, neural networks and convex optimization are embedded as factor nodes in the graph, allowing nonlinear function approximation and global optimization. Third, bidirectional message passing between neural network and convex optimization nodes is realized through Taylor expansion, thereby achieving joint decoupling and estimation of channel parameters, coordinates, and angles. Simulation results confirm that the proposed framework achieves higher accuracy—exceeding benchmarks by more than 3 dB, and demonstrates strong robustness across a range of scenarios. Future work will extend the method to multi-user environments, incorporate polarization diversity, and address hardware impairments such as phase noise, with the aim of supporting practical deployment in 6G systems.
A Multi-class Local Distribution-based Weighted Oversampling Algorithm for Multi-class Imbalanced Datasets
TAO Xinmin, XU Annan, SHI Lihang, LI Junxuan, GUO Xinyue, ZHANG Yanping
2025, 47(11): 4188-4199.   doi: 10.11999/JEIT250381
[Abstract](125) [FullText HTML](80) [PDF 6009KB](8)
Abstract:
  Objective  Classification with imbalanced datasets remains one of the most challenging problems in machine learning. In addition to class imbalance, such datasets often contain complex factors including class overlap, small disjuncts, outliers, and low-density regions, all of which can substantially degrade classifier performance, particularly in multi-class settings. To address these challenges simultaneously, this study proposes the Multi-class Local Distribution-based Weighted Oversampling Algorithm (MC-LDWO).  Methods  The MC-LDWO algorithm first constructs hyperspheres centered on dynamically determined minority classes, with radii estimated from the distribution of each class. Within these hyperspheres, minority class samples are selected for oversampling according to their local distribution, and an adaptive weight allocation strategy is designed using local density metrics. This ensures that samples in low-density regions and near class boundaries are assigned higher probabilities of being oversampled. Next, a low-density vector is computed from the local distribution of both majority and minority classes. A random vector is then introduced and integrated with the low-density vector, and a cutoff threshold is applied to determine the generation sites of synthetic samples, thereby reducing class overlap during boundary oversampling. Finally, an improved decomposition strategy tailored for multi-class imbalance is employed to further enhance classification performance in multi-class imbalanced scenarios.  Results and Discussions  The MC-LDWO algorithm dynamically identifies the minority and combined majority class sample sets and constructs hyperspheres centered on each minority class sample, with radii determined by the distribution of the corresponding minority class. These hyperspheres guide the subsequent oversampling process. A trade-off parameter (\begin{document}$ \beta $\end{document}) is introduced to balance the influence of local densities between the combined majority and minority classes. Experimental results on KEEL datasets show that this approach effectively prevents class overlap during boundary oversampling while assigning higher oversampling weights to critical minority samples located near boundaries and in low-density regions. This improves boundary distribution and simultaneously addresses within-class imbalance. When the trade-off parameter is set to 0.5, MC-LDWO achieves a balanced consideration of both boundary distribution and the diverse densities present in minority classes due to data difficulty factors, thereby supporting improved performance in downstream classification tasks (Fig. 10).  Conclusions  Comparative results with other state-of-the-art oversampling algorithms demonstrate that: (1) The MC-LDWO algorithm effectively prevents overlap when strengthening decision boundaries by setting the cutoff threshold (\begin{document}$ T $\end{document}) and adaptively assigns oversampling weights according to two local density indicators for the minority and combined majority classes within the hypersphere. This approach addresses within-class imbalance caused by data difficulty factors and enhances boundary distribution. (2) By jointly considering density and boundary distribution, and setting the trade-off parameter to 0.5, the proposed algorithm can simultaneously mitigate within-class imbalance and reinforce the boundary information of minority classes. (3) When applied to highly imbalanced datasets characterized by complex decision boundaries and data difficulty factors such as outliers and small disjuncts, MC-LDWO significantly improves the boundary distribution of each minority class while effectively managing within-class imbalance, thereby enhancing the performance of subsequent classifiers.
Multi-Mode Anti-Jamming for UAV Communications: A Cooperative Mode-Based Decision-Making Approach via Two-Dimensional Transfer Reinforcement Learning
WANG Shiyu, WANG Ximing, KE Zhenyi, LIU Dianxiong, LIU Jize, DU Zhiyong
2025, 47(11): 4200-4210.   doi: 10.11999/JEIT250566
[Abstract](312) [FullText HTML](227) [PDF 3743KB](25)
Abstract:
  Objective  With the widespread application of Unmanned Aerial Vehicles (UAVs) in military reconnaissance, logistics, and emergency communications, ensuring the security and reliability of UAV communication systems has become a critical challenge. Wireless channels are highly vulnerable to diverse jamming attacks. Traditional anti-jamming techniques, such as Frequency-Hopping Spread Spectrum (FHSS), are limited in dynamic spectrum environments and may be compromised by advanced machine learning algorithms. Furthermore, UAVs operate under strict constraints on onboard computational power and energy, which hinders the real-time use of complex anti-jamming algorithms. To address these challenges, this study proposes a multi-mode anti-jamming framework that integrates Intelligent Frequency Hopping (IFH), Jamming-based Backscatter Communication (JBC), and Energy Harvesting (EH) to strengthen communication resilience in complex electromagnetic environments. A Multi-mode Transfer Deep Q-Learning (MT-DQN) method is further proposed, enabling two-dimensional transfer to improve learning efficiency and adaptability under resource constraints. By leveraging transfer learning, the framework reduces computational load and accelerates decision-making, thereby allowing UAVs to counter jamming threats effectively even with limited resources.  Methods  The proposed framework adopts a multi-mode anti-jamming architecture that integrates IFH, JBC, and EH to establish a comprehensive defense strategy of “avoiding, utilizing, and converting” interference. The system is formulated as a Markov Decision Process (MDP) to dynamically optimize the selection of anti-jamming modes and communication channels. To address the challenges of high-dimensional state-action spaces and restricted onboard computational resources, a two-dimensional transfer reinforcement learning framework is developed. This framework comprises a cross-mode strategy-sharing network for extracting common features across different anti-jamming modes (Fig. 3) and a parallel network for cross-task transfer learning to adapt to variable task requirements (Fig. 4). The cross-mode strategy-sharing network accelerates convergence by reusing experiences, whereas the cross-task transfer learning network enables knowledge transfer under different task weightings. The reward function is designed to balance communication throughput and energy consumption. It guides the UAV to select the optimal anti-jamming strategy in real time based on spectrum sensing outcomes and task priorities.  Results and Discussions  The simulation results validate the effectiveness of the proposed MT-DQN. The dynamic weight allocation mechanism exhibits strong cross-task transfer capability (Fig. 6), as weight adjustments enable rapid convergence toward the corresponding optimal reward values. Compared with conventional Deep Reinforcement Learning (DRL) algorithms, the proposed method achieves a 64% faster convergence rate while maintaining the probability of communication interruption below 20% in dynamic jamming environments (Fig. 7). The framework shows robust performance in terms of throughput, convergence rate, and adaptability to variations in jamming patterns. In scenarios with comb-shaped and sweep-frequency jamming, the proposed method yields higher normalized throughput and faster convergence, exceeding baseline DQN and other transfer learning-based approaches. The results also indicate that MT-DQN improves stability and accelerates policy optimization during jamming pattern switching (Fig. 7), highlighting its adaptability to abrupt changes in jamming patterns through transfer learning.  Conclusions  This study proposes a multi-modal anti-jamming framework that integrates IFH, JBC, and EH, thereby enhancing the communication capability of UAVs. The proposed solution shifts the paradigm from traditional jamming avoidance toward active jamming exploitation, repurposing jamming signals as covert carriers to overcome the limitations of conventional frequency-hopping systems. Simulation results confirm the advantages of the proposed method in throughput performance, convergence rate, and environmental adaptability, demonstrating stable communication quality even under complex electromagnetic conditions. Although DRL approaches are inherently constrained in handling completely random jamming without intrinsic patterns, this work improves adaptability to dynamic jamming through transfer learning and cross-modal strategy sharing. These findings provide a promising approach for countering complex jamming threats in UAV networks. Future work will focus on validating the proposed algorithm in hardware implementations and enhancing the robustness of DRL methods under highly non-stationary, though not entirely unpredictable, jamming conditions such as pseudo-random or adaptive interference.
Dual Mode Index Modulation-aided Orthogonal Chirp Division Multiplexing System in High-dynamic Scenes
NING Xiaoyan, TANG Zihan, YIN Qiaoling, WANG Shihan
2025, 47(11): 4211-4219.   doi: 10.11999/JEIT250475
[Abstract](123) [FullText HTML](48) [PDF 3429KB](13)
Abstract:
  Objective  In high-dynamic environments, the Orthogonal Chirp Division Multiplexing (OCDM) system has attracted significant attention due to its inherent advantage of time-frequency two-dimensional expansion gain. The OCDM with Index Modulation (OCDM-IM) system extends the index domain of the traditional OCDM system, selectively activating subcarriers through index modulation. This reduces inter-carrier interference to some extent. However, the OCDM-IM system necessitates that certain subcarriers remain inactive, which, on one hand, diminishes the time-frequency expansion gain of the OCDM system and, on the other hand, leads to more pronounced Doppler interference in high-dynamic environments. Additionally, the inactive subcarriers do not contribute to data transmission, resulting in throughput loss. To overcome these challenges, this study proposes a novel communication system architecture, the Dual Mode Index Modulation-aided OCDM (DM-OCDM-IM). This architecture incorporates a dual-mode index mapping scheme and introduces new modulation dimensions within the OCDM system. The DM-OCDM-IM system preserves the interference immunity associated with the time-frequency two-dimensional expansion of the OCDM system while achieving higher spectral efficiency with low-order constellation modulation, offering enhanced communication performance in high-dynamic scenarios.  Methods  In this study, a DM-OCDM-IM communication system architecture is proposed, consisting of two main components: the dual mode index modulation module and the receiving algorithm. In the dual mode index modulation module, the DM-OCDM-IM system partitions the subcarriers in each subblock into two groups, each transmitting constant-amplitude and mutually distinguishable constellation symbols. This design expands the modulation dimensions and improves spectral efficiency. At the same time, low-order constellation modulation can be applied in a single dimension, thereby strengthening the system’s anti-jamming capability in high-dynamic environments. The constant-amplitude dual mode index mapping scheme also reduces performance fluctuations caused by channel gain variations and offers ease of hardware implementation. For signal reception, the system must contend with substantial Doppler frequency shifts and the computational complexity of demodulation in high-dynamic conditions. To address this, the DM-OCDM-IM employs a receiving algorithm based on feature decomposition of the Discrete Fresnel Transform (DFnT), which reduces complexity. The discrete time-domain transmit signal is reconstructed by applying the Discrete Fourier Transform (DFT) and feature decomposition to the received frequency-domain signal. Finally, the original transmitted bits are recovered through index demodulation and constellation demodulation of the reconstructed time-domain signal using a maximum-likelihood receiver.  Results and Discussions  The performance of the proposed DM-OCDM-IM system is simulated and compared with that of the existed Dual Mode Index Modulation-aided OFDM (DM-OFDM-IM) system and the OCDM-IM system under three channel conditions: AWGN, multipath, and Doppler frequency shift. The results show that, relative to the DM-OFDM-IM system, the proposed DM-OCDM-IM system exploits multipath diversity more effectively and exhibits stronger resistance to fading in all three channels (Fig. 5, Fig. 6). When compared with the OCDM-IM system, the Bit Error Rate (BER) performance of the proposed DM-OCDM-IM system is significantly improved across all three channel conditions, particularly at high spectral efficiency (Fig.7(b), Fig.8(b)). These results confirm that the introduction of the dual mode index modulation technique extends the modulation dimensions within the OCDM framework. Information is transmitted not only through index modulation but also through dual mode modulation, enabling higher spectral efficiency without increasing the modulation order. At the same time, the time-frequency expansion gain characteristic of OCDM is preserved, while receiver complexity is effectively controlled. These combined features make the proposed DM-OCDM-IM system well suited for communication in high-dynamic channel environments.  Conclusions  This paper establishes a novel DM-OCDM-IM system framework. First, by integrating a constant-amplitude dual mode index mapping scheme into the traditional OCDM system, the proposed design expands the modulation dimensions and allows the use of low-order constellation modulation in a single dimension. This improves spectral efficiency while enhancing system reliability in high-dynamic environments. Second, to reduce receiver-side complexity, a receiving algorithm based on feature decomposition of the DFnT is proposed, simplifying the digital signal processing of the DM-OCDM-IM system. Finally, the performance of the system is evaluated under AWGN, multipath, and Doppler frequency shift channels. The results demonstrate that, compared with the existed DM-OFDM-IM system, the proposed DM-OCDM-IM system exhibits stronger resistance to multipath fading and Doppler frequency shifts. In comparison with the OCDM-IM system, the proposed DM-OCDM-IM design preserves the time-frequency expansion gain of OCDM and provides stronger fading resistance at high spectral efficiency. Therefore, the proposed DM-OCDM-IM system offers superior adaptability in high-dynamic scenarios and has the potential to serve as a next-generation physical-layer waveform for mobile communications.
HRIS-Aided Layered Sparse Reconstruction Hybrid Near- and Far-Field Source Localization Algorithm
YANG Qingqing, PU Xuelai, PENG Yi, LI Hui, YANG Qiuping
2025, 47(11): 4220-4230.   doi: 10.11999/JEIT250429
[Abstract](73) [FullText HTML](40) [PDF 2514KB](15)
Abstract:
  Objective  Advances in Reconfigurable Intelligent Surface (RIS) technology have enabled larger arrays and higher frequencies, which expand the near-field region and improve positioning accuracy. The fundamental differences between near- and far-field propagation necessitate hybrid localization algorithms capable of seamlessly integrating both regimes.  Methods  A localization framework for mixed near- and far-field sources is proposed by integrating Fourth-Order Cumulant (FOC) matrices with hierarchical sparse reconstruction. A hybrid RIS architecture incorporating active elements is employed to directly receive pilot signals, thereby reducing parameter-coupling errors that commonly occur in passive RIS over multi-hop channels and enhancing reliability in Non-Line-of-Sight (NLOS) scenarios. Symmetrically placed active elements are employed to construct three FOC matrices for three-dimensional position estimation. The two-dimensional angle search is decomposed into two sequential one-dimensional searches, where elevation and azimuth are estimated separately to reduce computational complexity. The first FOC matrix ( C 1), formed from vertically symmetric elements, captures elevation characteristics. The second matrix ( C 2), constructed from centrally symmetric elements, suppresses nonlinear terms related to distance. The third matrix ( C 3) applies the previously estimated angles to select active elements, incorporates near-field effects, and enables accurate distance estimation as well as discrimination between near-field and far-field signals. To further improve the efficiency and accuracy of spectral searches, a hierarchical multi-resolution strategy based on sparse reconstruction is introduced. This method partitions the continuous parameter space into discrete intervals, incrementally generates a multi-resolution dictionary, and applies a progressive search procedure for precise position parameter estimation. During the search process, a tuning factor constrains the maximum reconstruction error between the sparse matrix and the projection of the original signal subspace. In addition, the algorithm exploits the orthogonality between the signal and noise subspaces to design a weight matrix, which reduces the effects of noise and position errors on the sparse solution. This hierarchical search enables rapid, coarse-to-fine parameter estimation and substantially improves localization accuracy.  Results and Discussions  The performance of the proposed algorithm is evaluated against Two-Stage Multiple Signal Classification (TSMUSIC), hybrid Orthogonal Matching Pursuit (OMP), and Holographic Multiple-Input Multiple-Output (HMIMO)-based methods with respect to noise resistance, convergence speed, and computational efficiency. Under varying SNR conditions (Fig. 5), traditional subspace methods exhibit degraded performance at low SNR because of reliance on signal-noise subspace orthogonality. In contrast, the proposed algorithm employs the FOC matrix to achieve accurate elevation and azimuth estimation while suppressing Gaussian noise. The hierarchical sparse reconstruction strategy further enhances estimation accuracy, resulting in superior far-field localization performance. Unlike the HMIMO-based algorithm, which depends on dynamic codebook switching, the proposed method retains nonlinear distance-dependent phase terms and constructs the distance codebook from initial angle estimates, thereby improving near-field localization accuracy. In Experiment 2, the effect of varying snapshot numbers on parameter estimation is examined. Owing to the angle-decoupling capability of the FOC matrix, the algorithm achieves rapid reduction in Root Mean Square Error (RMSE) even with a small number of snapshots. As the number of snapshots increases, estimation accuracy improves steadily and approaches convergence, indicating robustness against noise and fast convergence under low-snapshot conditions. Conventional methods typically require predefined near-field and far-field grids. By contrast, the nonlinear phase retention mechanism enables automatic discrimination between near-field and far-field sources without a predetermined distance threshold. While the nonlinear phase term introduces slightly slower convergence during distance decoupling, the proposed method still outperforms TSMUSIC and hybrid OMP. However, angle estimation errors during the decoupling process provide the HMIMO-based approach with a slight advantage in distance estimation accuracy (Fig. 6). Computational complexity is also compared between the hierarchical multi-resolution framework and traditional global search strategies (Fig. 7). Standard hybrid-field localization algorithms, such as TSMUSIC and hybrid OMP, require simultaneous optimization of angle and distance parameters, leading to exponential growth of computational cost. In contrast, the hierarchical strategy applies a phased search in which elevation and azimuth are estimated sequentially, reducing the two-dimensional angle spectrum search to two one-dimensional searches. The combination of progressive grid contraction, layer-by-layer tuning factors, and step-size decay narrows the search range efficiently, enabling rapid convergence through a three-layer dynamic grid structure. The distance dictionary constructed from angle estimates further removes redundant grids, thereby reducing complexity compared with global search methods.  Conclusions  This study presents a 3D localization framework for mixed near- and far-field sources in RIS-assisted systems by combining FOC decoupling with hierarchical sparse reconstruction. The method decouples angle and range estimation and uses a multi-resolution search strategy, achieving reliable performance and rapid convergence even under low SNR conditions and with limited snapshots. Simulation results demonstrate that the proposed approach consistently outperforms TSMUSIC, hybrid OMP, and HMIMO-based techniques, confirming its efficiency and robustness in mixed-field environments.
Hybrid Far-Near Field Channel Estimation for XL-RIS Assisted Communication Systems
SHAO Kai, HUA Fanyu, WANG Guangyu
2025, 47(11): 4231-4241.   doi: 10.11999/JEIT250306
[Abstract](181) [FullText HTML](124) [PDF 1943KB](34)
Abstract:
  Objective  With the rapid development of sixth-generation mobile communication, Extra-Large Reconfigurable Intelligent Surfaces (XL-RIS) have attracted significant attention due to their potential to enhance spectral efficiency, expand coverage, and reduce energy consumption. However, conventional channel estimation methods, primarily based on Far-Field (FF) or near-field (NF) models, face limitations in addressing the hybrid far-NF environment that arises from the coexistence of NF spherical waves and FF planar waves in XL-RIS deployments. These limitations restrict the intelligent control capability of RIS technology due to inaccurate channel modeling and reduced estimation accuracy. To address these challenges, this paper constructs a hybrid-field channel model for XL-RIS and proposes a robust channel estimation method to resolve parameter estimation challenges under coupled FF and NF characteristics, thereby improving channel estimation accuracy in complex propagation scenarios.  Methods  For channel estimation in XL-RIS-aided communication systems, several key challenges must be addressed, including the modeling of hybrid far-NF cascaded channels, separation of FF and NF channel components, and individual parameter estimation. To capture the hybrid-field effects of XL-RIS, a hybrid-field cascaded channel model is constructed. The RIS-to-User Equipment (UE) channel is modeled as a hybrid far-NF channel, whereas the Base Station (BS)-to-RIS channel is characterized under the FF assumption. A unified representation of FF and NF models is established by introducing equivalent cascaded angles for the angle of departure and angle of arrival on the RIS side. The XL-RIS hybrid-field cascaded channel is parameterized through BS arrival angles, RIS-UE cascaded angles, and distances. To reduce the computational complexity of joint parameter estimation, a Two-Stage Hybrid-Field (TS-HF) channel estimation scheme is proposed. In the first stage, the BS arrival angle is estimated using the MUltiple SIgnal Classification (MUSIC) algorithm. In the second stage, a Hybrid-Field forward spatial smoothing Rank-reduced MUSIC (HF-RM) algorithm is proposed to estimate the parameters of the RIS-UE hybrid-field channel. The received signals are pre-processed using a forward spatial smoothing technique to mitigate multipath coherence effects. Subsequently, the Rank-reduced MUSIC (RM) algorithm is applied to separately estimate the FF and NF angle parameters, as well as the NF distance parameter. During this stage, a power spectrum comparison scheme is designed to distinguish FF and NF angles based on power spectral characteristics, thereby providing high-precision angular information to support NF distance estimation. Finally, channel attenuation is estimated using the least squares method. To validate the effectiveness of the proposed hybrid-field channel estimation scheme, comparative analyses are conducted against FF, NF, and the proposed TS-HF-RM schemes. The FF estimation approximates the hybrid-field channel using a FF channel model and estimates FF angle parameters with the MUSIC algorithm, referred to as the TS-FF-M scheme. The NF estimation applies a NF channel model to characterize the hybrid channel and estimates angle and distance parameters using the RM algorithm, referred to as the TS-NF-RM scheme. To further evaluate the estimation performance, additional benchmark schemes are considered, including the Two-Stage Near-Field Orthogonal Matching Pursuit (TS-NOMP) scheme, the Two-Stage Hybrid Orthogonal Matching Pursuit with Prior (TS-HOMP-P) scheme that requires prior knowledge of FF and NF quantities, and the Two-Stage Hybrid Orthogonal Matching Pursuit with No Prior (TS-HOMP-NP) scheme that operates without requiring such prior information.  Results and Discussions  Compared with the TS-FF-M and TS-NF-RM schemes, the proposed TS-HF-RM approach achieves effective separation and accurate estimation of both FF and NF components by jointly modeling the hybrid-field channel. The method consistently demonstrates superior estimation accuracy across a wide range of Signal-to-Noise Ratio (SNR) conditions (Fig. 4). These results confirm both the necessity of hybrid-field channel modeling and the effectiveness of the proposed estimation scheme. Experimental findings show that the TS-HF-RM approach significantly improves channel estimation performance in XL-RIS-assisted communication systems. Further comparative analysis reveals that the TS-HF-RM scheme outperforms TS-NOMP and TS-HOMP-P by mitigating power leakage effects and overcoming limitations associated with unknown path numbers through distinct processing of FF and NF components. Without requiring prior knowledge of the propagation environment, the proposed method achieves lower Normalized Mean Square Error (NMSE) while demonstrating improved robustness and estimation precision (Fig. 5). Although TS-HOMP-NP also operates without prior field information, the TS-HF-RM scheme provides superior parameter resolution, attributed to its subspace decomposition principle. Additionally, both the TS-HF-RM and TS-HOMP-P schemes exhibit improved performance as the number of pilot signals increases. However, TS-HF-RM consistently outperforms TS-HOMP-P under low-SNR conditions (0 dB). At high SNR (10 dB) with a limited number of pilot signals (<280), TS-HOMP-P temporarily achieves better performance due to its higher sensitivity to SNR. Nevertheless, the proposed TS-HF-RM approach demonstrates greater stability and adaptability under low-SNR and resource-constrained conditions (Fig. 6).  Conclusions  This study addresses the challenge of hybrid-field channel estimation for XL-RIS by constructing a hybrid-field cascaded channel model and proposing a two-stage estimation scheme. The HF-RM algorithm is specifically designed for accurate hybrid component estimation in the second stage. Theoretical analysis and simulation results demonstrate the following: (1) The hybrid-field model reduces inaccuracies associated with traditional single-field assumptions, providing a theoretical foundation for reliable parameter estimation in complex propagation environments; (2) The proposed TS-HF-RM algorithm enables high-resolution parameter estimation with effective separation of FF and NF components, achieving lower NMSE compared to hybrid-field OMP-based methods.
A Spatial-semantic Combine Perception for Infrared UAV Target Tracking
YU Guodong, JIANG Yichun, LIU Yunqing, WANG Yijun, ZHAN Weida, WANG Chunyang, FENG Jianghai, HAN Yueyi
2025, 47(11): 4242-4253.   doi: 10.11999/JEIT250613
[Abstract](158) [FullText HTML](55) [PDF 6284KB](42)
Abstract:
  Objective  In recent years, infrared image-based UAV target tracking technology has attracted widespread attention. In real-world scenarios, infrared UAV target tracking still faces significant challenges due to factors such as complex backgrounds, UAV target deformation, and camera movement. Siamese network-based tracking methods have made breakthroughs in balancing tracking accuracy and efficiency. However, existing approaches rely solely on high-level feature outputs from deep networks to predict target positions, neglecting the effective use of low-level features. This leads to the loss of spatial detail features of infrared UAV targets, severely affecting tracking performance. To efficiently utilize low-level features, some methods have incorporated Feature Pyramid Networks (FPN) into the tracking framework, progressively fusing cross-layer feature maps in a top-down manner, thereby effectively enhancing tracking performance for multi-scale targets. Nevertheless, these methods directly adopt traditional FPN channel reduction operations, which result in significant loss of spatial contextual information and channel semantic information. To address the above issues, a novel infrared UAV target tracking method based on spatial-semantic combine perception is proposed. By capturing spatial multi-scale features and channel semantic information, the proposed approach enhances the model’s capability to track infrared UAV targets in complex backgrounds.  Methods  The proposed method comprises four main components: a backbone network, multi-scale feature fusion, template-search feature interaction, and a detection head. Initially, template and search images containing infrared UAV targets are input into a weight-sharing backbone network to extract features. Subsequently, an FPN is constructed, within which a Spatial-semantic Combine Attention Module (SCAM) is integrated to efficiently fuse multi-scale features. Finally, a Dual-branch global Feature interaction Module (DFM) is employed to facilitate feature interaction between the template and search branches, and the final tracking results are obtained through the detection head. The proposed SCAM enhances the network’s focus on spatial and semantic information by jointly leveraging spatial and channel attention mechanisms, thereby mitigating the loss of spatial and semantic information in low-level features caused by channel dimensionality reduction in traditional FPN. SCAM primarily consists of two components: the Spatial Multi-scale Attention module (SMA) and the Global-local Channel Semantic Attention module (GCSA). The SMA captures long-range multi-scale dependencies efficiently through axial positional embedding and multi-branch grouped feature extraction, thereby improving the network’s perception of global contextual information. GCSA adopts a dual-branch design to effectively integrate global and local information across feature channels, suppress irrelevant background noise, and enable more rational channel-wise feature weighting. The proposed DFM treats the template branch features as the query source for the search branch and applies global cross-attention to capture more comprehensive features of infrared UAV targets. This enhances the tracking network’s ability to attend to the spatial location and boundary details of infrared UAV targets.  Results and Discussions  The proposed method has been validated on the infrared UAV benchmark dataset (Anti-UAV). Quantitative analysis (Table 1) demonstrates that, compared to 10 state-of-the-art methods, the proposed approach achieves the highest average normalized precision score of 76.9%, surpassing the second-best method, LGTrack, by 4.4%. Qualitative analysis (Figs. 68) further confirms that the proposed method exhibits strong adaptability and robustness when addressing various typical challenges in infrared UAV tracking, such as out of view, distracting objects and complex backgrounds. The collaborative design of the individual modules significantly enhances the model’s ability to perceive and represent small targets and dynamic scenes. In addition, qualitative experiments (Fig. 9) conducted on a self-constructed infrared UAV tracking dataset demonstrate the effectiveness and generalization capability of the proposed method in real-world tracking scenarios. Ablation studies (Tables 25) reveal that integrating any individual proposed module consistently improves tracking performance.  Conclusions  This paper conducts a systematic theoretical analysis and experimental validation addressing the issue of spatial and semantic information loss in infrared UAV target tracking. Focusing on the limitations of existing FPN-based infrared UAV tracking methods, particularly the drawbacks associated with channel reduction in multi-scale low-level features, a novel infrared UAV target tracking method based on spatial-semantic combine perception is proposed which fully leverages the complementary advantages of spatial and channel attention mechanisms. This method enhances the network’s focus on spatial context and critical semantic information, thereby improving overall tracking performance. The following main conclusions are obtained: (1) The proposed SCAM combining SMA and GCSA, where SMA captures spatial long-range feature dependencies through position coordinate embedding and one-dimensional convolution operations, ensuring the acquisition of multi-scale contextual information, while GCSA achieves more comprehensive semantic feature attention by interacting local and global channel features. (2) The designed DFM, which realizes feature interaction between search branch features and template branch features through global cross-attention, enabling the dual-branch features to complement each other and enhancing network tracking performance. (3) Extensive experimental results demonstrate that the proposed algorithm outperforms existing advanced methods in both quantitative evaluation and qualitative analysis, with an average state accuracy of 0.769, success rate of 0.743, and precision of 0.935, achieving more accurate tracking of infrared UAV targets. Although the algorithm in this paper has been optimized in terms of computing resource utilization efficiency, further research is needed on efficient deployment strategies for embedded and mobile devices to improve real-time performance and computing adaptability.
Belief Propagation-Ordered Statistics Decoding Algorithm with Parameterized List Structures
LIANG Jifan, WANG Qianfan, SONG Linqi, LI Lvzhou, MA Xiao
2025, 47(11): 4254-4263.   doi: 10.11999/JEIT250552
[Abstract](124) [FullText HTML](67) [PDF 3351KB](14)
Abstract:
  Objective  Traditional Belief Propagation-Ordered Statistics Decoding (BP-OSD) algorithms for quantum error-correcting codes often rely on a single normalization factor (\begin{document}$ \alpha $\end{document}) in the Belief Propagation (BP) stage, which restricts the search space and limits decoding performance. An enhanced BP-OSD algorithm is presented to address this limitation by employing a list of candidate \begin{document}$ \alpha $\end{document} values. The central idea is to perform BP decoding iteratively for multiple \begin{document}$ \alpha $\end{document} values, with the resulting posterior probabilities post-processed by Ordered Statistics Decoding (OSD). To balance performance gains with computational tractability, the multi-\begin{document}$ \alpha $\end{document} BP-OSD process is embedded within a two-stage framework: the more computationally intensive parameter-listed decoding is activated only when an initial BP decoding with a fixed \begin{document}$ {\alpha }_{0} $\end{document} fails. This design broadens the parameter search to improve decoding performance, while conditional activation ensures that computational complexity remains manageable, particularly at low physical error rates.  Methods  The proposed enhanced BP-OSD algorithm (Algorithm 1) introduces a two-stage decoding process. In the first stage, decoding is attempted using standard BP with a single predetermined normalization factor (\begin{document}$ {\alpha }_{0} $\end{document}), providing a computationally efficient baseline. If this attempt fails to produce a valid syndrome match, the second stage is activated. In the second stage, parameter listing is employed: BP decoding is executed independently across a predefined list of \begin{document}$ L $\end{document} distinct normalization factors \begin{document}$ \left\{{\alpha }_{1},{\alpha }_{2}, \cdots,{\alpha }_{L}\right\} $\end{document}. Each run generates a set of posterior probabilities corresponding to a different BP operational point. These posterior probabilities are then individually post-processed by an OSD module, forming a pool of candidate error patterns. The final decoded output is selected from this pool according to the maximum likelihood criterion, or the minimum Hamming weight criterion under a depolarizing channel. Complexity analysis shows that this conditional two-stage design ensures that the average computational cost remains comparable to that of standard BP decoding, particularly at low physical error rates where the first stage frequently succeeds.  Results and Discussions  The effectiveness of the proposed algorithm is evaluated through Monte Carlo simulations on both Surface codes ⟦\begin{document}$ {2{d}^{2}-2d+\mathrm{1,1},d} $\end{document}⟧ and Quantum Low-Density Parity-Check (QLDPC) codes \begin{document}$ \left[\kern-0.15em\left[ {\mathrm{882,24}} \right]\kern-0.15em\right] $\end{document} under a depolarizing channel. For Surface codes, the enhanced BP-OSD algorithm achieves a substantially lower logical error rate compared with both the Minimum-Weight Perfect Matching (MWPM) algorithm and the original BP algorithm (Fig. 4(a)). The error threshold is improved from approximately \begin{document}$ 15.5\% $\end{document} (MWPM) to about \begin{document}$ 18.3\% $\end{document} with the proposed method. The average decoding time comparison in Fig. 4(b) demonstrates that, particularly at low physical error rates, the proposed algorithm maintains a decoding speed comparable to the original BP algorithm. This efficiency results from the two-stage design, in which the more computationally intensive parameter-listed search is activated only when required. For QLDPC codes (Fig. 5(a)), the proposed algorithm outperforms both the original BP and BP-OSD algorithms in terms of logical error rate, even when a smaller OSD candidate list per \begin{document}$\alpha$\end{document} value is employed. As shown in Table 3, increasing the parameter list size L (e.g., \begin{document}$ L=\mathrm{4,8},16 $\end{document}) improves decoding performance, although the gains diminish as L grows. This observation supports the choice of L = 16 as an effective balance between performance and complexity. Furthermore, the activation probability of the second stage (Table 2) decreases rapidly as the physical error rate declines, confirming the efficiency of the two-stage framework.  Conclusions  An enhanced BP-OSD algorithm for quantum error-correcting codes is presented, featuring a parameter-listing strategy for the normalization factor (\begin{document}$ \alpha $\end{document}) in the BP stage. Unlike conventional approaches that rely on a single \begin{document}$ \alpha $\end{document}, the proposed method explores multiple \begin{document}$ \alpha $\end{document} values, with the resulting posterior probabilities processed by the OSD module to select the most likely output. This systematic expansion of the search space improves decoding performance. To control computational overhead, a two-stage decoding mechanism is employed: the parameter-listed BP-OSD is activated only when an initial BP decoding with a fixed \begin{document}$ {\alpha }_{0} $\end{document} fails. Complexity analysis, supported by numerical simulations, shows that the average computational cost of the proposed algorithm remains comparable to that of standard BP decoding in low physical error rate regimes. Monte Carlo simulations further demonstrate its efficacy. For Surface codes, the enhanced BP-OSD achieves lower logical error rates than the MWPM algorithm and raises the error threshold from approximately 15.5% to 18.3%. For QLDPC codes, it exceeds both the original BP and BP-OSD algorithms in logical error rate performance, even with a reduced OSD candidate list size in the second stage. Overall, the proposed algorithm provides a promising pathway toward high-performance, high-threshold quantum error correction by balancing decoding power with operational efficiency, highlighting its potential for practical applications.
Mutualistic Backscatter NOMA Method for Coordinated Direct and Relay Transmission System
XU Yao, HU Rongfei, JIA Shaobo, LI Bo, WANG Gang, ZHANG Zhizhong
2025, 47(11): 4264-4274.   doi: 10.11999/JEIT250405
[Abstract](197) [FullText HTML](95) [PDF 3855KB](30)
Abstract:
  Objective  The exponential growth in data traffic necessitates that cellular Internet of Things (IoT) systems achieve both ultra-high spectral efficiency and wide-area coverage to meet the stringent service requirements of vertical applications such as industrial automation and smart cities. Non-Orthogonal Multiple Access-based Coordinated Direct and Relay Transmission (NOMA-CDRT) method can enhance both spectral efficiency and coverage by leveraging power-domain multiplexing and cooperative relaying, making it a promising approach to address these challenges. However, existing NOMA-CDRT frameworks are primarily designed for cellular communications and do not effectively support spectrum sharing or the deep integration of cellular and IoT transmissions. To overcome these limitations, this study proposes a Mutualistic Backscatter NOMA-CDRT (MB-NOMA-CDRT) method. This approach facilitates spectrum sharing and mutualistic coexistence between cellular users and IoT devices, while improving the system’s Ergodic Sum Rate (ESR).  Methods  The proposed MB-NOMA-CDRT method integrates backscatter modulation and power-domain superposition coding to develop a bidirectional communication strategy that unifies information transmission and cooperative assistance, enabling spectrum sharing and mutualistic coexistence between cellular users and IoT devices. Specifically, the base station uses downlink NOMA to serve the cellular center user directly and the cellular edge user via a relaying user. Simultaneously, IoT devices utilize cellular radio frequency signals and backscatter modulation to transmit their data to the base station, thereby achieving spectrum sharing. The backscattered IoT signals act as multipath gains, contributing to improved cellular communication quality. To rigorously characterize the system performance, the squared generalized-K distribution and Meijer-G functions are adopted to derive closed-form expressions for the ESR under both perfect and imperfect Successive Interference Cancellation (SIC). Building on this analytical foundation, a power allocation optimization scheme is developed using an enhanced Particle Swarm Optimization (PSO) algorithm to maximize system ESR. Finally, extensive Monte Carlo simulations are conducted to verify the ESR gains of the proposed method, confirm the theoretical analysis, and demonstrate the efficacy of the optimization scheme.  Results and Discussions  The performance advantage of the proposed MB-NOMA-CDRT method is demonstrated through comparisons of ESR with conventional NOMA-CDRT and Orthogonal Multiple Access (OMA) schemes (Fig. 2 and Fig. 3). The theoretical ESR results closely match the simulation data, confirming the validity of the analytical derivations. Under both perfect and imperfect SIC, the proposed method consistently achieves the highest ESR. This improvement arises from spectrum sharing between cellular users and IoT devices, where the IoT link contributes multipath gain to the cellular link, thereby enhancing overall system performance. To investigate the influence of power allocation, simulation results illustrate the three-dimensional relationship between ESR and power allocation coefficients (Fig. 4). A maximum ESR is observed under specific coefficient combinations, indicating that optimized power allocation can significantly improve system throughput. Furthermore, the proposed optimization scheme demonstrates rapid convergence, with ESR values stabilizing within a few iterations (Fig. 5), supporting its computational efficiency. Finally, ESR performance is compared among the proposed optimization scheme, exhaustive search, and fixed power allocation strategies (Fig. 6). The proposed scheme consistently yields higher ESR across both perfect and imperfect SIC scenarios, demonstrating its superiority in enhancing spectral efficiency while maintaining low computational complexity.  Conclusions  This study proposes a MB-NOMA-CDRT method that enables spectrum sharing between IoT devices and cellular users while improving cellular communication quality through the backscatter-assisted reflection link. To evaluate system performance, closed-form expressions for the ESR are derived under both perfect and imperfect SIC. Building on this analytical foundation, a power allocation optimization scheme based on PSO is developed to maximize the system ESR. Simulation results demonstrate that the proposed method consistently outperforms conventional NOMA-CDRT and OMA schemes in terms of ESR, under both perfect and imperfect SIC conditions. The optimization scheme also exhibits favorable convergence behavior and effectively improves system performance. Given its advantages in spectral efficiency and computational efficiency, the proposed MB-NOMA-CDRT method is well suited to cellular IoT scenarios. Future work will focus on exploring the mathematical conditions necessary to fully characterize and exploit the mutualistic transmission mechanism.
Low-complexity Ordered Statistic Decoding Algorithm Based on Skipping Mechanisms
WANG Qianfan, GUO Yangeng, SONG Linqi, MA Xiao
2025, 47(11): 4275-4284.   doi: 10.11999/JEIT250447
[Abstract](230) [FullText HTML](110) [PDF 5816KB](29)
Abstract:
  Objective  Ultra-Reliable Low-Latency Communication (URLLC) in 5G and the emerging Hyper-Reliable Low-Latency Communication (HRLLC) in 6G impose exceptionally stringent requirements on both reliability and end-to-end delay. These requirements create opportunities and challenges for short-length channel codes, particularly in scenarios where Maximum-Likelihood (ML) or near-ML decoding is desirable but computational complexity and latency are prohibitive. Ordered Statistic Decoding (OSD) is a universal near-ML decoding technique that can closely approach finite-length performance bounds. However, its re-encoding step suffers from combinatorial explosion, resulting in impractical complexity in high-throughput and low-latency systems. The excessive number of Test-Error-Pattern (TEP) re-encodings fundamentally restricts the deployment of OSD in URLLC and HRLLC contexts. To address this bottleneck, we design multiple efficient skip mechanisms that substantially reduce re-encoding operations while maintaining negligible performance degradation.  Methods  Three complementary skipping mechanisms are developed to prune the OSD re-encoding search: (1) Soft-information based skipping. Two criteria are introduced—Trivial and Dynamic Approximate Ideal (DAI), to compare the soft metric of each TEP against the minimum soft weight in the current list. Candidates with excessively large soft weights, which are unlikely to be correct, are skipped. Unlike prior work that evaluates only the first TEP at each Hamming weight increment, both criteria are applied to every candidate. The Trivial criterion ensures no performance loss by skipping only when a TEP’s soft metric exceeds the best-so-far. The DAI criterion incorporates an expected residual soft-weight compensation term over non-basis bits, enabling more aggressive skipping with minimal performance degradation. (2) Extra-parity skipping. The search dimension is expanded from \begin{document}$ k $\end{document} to \begin{document}$ k+\delta $\end{document} by appending the \begin{document}$ \delta $\end{document} most reliable non-basis bit positions to the test vector. Additional parity checks arising from the extended generator matrix eliminate invalid TEPs. Any candidate failing these extra parity constraints is bypassed. (3) Joint skipping. This approach integrates the two preceding mechanisms. Each partial TEP \begin{document}$ ({\boldsymbol{e}}_{\mathrm{L}},{\boldsymbol{e}}_{\delta })\in {\mathbb{F}}_{2}^{k+\delta } $\end{document} is first tested using the DAI rule and then subjected to the extra-parity check. Only candidates passing both criteria are re-encoded.  Results and Discussions  Extensive simulations on extended BCH \begin{document}$ \left[\mathrm{128,64}\right] $\end{document} and BCH \begin{document}$ \left[\mathrm{127,64}\right] $\end{document} codes over the BPSK-AWGN channel demonstrate the efficacy of the proposed skipping mechanisms. Soft-information skipping: When compared with conventional OSD using maximum flipping order \begin{document}$ t=4 $\end{document}, the Trivial rule is found to reduce average re-encodings by 50%~90% across the SNR range. The DAI rule achieves an additional 60%~99% reduction beyond the Trivial rule. At SNR = 3 dB, the average number of re-encodings decreases from approximately \begin{document}$ 6.7\times {10}^{5} $\end{document} to \begin{document}$ 1.2\times {10}^{3} $\end{document}, with negligible degradation in Frame-Error Rate (FER) (Fig. 1). Extra-parity skipping: For \begin{document}$ \delta =4 $\end{document}, over \begin{document}$ 90 \% $\end{document} of re-encodings are eliminated uniformly across SNR values, thereby reducing dependence on channel conditions. This reduction is achieved without significant FER loss (Fig. 2). Joint skipping: The combined mechanism demonstrates superior performance over individual schemes. It reduces average re-encodings by an additional about 40% compared with the DAI rule alone, and by more than 99.9% compared with extra-parity alone in high-SNR regimes. In this region, re-encodings decrease from \begin{document}$ \sim 6.7\times {10}^{5} $\end{document} to fewer than 100, while FER remains nearly identical to that of baseline OSD (Fig. 3). The joint skipping mechanism is further evaluated on BCH codes with different rates, including \begin{document}$ \left[\mathrm{127,36}\right] $\end{document}, \begin{document}$ \left[\mathrm{127,64}\right] $\end{document}, and \begin{document}$ \left[\mathrm{127,92}\right] $\end{document}. In all cases, substantial reductions in re-encodings are consistently achieved with negligible performance degradation (Fig. 4). A comparative analysis with state-of-the-art schemes—including Probabilistic Sufficient/Necessary Conditions (PSC/PNC), Fast OSD (FOSD), and Order-Skipping OSD (OS-OSD)—shows that the proposed joint skipping OSD with \begin{document}$ \delta =4 $\end{document} achieves the lowest re-encoding count. Up to two orders of magnitude fewer re-encodings are observed relative to OS-OSD at low SNR, and superiority over FOSD is maintained at moderate SNR, while error-correction performance is preserved across all tested SNRs (Fig. 5).  Conclusions  To address the stringent reliability and latency requirements of 5G URLLC and future 6G HRLLC, this work presents novel skipping mechanisms for OSD that substantially reduce re-encoding complexity. For offline pre-computed TEPs, the soft-information, extra-parity, and joint skipping rules eliminate more than \begin{document}$ 99 \% $\end{document} of redundant re-encodings in typical operating regimes with negligible degradation in Frame-Error Rate (FER). In particular, the proposed joint skipping mechanism lowers the average re-encoding count from approximately \begin{document}$ 6.7\times {10}^{5} $\end{document} to only tens in the high-SNR region, thereby meeting practical latency constraints while preserving near-ML performance. These findings demonstrate the potential of the proposed skipping framework to enable high-performance short-block decoding in next-generation HRLLC.
Radar, Sonar,Navigation and Array Signal Processing
The Research on Interference Suppression Algorithms for Millimeter-Wave Radar in Multi-Interference Environments
TAN Haonan, DONG Mei, CHEN Boxiao
2025, 47(11): 4285-4295.   doi: 10.11999/JEIT250617
[Abstract](176) [FullText HTML](99) [PDF 8837KB](29)
Abstract:
  Objective  With the widespread application of millimeter-wave radar in intelligent driving, mutual interference among radars has become increasingly prominent. Interference signals appear as sharp pulses in the time domain and elevated background noise in the frequency domain, severely degrading target information acquisition and threatening road traffic safety. To address this challenge, this paper proposes a joint envelope recovery-based signal reconstruction algorithm that exploits the time-domain characteristics of signals to enhance target detection performance in multi-interference environments.  Methods  The proposed algorithm consists of two core steps. Step 1: Interference region detection. A dual-criterion mechanism, combining interference envelope detection with transition point detection within the envelope, is employed. This approach substantially improves the accuracy of detecting both interference regions and useful signal segments in multi-interference environments. Step 2: Signal reconstruction. The detected useful signal segments and interference-free portions are used to reconstruct the interference regions. To ensure continuity and improve reconstruction accuracy, the Hilbert transform is applied to perform normalized envelope amplitude coordination on the reconstructed signal.  Results and Discussions  The algorithm first detects interference regions and useful signal segments with high precision through the dual-criterion mechanism, and then reconstructs the interference regions using the detected segments. Simulation results show that the algorithm achieves an interference detection accuracy of 93.7% and a useful signal segment detection accuracy of 97.2%, exceeding comparative algorithms (Table 3). The reconstructed signal effectively eliminates sharp interference pulses in the time domain, smooths the signal amplitude, and markedly improves the Signal-to-Interference-plus-Noise Ratio (SINR) in the frequency domain (Fig. 11). Compared with other interference suppression algorithms, the proposed method exhibits superior suppression performance (Fig. 12), achieving an SINR improvement of more than 3 dB in the frequency domain and maintaining better suppression effects across different SINR conditions (Fig. 13). In real-road tests, the algorithm successfully detects multiple interference regions and useful signal segments (Fig. 14) and significantly enhances the SINR after reconstruction (Fig. 15).  Conclusions  This paper proposes a joint envelope recovery-based signal reconstruction algorithm to address inaccurate target detection in multi-interference environments for millimeter-wave radar. The algorithm employs a dual-criterion mechanism to accurately detect interference regions and valid signal segments, and reconstructs the interference regions using the detected useful segments. The Hilbert transform is further applied to achieve collaborative normalization of the signal envelope. Experimental results demonstrate that the algorithm effectively identifies interference signals and reconstructs interference regions in multi-interference scenarios, significantly improving the signal-to-noise ratio, suppressing interference, and enabling accurate target information acquisition. These findings provide an effective anti-jamming solution for intelligent driving systems operating in multi-interference environments.
Highly Dynamic Doppler Space Target Situation Awareness Algorithm for Spaceborne ISAR
ZHOU Yichen, WANG Yong, DING Wenjun
2025, 47(11): 4296-4306.   doi: 10.11999/JEIT250667
[Abstract](167) [FullText HTML](72) [PDF 5964KB](40)
Abstract:
  Objective  With the growing number of operational satellites in orbit, Space Situation Awareness (SSA) has become a critical capability for ensuring the safety of space operations. Traditional ground-based radar and optical systems face inherent limitations in tracking deep-space objects due to atmospheric interference and orbital obscuration. Therefore, spaceborne Inverse Synthetic Aperture Radar (ISAR) has emerged as a pivotal technology for on-orbit target characterization, offering all-weather, long-duration observation. However, higher-order Three-Dimensional (3D) spatial-variant range migration and phase errors, caused by the complex relative motion between a spaceborne ISAR platform and its target, can seriously degrade imaging quality. Meanwhile, conventional Two-Dimensional (2D) Range-Doppler (RD) imaging provides valuable intensity distributions of scattering points but remains a projection of the target’s 3D structure. The absence of geometric information limits accurate attitude estimation and collision risk assessment. To address these challenges and achieve more comprehensive SSA, this paper proposes a joint space target imaging and attitude estimation algorithm.  Methods  This paper proposes a joint space target imaging and attitude estimation algorithm composed of three main components: space target imaging characterization, high-resolution imaging, and attitude estimation. First, the imaging characteristics of satellite targets are analyzed to establish the mapping relationship between the image domain and the Doppler parameters of individual scattering points. Second, adaptive segmentation in the two-dimensional (2D) image domain combined with high-precision regional compensation is applied to obtain high-resolution imaging results. Finally, the spatial distribution characteristics of the Doppler parameters are exploited to derive an explicit expression for the second-order Doppler parameters and to estimate the planar component attitude of the target, such as that of the solar wing.  Results and Discussions  The proposed SSA method achieves high-resolution imaging even in the presence of orbital error and complex 3D spatial-variant Doppler error. Moreover, target attitude estimation can be performed without the need for rectangular component extraction. The effectiveness of the algorithm is verified through three simulation experiments. When the target adopts different attitudes, the method successfully produces both high-resolution imaging results and accurate target attitude estimation (Fig. 7, Fig. 8). To further evaluate performance, comparative simulations are conducted (Fig. 9, Fig. 10). In addition, a method for estimating the long- and short-edge pointing of the satellite solar wing is presented in Section 3.3. The effectiveness of the proposed high-precision imaging algorithm for spinning targets is analyzed in Section 3.4, where the third simulation demonstrates the extended SSA capability of the algorithm (Fig. 11, Fig. 12).  Conclusions  This paper proposes a joint high-resolution imaging and attitude estimation algorithm to address the situational awareness requirements of highly dynamic Doppler space targets. First, the imaging characteristics of satellite targets and the mapping relationship between scattering points and higher-order Doppler parameters are derived. Second, an adaptive region segmentation algorithm is developed to compensate for 3D spatial-variant errors, thereby significantly enhancing imaging resolution. Meanwhile, an explicit correlation between Doppler parameters and satellite attitude is established based on the characteristics of planar components. Simulation results under different imaging conditions confirm the validity and reliability of the algorithm. Compared with conventional approaches, the proposed method achieves joint compensation of orbital and rotational errors. Furthermore, the attitude estimation process does not require rectangular component segmentation and remains effective even when rectangular components are partially obscured.
Electromagnetic Signal Feature Matching Characterization for Constant False Alarm Detection
WANG Zixin, XIANG Houhong, TIAN Bo, MA Hongwei, WANG Yuhao, ZENG Xiaolu, WANG Fengyu
2025, 47(11): 4307-4317.   doi: 10.11999/JEIT250589
[Abstract](166) [FullText HTML](80) [PDF 5514KB](35)
Abstract:
  Objective  Small targets such as unmanned aerial vehicles and unmanned vessels, which exhibit small Radar Cross Section (RCS) values and weak echoes, are difficult to detect due to their low observability. Traditional Constant False Alarm Rate (CFAR) detection is typically represented by the Cell-Averaged (CA) CFAR method, in which the detection threshold is determined by the statistical power parameter of the signal. However, its detection performance is constrained by the Signal-to-Noise Ratio (SNR). This study focuses on how to exploit and apply signal features beyond power parameters to achieve CFAR detection under lower SNR conditions.  Methods  After pulse compression, the envelope of a Linear Frequency Modulation (LFM) signal exhibits sinc characteristics, whereas noise retains its random nature. This difference can be used to distinguish target echoes from non-target signals. On this basis, we propose a constant false alarm detection method based on signal feature matching. First, both the ideal echo signal and the actual echo signal are processed with sliding windows of equal length to generate an ideal sample and a set of test samples. A dual-port fully connected neural network is then constructed to extract the deep feature matching degree between the ideal sample and the test samples. Finally, the constant false alarm threshold is obtained by numerically calculating the deep feature matching parameter from a large number of non-target samples compared with the standard sample.  Results and Discussions  Several sets of simulation experiments are carried out, and measured radar data from different frequency bands are applied to verify the effectiveness of the proposed method. The simulations first confirm that the method maintains stable constant false alarm characteristics (Table 1). The detection performance is then compared with traditional CA-CFAR detection, machine learning approaches, and other deep learning methods. The results indicate that, relative to CA-CFAR detection, the proposed method achieve 2~5 dB gain in equivalent SNR across different false alarm probabilities (Fig. 4). Under mismatched SNR conditions, the method continues to demonstrate robust detection performance with strong generalization capability (Fig. 5). In the processing of measured X-band radar data, the proposed method detects targets that CA-CFAR fails to identify, extending the detection range to 740 distance units, compared with 562 distance units for CA-CFAR, corresponding to an improvement of approximately 28.72% in radar detection capability (Fig. 7, 8). In the case of S-band radar data, the proposed method significantly reduces false alarms (Fig. 10, 11).  Conclusions  This study exploits the difference between target and noise signal envelopes by introducing a feature extraction network that effectively enhances target detection performance. Comparative simulation experiments and the processing of measured radar data across different frequency bands demonstrate the following: (1) the proposed method markedly improves detection performance over traditional CA-CFAR detection, yielding a 2~5 dB gain in equivalent SNR; (2) under mismatched SNR conditions, the method shows strong generalization capability, achieving better detection performance than other deep learning and machine learning approaches; (3) in X-band radar data processing, the method increases detection capability by approximately 28.72%; and (4) in S-band radar data processing, it significantly reduces false alarms. Future work will focus on accelerating the detection process to further improve efficiency.
Three-Dimensional Imaging Method for Concealed Human Targets Based on Array Stitching
QIU Chen, CHEN Jiahui, SHAO Fengzhi, LI Nian, XU Zihan, GUO Shisheng, CUI Guolong
2025, 47(11): 4318-4330.   doi: 10.11999/JEIT250334
[Abstract](122) [FullText HTML](60) [PDF 7752KB](20)
Abstract:
  Objective  Traditional Through-the-Wall Radar (TWR) systems based on planar multiple-input multiple-output arrays often face high hardware complexity, calibration challenges, and increased system cost. To overcome these limitations, we propose a Three-Dimensional (3D) imaging framework based on array stitching. The method uses either time-sequential or simultaneous operation of multiple small-aperture radar sub-arrays to emulate a large aperture. This strategy substantially reduces hardware complexity while maintaining accurate 3D imaging of concealed human targets.  Methods  The proposed framework integrates three core techniques: 3D weighted total variation (3DWTV) reconstruction, Lucy-Richardson (LR) deconvolution, and 3D wavelet transform (3DWT)-based fusion. Radar echoes are first collected from horizontally and vertically distributed sub-arrays that emulate a planar aperture. Each sub-array image is independently reconstructed using 3DWTV, which enforces spatial sparsity to suppress noise while preserving structural details. The horizontal and vertical images are then multiplicatively fused to jointly recover azimuth and elevation information. To reduce diffraction-induced blurring, LR deconvolution models system degradation through the Point Spread Function (PSF) and iteratively refines scene reflectivity, thereby enhancing cross-range resolution. Finally, 3DWT decomposes the images into multi-scale sub-bands (e.g., LLL, LLH, LHL), which are selectively fused using absolute-maximum and fuzzy-logic rules. The inverse wavelet transform is then applied to reconstruct the final 3D image, retaining both global and local features.  Results and Discussions  The proposed method is evaluated through both simulations and real-world experiments using a Stepped-Frequency Continuous-Wave (SFCW) radar operating from 1.6 to 2.2 GHz with a 2Tx-4Rx configuration. In simulations, compared with baseline algorithms such as Back-Projection (BP) and the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), the proposed method achieves better performance. Image Entropy (IE) decreases from 9.7125 for BP and 9.7065 for FISTA to 8.0711, which reflects improved image quality. Experimental tests conducted in indoor environments further confirm robustness. For both standing and sitting postures, IE is reduced from 9.9982 to 7.0030 and from 9.9947 to 6.2261, respectively.  Conclusions  This study presents a low-cost, high-resolution 3D imaging method for TWR systems based on array stitching. By integrating 3DWTV reconstruction, LR deconvolution, and 3DWT fusion, the method effectively reconstructs concealed human postures using a limited aperture. The approach simplifies hardware design, reduces system complexity, and preserves imaging quality under sparse sampling, thereby providing a practical solution for portable and scalable TWR systems.
A Waveform Design for Integrated Radar and Jamming Based on Intra-Pulse and Inter-Pulse Multiple-Phase Modulation
ZHANG Shiyuan, LU Xingyu, YAN Huabin, YANG Jianchao, TAN Ke, GU Hong
2025, 47(11): 4331-4339.   doi: 10.11999/JEIT250600
[Abstract](154) [FullText HTML](72) [PDF 3428KB](32)
Abstract:
  Objective  An integrated radar-jamming waveform employing multiple-phase modulation both within pulses (intra-pulse) and between pulses (inter-pulse) is proposed. The design increases the degrees of freedom in waveform synthesis compared with existing integrated signals, thereby improving joint performance in detection and jamming. In detection, phase compensation and complementary synthesis of received echoes are used to reconstruct a Linear Frequency Modulation (LFM) waveform, preserving the range resolution and ambiguity characteristics of LFM. In jamming, multi-parameter control of phase both across and within pulses allows flexible adjustment of the jamming energy distribution in the adversary’s range-Doppler map, enabling targeted energy allocation and concealment strategies. Simulation and experimental results show that the proposed waveform enhances overall detection and jamming performance relative to conventional integrated designs.  Methods  An integrated waveform that combines intra-pulse and inter-pulse multi-phase modulation is proposed. Carefully designed inter-pulse phase perturbations are introduced to prevent jamming energy from concentrating at zero Doppler and to allow precise control of the Doppler distribution of the jamming signal. During echo processing, the inter-pulse perturbations are removed by phase compensation so that inter-pulse complementarity reconstructs a continuous LFM waveform, thereby preserving detection performance. Each pulse is encoded with a binary phase-coded sequence, and additional phase modulation is applied between pulses. The resulting waveform has multiple tunable parameters and increased degrees of freedom, achieves low-sidelobe detection comparable to LFM, and permits flexible allocation of jamming energy across the range-Doppler plane.  Results and Discussions  The proposed integrated waveform is evaluated through simulations and practical experiments. Detection performance is significantly enhanced, with the Signal-to-Clutter-Noise Ratio (SCNR) for moving-target detection reaching 63.46 dB, representing a 25.25 dB improvement over conventional integrated waveforms and only 3.57 dB lower than that of a reference LFM signal (67.03 dB). These findings demonstrate that phase compensation and inter-pulse complementarity effectively enhance target detectability. Jamming performance is governed by the range of inter-pulse random phase perturbations. When the perturbation range is 0°, jamming energy is concentrated in the zero-Doppler main lobe, resulting in limited target masking. Expanding the range to ±90° flattens the Doppler spectrum and substantially weakens the target signature. Further extending the range to ±180° eliminates the zero-frequency main peak and achieves near-uniform diffusion of jamming energy across the Doppler domain. Therefore, by varying the inter-pulse phase range, continuous adjustment between concentrated and distributed jamming energy allocation is achieved. Overall, the waveform maintains detection performance comparable to that of optimal LFM signals while enabling flexible, parameterized control of jamming energy distribution. This design provides an adaptable solution for integrated radar-jamming systems that achieves a balance between efficient detection and adaptive jamming capability.  Conclusions  This study is based on a previously proposed integrated radar-jamming waveform and focuses on solving the problem of uneven jamming energy distribution in the unoptimized design. An integrated radar-jamming waveform based on combined intra-pulse and inter-pulse multiple-phase modulation is proposed by introducing random phase modulation between pulses. The proposed waveform achieves detection performance comparable to that of LFM signals and provides flexible control of jamming effects through multiple adjustable parameters, offering high design freedom. Theoretical analysis shows that intra-pulse modulation alone is insufficiently adaptable. The addition of random inter-pulse phases with variable distribution ranges enables more precise regulation of jamming energy diffusion. Simulation results indicate that increasing the range of inter-pulse phase perturbation leads to progressively wider diffusion of jamming energy, while detection performance remains similar to that of LFM. Therefore, by adjusting the distribution range of inter-pulse phases, the jamming energy pattern can be flexibly shaped, providing greater degrees of freedom in waveform design. Experimental results verify that the proposed waveform exhibits good overall performance in both detection and jamming. However, its practical application remains limited by specific operational conditions, which will be addressed in future studies.
A Space-Time Joint Waveform for Frequency Diverse Array Radar with Spatial Linear Frequency Modulation Weighting
LAN Yu, ZHOU Jianxiong
2025, 47(11): 4340-4350.   doi: 10.11999/JEIT250561
[Abstract](90) [FullText HTML](39) [PDF 6786KB](10)
Abstract:
  Objective  Frequency Diverse Array (FDA) radar exhibits a fast time-varying beampattern and a space-time coupled steering vector, offering potential advantages for multi-target tracking, wide-area surveillance, and mainlobe interference suppression. However, the beampattern of conventional coherent FDA radar is narrow, resulting in a shorter beam dwell time than that of phased arrays. This limitation prevents the ambiguity function of conventional coherent FDA from achieving both high range resolution and low sidelobe level simultaneously. When the baseband signal is modulated with a Linear Frequency Modulation (LFM) waveform, the ambiguity function presents low range resolution and low sidelobe level. Conversely, when the baseband signal is modulated with a phase-coded waveform, it achieves high range resolution but exhibits high sidelobe levels with strip-like high-gain sidelobes. The degradation in range resolution or sidelobe performance significantly constrains detection capability. To address this problem, this study proposes a novel space-time joint FDA waveform with spatial LFM weighting, which simultaneously achieves high range resolution, low sidelobe level, and reduced Doppler sensitivity.  Methods  The spatial-domain modulation scheme and the time-domain baseband waveform are two interdependent factors that determine the ambiguity function performance of FDA radar. Selecting a time-domain baseband waveform with a thumbtack-shaped ambiguity function enables the range resolution to remain independent of space-time coupling. By modulating the spatial weighting phase, the beampattern shape of the FDA can be adjusted to extend beam dwell time, suppress strip-like high-gain sidelobes, and smooth sidelobe energy distribution. The proposed space-time joint waveform thus achieves both high range resolution and low sidelobe level. Doppler tolerance is another key metric for evaluating ambiguity function performance. A space-time joint waveform with spatial phase-coded weighting exhibits high Doppler sensitivity, leading to significantly elevated sidelobe levels and sharp reductions in transmit beamforming gain. In contrast, the spatial LFM weighting method proposed in this study enhances Doppler tolerance while maintaining desirable range and sidelobe characteristics.  Results and Discussions  By combining the spatial LFM weighting method with a time-domain baseband waveform exhibiting a thumbtack-shaped ambiguity function (e.g., a phase-coded waveform), this study addresses the limitation of conventional coherent FDA waveforms, which cannot simultaneously achieve high range resolution and low sidelobe level. The proposed waveform demonstrates robust pulse compression performance, even under target motion. Simulation experiments were conducted to analyze the ambiguity functions under both stationary and motion conditions, and the results are summarized as follows: (1) The average sidelobe levels near the target peak for the space-time joint FDA waveform with spatial LFM weighting and spatial phase-coded weighting are both approximately –30 dB (Fig.3(a)(b)). In comparison, the average sidelobe level near the target peak for the spatial phase-coded weighting FDA using a time-domain LFM baseband waveform is about –20 dB (Fig.3(c)), while that of the coherent FDA with a time-domain phase-coded waveform is about –12 dB (Fig.3(d)). Thus, the two space-time joint FDA waveforms achieve the lowest average sidelobe levels. (2) The imaging results of both space-time joint FDA waveforms show no strip-like high-gain sidelobes (Fig.4(a)(b)). By contrast, the spatial phase-coded weighting FDA and the coherent FDA with a time-domain phase-coded waveform both display prominent high-gain sidelobes (Fig.4(c)(d)). These sidelobes from high Signal-to-Noise Ratio (SNR) targets can obscure nearby low-SNR targets. (3) All four FDA waveforms achieve a range resolution of 0.75 m (Fig. 5), corresponding to a bandwidth of 200 MHz. (4) Under motion conditions, the space-time joint FDA waveform with spatial phase-coded weighting exhibits a notable increase in peak sidelobe level compared with stationary conditions (Fig. 6(a)). In contrast, the space-time joint FDA waveform with spatial LFM weighting maintains the lowest peak sidelobe level among all four FDA configurations (Fig. 6(b)).  Conclusions  This study proposes a space-time joint FDA waveform with spatial LFM weighting. The proposed waveform effectively resolves the issue of degraded range resolution in conventional coherent FDA systems, ensuring that range resolution depends solely on bandwidth. It also eliminates the strip-like high-gain sidelobes commonly observed in conventional FDA waveforms. Under simulation conditions, the average sidelobe level near the target peak is reduced by approximately 10 dB and 18 dB compared with those of the spatial phase-coded weighting FDA and the coherent FDA with a time-domain phase-coded waveform, respectively. This reduction substantially mitigates the masking of low-SNR targets by sidelobes from high-SNR targets and demonstrates strong Doppler tolerance. However, under relative motion conditions, the proposed waveform exhibits Doppler-angle coupling, which will be addressed in future research through the development of coupling mitigation strategies.
Detection and Localization of Radio Frequency Interference via Cross-domain Multi-feature from SAR Raw Data
FU Zewen, WEI Tingting, LI Ningning, LI Ning
2025, 47(11): 4351-4362.   doi: 10.11999/JEIT250701
[Abstract](113) [FullText HTML](56) [PDF 8021KB](24)
Abstract:
  Objective  The increasing congestion of the electromagnetic spectrum presents major challenges for Synthetic Aperture Radar (SAR) systems, where Radio Frequency Interference (RFI) can severely degrade imaging quality and compromise interpretation accuracy. Existing detection methods have critical limitations: time-domain approaches are insensitive to weak interference, whereas transform-domain methods perform poorly in characterizing broadband interference. This study develops a cross-domain framework that integrates complementary features from multiple domains, enabling robust RFI detection and accurate localization. The proposed approach addresses the deficiencies of single-domain methods and provides a reliable solution for operational SAR systems.  Methods  This study introduces two methodological innovations. First, a weighted feature fusion framework combines the first-order derivatives of time-domain kurtosis and skewness using Principal Component Analysis (PCA)-optimized weights, thereby capturing both global statistical distributions and local dynamic variations. Second, a differential time-frequency analysis technique applies the Short-Time Fourier Transform (STFT) with logarithmic ratio operations and adaptive thresholding to achieve sub-pulse interference localization. The overall workflow integrates K-means clustering for initial detection, STFT-based feature enhancement, binary region identification, and Inverse STFT (ISTFT) reconstruction. The proposed approach is validated against three state-of-the-art methods using both simulated data and Sentinel-1 datasets.  Results and Discussions  Experimental results demonstrate marked improvements across all evaluation metrics. For simulated data, the proposed method achieves a signal accuracy (SA) of 98.56% and a False Alarm (FA) rate of 0.65% (Table 2), representing a 3.13% gain in SA compared with conventional methods. The Root Mean Square Error (RMSE) reaches 0.1902 (Table 3), corresponding to a 10.9% improvement over existing techniques. Visual analysis further confirms more complete interference detection (Fig. 2) and cleaner suppression results (Figs. 4 and 7), with target features preserved. For measured data, the method maintains robust performance, achieving a gray entropy of 0.7843 (Table 5), and effectively mitigating the severe FAs observed in traditional approaches (Fig. 8).  Conclusions  In complex and dynamic electromagnetic environments, traditional RFI detection methods often show inaccuracies or even fail when processing NarrowBand Interference (NBI) or WideBand Interference (WBI), limiting their operational applicability. To address this challenge, this study proposes an engineering-oriented interference detection method designed for practical SAR operations. By combining time-domain kurtosis with the first derivative of skewness, the approach significantly enhances detection accuracy and adaptability. Furthermore, a localization technique is introduced that precisely identifies interference positions. Using time-frequency domain analysis, the method calculates differential values between the time-frequency representations of echo signals with and without interference, and determines interference locations through threshold-based judgment. Extensive simulations and Sentinel-1 experiments confirm the universality and effectiveness of the proposed method in both detection and localization.
Dynamic Inversion Algorithm for Rainfall Intensity Based on Dual-Mode Microwave Radar Combined Rain Gauge
ZHANG Qishuo, ZHANG Wenxin, GAO Mengyu, XIONG Fei
2025, 47(11): 4363-4372.   doi: 10.11999/JEIT250535
[Abstract](186) [FullText HTML](108) [PDF 5180KB](10)
Abstract:
  Objective  Microwave meteorological radar has broad application potential in rainfall detection due to its non-contact measurement, high spatiotemporal resolution, and multi-parameter retrieval capability. However, in the context of climate change, increasingly complex rainfall events require monitoring systems to deliver high-precision, multi-dimensional, real-time data to support disaster warning and climate research. Conventional single-mode radars, constrained by fixed functionalities, cannot fully meet these requirements, which has led to the development of multi-mode radar technology. The dual-mode radar examined in this study employs Frequency Modulated Continuous Wave (FMCW) and Continuous Wave (CW) modes. These modes adopt different algorithmic principles for raindrop velocity measurement: FMCW enables spatially stratified detection and strong anti-interference performance, whereas CW provides more accurate measurements of raindrop fall speed, yielding integral rainfall information in the vertical column. Despite these advantages, retrieval accuracy remains limited by the reliance of traditional algorithms on fixed empirical parameters, which restrict adaptability to regional climate variations and dynamic microphysical precipitation processes, and hinder real-time response to variations in rain Drop Size Distribution (DSD). Ground rain gauges, by contrast, provide near-true reference data through direct measurement of rainfall intensity. To address the above challenges, this paper proposes a dynamic inversion algorithm that integrates dual-mode (FMCW-CW) radar with rain gauge data, enhancing adaptability and retrieval accuracy for rainfall monitoring.  Methods  Two models are developed for the two radar modes. For the FMCW mode, which can retrieve DSD parameters, a fusion algorithm based on Attention integrated with a double-layer Long Short-Term Memory (LSTM) network (LSTM-Attention-LSTM) is proposed. The first LSTM extracts features from DSD data and rain gauge–measured rainfall intensity through its hidden state output, with a dropout layer applied to randomly discard neurons and reduce overfitting. The Attention mechanism calculates feature similarity using dot products and converts it into attention weights. The second LSTM then processes the time series and integrates the hidden-layer features, which are passed through a fully connected layer to generate the retrieval results. For the CW mode, which cannot directly retrieve DSD parameters and is constrained to the reflectivity factor-Rainfall rate (Z-R) relationship (Z=aRb), an algorithm based on the Extended Kalman Filter (EKF) is proposed to optimize this relationship. The method dynamically models the Z-R parameters, computes the residual between predicted rainfall intensity and rain gauge observations, and updates the prior estimates accordingly. Physical constraints are applied to parameters a and b during state updates to ensure consistency with physical laws, thereby enabling accurate fitting of the Z-R relationship.  Results and Discussions  Experimental results show that both models enhance the accuracy of rainfall intensity retrieval. For the FMCW mode, the LSTM-Attention-LSTM model applied to the test dataset outperforms traditional physical models, single-layer LSTM, and double-layer LSTM. It effectively captures the temporal variation of rainfall intensity, with the absolute error relative to observed values remaining below 0.25 mm/h (Fig. 5). Compared with the traditional physical model, the LSTM-Attention-LSTM reduces RMSE and MAE by 46% and 38%, achieving values of 0.1623 mm/h and 0.147 mm/h, respectively, and increases R2 by 14.5% to 0.95 (Table 2). For the CW mode, the Z-R relationship optimized by the EKF model provides the best fit for the Z and R distribution in the validation dataset (Fig. 6). Rainfall intensity retrieved with this algorithm on the test set exhibits the smallest deviation from actual observations compared with convective cloud empirical formulas, Beijing plain area empirical formulas, and the dynamic Z-R method. The corresponding RMSE, MAE, and R2 reach 0.1076 mm/h, 0.094 mm/h, and 0.972, respectively (Fig. 7; Table 4).  Conclusions  This study proposes two multi-source data fusion schemes that integrate dual-mode radar with rain gauges for short-term rainfall monitoring. Experimental results confirm that both methods significantly improve the accuracy of rainfall intensity retrieval and demonstrate strong dynamic adaptability and robustness.
Quasi-Vortex Electromagnetic Wave Radar Forward-Looking Imaging based on Echo Phase Weighting
SHU Gaofeng, WEI Yixin, LI Ning
2025, 47(11): 4373-4383.   doi: 10.11999/JEIT250542
[Abstract](137) [FullText HTML](76) [PDF 11620KB](18)
Abstract:
  Objective  Forward-looking radar imaging plays a critical role in multiple applications. Numerous algorithms have been proposed to enhance azimuth resolution; however, improvement remains difficult due to the limitations imposed by antenna aperture. Existing high-resolution techniques, including synthetic aperture radar and Doppler beam sharpening, rely on Doppler bandwidth and inevitably create blind spots in the forward-looking region. Vortex electromagnetic waves carrying orbital angular momentum offer potential in forward-looking scenarios because of the orthogonality between different orbital angular momentum modes. In conventional vortex electromagnetic wave imaging, a Uniform Circular Array (UCA) is used to generate and transmit multi-mode vortex electromagnetic waves. Yet, the UCA-generated waves suffer from main lobe divergence, which disperses energy and weakens echo signals, while multi-mode transmission increases system complexity. To address these issues, this paper proposes a Quasi-Circular Array (QCA) that reduces system complexity, produces vortex electromagnetic waves with more concentrated main lobes, and preserves phase linearity. In addition, a post-processing method based on echo phase weighting is introduced. By applying phase modulation to the single-mode echo received by each antenna element, a complete equivalent multi-mode echo is synthesized. The proposed method enhances azimuth resolution and exhibits strong anti-noise performance.  Methods  To obtain clear images under low Signal-to-Noise Ratio (SNR) conditions, a phase modulation echo post-processing method combined with a QCA is proposed. The QCA first generates a single-mode vortex electromagnetic wave to illuminate the region of interest. Each element of the array then receives and stores the echo. Phase modulation is subsequently applied to the stored echo to generate signals of specific modes, thereby synthesizing an equivalent multi-mode echo with enhanced amplitude that preserves target information. This approach demonstrates strong potential for practical applications in forward-looking radar imaging under low SNR conditions.  Results and Discussions  When noise is added to the echo and imaging is performed (Figure 11), the proposed method achieves superior results under noisy conditions. As noise intensity increases, a clear target can still be reconstructed at an SNR of –10 dB. Even when the SNR is reduced to –15 dB and the target is submerged in noise, the contour features of the reconstructed target remain distinguishable. These results demonstrate that the method has strong anti-noise performance. In addition, when imaging is performed within a smaller mode range, the azimuth resolution achieved by the proposed method improves by an average factor of 2.2 compared with the traditional method (Figure 9). The improvements in resolution and anti-noise performance can be attributed to two factors: (1) The vortex electromagnetic waves generated by the QCA experience reduced destructive interference due to the asymmetric spatial distribution of array elements, producing waves with more concentrated main lobes, lower side lobes, and higher radiation gain. (2) Applying phase modulation in echo processing reduces the pulse repetition frequency of the vortex electromagnetic wave at the transmitting end, thereby lowering system complexity.  Conclusions  This study proposes a method capable of effective imaging under low SNR conditions. The echo expression of the electric field generated by the QCA is derived, and the radiation gain and phase characteristics of the quasi-vortex electromagnetic wave are analyzed. In addition, an echo post-processing method based on phase modulation is introduced. Simulation results demonstrate that, compared with the traditional UCA method, the proposed approach generates vortex electromagnetic waves with more concentrated main lobes, lower side lobes, and higher gain, while improving azimuth resolution by a factor of 2.2. Even at a SNR of –15 dB, the reconstructed imaging results remain distinguishable.
A 3D Underwater Target Tracking Algorithm with Integrated Grubbs-Information Entropy and Improved Particle Filter
CAI Fanglin, WANG Ji, QIU Haowei
2025, 47(11): 4384-4393.   doi: 10.11999/JEIT250249
[Abstract](70) [FullText HTML](31) [PDF 2901KB](15)
Abstract:
  Objective  To address the limited target tracking accuracy of traditional Particle Filter (PF) algorithms in three-dimensional Underwater Wireless Sensor Networks (UWSNs) under abnormal conditions, this study proposes a three-dimensional underwater target tracking algorithm (OGIE-IPF). The algorithm integrates an optimized Grubbs criterion-based information entropy-weighted data fusion with an Improved Particle Filter (IPF). Conventional PF algorithms often suffer from particle degeneracy and impoverishment, which restrict estimation accuracy. Although weight optimization strategies introduced during resampling can enhance particle diversity, existing approaches mainly rely on fixed weighting factors that cannot dynamically adapt to real-time operating conditions. Moreover, current anomaly detection methods for multi-source data fusion fail to effectively address data coupling and heteroscedasticity across dimensions. To overcome these challenges, a dynamic adaptive hierarchical weight optimization strategy is designed for the resampling phase, enabling adaptive particle weighting across hierarchy levels. Additionally, a Mahalanobis distance discrimination mechanism is incorporated into the Grubbs criterion-based anomaly detection method, achieving effective multi-dimensional anomaly detection through covariance-sensitive analysis.  Methods  The proposed OGIE-IPF algorithm enhances target tracking accuracy under underwater abnormal conditions through a distributed data processing framework that integrates multi-source data fusion and adaptive filtering. First, the Unscented Kalman Filter (UKF) is incorporated into the particle filtering framework to construct the importance density function, thereby alleviating particle degeneracy. Simultaneously, a dynamic adaptive hierarchical weight optimization mechanism is proposed during the resampling phase to improve particle diversity. Second, the Mahalanobis distance replaces the conventional standardized residual method in the standard Grubbs criterion for anomaly statistic construction. By incorporating the covariance matrix of multidimensional variables, the method achieves effective anomaly detection for multi-dimensional data. Finally, local target tracking is performed using the IPF combined with the optimized Grubbs criterion for anomaly detection and sensor credibility evaluation, whereas global state estimation is realized through an information entropy-weighted multi-source fusion algorithm.  Results and Discussions  The IPF developed in this study is designed to enhance particle set diversity through optimization of the importance density function and refinement of the resampling strategy. To evaluate algorithm performance, a comparative experimental group with a particle population of 100 is established. Simulation results indicate that the weight distribution variances of the IPF at specific time points and over the entire tracking period are reduced by approximately 98.27% and 97.26%, respectively, compared with the traditional PF (Figs. 2 and 3). These findings suggest that the improved strategy effectively regulates particles with varying weights, resulting in a balanced distribution across hierarchical weight levels. Sensor anomalies are simulated by introducing substantial perturbations in observation noise. The experimental data show that the OGIEWF algorithm maintains optimal error metrics throughout the operational period (Figs. 4 and 5), demonstrating superior capability in suppressing abnormal noise interference. To further assess algorithm robustness, two representative scenarios under low-noise and high-noise conditions are constructed for multi-algorithm comparison. The results indicate that OGIE-IPF achieves Root Mean Square Error (RMSE) reductions of 79.78%, 66.78%, and 56.41% compared with the PF, Extended Particle Filter (EPF), and Unscented Particle Filter (UPF) under low-noise conditions, and reductions of 83.41%, 70.38%, and 21.68% under high-noise conditions (Figs. 8 and 11).  Conclusions  The OGIE-IPF algorithm proposed in this study enhances target tracking accuracy in three-dimensional underwater environments through two synergistic mechanisms. First, tracking precision is improved by refining the PF framework to optimize the intrinsic accuracy of the filtering process. Second, data fusion reliability is strengthened via an anomaly detection framework that mitigates interference from erroneous sensor measurements. Simulation results confirm that the OGIE-IPF algorithm produces state estimations more consistent with ground truth trajectories than conventional PF, EPF, and UPF algorithms, achieving lower RMSE and maintaining stable tracking performance under limited particle populations and abnormal noise conditions. Future work will extend the model to incorporate dynamic marine environmental factors and address the effects of malicious node interference within underwater network security systems.
Electromagnetic Finite-Difference Time-Domain Scattering Analysis of Multilayered/Porous Materials in Specific Geometric Meshing
ZHANG Yuxian, YANG Zijiang, HUANG Zhixiang, FENG Xiaoli, FENG Naixing, YANG Lixia
2025, 47(11): 4394-4404.   doi: 10.11999/JEIT250348
[Abstract](115) [FullText HTML](62) [PDF 11146KB](13)
Abstract:
The Finite-Difference Time-Domain (FDTD) method is a widely used tool for analyzing the electromagnetic properties of dielectric media, but its application is often constrained by model complexity and mesh discretization. To enhance the efficiency of electromagnetic scattering simulations in multilayered/porous materials, we proposes an accelerated FDTD scheme in this paper. Computational geometry algorithms can be employed with the proposed method to rapidly generate Yee’s grids, utilizing a three-dimensional voxel array to define material distributions and field components. By exploiting the voxel characteristics, parallel algorithms are employed to efficiently compute Radar Cross Sections (RCS) for non-analytical geometries. In contrast to conventional volumetric mesh generation, which relies on analytic formulas, this work integrates ray-intersection techniques with Signed Distance Functions (SDFs). Calculations of tangent planes and intersection points minimize invalid traversals and reduce computational complexity, thus expediting grid-based electromagnetic parameter assignment for porous and irregular structures. The approach is applied to the RCS calculations of multilayered/porous models, demonstrating excellent consistency with results from popular commercial solvers (FEKO, CST, HFSS) while offering substantially higher efficiency. Numerical experiments confirm significant reductions in computation time and computer memory without compromising accuracy. Overall, the proposed acceleration scheme enhances the FDTD method’s ability to handle complex dielectric structures, providing an effective balance between computational speed and accuracy, and offering innovative solutions for rapid mesh generation and processing of complex internal geometries.  Objective   The FDTD method, a reliable approach for computing the electromagnetic properties of dielectric media, faces constraints in computational efficiency and accuracy due to model structure and mesh discretization. A major challenge in the field is achieving efficient electromagnetic scattering analysis with minimal computational resources while maintaining sufficient wavelength sampling resolution. To address this difficulty, we propose an FDTD-based electromagnetic analysis acceleration scheme that enhances simulation efficiency by significantly improving mesh generation and optimizing grid partitioning for complex multilayered/porous models.  Methods   In this study, those Yee’s grids for complex materials are efficiently generated using computational geometry algorithms and a 3D voxel array to define material distribution and field components. A parallel algorithm leverages voxel data to accelerate RCS calculations for non-analytical geometries. Unlike conventional volumetric meshing methods that rely on analytic formulas, this approach integrates ray-intersection techniques with SDFs. Calculations of tangent planes and intersection points further reduce invalid traversals and geometric complexity, facilitating faster grid-based assignment of electromagnetic parameters. Numerical experiments validate that the method effectively supports porous and multilayered non-analytical structures, demonstrating both high efficiency and accuracy.  Results and Discussions   The accelerated volumetric meshing algorithm is validated using a Boeing 737 model, showing more than a 67.5% reduction in computation time across different resolutions. Efficiency decreases at very fine meshes because of heavier computational loads and suboptimal valid-grid ratios. The method is further evaluated on three multilayered/porous structures, achieving 85.55% faster computation and 9.8% lower memory usage compared with conventional FDTD. In comparison with commercial solvers (FEKO, CST, HFSS), equivalent accuracy is maintained while runtimes are reduced by 87.58% and memory consumption by 81.6%. In all tested cases, errors remain below 6% relative to high-resolution FDTD, confirming that the proposed acceleration scheme provides both high efficiency and reliable accuracy.  Conclusions   In this study, we optimize volumetric mesh generation in FDTD through computational geometry algorithms. By combining ray-intersection techniques with reliable SDFs, the proposed approach efficiently manages internal cavities, while tangent-plane calculations minimize traversal operations and complexity, thereby accelerating scattering analysis. The scheme extends the applicability of FDTD to a broader range of dielectric structures and materials, delivering substantial savings in computation time and memory without compromising accuracy. Designed to support universal geometric model files, the framework shows strong potential for stealth optimization of multi-material structures and the development of electromagnetic scattering systems. It represents an important step toward integrating computational geometry with computational electromagnetics.
Cryption and Network Information Security
Edge Network Data Scheduling Optimization Method Integrating Improved Jaya and Cluster Center Selection Algorithm
YANG Wensheng, PAN Chengsheng
2025, 47(11): 4405-4418.   doi: 10.11999/JEIT250317
[Abstract](159) [FullText HTML](120) [PDF 5549KB](11)
Abstract:
  Objective  The rapid advancement of technologies such as artificial intelligence and the Internet of Things has placed increasing strain on traditional centralized cloud computing architectures, which struggle to meet the communication and computational demands of large-scale data processing. Due to the physical separation between cloud servers and end-users, data transmission typically incurs considerable latency and energy consumption. Therefore, edge computing—by deploying computing and storage resources closer to users, has emerged as a viable paradigm for supporting data-intensive and latency-sensitive applications. However, effectively addressing the challenges of data-intensive services in edge computing environments, such as efficient edge node clustering and resource scheduling, remains a key issue. This study proposes a data scheduling optimization method for edge networks that integrates an improved Jaya algorithm with a cluster center selection strategy. Specifically, for data-intensive services, the method partitions edge nodes into clusters and identifies optimal cluster centers. Data are first aggregated at these centers before being transmitted to the cloud. By leveraging cluster-based aggregation, the method facilitates more efficient data scheduling and improved resource management in edge environments.  Methods  The proposed edge network data scheduling optimization method comprises two core components: a shortest-path selection algorithm based on an improved Jaya algorithm and an optimal cluster center selection algorithm. The scheduling framework accounts for both the shortest communication paths among edge nodes and the availability of network resources. The improved Jaya algorithm incorporates a cosine-based nonlinear decay function and a multi-stage search strategy to dynamically optimize inter-node paths. The nonlinear decay function modulates the variation of random factors across iterations, allowing adaptive adjustment of the algorithm’s exploration capacity. This mechanism helps prevent premature convergence and reduces the likelihood of becoming trapped in local optima during the later optimization stages. To further enhance performance, a multi-stage search strategy divides the optimization process into two phases: an exploration phase during early iterations, which prioritizes global search across the solution space, and an exploitation phase during later iterations, which refines solutions locally. This staged approach improves the trade-off between convergence speed and solution accuracy, increasing the algorithm’s robustness in complex edge network environments. Based on the optimized paths and available bandwidth, a criterion is established for selecting the initial cluster center. Subsequently, a selection scheme for additional cluster centers is formulated by evaluating inter-cluster center distances. Finally, a partitioning method assigns edge nodes to their respective clusters based on the optimized topology.  Results and Discussions  The simulation experiments comprise two parts: performance evaluation of the improved Jaya algorithm (Jaya*) and analysis of the cluster partitioning scheme. To assess convergence speed and optimization accuracy, three benchmark test functions are used to compare Jaya* with four existing algorithms: Simulated Annealing (SA), Genetic Algorithm (GA), Ant Colony Optimization (ACO), and the standard Jaya algorithm. Building on these results, two additional experiments—cluster center selection and cluster partitioning—are conducted to evaluate the feasibility and effectiveness of the proposed optimal cluster center selection algorithm for resource scheduling. A parameter sensitivity analysis using the multi-modal Rastrigin function is performed to investigate the effects of different population sizes and maximum iteration counts on optimization accuracy and stability (Table 2 and Table 3). The optimal configuration is determined to be \begin{document}$ {\text{po}}{{\text{p}}_{{\text{size}}}} = 50 $\end{document} and \begin{document}$ {t_{\max }} = 500 $\end{document}, which achieves a favorable balance between accuracy and computational efficiency. Subsequently, a multi-algorithm comparison experiment is carried out under consistent conditions. The improved Jaya algorithm outperforms the four alternatives in convergence speed and optimization accuracy across three standard functions: Sphere (Fig. 4), Rastrigin (Fig. 5), and Griewank (Fig. 6). The algorithm also demonstrates superior stability. Its convergence trajectory is characterized by a rapid initial decline followed by gradual stabilization in later stages. Based on these findings, the cluster center selection algorithm is applied to tactical edge networks of varying scales—25, 38, and 50 nodes (Fig. 7). The parameter mi is calculated (Fig. 8), and various numbers of cluster centers are set to complete center selection and cluster member assignment (Table 5). Evaluation using the Average Sum of Squared Errors (AvgSSE) under different cluster center counts reveals that the minimum AvgSSE for all three network sizes occurs when the number of cluster centers is 4 (Table 6), indicating that this configuration yields the optimal clustering outcome. Therefore, the proposed method effectively selects cluster centers and derives the optimal clustering configuration (Fig. 9), while maintaining low clustering error and enhancing the efficiency and accuracy of resource scheduling. Finally, in a 38-node edge network scenario with four cluster centers, a multi-algorithm cluster partitioning comparison is conducted (Table 7). The improved Jaya algorithm achieves the best AvgSSE result of 16.22, significantly outperforming the four baseline algorithms. These results demonstrate its superiority in convergence precision and global search capability.  Conclusions  To address data resource scheduling challenges in edge computing environments, this study proposes an edge network data scheduling optimization method that integrates an improved Jaya algorithm with a cluster center selection strategy. The combined approach achieves high clustering accuracy, robustness, and generalization performance. It effectively enhances path planning precision and central node selection, leading to improved data transmission performance and resource utilization in edge networks.
A Hybrid Beamforming Algorithm Based on Riemannian Manifold Optimization with Non-Monotonic Line Search
YAN Junrong, SHI Weitao, LI Pei
2025, 47(11): 4419-4428.   doi: 10.11999/JEIT250396
[Abstract](215) [FullText HTML](128) [PDF 2116KB](21)
Abstract:
  Objective  Fully digital beamforming architectures provide high spectral efficiency but demand one Radio-Frequency (RF) chain per antenna element, resulting in substantial cost, power consumption, and hardware complexity. These limitations hinder their practical deployment in large-scale antenna systems. Hybrid beamforming offers a feasible alternative by reducing hardware requirements while retaining much of the performance. In such systems, analog beamforming modules follow a reduced number of RF chains to control massive antenna arrays. Analog phase shifters are energy-efficient and cost-effective but restricted to constant modulus constraints, which are essential for hardware implementation. In contrast, digital phase shifters offer flexible control over amplitude and phase. The central challenge is to approximate the spectral efficiency of fully digital systems while adhering to analog-domain constraints and minimizing energy and hardware demands. To overcome this challenge, this study proposes a novel hybrid beamforming algorithm that integrates Riemannian manifold optimization with a non-monotonic line search strategy (MO-NMLS). This approach achieves improved trade-offs among spectral efficiency, energy consumption, and hardware complexity.  Methods  The proposed methodology proceeds as follows. First, the joint matrix optimization problem for maximizing spectral efficiency in hybrid beamforming is decomposed into separate transmitter and receiver subproblems by formulating an appropriate objective function. This objective is then reformulated using a least squares approach, reducing the dimensionality of the search space from two to one. To accommodate the constant modulus constraints of analog beamforming, the problem is transformed into an unconstrained optimization on Riemannian manifolds. Both the Euclidean and Riemannian gradients of the modified objective function are derived analytically. Step sizes are adaptively determined using an MO-NMLS, which incorporates historical gradient information to compute dynamic step factors. This mechanism guides the search direction while avoiding convergence to suboptimal local minima due to fixed step sizes. Distinct update rules for the step factor are applied depending on whether the iteration count is odd or even. In each iteration, the current objective function value is compared with those from the preceding L iterations to decide whether to accept the new step and iteration point. After updating the step size, tangent vectors are retracted onto the manifold to generate new iterates until convergence criteria are satisfied. Once the analog precoder is fixed based on the optimized search direction, the corresponding digital precoder is derived in closed form. The dynamic step factor is computed using gradient data from the current and preceding L iterations, allowing the objective function to exhibit non-strict monotonicity within bounded ranges. This adaptive strategy results in faster convergence compared with conventional fixed-step methods.  Results and Discussions  The relationship between internal iteration count and Signal-to-Noise Ratio (SNR) for different beamforming algorithms is shown in Fig. 4. The MO-NMLS algorithm requires significantly fewer iterations than the conventional Conjugate Gradient (CG) method under both fully connected and overlapping subarray architectures. This improved efficiency arises from the use of Riemannian manifold optimization, which inherently satisfies the constant modulus constraints without necessitating computationally intensive Hessian matrix evaluations. Runtime performance is benchmarked in Fig. 5. The MO-NMLS algorithm reduces runtime by 75.3% relative to CG in the fully connected structure and by 79.2% in the overlapping subarray structure. Additionally, MO-NMLS achieves a further 21.1% reduction in runtime under the overlapping subarray architecture compared with the fully connected one, owing to simplified hardware requirements. Spectral efficiency as a function of SNR is presented in Fig. 6. In fully connected systems, MO-NMLS achieves a 0.64% improvement in spectral efficiency over CG while maintaining comparable stability in overlapping subarray architectures. This performance gain stems from the algorithm’s ability to avoid local optima, a key limitation of Orthogonal Matching Pursuit (OMP), which selects paths based solely on residual correlation. The scalability of MO-NMLS with respect to the number of antennas and data streams is demonstrated in Fig. 7. In fully connected systems, MO-NMLS outperforms CG by 0.73%, 1.43%, and 2.48% in spectral efficiency at antenna and data stream configurations of (32, 2), (64, 4), and (128, 8), respectively. While spectral efficiency increases across all algorithms as system scale grows, MO-NMLS exhibits the most substantial gains at higher scales. Energy efficiency improvements under the overlapping subarray architecture are shown in Fig. 8. Compared with the fully connected configuration, MO-NMLS yields energy efficiency gains of 1.2%, 10.9%, and 25.9% at subarray offsets of 1, 8, and 16, respectively. These improvements are attributed to the reduced number of required phase shifters and power amplifiers, which decreases total system power consumption as the subarray offset increases.  Conclusions  The proposed MO-NMLS algorithm achieves an effective balance among spectral efficiency, hardware complexity, and energy consumption in hybrid beamforming systems, while substantially reducing computational runtime. Moreover, the overlapping subarray architecture attains spectral efficiency comparable to that of fully connected systems, with significantly lower execution times. These results highlight the practical advantages of the proposed approach for large-scale antenna systems operating under resource constraints.
Research on Skill-Aware Task Assignment Algorithm under Local Differential Privacy
FANG Xianjin, ZHEN Yaru, ZHANG Pengfei, HUANG Shanshan
2025, 47(11): 4429-4439.   doi: 10.11999/JEIT250425
[Abstract](154) [FullText HTML](65) [PDF 2639KB](7)
Abstract:
  Objective  With the proliferation of mobile smart devices and wireless networks, Spatial Crowdsourcing (SC) has emerged as a new paradigm for collaborative task execution. By leveraging workers’ real-time locations, SC platforms dynamically assign tasks to distributed participants. However, continuous exposure of location data creates privacy risks, including trajectory inference and identity disclosure, which reduce worker participation and threaten system sustainability. Existing privacy-preserving methods either rely on trusted third parties or apply traditional differential privacy mechanisms. The former incurs high costs and security vulnerabilities, whereas the latter struggles to balance the trade-off between noise magnitude and data utility, often reducing task matching accuracy. To address these challenges, this study proposes a skill-aware task assignment algorithm under Local Differential Privacy (LDP) that simultaneously enhances location privacy protection and task assignment performance. The algorithm is particularly effective in settings characterized by uneven skill distributions and complex task requirements.  Methods  To protect workers’ location privacy, a Clip-Laplace (CLP) mechanism is applied to perturb real-time location data under Local Differential Privacy (LDP), ensuring bounded noise while maintaining data utility. To mitigate mismatches between heterogeneous task requirements and imbalanced worker skills, an entropy-based metric is used to evaluate skill diversity. When entropy falls below a predefined threshold, a secondary screening strategy rebalances the distribution by suppressing common skills and prioritizing rare ones. A skill-aware Pruning Greedy task assignment algorithm (PUGR) is then developed. PUGR iteratively selects the worker-task pair with the highest marginal contribution to maximize skill coverage under spatiotemporal and budget constraints. To improve computational efficiency, three pruning strategies are integrated: time-distance pruning, high-reward pruning, and budget-infeasibility pruning. Finally, comparative and ablation experiments on three real-world datasets assess the method using multiple metrics, including Loss of Quality of Service (LQS), Average Remaining Budget Rate (ARBR), and Task Completion Rate (TCR).  Results and Discussions  Experimental results show that the CLP mechanism consistently achieves lower LQS than the traditional Laplace mechanism (LP) across different privacy budgets, effectively reducing errors introduced by noise (Fig. 2). For skill diversity, the entropy-based metric combined with secondary screening nearly doubles the average entropy of candidate workers on the TKY and NYC datasets, demonstrating its effectiveness in balancing skill distribution. In task assignment, the proposed PUGR algorithm completes most worker-task matchings within four iterations, thereby reducing redundant computation and accelerating convergence (Fig. 3). Regarding budget utilization, the ARBR under CLP remains close to the No Privacy (NoPriv) baseline, indicating efficient resource allocation (Fig. 4, Table 2). For task completion, the method achieves a TCR of up to 90% in noise-free settings and consistently outperforms Greedy, OE-ELA, and TsPY under CLP (Fig. 5). Ablation studies further validate the contributions of secondary screening and pruning strategies to overall performance improvement.  Conclusions  This study addresses two central challenges in spatial crowdsourcing: protecting workers’ location privacy and improving skill-aware task assignment. A task assignment framework is proposed that integrates the CLP mechanism with a skill-aware strategy under the LDP model. The CLP mechanism provides strong privacy guarantees while preserving data utility by limiting noise magnitude. An entropy-based metric combined with secondary screening ensures balanced skill distribution, substantially enhancing skill coverage and task execution success in multi-skill scenarios. The PUGR algorithm incorporates skill contribution evaluation with multiple pruning constraints, thereby reducing the search space and improving computational efficiency. Experiments on real-world datasets demonstrate the method’s superiority in terms of LQS, ARBR, and TCR, confirming its robustness, scalability, and effectiveness in balancing privacy protection with assignment performance. Future work will explore dynamic pricing mechanisms based on skill scarcity and personalized, adaptive incentives to foster fairness, long-term worker engagement, and the sustainable development of spatial crowdsourcing platforms.
Source Code Vulnerability Detection Method Integrating Code Sequences and Property Graphs
YANG Hongyu, LUO Jingchuan, CHENG Xiang, HU Juncheng
2025, 47(11): 4440-4450.   doi: 10.11999/JEIT250470
[Abstract](101) [FullText HTML](45) [PDF 2746KB](14)
Abstract:
  Objective  Code vulnerabilities create opportunities for hacker intrusions, and if they are not promptly identified and remedied, they pose serious threats to cybersecurity. Deep learning-based vulnerability detection methods leverage large collections of source code to learn secure programming patterns and vulnerability characteristics, enabling the automated identification of potential security risks and enhancing code security. However, most existing deep learning approaches rely on a single network architecture, extracting features from only one perspective, which constrains their ability to comprehensively capture multi-dimensional code characteristics. Some studies have attempted to address this by extracting features from multiple dimensions, yet the adopted feature fusion strategies are relatively simplistic, typically limited to feature concatenation or weighted combination. Such strategies fail to capture interdependencies among feature dimensions, thereby reducing the effectiveness of feature fusion. To address these challenges, this study proposes a source code vulnerability detection method integrating code sequences and property graphs. By optimizing both feature fusion and vulnerability detection processes, the proposed method effectively enhances the accuracy and robustness of vulnerability detection.  Methods  The proposed method consists of four components: feature representation, feature extraction, feature fusion, and vulnerability detection (Fig. 1). First, vector representations of the code sequence and the Code Property Graph (CPG) are obtained. Using word embedding and node embedding techniques, the code sequence and graph nodes are mapped into fixed-dimensional vectors, which serve as inputs for subsequent feature extraction. Next, the pre-trained UniXcoder model is employed to capture contextual information and extract semantic features from the code. In parallel, a Residual Gated Graph Convolution Network (RGGCN) is applied to the CPG to capture complex structural information, thereby extracting graph structural features. To integrate these complementary representations, a Multimodal Attention Fusion Network (MAFN) is designed to model the interactions between semantic and structural features. This network generates informative fused features for the vulnerability detection task. Finally, a MultiLayer Perceptron (MLP) performs classification on the semantic features, structural features, and fused features. An interpolated prediction classifier is then applied to optimize the detection process by balancing multiple prediction outcomes. By adaptively adjusting the model’s focus according to the characteristics of different code samples, the classifier enables the detection model to concentrate on the most critical features, thereby improving overall detection accuracy.  Results and Discussions  To validate the effectiveness of the proposed method, comparative experiments were conducted against baseline approaches on the Devign, Reveal, and SVulD datasets. The experimental results are summarized in (Tables 13). On the Devign dataset, the proposed method achieved an accuracy improvement of 1.38% over SCALE and a precision improvement of 5.19% over CodeBERT. On the Reveal dataset, it improved accuracy by 0.08% compared to SCALE, with precision being closest to that of SCALE. On the SVulD dataset, the method achieved an accuracy improvement of 0.13% over SCALE and a precision gain of 8.15% over Vul-LMGNNs. Collectively, these results demonstrate that the proposed method consistently yields higher accuracy and precision. This improvement can be attributed to its effective integration of semantic information extracted by UniXcoder and structural information captured by RGGCN. By contrast, CodeBERT and LineVul effectively learn code semantics but exhibit insufficient understanding of complex structural patterns, resulting in weaker detection performance. Devign and Reveal employ gated graph neural networks to capture structural information from code graphs but lack the ability to model semantic information contained in code sequences, which constrains their performance. Vul-LMGNNs attempt to improve detection performance by jointly learning semantic and structural features; however, their feature fusion strategy relies on simple concatenation. This approach fails to account for correlations between features, severely limiting the expressive power of the fused representation and reducing detection performance. In contrast, the proposed method fully leverages and integrates semantic and structural features through multimodal attention fusion. By modeling feature interactions rather than treating them independently, it achieves superior accuracy and precision, enabling more effective vulnerability detection.  Conclusions  Fully integrating code features across multiple dimensions can significantly enhance vulnerability detection performance. Compared with baseline methods, the proposed approach enables deeper modeling of interactions among code features, allowing the detection model to develop a more comprehensive understanding of code characteristics and thereby achieve superior detection accuracy and precision.
An Optimization Design Method for Zero-Correlation Zone Sequences Based on Newton’s Method
HU Enbo, LIU Tao, LI Yubo
2025, 47(11): 4451-4458.   doi: 10.11999/JEIT250394
[Abstract](284) [FullText HTML](135) [PDF 1661KB](39)
Abstract:
  Objective  Sequences with favorable correlation properties are widely applied in radar and communication systems. Sequence sets with zero or low correlation characteristics enhance radar resolution, target detection, imaging quality, and information acquisition, while also improving the omnidirectional transmission capability of massive multiple-input multiple-output (MIMO) systems. Designing aperiodic Zero Correlation Zone (ZCZ) sequence sets with excellent correlation performance is therefore critical for both wireless communication and radar applications. For example, aperiodic Z-Complementary Set (ZCS) sequence sets are often used in omnidirectional precoding for MIMO systems, whereas aperiodic ZCZ sequence sets are employed in integrated MIMO radar-communication systems. These ZCZ sequence sets are thus valuable across a range of system applications. However, most prior studies rely on analytical construction methods, which impose constraints on parameters such as sequence length and the number of sequences, thereby limiting design flexibility and practical applicability. This study proposes a numerical optimization approach for designing ZCS and aperiodic ZCZ sequence sets with improved correlation properties and greater parametric flexibility. The method minimizes the Complementary Peak Sidelobe Level (CPSL) and Weighted Peak Sidelobe Level (WPSL) using Newton’s method to achieve superior sequence performance.  Methods  This study proposes an optimization-based design method using Newton’s method to construct both aperiodic ZCS sequence sets and aperiodic ZCZ sequence sets with low sidelobe levels and flexible parameters. The optimization objective is first formulated using the CPSL and WPSL. The problem is then reformulated as an equivalent system of nonlinear equations, which is solved using Newton’s method. To reduce computation time, partial derivatives are approximated using numerical differentiation techniques. A loop iteration strategy is employed to address multiple constraints during the optimization process. To ensure algorithmic convergence, Armijo’s rule is used for step size selection, promoting stable descent of the objective function along the defined search direction.  Results and Discussions  The aperiodic ZCS sequence set is constructed using Newton’s method. As the number of sequences increases, the CPSL progressively decreases, falling below –300 dB when \begin{document}$M \geqslant 2$\end{document}. The proposed method yields better sidelobe performance than the improved Iterative Twisted Approximation (ITORX) algorithm (Fig. 1). The performance of ZCS sequences generated by both methods is evaluated under different ZCZ conditions. While both approaches achieve low CPSL, Newton’s method yields aidelobe levels closer to the ideal value (Fig. 2). Convergence behavior is assessed using CPSL and the number of iterations. The improved ITROX algorithm typically requires around 20000 iterations to converge, with increasing iterations as ZCZ size grows. In contrast, Newton’s method achieves rapid convergence within approximately 10 iterations (Figs. 3 and 4). The aperiodic ZCZ sequence set constructed using Newton’s method exhibits autocorrelation and cross-correlation peak sidelobe levels below –300 dB within the ZCZ. Moreover, Newton’s method achieves the lowest WPSL, offering the best overall performance among all tested methods (Fig. 5). The smooth convergence curves further confirm the algorithm’s stability when applied to aperiodic ZCZ sequence construction (Fig. 6).  Conclusions  This study proposes an optimization-based algorithm for designing aperiodic ZCS and aperiodic ZCZ sequence sets using Newton’s method, aiming to address the limitations of fixed parameters and high peak sidelobe levels found in existing approaches. Two optimization problems are formulated by minimizing the WPSL and CPSL, respectively. To simplify computation, the optimization tasks are converted into systems of nonlinear equations, which are solved using Newton’s method. The Jacobian matrix is computed via numerical differentiation to reduce computational cost. A loop iteration strategy is introduced to meet multiple constraints in the construction of aperiodic ZCZ sequences. Simulation results confirm that the proposed method yields sequence sets with excellent correlation properties and flexible parameter configurations. By tuning the weighting coefficients, low sidelobe levels can be achieved in specific regions of interest, accommodating different application requirements. The combination of flexible design parameters and favorable correlation performance makes the proposed sequences suitable for a wider range of practical scenarios.
Multi-modal Joint Distillation Optimization for Source Code Vulnerability Detection
ZHANG Xuejun, ZHANG Yifan, LIU Cancan, JIA Xiaohong, CHEN Zhuo, ZHANG Lei
2025, 47(11): 4459-4469.   doi: 10.11999/JEIT250453
[Abstract](142) [FullText HTML](95) [PDF 2667KB](18)
Abstract:
  Objective  As software systems increase in scale and complexity, the frequency of security vulnerabilities in source code rises accordingly, threatening system reliability, data integrity, and user privacy. Conventional automated vulnerability detection methods typically depend on a narrow set of shallow features—such as API call sequences, opcode patterns, or syntactic heuristics—rendering them susceptible to learning spurious correlations and unable to capture the rich semantic and structural information essential for accurate detection. Moreover, most existing approaches either rely on single-modal representations or weakly integrate multiple modalities without explicitly addressing distribution mismatches across them. This often results in overfitting to seen datasets and limited generalization to unseen codebases, particularly across different projects or programming languages. Although recent advances in machine learning and deep learning have improved source code analysis, effectively modeling the complex interactions between code semantics and execution structures remains a major challenge. To overcome these limitations, this paper proposes a multi-modal joint Distillation Optimization for Vulnerability Detection (mVulD-DO), a multimodal framework that combines deep feature distillation with dynamic global feature alignment. The proposed method aims to enhance semantic comprehension, structural reasoning, and cross-modal consistency, which are critical for robust vulnerability detection. By enforcing both intra-modality refinement and inter-modality alignment, mVulD-DO addresses the semantic-structural gap that constrains traditional methods.  Methods  The mVulD-DO framework begins by extracting multiple semantic modalities from raw source code—function names, variable names, token_type attributes, and local code slices—using program slicing and syntactic parsing techniques. In parallel, a Program Dependency Graph (PDG) is constructed to capture both control-flow and data-flow relationships, generating a heterogeneous graph that represents explicit and implicit program behaviors. Each semantic modality is embedded using pretrained encoders and subsequently refined via a deep feature distillation module, which integrates multi-head self-attention and multi-scale convolutional layers to emphasize salient patterns and suppress redundant noise. To model the sequential dependencies intrinsic to program execution, a Bidirectional Long Short-Term Memory (BLSTM) network captures long-range contextual interactions. For structural representation, a Graph Attention Network (GAT) processes the PDG-C to produce topology-aware embeddings. To bridge the gap between modalities, adaptive dynamic Sinkhorn regularization is applied to globally align the distributions of semantic and structural embeddings. This approach mitigates modality mismatches while preserving flexibility by avoiding rigid one-to-one correspondences. Finally, the distilled and aligned multimodal features are fused and passed through a lightweight fully connected classifier for binary vulnerability prediction. The model is jointly optimized using both classification and alignment objectives, improving its ability to generalize across unseen codebases.  Results and Discussions  Comprehensive comparisons conducted on the mixed CVE-fixes+SARD dataset—covering 25 common CWE vulnerability types with an 8:1:1 train:validation:test split—demonstrate that traditional source code detectors, which directly map code to labels, often rely on superficial patterns and show limited generalization, particularly for out-of-distribution samples. These methods achieve accuracies ranging from 55.41% to 85.84%, with recall typically below 80% (Table 1). In contrast, mVulD-DO leverages multi-layer feature distillation to purify and enhance core representations, while dynamic Sinkhorn alignment mitigates cross-modal inconsistencies. This results in accuracy of 87.11% and recall of 83.59%, representing absolute improvements of 1.27% and 6.26%, respectively, over the strongest baseline method (ReGVD). Although mVulD-DO reports a slightly higher False Positive Rate (FPR) 10.98%—2.92% above that of ReGVD—it remains lower than that of most traditional detectors. This modest increase is considered acceptable in practice, given that failing to detect a critical vulnerability typically incurs greater cost than issuing additional alerts. Compared with instruction-tuned large language models (e.g., VulLLM), which maintain low FPRs below 10% but suffer from recall below 75%, mVulD-DO offers a more favorable trade-off between false alarms and coverage of true vulnerabilities. Ablation studies (Table 2) further validate the contribution of each component. Removing function name embeddings (unfunc) results in a 1.3% decrease in F1 score; removing variable name embeddings (unvar) causes a 1.3% drop; and omitting token_type attributes (untype) leads to a 3.35% reduction. The most substantial performance degradation—9.11% in F1—occurs when the deep feature distillation module is disabled (undis), highlighting the critical role of multi-scale semantic refinement and noise suppression. Additional evaluations on vulnerability-sensitive subsets—Call, OPS, Array, and Ptr—demonstrate consistent benefits from Sinkhorn alignment. F1 score improvements over unaligned variants are observed as follows: 1.45% for Call, 4.22% for OPS, 1.38% for Array, and 0.36% for Ptr (Table 3), confirming the generalization advantage across a broad spectrum of vulnerability types.  Conclusions  Experimental results demonstrate that the proposed mVulD-DO framework consistently outperforms existing vulnerability detection methods in recall, F1-score, and accuracy, while maintaining a low FPR The effectiveness of deep feature distillation, multi-scale semantic extraction, and dynamic Sinkhorn alignment is validated through extensive ablation and visualization analyses. Despite these improvements, the model incurs additional computational overhead due to multimodal distillation and Sinkhorn alignment, and shows sensitivity to hyperparameter settings, which may limit its suitability for real-time applications. Moreover, while strong performance is achieved on the mixed dataset, the model’s generalization across unseen projects and programming languages remains an open challenge. Future work will focus on developing lightweight training strategies—such as knowledge distillation and model pruning—to reduce computational costs. Additionally, incorporating unsupervised domain adaptation and incremental alignment mechanisms will be critical to support dynamic code evolution and enhance cross-domain robustness. These directions aim to improve the scalability, adaptability, and practical deployment of multimodal vulnerability detection systems in diverse software environments.
Construction of Multi-Scroll Conservative Chaotic System and Its Application in Image Encryption
AN Xinlei, LI Zhifu, XUE Rui, XIONG Li, ZHANG Li
2025, 47(11): 4470-4481.   doi: 10.11999/JEIT250432
[Abstract](139) [FullText HTML](89) [PDF 9887KB](22)
Abstract:
  Objective  Existing conservative chaotic systems often suffer from structural simplicity and weak nonlinear characteristics, and research on complex dynamical behaviors such as multi-scroll structures remains limited, constraining their potential in engineering applications. To address security risks in face image transmission and the inefficiency of traditional global encryption methods, this study constructs a conservative chaotic system with multi-scroll characteristics, investigates its complex dynamical behavior, and designs a face-detection-based selective image encryption algorithm targeting sensitive regions. The work explores the practical application of conservative chaotic systems in image encryption.  Methods  A five-dimensional conservative hyperchaotic system is constructed on the basis of the generalized Hamiltonian system, and the controlled generation of multi-scroll chaotic flows is achieved through modulation of the Hamiltonian energy function. The Hessian matrix is used to analyze the stationary points of the Hamiltonian energy function, thereby revealing the relationship between scroll structures and stationary points. The spatial distribution of multi-scroll chaotic flows is further characterized by energy isosurfaces. The complex dynamical behaviors of the proposed system are investigated using Lyapunov exponent spectra and phase diagrams, while the sequence complexity is evaluated with the SE complexity algorithm. On this basis, an image encryption algorithm integrated with face detection technology is designed. The algorithm applies a diffusion-scrambling strategy to selectively encrypt facial regions. The security performance is evaluated through multiple indicators, including key space, pixel correlation, and information entropy.  Results and Discussions  Analysis of stationary points in the Hamiltonian energy function revealed a positive correlation between their number and scroll generation. Extreme points primarily drive scroll formation, whereas saddle points define transition zones, indicating that the scroll structure can be effectively regulated through the Hamiltonian energy function. The Lyapunov exponent spectrum of the multi-scroll conservative chaotic system is distributed symmetrically about the x-axis and exhibits an integer Lyapunov dimension, fully confirming the system’s volume-conserving property. Under different initial conditions, the system demonstrates diverse coexistence behaviors, including phase trajectories of varying types and scales. Complexity evaluation further showed that the multi-scroll conservative chaotic system achieves markedly higher spectral entropy complexity, supporting its potential for image encryption applications. Experimental validation demonstrated that the proposed algorithm can accurately detect faces and selectively encrypt sensitive regions, thereby avoiding the computational inefficiency of indiscriminate global encryption. Moreover, the algorithm exhibited strong performance across multiple security metrics.  Conclusions  A conservative chaotic system is constructed on the basis of the generalized Hamiltonian system, and its complex dynamical behavior and application in image encryption are investigated. The study provides theoretical references for the generation of multi-scroll conservative chaotic flows and offers practical guidance for the application of image encryption technology.
Bayesian Optimization-driven Design Space Exploration Method for Coarse-Grained Reconfigurable Cipher Logic Array
JIANG Danping, DAI Zibin, LIU Yanjiang, ZHOU Zhaoxu, SONG Xiaoyu
2025, 47(11): 4482-4492.   doi: 10.11999/JEIT250624
[Abstract](76) [FullText HTML](38) [PDF 2893KB](4)
Abstract:
  Objective  Coarse-Grained Reconfigurable Cipher logic Arrays (CGRCAs) are widely employed in information security systems owing to their high flexibility, strong performance, and inherent security. Design Space Exploration (DSE) plays a critical role in evaluating and optimizing the performance of cryptographic algorithms deployed on CGRCAs. However, conventional DSE approaches require extensive computation time to locate optimal solutions in multi-objective optimization problems and often yield suboptimal performance. To overcome these limitations, this study proposes a Bayesian optimization-based DSE framework, termed Multi-Objective Bayesian optimization-based Exploration (MOBE), which enhances search efficiency and solution quality while effectively satisfying the complex design requirements of CGRCA architectures.  Methods  The high-dimensional characteristics and multi-objective optimization features of the CGRCA are analyzed, and its design space is systematically modeled. A DSE method based on Bayesian optimization is then proposed, comprising initial sampling design, rapid evaluation model construction, surrogate model development, and acquisition function optimization. A knowledge-aware unsupervised learning sampling strategy is introduced to integrate domain-specific knowledge with clustering algorithms, thereby improving the representativeness and diversity of the initial samples. A rapid evaluation model is established to estimate throughput, area overhead, and Function Unit (FU) utilization for each sample, effectively reducing the computational cost of performance evaluation. To enhance both search efficiency and generalizability, a greedy-based hybrid surrogate model is constructed by combining Gaussian Process with Deep Kernel Learning (DKL-GP), random forest, and neural network models. Moreover, an adaptive multi-acquisition function is designed by integrating Expected Hyper Volume Improvement (EHVI) and quasi-Monte Carlo Upper Confidence Bound (qUCB) to identify the most promising samples and maintain a balanced trade-off between exploration and exploitation. The weighting ratio between EHVI and qUCB is dynamically adjusted to accommodate the varying optimization requirements across different search phases.  Results and Discussions  The DSE method based on Bayesian optimization (Algorithm 2) includes initial sampling design, rapid evaluation model construction, surrogate model development, and acquisition function optimization to enhance solution quality and search efficiency. Simulation results show that the knowledge-aware unsupervised learning sampling strategy reduces the Average Distance from Reference Set (ADRS) by up to 28.2% and increases hypervolume by 15.1% compared with existing sampling approaches (Table 3). This improvement primarily arises from the integration of domain knowledge with clustering algorithms. Compared with single surrogate model-based DSE methods, the greedy-based hybrid surrogate model leverages the complementary advantages of multiple surrogate models across different optimization stages, prioritizing samples that contribute most to hypervolume expansion. The hybrid surrogate model achieves a reduction in ADRS of up to 31.7% and an improvement in hypervolume of 20.0% (Table 4). Furthermore, the proposed MOBE framework achieves a 34.9% reduction in ADRS and increases hypervolume by 28.7% relative to state-of-the-art DSE methods (Table 5). Regarding the average performance metrics of Pareto-front samples, MOBE enhances throughput by up to 29.9%, reduces area overhead by 6.0%, and improves FU utilization by 11.6% (Fig. 6), confirming its superiority in overall solution quality. Moreover, the MOBE method exhibits excellent cross-algorithm stability in both hypervolume and Normalized Overall Execution Time (NOET) (Table 6 and Fig. 7).  Conclusions  This study presents a multi-objective DSE method based on Bayesian optimization that enhances both solution quality and search efficiency for CGRCA. The proposed approach employs a knowledge-aware unsupervised learning sampling strategy to generate an initial sample set with high representativeness and diversity. A rapid evaluation model is subsequently developed to reduce the computational cost of performance assessments. Additionally, the integration of adaptive multi-acquisition functions with a greedy-based hybrid surrogate model further improves the efficiency and generalization capability of the DSE framework. Comparative experiments demonstrate the effectiveness of the proposed MOBE method: (1) the sampling strategy reduces the ADRS by up to 28.2% and increases hypervolume by 15.1% compared with existing methods; (2) the greedy-based hybrid surrogate model achieves up to a 31.7% reduction in ADRS and a 20.0% improvement in hypervolume relative to single surrogate model-based approaches; (3) the overall MOBE framework achieves a 34.9% reduction in ADRS and a 28.7% increase in hypervolume compared with state-of-the-art DSE techniques; (4) MOBE improves throughput by up to 29.9%, reduces area overhead by 6.0%, and increases FU utilization by 11.6% relative to existing methods; and (5) MOBE exhibits excellent cross-algorithm stability in hypervolume and NOET. MOBE is applicable to medium-and-high-performance cryptographic application scenarios, including cloud platforms and desktop terminals. Nevertheless, two limitations remain. First, MOBE currently employs only traditional surrogate models, which may constrain feature learning efficiency and modeling accuracy. Second, its validation is confined to a CGRCA architecture previously developed by the research group, lacking verification across existing CGRCA architectures. Future work will address these limitations by incorporating emerging artificial intelligence techniques, such as large models, and conducting extensive experiments on diverse CGRCA architectures to further enhance the generalization and effectiveness of MOBE.
A Cross-Dimensional Collaborative Framework for Header-Metadata-Driven Encrypted Traffic Identification
WANG Menghan, ZHOU Zhengchun, JI Qingbing
2025, 47(11): 4493-4503.   doi: 10.11999/JEIT250434
[Abstract](102) [FullText HTML](68) [PDF 5166KB](14)
Abstract:
  Objective  With the widespread adoption of network communication encryption technologies, encrypted traffic identification has become a critical problem in network security. Traditional identification methods based on payload content face the risk of feature invalidation due to the continuous evolution of encryption algorithms, leading to detection blind spots in dynamic network environments. Meanwhile, the structured information embedded in packet headers, an essential carrier for protocol interaction, remains underutilized. Furthermore, as encryption protocols evolve, existing encrypted traffic identification approaches encounter limitations such as poor feature interpretability and weak model robustness against adversarial attacks. To address these challenges, this paper proposes a cross-dimensional collaborative identification framework for encrypted traffic, driven by header metadata features. The framework systematically reveals and demonstrates the dominant role of header features in encrypted traffic identification, overcoming the constraints of single-perspective analyses and reducing dependence on payload data. It further enables the assessment of deep model performance boundaries and decision credibility. Through effective feature screening and pruning, redundant attributes are eliminated, enhancing the framework’s anti-interference capability in encrypted scenarios. This approach reduces model complexity while improving interpretability and robustness, facilitating the design of lighter and more reliable encrypted traffic identification models.  Methods  This study performs a three-dimensional analysis including (1) network traffic feature selection and identification performance, (2) quantitative evaluation of feature importance in classification, and (3) assessment of model robustness under adversarial perturbations. First, the characteristics, differences, and effects on identification performance are compared among three forms of encrypted traffic packets using a One-Dimensional Convolutional Neural Network (1D-CNN). This comparison verifies the dominant role of header features in encrypted traffic identification. Second, two explainable algorithms, Layer-wise Relevance Propagation (LRP) and Deep Taylor Decomposition (DTD), are employed to further confirm the essential contribution of header features to network traffic classification. The relative importance of header and payload features is quantified from two perspectives: (1) the relevance of backpropagation and (2) the contribution coefficients derived from Taylor series expansion, thereby enhancing feature interpretability. Finally, adversarial attack experiments are conducted using Projected Gradient Descent (PGD) and random perturbations. By injecting carefully constructed adversarial perturbation data into the initial and terminal parts of the payload, or by adding randomly generated noise to produce adversarial traffic, the study examines the effect of these perturbations on model decision-making. This analysis evaluates the stability and anti-interference capabilities of the encrypted traffic identification model under adversarial conditions.  Results and Discussions  Comparative experiments conducted on the ISCXVPN2016 and ISCXTor2016 datasets yield three key findings. (1) Recognition performance. The model based solely on header features achieves an F1 score up to 6% higher than that of the model using complete traffic, and up to 61% higher than that of the model using only payload features. These results verify that header features possess irreplaceable significance in encrypted traffic identification. The structural information embedded in headers plays a dominant role in enabling the model to accurately classify traffic types. Even without payload data, high identification accuracy can be achieved using header information alone (Figure 2, Table 4). (2) Interpretability evaluation. The LRP and DTD methods are used to quantify the contribution of header features to model classification. The correlation between header features and classification performance is markedly higher than that of payload features, with the average proportion of the correlation score up to 89.8% (Figures 34, Table 5). This result is highly consistent with the classification behavior of the One-Dimensional Convolutional Neural Network (1D-CNN), further confirming the critical importance and dominant influence of header features in encrypted traffic identification. (3) Anti-interference robustness. The combined Header-Payload model exhibits strong robustness under adversarial attacks. Particularly under low-bandwidth conditions, the model incorporating header features shows a markedly higher maximum performance retention rate under equivalent bandwidth perturbation than the pure payload model, with the maximum difference reaching 98.46%. This finding confirms the essential role of header features in enhancing model robustness (Figures 56). Header-based models maintain stable recognition performance, whereas payload information is more susceptible to interference, leading to sharp performance degradation. In addition, the identification performance, contribution quantification, and anti-attack effectiveness of header features are influenced by data type and distribution characteristics. In certain cases, payload features provide auxiliary support, suggesting a complementary relationship between the two feature domains.  Conclusions  This study addresses core challenges in encrypted traffic identification, including feature degradation, limited interpretability, and weak adversarial robustness in traditional payload-dependent methods. A cross-dimensional collaborative identification framework driven by header features is proposed. Through systematic theoretical analysis and experimental validation from three perspectives, the framework demonstrates the irreplaceable value of header features in network traffic identification and overcomes the limitations of conventional single-perspective approaches. It provides a theoretical foundation for improving the efficiency, interpretability, and robustness of encrypted traffic identification models. Future work will focus on enhancing dynamic adaptability, integrating multi-modal features, implementing lightweight architectures, and strengthening adversarial defense mechanisms. These directions are expected to advance encrypted traffic identification technology toward higher intelligence, adaptability, and resilience.
Image and Intelligent Information Processing
Weakly Supervised Recognition of Aerial Adversarial Maneuvers via Contrastive Learning
ZHU Longjun, YUAN Weiwei, MEN Xuefeng, TONG Wei, WU Qi
2025, 47(11): 4504-4514.   doi: 10.11999/JEIT250495
[Abstract](199) [FullText HTML](86) [PDF 3012KB](36)
Abstract:
  Objective  Accurate recognition of aerial adversarial maneuvers is essential for situational awareness and tactical decision-making in modern air warfare. Conventional supervised approaches face major challenges: obtaining labeled flight data is costly due to the intensive human effort required for collection and annotation, and these methods are limited in capturing temporal dependencies inherent in sequential flight parameters. Temporal dynamics are crucial for describing the evolution of maneuvers, yet existing models fail to fully exploit this information. To address these challenges, this study proposes a weakly supervised maneuver recognition framework based on contrastive learning. The method leverages a small proportion of labeled data to learn discriminative representations, thereby reducing reliance on extensive manual annotations. The proposed framework enhances recognition accuracy in data-scarce scenarios and provides a robust solution for maneuver analysis in dynamic adversarial aerial environments.  Methods  The proposed framework extends the Simple framework for Contrastive Learning of visual Representations (SimCLR) into the time-series domain by incorporating five temporal-specific data augmentation strategies: time compression, masking, permutation, scaling, and flipping. These augmentations generate multi-view samples that form positive pairs for contrastive learning, thereby ensuring temporal invariance in the feature space. A customized ResNet-18 encoder is employed to extract hierarchical features from the augmented time-series data, and a Multi-Layer Perceptron (MLP) projection head maps these features into a contrastive space. The Normalized Temperature-scaled cross-entropy (NT-Xent) loss is adopted to maximize similarity between positive pairs and minimize it between negative pairs, which effectively mitigates pseudo-label noise. To further improve recognition performance, a fine-tuning strategy is introduced in which pre-trained features are combined with a task-specific classification head using a limited amount of labeled data to adapt to downstream recognition tasks. This contrastive learning framework enables efficient analysis of time-series flight data, achieves accurate recognition of fighter aircraft maneuvers, and reduces dependence on large-scale labeled datasets.  Results and Discussions  Experiments are conducted on flight simulation data obtained from DCS World. To address the class imbalance issue, hybrid datasets (Table 1) are constructed, and training data ratios ranging from 2% to 30% are employed to evaluate the effectiveness of the weakly supervised framework. The results demonstrate that contrastive learning effectively captures the temporal patterns within flight data. For example, on the D1 dataset, accuracy with the base method increases from 35.6% with 2% labeled data to 89.5% when the fine-tuning ratio reaches 30% (Tables 35, Fig. 2(a)). To improve recognition of long maneuver sequences, a linear classifier and a voting strategy are introduced. The voting strategy markedly enhances few-shot learning performance. On the D1 dataset, accuracy reaches 54.5% with 2% labeled data and rises to 97.9% at a 30% fine-tuning ratio, representing a substantial improvement over the base method. On the D6 dataset, which simulates multi-source data fusion scenarios in air combat, the accuracy of the voting method increases from 0.476 with 2% labeled data to 0.928 with 30% labeled data (Fig. 2(f)), with a growth rate in the low-data phase 53% higher than that of the base method. Additionally, on the comprehensive D7 dataset, the accuracy standard deviation of the voting method is only 0.011 (Fig. 2(g), Fig. 3), significantly lower than the 0.015 observed for the base method. The superiority of the proposed framework can be attributed to two factors: the suppression of noise through integration of multiple prediction results using the voting strategy and the extraction of robust features from unlabeled data via contrastive learning pre-training. Together, these techniques enhance generalization and stability in complex scenarios, confirming the effectiveness of the method in leveraging unlabeled data and managing multi-source information.  Conclusions  This study applies the SimCLR framework to maneuver recognition and proposes a weakly supervised approach based on contrastive learning. By incorporating targeted data augmentation strategies and combining self-supervised learning with fine-tuning, the method exploits the latent information in time-series data, yielding substantial improvements in recognition performance under limited labeled data conditions. Experiments on simulated air combat datasets demonstrate that the framework achieves stable recognition across different data categories, offering practical insights for feature learning and model optimization in time-series classification tasks. Future research will focus on three directions: first, integrating real flight data to evaluate the model’s generalization capability in practical scenarios; second, developing dynamically adaptive data augmentation strategies to enhance performance in complex environments; and third, combining reinforcement learning and related techniques to improve autonomous decision-making in dynamic aerial missions, thereby expanding opportunities for intelligent flight operations.
Research on Federated Unlearning Approach Based on Adaptive Model Pruning
MA Zhenguo, HE Zixuan, SUN Yanjing, WANG Bowen, LIU Jianchun, XU Hongli
2025, 47(11): 4515-4524.   doi: 10.11999/JEIT250503
[Abstract](232) [FullText HTML](105) [PDF 1720KB](25)
Abstract:
  Objective  The rapid proliferation of Internet of Things (IoT) devices and the enforcement of data privacy regulations, including the General Data Protection Regulation (GDPR) and the Personal Information Protection Act, have positioned Federated Unlearning (FU) as a critical mechanism to safeguard the “right to be forgotten” in Edge Computing (EC). Existing class-level unlearning approaches often adopt uniform model pruning strategies. However, because edge nodes vary substantially in computational capacity, storage, and network bandwidth, these methods suffer from efficiency degradation, leading to imbalanced training delays and decreased resource utilization. This study proposes FU with Adaptive Model Pruning (FunAMP), a framework that minimizes training time while reliably eliminating the influence of target-class data. FunAMP dynamically assigns pruning ratios according to node resources and incorporates a parameter correlation metric to guide pruning decisions. In doing so, it addresses the challenge of resource heterogeneity while preserving compliance with privacy regulations.  Methods  The proposed framework establishes a quantitative relationship among model training time, node resources, and pruning ratios, on the basis of which an optimization problem is formulated to minimize overall training time. To address this problem, a greedy algorithm (Algorithm 2) is designed to adaptively assign appropriate pruning ratios to each node. The algorithm discretizes the pruning ratio space and applies a binary search strategy to balance computation and communication delays across nodes. Additionally, a Term Frequency-Inverse Document Frequency (TF-IDF)-based metric is introduced to evaluate the correlation between model parameters and the target-class data. For each parameter, the TF score reflects its activation contribution to the target class, whereas the IDF score measures its specificity across all classes. Parameters with high TF-IDF scores are iteratively pruned until the assigned pruning ratio is satisfied, thereby ensuring the effective removal of target-class data.  Results and Discussions  Simulation results confirm the effectiveness of FunAMP in balancing training efficiency and unlearning performance under resource heterogeneity. The effect of pruning granularity on model accuracy (Fig. 1): fine granularity (e.g., 0.01) preserves model integrity, whereas coarse settings degrade accuracy due to excessive parameter removal. Under fixed training time, FunAMP consistently achieves higher accuracy than FunUMP and Retrain (Fig. 2), as adaptive pruning ratios reduce inter-node waiting delays. For instance, FunAMP attains 76.48% accuracy on LeNet and 83.60% on AlexNet with FMNIST, outperforming baseline methods by 5.91% and 4.44%, respectively. The TF-IDF-driven pruning mechanism fully removes contributions of target-class data, achieving 0.00% accuracy on the target data while maintaining competitive performance on the remaining data (Table 1). Robustness under varying heterogeneity levels is further verified (Fig. 3). Compared with baselines, FunAMP markedly reduces the training time required to reach predefined accuracy and delivers up to 11.8× speedup across four models. These results demonstrate FunAMP’s capability to harmonize resource utilization, preserve model performance, and ensure unlearning efficacy in heterogeneous edge environments.  Conclusions  To mitigate training inefficiency caused by resource heterogeneity in FU, this study proposes FunAMP, a framework that integrates adaptive pruning with parameter relevance analysis. A system model is constructed to formalize the relationship among node resources, pruning ratios, and training time. A greedy algorithm dynamically assigns pruning ratios to edge nodes, thereby minimizing global training time while balancing computational and communication delays. Furthermore, a TF-IDF-driven metric quantifies the correlation between model parameters and target-class data, enabling the selective removal of critical parameters to erase target-class contributions. Theoretical analysis verifies the stability and reliability of the framework, while empirical results demonstrate that FunAMP achieves complete removal of target-class data and sustains competitive accuracy on the remaining classes. This work is limited to single-class unlearning, and extending the approach to scenarios requiring the simultaneous removal of multiple classes remains an important direction for future research.
For Electric Power Disaster Early Warning Scenarios: A Large Model and Lightweight Models Joint Deployment Scheme Based on Limited Spectrum Resources
CHEN Lei, HUANG Zaichao, LIU Chuan, ZHANG Weiwei
2025, 47(11): 4525-4534.   doi: 10.11999/JEIT250321
[Abstract](101) [FullText HTML](53) [PDF 3056KB](23)
Abstract:
  Objective  Traditional approaches to electric power disaster early warning rely on dedicated, scenario-specific systems, leading to redundant data collection and high development costs. To enhance accuracy and reduce costs, comprehensive early warning frameworks based on Artificial Intelligence (AI) large models have become an important research direction. However, large models are typically deployed in the cloud, and limited wireless spectrum resources constrain the uploading of complete data streams. Deploying lightweight models at terminal devices through substantial model compression can alleviate spectrum limitations but inevitably compromises model performance.  Methods  To address these limitations, this study proposes a cloud-terminal collaborative joint deployment scheme integrating large and lightweight models. In this framework, a high-precision large model is deployed in the cloud to process complex tasks, whereas lightweight models are deployed at terminal devices to handle simple tasks. Task offloading decisions are governed by a confidence threshold that dynamically determines whether computation occurs locally or in the cloud. A power-domain Non-Orthogonal Multiple Access (NOMA) technique is incorporated to allow multiple terminals to share identical time-frequency resources, thereby improving system detection accuracy by increasing the proportion of tasks processed in the cloud. Additionally, for scenarios considering (1) only uplink shared-channel bandwidth constraints and (2) both terminal access collision constraints and shared-channel bandwidth constraints, corresponding algorithms are developed to determine the maximum number of terminals supported under a given bandwidth and to identify the optimal confidence threshold that maximizes detection accuracy.  Results and Discussions  (1) As shown in Figures 3(a) and 3(b), when the uplink shared-channel bandwidth \begin{document}$ W $\end{document} increases, the number of supported terminals rises for both the proposed scheme and the orthogonal multiple access (OMA)-based scheme. This occurs because a larger \begin{document}$ W $\end{document} enables more terminals with low-confidence detection results to upload raw data to the cloud for further processing, thereby enhancing detection accuracy and reducing the missed detection rate. (2) In contrast, the number of supported terminals \begin{document}$ M $\end{document} in the pure on-device processing scheme remains constant with varying \begin{document}$ W $\end{document}, as this scheme relies entirely on the lightweight model deployed at the terminal and is therefore independent of bandwidth. (3) Compared with the OMA-based and pure on-device schemes, the proposed approach markedly increases the number of supported terminals, confirming that non-orthogonal reuse of time-frequency resources and cloud–terminal collaborative deployment of large and lightweight models are key to improving system performance. (4) As shown in Table 3, an increase in the number of preambles reduces the probability of terminal access collisions, allowing more terminals to successfully transmit raw data to the cloud for detection. Therefore, the missed detection rate decreases, and overall detection accuracy improves.  Conclusions  For electric power disaster early warning scenarios, this study integrates power-domain NOMA and proposes a cloud-terminal collaborative deployment scheme combining a large model with lightweight models. By dynamically determining whether tasks are processed locally by a lightweight model or in the cloud by a large model, the system achieves optimized detection accuracy and a reduced missed detection rate. Numerical results indicate that, under given uplink shared-channel bandwidth, minimum detection accuracy, and maximum missed detection rate, the introduction of power-domain NOMA effectively increases the number of supported terminals. Furthermore, when both terminal access collision constraints and shared-channel bandwidth constraints are considered, optimizing the confidence threshold to regulate the number of terminals transmitting data to the cloud further enhances detection accuracy and reduces the missed detection rate.
Global-local Co-embedding and Semantic Mask-driven Aging Approach
LIU Yaohui, LIU Jiaxin, SUN Peng, SHEN Zhe, LANG Yubo
2025, 47(11): 4535-4548.   doi: 10.11999/JEIT250430
[Abstract](188) [FullText HTML](81) [PDF 5485KB](21)
Abstract:
  Objective  Facial age progression has become increasingly important in applications such as criminal investigation and digital identity authentication, making it a key research area in computer vision. However, existing mainstream facial age progression networks face two primary limitations. First, they tend to overemphasize the embedding of age-related features, often at the expense of preserving identity-consistent multi-scale attributes. Second, they fail to effectively eliminate interference from non-age-related elements such as hair and glasses, leading to suboptimal performance in complex scenarios. To address these challenges, this study proposes a global-local co-embedding and semantic mask-driven aging method. The globallocal co-embedding strategy improves the accuracy of input portrait reconstruction while reducing computational cost during the embedding phase. In parallel, a semantic mask editing mechanism is introduced to remove non-age-related features—such as hair and eyewear—thereby enabling more accurate embedding of age-related characteristics. This dual strategy markedly enhances the model’s capacity to learn and represent age-specific attributes in facial imagery.  Methods  A Global-Local Collaborative Embedding (GLCE) strategy is proposed to achieve high-quality latent space mapping of facial images. Distinct learning objectives are assigned to separate latent subspaces, which enhances the representation of fine-grained facial features while preserving identity-specific information. Therefore, identity consistency is improved, and both training time and computational cost are reduced, increasing the efficiency of feature extraction. To address interference from non-age-related elements, a semantic mask-driven editing mechanism is employed. Semantic segmentation and image inpainting techniques are integrated to accurately remove regions such as hair and glasses that hinder precise age modeling. A differentiable generator, DsGAN, is introduced to align the transferred latent codes with the embedded identity-preserving codes. Through this alignment, the expression of age-related features is enhanced, and identity information is better retained during the age progression process.  Results and Discussions  Experimental results on benchmark datasets, including CCAD and CelebA, demonstrate that GLS-Age outperforms existing methods such as IPCGAN, CUSP, HRFAE, FADING SAM, and LATS in identity confidence assessment. The age distributions of the generated portraits are also more closely aligned with those of the target age groups. Qualitative analysis further shows that, in cases with hair occlusion, GLS-Age produces more realistic wrinkle textures and enables more accurate embedding of age-related features compared with other methods. Simultaneously, it significantly improves the identity consistency of the synthesized facial images.  Conclusions  This study addresses core challenges in facial age progression, including identity preservation, inadequate detail modeling, and interference from non-age-related factors. A novel Global-Local collaborative embedding and Semantic mask-driven Aging method (GLS-Age) is proposed to resolve these limitations. By employing a differentiated latent space learning strategy, the model achieves hierarchical decoupling of structural and textural features. When integrated with semantic-guided portrait editing and a differentiable generator for latent space alignment, GLS-Age markedly enhances both the fidelity of age feature expression and the consistency of identity retention. The method demonstrates superior generalization and synthesis quality across multiple benchmark datasets, effectively reproducing natural wrinkle patterns and age-related facial changes. These results confirm the feasibility and advancement of GLS-Age in facial age synthesis tasks. Furthermore, this study establishes a compact, high-quality dataset focused on Asian facial portraits, supporting further research in image editing and face generation within this demographic. The proposed method not only contributes technical support to practical applications such as cold case resolution and missing person identification in public security but also offers a robust data and modeling framework for advancing human age-based simulation technologies. Future work will focus on enhancing controllable editing within latent spaces, improving anatomical plausibility in skull structure transformations, and strengthening model performance across extreme age groups, including infants and the elderly. These efforts aim to expand the application of facial age progression in areas such as forensic analysis, humanitarian family search, and social security systems.
Combine the Pre-trained Model with Bidirectional Gated Recurrent Units and Graph Convolutional Network for Adversarial Word Sense Disambiguation
ZHANG Chunxiang, SUN Ying, GAO Kexin, GAO Xueyao
2025, 47(11): 4549-4559.   doi: 10.11999/JEIT250386
[Abstract](186) [FullText HTML](110) [PDF 2090KB](19)
Abstract:
  Objective  In Word Sense Disambiguation (WSD), the Linguistically-motivated bidirectional Encoder Representation from Transformer (LERT) is employed to capture rich semantic representations from large-scale corpora, enabling improved contextual understanding of word meanings. However, several challenges remain. Current WSD models are not sufficiently sensitive to temporal and spatial dependencies within sequences, and single-dimensional features are inadequate for representing the diversity of linguistic expressions. To address these limitations, a hybrid network is constructed by integrating LERT, Bidirectional Gated Recurrent Units (Bi-GRU), and Graph Convolutional Network (GCN). This network enhances the modeling of structured text and contextual semantics. Nevertheless, generalization and robustness remain problematic. Therefore, an adversarial training algorithm is applied to improve the overall performance and resilience of the WSD model.  Methods  An adversarial WSD method is proposed based on a pre-trained model, combining Bi-GRU and GCN. First, word forms, parts of speech, and semantic categories of the neighboring words of an ambiguous term are input into the LERT model to obtain the CLS sequence and token sequence. Second, cross-attention is applied to fuse the global semantic information extracted by Bi-GRU from the token sequence with the local semantic information derived from the CLS sequence. Sentences, word forms, parts of speech, and semantic categories are then used as nodes to construct a disambiguation feature graph, which is subsequently input into GCN to update the feature information of the nodes. Third, the semantic category of the ambiguous word is determined through the interpolated prediction layer and semantic classification layer. Fourth, subtle continuous perturbations are generated by computing the gradient of the dynamic word vectors in the input. These perturbations are added to the original word vector matrix to create adversarial samples, which are used to optimize the LERT+Bi-GRU+CA+GCN (LBGCA-GCN) model. A cross-entropy loss function is applied to measure the performance of the LBGCA-GCN model on adversarial samples. Finally, the loss from the network is combined with the loss from AT to optimize the LBGCA-GCN model.  Results and Discussions  When the FreeLB algorithm is applied, stronger adversarial perturbations are generated, and the FreeLB algorithm achieves the best performance (Table 2). As the number of perturbation steps increases, the strength of AT improves. However, when the number of steps exceeds a certain threshold, the LBGCA-GCN+AT(LBGCA-GCN-AT) model begins to overfit. The Free Large-Batch (FreeLB) algorithm demonstrates strong robustness with three perturbation steps (Table 3). The cross-attention mechanism, which fuses the token sequence with the CLS sequence, yields significant performance gains in complex semantic scenarios (Fig. 3). By incorporating AT, the LBGCA-GCN-AT model achieves notable improvements across multiple evaluation metrics (Table 4).  Conclusions  This study presents an adversarial WSD method based on a pre-trained model, integrating Bi-GRU and GCN to address the weak generalization ability and robustness of conventional WSD models. LERT is used to transform discriminative features into dynamic word vectors, while cross-attention fuses the global semantic information extracted by Bi-GRU from the token sequence with the local semantic information derived from the CLS sequence. This fusion generates more complete node representations for the disambiguation feature graph. A GCN is then applied to update the relationships among nodes within the feature graph. The interpolated prediction layer and semantic classification layer are used to determine the semantic category of ambiguous words. To further improve robustness, the gradient of the dynamic word vector is computed and perturbed to generate adversarial samples, which are used to optimize the LBGCA-GCN model. The network loss is combined with the AT loss to refine the model. Experiments conducted on the SemEval-2007 Task #05 and HealthWSD datasets examine multiple factors affecting model performance, including adversarial algorithms, perturbation steps, and sequence fusion methods. Results demonstrate that introducing AT improves the model’s ability to handle real-world noise and perturbations. The proposed method not only enhances robustness and generalization but also strengthens the capacity of WSD models to capture subtle semantic distinctions.
Multi-target Behavior and Intent Prediction on the Ground Under Incomplete Perception Conditions
ZHU Xinyi, PING Peng, HOU Wanying, SHI Quan, WU Qi
2025, 47(11): 4560-4571.   doi: 10.11999/JEIT250322
[Abstract](227) [FullText HTML](114) [PDF 8064KB](29)
Abstract:
  Objective  Modern battlefield environments, characterized by complex and dynamically uncertain target behaviors combined with information asymmetry, present significant challenges for intent prediction. Conventional methods lack robustness in processing incomplete data, rely on oversimplified behavioral models, and fail to capture tactical intent semantics or adapt to rapidly evolving multi-target coordinated scenarios. These limitations restrict their ability to meet the demands of real-time recognition of high-value target intent and comprehensive ground target situational awareness. To address these challenges, this study proposes a Threat Field-integrated Gated Recurrent Unit model (TF-GRU), which improves prediction accuracy and robustness through threat field modeling, dynamic data repair, and multi-target collaboration, thereby providing reliable support for battlefield decision-making.  Methods  The TF-GRU framework integrates static and dynamic threat field modeling with a hybrid Particle Filtering (PF) and Dynamic Time Warping (DTW) strategy. Static threat fields quantify target-specific threats (e.g., tanks, armored vehicles, artillery) using five factors: enemy-friend distance, range, firepower, defense, and mobility. Gaussian and exponential decay models are employed to describe spatial threat diffusion across different target categories. Dynamic threat fields incorporate real-time kinematic variables (velocity, acceleration, orientation) and temporal decay, allowing adaptive updates of threat intensity. To address incomplete sensor data, a PF-DTW switching mechanism dynamically alternates between short-term PF (N = 1 000 particle) and long-term historical trajectory matching (DTW with β = 50). Collaborative PF introduces neighborhood angular constraints to refine multi-target state estimation. The GRU architecture is further enhanced with Mish activation, adaptive Xavier initialization, and threat-adaptive gating, ensuring effective fusion of trajectory and threat features.  Results and Discussions  Experiments were conducted on a simulated dataset comprising 40 trajectories and 144,000 timesteps. Under complete data conditions, the TF-GRU model achieved the highest accuracy on both the training and test sets, reaching 94.7% and 92.9%, respectively, indicating strong fitting capability and generalization performance (Fig. 9). After integrating static and dynamic threat fields, model accuracy increased from 72% (trajectory-only input) to 83%, accompanied by substantial improvements in F1 scores and reductions in predictive uncertainty (Fig. 6). In scenarios with 30% missing data, TF-GRU maintained an accuracy of 86.2%, outperforming comparative models and demonstrating superior robustness (Fig. 10). These results confirm that the PF-DTW mechanism effectively reduces the adverse effects of both short-term and long-term data loss, while the collaborative PF strategy strengthens multi-target prediction through neighborhood synergy (η = 0.6). This combination enables robust threat field reconstruction and reliable intent inference (Figs. 78).  Conclusions  The TF-GRU model effectively addresses the challenges of intent prediction in complex battlefield environments with incomplete data through threat field modeling, the PF-DTW dynamic repair mechanism, and multi-target collaboration. It achieves high accuracy and robustness, providing reliable support for situational awareness and command decision-making. Future work will focus on applying the model to real-world datasets and enhancing computational efficiency to facilitate practical deployment.
Collaborative Inference for Large Language Models Against Jamming Attacks
LIN Zhiping, XIAO Liang, CHEN Hongyi, XU Xiaoyu, LI Jieling
2025, 47(11): 4572-4582.   doi: 10.11999/JEIT250675
[Abstract](209) [FullText HTML](95) [PDF 5044KB](42)
Abstract:
  Objective  Collaborative inference with Large Language Models (LLMs) is employed to enable mobile devices to offload multi-modal data, including images, text, video, and environmental information such as temperature and humidity, to edge servers. This offloading improves the performance of inference tasks such as human-computer question answering, logical reasoning, and decision support. Jamming attacks, however, increase transmission latency and packet loss, which reduces task completion rates and slows inference. A reinforcement learning-based collaborative inference scheme is proposed to enhance inference speed, accuracy, and task completion under jamming conditions. LLMs with different sparsity levels and quantization precisions are deployed on edge servers to meet heterogeneous inference requirements across tasks.  Methods  A reinforcement learning-based collaborative inference scheme is proposed to enhance inference accuracy, speed, and task completion under jamming attacks. The scheme jointly selects the edge servers, sparsity rates and quantization levels of LLMs, as well as the transmit power and channels for data offloading, based on task type, data volume, channel gains, and received jamming power. A policy risk function is formulated to quantify the probability of inference task failure given offloading latency and packet loss rate, thereby reducing the likelihood of unsafe policy exploration. Each edge server deploys LLMs with varying sparsity rates and quantization precisions, derived from layer-wise unstructured pruning and model parameter quantization, to process token vectors of multi-modal data including images, text, video, and environmental information such as temperature and humidity. This configuration is designed to meet diverse requirements for inference accuracy and speed across different tasks. The LLM inference system is implemented with mobile devices offloading images and text to edge servers for human-computer question answering and driving decision support. The edge servers employ a vision encoder and tokenizer to transform the received sensing data into token vectors, which serve as inputs to the LLMs. Pruning and parameter quantization are applied to the foundation model LLaVA-1.5-7B, generating nine LLM variants with different sparsity rates and quantization precisions to accommodate heterogeneous inference demands.  Results and Discussions  Experiments are conducted with three vehicles offloading images (i.e., captured traffic scenes) and texts (i.e., user prompts) using a maximum transmit power of 100 mW on 5.735~5.835 MHz frequency channels. The system is evaluated against a smart jammer that applies Q-learning to block one of the 20 MHz channels within this band. The results show consistent performance gains over benchmark schemes. Faster responses and more accurate driving advice are achieved, enabled by reduced offloading latency and lower packet loss in image transmission, which allow the construction of more complete traffic scenes. Over 20 repeated runs, inference speed is improved by 20.3%, task completion rate by 14.1%, and inference accuracy by 12.2%. These improvements are attributed to the safe exploration strategy, which prevents performance degradation and satisfies diverse inference requirements across tasks.  Conclusions  This paper proposed a reinforcement learning-based collaborative inference scheme that jointly selects the edge servers, sparsity rates and quantization levels of LLMs, as well as the transmit power and offloading channels, to counter jamming attacks. The inference system deploys nine LLM variants with different sparsity rates and quantization precisions for human-computer question answering and driving decision support, thereby meeting heterogeneous requirements for accuracy and speed. Experimental results demonstrate that the proposed scheme provides faster responses and more reliable driving advice. Specifically, it improves inference speed by 20.3%, task completion rate by 14.1%, and accuracy by 12.2%, achieved through reduced offloading latency and packet loss compared with benchmark approaches.
Joint Focus Measure and Context-Guided Filtering for Depth From Focus
JIANG Ying, DENG Huiping, XIANG Sen, WU Jin
2025, 47(11): 4583-4593.   doi: 10.11999/JEIT250540
[Abstract](83) [FullText HTML](53) [PDF 9449KB](3)
Abstract:
  Objective  Depth from Focus (DFF) seeks to determine scene depth by analyzing the focus variation of each pixel in an image. A key challenge in DFF is identifying the best-focused slice within the focal stack. However, focus variation in weakly textured regions is often subtle, making it difficult to detect focused areas, which adversely affects the accuracy of depth maps. To address this issue, this study proposes a depth estimation network that integrates focus measures and contextual information from the focal stack. The network accurately locates the best-focused pixels and generates a reliable depth map. By explicitly incorporating focus cues into a Convolutional Neural Network (CNN) and thoroughly considering spatial correlations within the scene, the approach facilitates comprehensive inference of focus states in weakly textured regions. This enables the network to capture both local focus-related details and global contextual information, thereby enhancing the accuracy and efficiency of depth estimation in challenging regions.  Methods  The proposed network consists of two main components. The first is focus region detection, which extracts focus-related features from the focal stack. A focus measure operator is introduced into the network during learning, yielding the maximum response when an image region is in sharp focus. After identifying the best-focused slices within the stack, the detected focus features are fused with those extracted by a 2D CNN. Because focus variations in weakly textured regions are often subtle, the representation of focus regions is enhanced to improve sensitivity to such changes. The second component comprises a semantic network and a semantic context module. A semantic context network is used to extract semantic cues, and semantic-guided filtering is then applied to the focus volume, integrating target features (focus volume) with guiding features (semantic context features). When local focus cues are indistinguishable, the global semantic context allows reliable inference of the focus state. This framework combines the strengths of deep learning and traditional methods while accounting for the specific characteristics of DFF and CNN architectures. Therefore, it produces robust and accurate depth maps, even in challenging regions.  Results and Discussions  The proposed architecture is evaluated through quantitative and qualitative comparisons on two public datasets. Prediction reliability is assessed using multiple evaluation metrics, including Mean Squared Error (MSE) and squared relative error (Sqr.rel.). Quantitative results (Tables 1 and 2) show that the proposed method consistently outperforms existing approaches on both datasets. The small discrepancy between predicted and ground-truth depths indicates precise depth estimation with reduced prediction errors. In addition, higher accuracy is achieved while computational cost remains within a practical range. Qualitative analysis (Figures 8 and 9) further demonstrates superior depth reconstruction and detail preservation, even when a limited number of focal stack slices is used. The generalization ability of the network is further examined on the unlabeled Mobile Depth dataset (Figure 10). The results confirm that depth can be reliably recovered in diverse unseen scenes, indicating effectiveness for real-world applications. Ablation studies (Table 3) validate the contribution of each proposed module. Optimal performance is obtained when both the Focus Measure (FM) and the Semantic Context-Guided Module (SCGM) are applied. Parameter count comparisons further indicate that the proposed approach achieves a balance between performance and complexity, delivering robust accuracy without excessive computational burden.  Conclusions  This study proposes a CNN-based DFF framework to address the challenge of depth estimation in weakly textured regions. By embedding focus measure operators into the deep learning architecture, the representation of focused regions is enhanced, improving focus detection sensitivity and enabling precise capture of focus variations. In addition, the introduction of semantic context information enables effective integration of local and global focus cues, further increasing estimation accuracy. Experimental results across multiple datasets show that the proposed model achieves competitive performance compared with existing methods. Visual results on the Mobile Depth dataset further demonstrate its generalization ability. Nonetheless, the model shows limitations in extremely distant regions. Future work could incorporate multimodal information or frequency-domain features to further improve depth accuracy in weakly textured scenes.
Multi-granularity Text Perception and Hierarchical Feature Interaction Method for Visual Grounding
CAI Hua, RAN Yue, FU Qiang, LI Junyan, ZHANG Chenjie, SUN Junxi
2025, 47(11): 4594-4605.   doi: 10.11999/JEIT250387
[Abstract](189) [FullText HTML](104) [PDF 8514KB](11)
Abstract:
  Objective  Visual grounding requires effective use of textual information for accurate target localization. Traditional methods primarily emphasize feature fusion but often neglect the guiding role of text, which limits localization accuracy. To address this limitation, a Multi-granularity Text Perception and Hierarchical Feature Interaction method for Visual Grounding (ThiVG) is proposed. In this method, the hierarchical feature interaction module is progressively incorporated into the image encoder to enhance the semantic representation of image features. The multi-granularity text-aware module is designed to generate weighted text with spatial and semantic enhancement, and a preliminary Hadamard product-based fusion strategy is applied to refine image features for cross-modal fusion. Experimental results show that the proposed method substantially improves localization accuracy and effectively alleviates the performance bottleneck arising from over-reliance on feature fusion modules in conventional approaches.  Methods  The proposed method comprises an image-text feature extraction network, a hierarchical feature interaction module, a multi-granularity text perception module, and a graphic-text cross-modal fusion and target localization network (Fig. 1). The image-text feature extraction network includes image and text branches for extracting their respective features (Fig. 2). In the image branch, text features are incorporated into the image encoder through the hierarchical feature interaction module (Fig. 3). This enables text information to filter and update image features, thereby strengthening their semantic expressiveness. The multi-granularity text perception module employs three perception mechanisms to fully extract spatial and semantic information from the text (Fig. 4). It generates weighted text, which is preliminarily fused with image features through a Hadamard product-based strategy, providing fine-grained image features for subsequent cross-modal fusion. The graphic-text cross-modal fusion module then deeply integrates image and text features using a Transformer encoder (Fig. 5), capturing their complex relationships. Finally, a Multilayer Perceptron (MLP) performs regression to predict the bounding box coordinates of the target location. This method not only achieves effective integration of image and text information but also improves accuracy and robustness in visual grounding tasks through hierarchical feature interaction and deep cross-modal fusion, offering a novel approach to complex localization challenges.  Results and Discussions  Comparison experiments demonstrate that the proposed method achieves substantial accuracy gains across five benchmark visual localization datasets (Tables 1 and 2), with particularly strong performance on the long-text RefCOCOg dataset. Although the model has a larger parameter size, comparisons of parameter counts and training-inference times indicate that its overall performance still exceeds that of traditional methods (Table 3). Ablation studies further verify the contribution of each key module (Table 4). The hierarchical feature interaction module improves the semantic representation of image features by incorporating textual information into the image encoder (Table 5). The multi-granularity text perception module enhances attention to key textual components through perception mechanisms and adaptive weighting (Table 6). By avoiding excessive modification of the text structure, it markedly strengthens the model’s capacity to process long text and complex sentences. Experiments on the number of encoder layers in the cross-modal fusion module show that a 6-layer deep fusion encoder effectively filters irrelevant background information (Table 7), yielding a more precise feature representation for the localization regression MLP. Generalization tests and visualization analyses further demonstrate that the proposed method maintains high adaptability and accuracy across diverse and challenging localization scenarios (Figs. 6, and 7).  Conclusions  This study proposes a visual grounding algorithm that integrates multi-granularity text perception with hierarchical feature interaction, effectively addressing the under-utilization of textual information and the reliance on single-feature fusion in existing approaches. Key innovations include the hierarchical feature interaction module in the image branch, which markedly enhances the semantic representation of image features; the multi-granularity text perception module, which fully exploits textual information to generate weighted text with spatial and semantic enhancement; and a preliminary Hadamard product-based fusion strategy, which provides fine-grained image representations for cross-modal fusion. Experimental results show that the proposed method achieves substantial accuracy improvements on classical vision datasets and demonstrates strong adaptability and robustness across diverse and complex localization scenarios. Future work will focus on extending this method to accommodate more diverse text inputs and further improving localization performance in challenging visual environments.
Circuit and System Design
VCodePPA: A Large-scale Verilog Dataset with PPA Annotations
CHEN Xiyuan, JIANG Yuxuan, XIA Yingjie, HU Ji, ZHOU Yizhao
2025, 47(11): 4606-4619.   doi: 10.11999/JEIT250449
[Abstract](207) [FullText HTML](94) [PDF 4130KB](24)
Abstract:
  Objective  As a predominant hardware description language, the quality of Verilog code directly affects the Power, Performance, and Area (PPA) metrics of the resulting circuits. Current Large Language Model (LLM)-based approaches for generating hardware description languages face a central challenge: incorporating a design feedback mechanism informed by PPA metrics to guide model optimization, rather than relying solely on syntactic and functional correctness. The field faces three major limitations: the absence of PPA metric annotations in training data, which prevents models from learning the effects of code modifications on physical characteristics; evaluation frameworks that remain disconnected from downstream engineering needs; and the lack of systematic data augmentation methods to generate functionally equivalent code with differentiated PPA characteristics. To address these gaps, we present VCodePPA, a large-scale dataset that establishes precise correlations between Verilog code structures and PPA metrics. The dataset comprises 17 342 entries and provides a foundation for data-driven optimization paradigms in hardware design.  Methods  The dataset construction is initiated by collecting representative Verilog code samples from GitHub repositories, OpenCores projects, and standard textbooks. After careful selection, a seed dataset of 3 500 samples covering 20 functional categories is established. These samples are preprocessed through functional coverage optimization, syntax verification with Yosys, format standardization, deduplication, and complexity filtering. An automated PPA extraction pipeline is implemented in Vivado to evaluate performance characteristics, with metrics including LookUp Table (LUT) count, register usage, maximum operating frequency, and power consumption. To enhance dataset diversity while preserving functional equivalence, a multi-dimensional code transformation framework is applied, consisting of nine methods across three dimensions: architecture layer (finite state machine encoding, interface protocol reconstruction, arithmetic unit replacement), logic layer (control flow reorganization, operator rewriting, logic hierarchy restructuring), and timing layer (critical path cutting, register retiming, pipeline insertion or deletion). Efficient exploration of the transformation space is achieved through a Homogeneous Verilog Mutation Search (HVMS) algorithm based on Monte Carlo Tree Search, which generates 5~10 PPA-differentiated variants for each seed code. A dual-task LLM training strategy with PPA-guided adaptive loss functions is subsequently employed, incorporating contrastive learning mechanisms to capture the relationship between code structure and physical implementation.  Results and Discussions  The VCodePPA dataset achieves broad coverage of digital hardware design scenarios, representing approximately 85%~90% of common design contexts. The multi-dimensional transformation framework generates functionally equivalent yet structurally diverse code variants, with PPA differences exceeding 20%, thereby exposing optimization trade-offs inherent in hardware design. Experimental evaluation demonstrates that models trained with VCodePPA show marked improvements in PPA optimization across multiple Verilog functional categories, including arithmetic, memory, control, and hybrid modules. In testing scenarios, VCodePPA-trained models produced implementations with superior PPA metrics compared with baseline models. The PPA-oriented adaptive loss function effectively overcame the traditional limitation of language model training, which typically lacks sensitivity to hardware implementation efficiency. By integrating contrastive learning and variant comparison loss mechanisms, the model achieved an average improvement of 17.7% across PPA metrics on the test set, influencing 32.4% of token-level predictions in code generation tasks. Notably, VCodePPA-trained models reduced on-chip resource usage by 10%\begin{document}$ \sim $\end{document}15%, decreased power consumption by 8%\begin{document}$ \sim $\end{document}12%, and shortened critical path delay by 5%\begin{document}$ \sim $\end{document}8% relative to baseline models.  Conclusions  This paper introduces VCodePPA, a large-scale Verilog dataset with precise PPA annotations, addressing the gap between code generation and physical implementation optimization. The main contributions are as follows: (1)construction of a seed dataset spanning 20 functional categories with 3 500 samples, expanded through systematic multi-dimensional code transformation to 17 000 entries with comprehensive PPA metrics; (2)development of an MCTS-based homogeneous code augmentation scheme employing nine transformers across architectural, logical, and timing layers to generate functionally equivalent code variants with significant PPA differences; and (3)design of a dual-task training framework with PPA-oriented adaptive loss functions, enabling models to learn PPA trade-off principles directly from data rather than relying on manual heuristics or single-objective constraints. Experimental results demonstrate that models trained on VCodePPA effectively capture PPA balancing principles and generate optimized hardware description code. Future work will extend the dataset to more complex design scenarios and explore advanced optimization strategies for specialized application domains.
Ultra-wideband Bonding Wire RF Characteristics Compensation IC and Circuit Design for Microwave Components
KONG Weidong, YAN Pengyi, LU Shaopeng, WANG Qiaonan, DENG Shixiong, LIN Peng, WANG Cong, YANG Guohui, ZHANG Kuang
2025, 47(11): 4620-4627.   doi: 10.11999/JEIT250502
[Abstract](196) [FullText HTML](110) [PDF 3705KB](19)
Abstract:
  Objective  In microwave modules, assembly gaps often occur between power amplifier chips and multilayer hybrid circuit boards or among different circuit units. These gaps form deep transition trenches that significantly degrade RF signal transmission quality, particularly at millimeter-wave frequencies. Bonding wires remain a critical solution for establishing electrical interconnections between RF chips and other structures. However, the inherent parasitic inductance of gold bonding wires adversely affects system performance. As RF modules increasingly operate in the Ka-band and W-band, the degradation caused by this parasitic inductance has become more pronounced. The problem is especially severe when the ground-signal return path is excessively long or when the bonding wires themselves are too long.  Methods  The impedance transformation paths of T-type and π-type matching networks are compared on the Smith chart. The analysis indicates that for a given parasitic inductance of bonding wires, the Q-circle value of the π-type matching network is the same wtth that of the T-type, thereby enabling a broader matching bandwidth. A π-type matching network for chip-to-chip interconnection is realized by optimizing the bonding pad dimensions on the GaAs chip to provide capacitive loading. As the bonding pad size increases, more gold wires can be bonded to the chip, which simultaneously reduces the parasitic inductance of the wires. Additionally, a symmetric “Ground-Signal-Ground (GSG)” bonding pad structure is designed on the GaAs chip, which shortens the ground return path and further reduces the parasitic inductance of the bonding wires. By integrating these three design strategies, the proposed chip and transition structure are shown to substantially improve the performance of cross-deep-gap transitions between different circuit units in microwave modules.  Results and Discussions  The proposed chip and transition structure substantially improve the performance of cross-trench transitions between different circuit units in microwave modules (Fig. 7). Simulation results show that the interconnection architecture effectively mitigates the adverse effects of trench depth on RF characteristics. Experimental validation further confirms that the π-type matching network implemented with the designed chip achieves an ultra-wideband, high-performance cross-trench transition, with a return loss of ≥ 17 dB and an insertion loss of ≤ 0.7 dB over the DC~40 GHz frequency range.  Conclusions  Comparative analysis of impedance transformation paths between T-type and π-type matching networks demonstrates that in gold-wire bonding interconnections, the π-type configuration is more effective in mitigating the effect of bonding wire parasitic inductance on matching bandwidth, making it suitable for ultra-wideband bonded interconnection circuits. To implement the π-type matching network using GaAs technology, the bonding pad area on the chip is enlarged to provide capacitive loading and to allow additional bonding wires, thereby further reducing parasitic inductance. A GSG structure is also designed on the GaAs chip surface to modify the reference ground return path of the bonded interconnections, leading to additional reduction in parasitic inductance. By integrating these features, an ultra-wideband compensation chip is developed and applied to cross-trench transition structures in microwave modules. Experimental results indicate that for a transition structure with a trench depth of 2 mm and a width of 0.2 mm, the proposed design achieves high-performance characteristics from DC to 40 GHz, with return loss ≥ 17 dB and insertion loss ≤ 0.7 dB. When applied to interconnections between RF chips and circuit boards in microwave modules, the chip also significantly enhances the RF matching performance of bonded interconnections.
A Ku-band Circularly Polarized Leaky-wave Antenna Loaded with Parasitic Slots
HUANG Zhiyuan, ZHANG Yunhua, ZHAO Xiaowen
2025, 47(11): 4628-4636.   doi: 10.11999/JEIT250347
[Abstract](158) [FullText HTML](85) [PDF 6848KB](16)
Abstract:
This paper proposes a Ku-band circularly polarized Leaky-Wave Antenna (LWA) based on a Substrate Integrated Waveguide (SIW). A parasitic slot, with the same configuration as the main radiation slot but reduced in size, is employed to address the open-stopband problem and enhance impedance matching. The radiation slot excites Circularly Polarized (CP) waves, while the parasitic slot simultaneously broadens the Axial Ratio (AR) bandwidth and suppresses the open-stopband effect. A prototype antenna is designed, fabricated, and measured. The results demonstrate that the antenna achieves a 32% 3-dB AR bandwidth from 12.6 GHz to 17.4 GHz, with CP beam scanning from –49° to +14°. The simulated and measured results are in good agreement. In addition, the realized gain remains stable across the operating band. Compared with existing works, the proposed design achieves the widest scanning range.  Objective  Compared with traditional phased array antennas, frequency-scanning antennas have extensive applications in both military and civilian fields owing to their advantages of low profile, low cost, and lightweight design. CP waves offer superior anti-interference performance compared with linearly polarized waves. As a representative frequency-scanning antenna, the LWA has attracted sustained global research interest. This study focuses on the investigation of a Ku-band Circularly Polarized Leaky-Wave Antenna (CP-LWA), with emphasis on wide-bandwidth and wide-scanning techniques, as well as methods for achieving circular polarization. The aim is to provide potential design concepts for next-generation mobile communication and radar system antennas.  Methods   The fan-shaped slot is modified based on previous work, and an additional size-reduced parasitic slot of the same shape as the main slot is introduced. The parasitic slots cancel the reflected waves generated by the main radiating slot, thereby suppressing the Open-Stop-Band (OSB) effect, and they also enlarge the effective radiating aperture, which improves radiation efficiency and impedance matching. By exploiting the metallic boundary of the conductors, the parasitic slots enhance CP performance and broaden the AR bandwidth. To validate the proposed design, an antenna consisting of 12 main slots and 11 parasitic slots is designed, simulated, and measured.  Results and Discussions  A prototype is designed, fabricated, and measured in a microwave anechoic chamber to validate the proposed antenna. Both simulated and measured S11 values remain below –10 dB across the entire Ku-band. The measured S11 is slightly higher in the low-frequency range (12~13 GHz) and slightly lower in the high-frequency range (16~18 GHz), while maintaining an overall consistent trend with the simulations, except for a frequency shift of approximately 0.2 GHz toward lower frequencies. For the AR bandwidth, the simulated and measured 3-dB AR bandwidths are 32.7% (12.8~17.8 GHz) and 32.0% (12.6~17.4 GHz), respectively. The realized gains are on average 0.6 dB lower than the simulated values across the AR bandwidth, likely due to measurement system errors and fabrication tolerances. The simulated and measured peak gains reach 14.26 dB and 13.65 dB, respectively, with maximum gain variations of 2.91 dB and 2.85 dB. The measured AR and gain results therefore show strong agreement with the simulations. The measured sidelobe level increases on average by approximately 0.65 dB. The simulated CP scanning range extends from –47° to +17°, while the measured range narrows slightly to –49° to +14°. The frequency shift of the LWA is analyzed, and based on the simulated effect of variations in εr on the scanning patterns, the shift toward lower frequencies is attributed to the actual dielectric constant of the substrate being smaller than the nominal value of 2.2 specified by the manufacturer.  Conclusions  This paper proposes a Ku-band CP-LWA based on a SIW. The antenna employs etched slots consisting of fan-shaped radiation slots and size-reduced parasitic slots. The radiation slots excite circular polarization due to their inherent geometric properties, while the parasitic slots suppress the CP effect and broaden the CP bandwidth. Measurements confirm that the proposed LWA achieves a wide 3-dB AR bandwidth of 12.6~17.4 GHz (32%) with a CP beam scanning range from –49° to +14°. Meanwhile, the antenna demonstrates stable gain performance across the entire AR bandwidth.
The effects of ELF-MF on Aβ42 deposition in AD mice and SWM-related neural oscillations
GENG Duyan, LIU Aoge, YAN Yuxin, ZHENG Weiran
2025, 47(11): 4637-4647.   doi: 10.11999/JEIT241106
[Abstract](64) [FullText HTML](28) [PDF 3195KB](11)
Abstract:
  Objective  Extremely Low-Frequency Magnetic Fields (ELF-MF) have shown beneficial effects in various diseases; however, their influence on Alzheimer’s Disease (AD) remains insufficiently understood. With global population aging, AD has become one of the most prevalent neurodegenerative disorders. Its complex pathogenesis is characterized by neuronal loss, extracellular Amyloid-β (Aβ) deposition, and intracellular neurofibrillary tangles. Cognitive decline, particularly Spatial Working Memory (SWM) impairment, is among its main clinical manifestations. As a crucial cognitive function for encoding and retaining spatial location information, SWM underpins the execution of complex cognitive tasks. Impairment of SWM not only affects daily functioning but also serves as a key indicator of AD progression. Although previous studies have suggested potential cognitive benefits of ELF-MF exposure, systematic investigations integrating pathological, behavioral, and electrophysiological analyses remain limited. This study aims to investigate whether 40 Hz ELF-MF exposure mitigates AD pathology by assessing Aβ42 deposition, SWM performance, and neural oscillatory activity in the hippocampal CA1 region, and to elucidate the relationships between electrophysiological modulation and behavioral improvement.  Methods  An integrated multidisciplinary approach combining immunofluorescence detection, behavioral assessment, and electrophysiological recording is employed. Transgenic AD model mice and Wild-Type (WT) controls are used and assigned to three groups: WT control (Con), AD model group (AD), and AD model group exposed to ELF-MF stimulation (ES). The ES group receives 40 Hz, 10 mT continuous pulse stimulation twice daily for 0.5 h per session over 14 consecutive days, whereas the AD and Con groups undergo sham stimulation during identical time periods. SWM is evaluated using the Object Location Task (OLT). Behavioral performance is quantitatively determined by calculating the Cognitive Index (CI), which reflects the animal’s capacity to recognize spatial novelty. During behavioral testing, Local Field Potential (LFP) signals are synchronously recorded from the hippocampal CA1 region via chronically implanted microelectrodes. Advanced signal processing techniques, including time-frequency distribution analysis and phase-amplitude coupling computation, are applied to characterize neural oscillations within the theta (4~13 Hz) and gamma (30~80 Hz) frequency bands. After completion of the experiments, brain tissues are collected for quantitative measurement of Aβ42 plaque deposition in hippocampal sections through immunofluorescence staining, using standardized imaging and quantification protocols. Statistical analyses are performed to evaluate correlations between behavioral indices and electrophysiological parameters, with the objective of identifying mechanistic relationships underlying the effects of ELF-MF exposure.  Results and Discussions  Exposure to 40 Hz ELF-MF produced significant therapeutic effects across all examined parameters. Pathological analysis revealed markedly reduced Aβ42 deposition in the hippocampal region of treated AD mice compared with untreated controls, supporting the amyloid cascade hypothesis, which identifies Aβ42 oligomers as critical triggers of neurodegeneration. This reduction suggests that ELF-MF may influence Aβ42 metabolic pathways, potentially through the regulation of mitochondrial dynamics, as reported in previous studies. Behavioral assessment indicated a pronounced improvement in SWM following ELF-MF exposure, reflected by significantly elevated CI scores in the OLT. Electrophysiological recordings revealed notable alterations in neural oscillatory activity, with treated animals exhibiting increased power spectral density in both theta (4~13 Hz) and gamma (30~80 Hz) bands during memory task performance. The temporal dynamics of theta oscillations also differed among groups: in Con and ES mice, peak theta power occurred approximately 0.5~1 seconds before the behavioral reference point, indicating anticipatory processing, whereas in AD mice, peaks appeared after the reference point, reflecting delayed cognitive responses. Cross-frequency coupling analysis further demonstrated enhanced theta-gamma phase-amplitude coupling strength in the hippocampal CA1 region of ELF-MF-exposed mice, with coupling peaks primarily observed in the lower theta and higher gamma frequencies. Correlation analyses revealed statistically significant positive relationships between behavioral cognitive indices and electrophysiological measures, particularly for theta power and theta-gamma coupling strength. These convergent findings across pathological, behavioral, and electrophysiological domains indicate that ELF-MF exposure may restore impaired neural synchronization mechanisms. Enhanced theta-gamma coupling is particularly relevant, as this neurophysiological mechanism is known to facilitate temporal coordination among neuronal assemblies during memory processing. Although the present study demonstrates clear benefits of ELF-MF stimulation, heterogeneity in previously reported results warrants consideration. The efficacy of ELF-MF appears highly dependent on key stimulation parameters such as frequency, intensity, duration, and exposure intervals. Previous studies have reported divergent effects, ranging from negligible or adverse outcomes to substantial cognitive enhancement under different experimental conditions. This parameter dependency presents challenges for clinical translation and highlights the need for systematic optimization in higher-order animal models.  Conclusions  This study demonstrates that exposure to a 40 Hz ELF-MF effectively reduces Aβ42 deposition in the hippocampal region of AD mice, alleviates SWM deficits, and normalizes neural oscillatory activity in the hippocampal CA1 region. The observed cognitive improvements are closely linked to enhanced oscillations in the theta and gamma frequency bands and to strengthened theta-gamma cross-frequency coupling, indicating that neuromodulatory regulation of neural synchronization underlies behavioral recovery. These findings provide strong evidence supporting the potential of ELF-MF as a noninvasive therapeutic approach for AD, targeting both pathological markers and functional impairments. The study establishes a foundation for future work aimed at optimizing stimulation parameters and advancing translational applications, while highlighting the central role of neural oscillatory restoration as a therapeutic mechanism in neurodegenerative disorders. Further investigations should focus on refining exposure protocols and developing personalized stimulation strategies to accommodate individual variability in treatment responsiveness.
Microfabrication Method for Amorphous Wires GMI Magnetic Sensors
ZHANG Bo, WEN Xiaolong, WAN Yadong, ZHANG Chao, LI Jianhua
2025, 47(11): 4648-4654.   doi: 10.11999/JEIT250338
[Abstract](79) [FullText HTML](49) [PDF 5408KB](10)
Abstract:
  Objective  Compared with amorphous ribbons and thin films, amorphous wires exhibit superior Giant MagnetoImpedance (GMI) performance, making them promising materials for GMI magnetic sensors. Their flexible and heterogeneous morphology, however, complicates precise positioning during device fabrication. Additionally, the poor wettability of amorphous wires hinders control of contact resistance during soldering, often resulting in inconsistent device performance. This study proposes a microfabrication method for GMI magnetic sensors based on amorphous wires. Through-glass vias are employed as alignment markers, and auxiliary fixtures are used to accurately position and secure the wires on a glass wafer. Using photolithography and electroplating, bonding pads are fabricated to establish reliable electrical interconnections between the wires and the pads, enabling device-level processing and integration. A winding machine is then applied to wind the signal pickup coil on the device surface, completing fabrication of the GMI magnetic sensor. This approach avoids deformation and stress accumulation caused by direct coil winding on the amorphous wires, thereby improving manufacturability and ensuring stable performance of amorphous wire-based GMI magnetic sensors.  Methods  A glass wafer is employed as the substrate, owing to its high surface flatness and mechanical rigidity, which provide stable support for the flexible amorphous wire structure. To mitigate deformation caused by wire flexibility during winding, a microelectronics process integration scheme based on the glass wafer is implemented. A metal seed layer is first deposited by magnetron sputtering. Ultraviolet lithography and electroplating are then applied to form a high-precision array of electrical interconnection pads on the wafer surface. The ends of the amorphous wire are threaded through through-glass vias fabricated along the wafer edge by laser ablation and subsequently secured, ensuring accurate positioning over the bonding pad area while maintaining the natural straight form of the wire (Fig. 4). The amorphous wire is interconnected with the pads using electroplating. Standardized devices with an amorphous wire-glass substrate-interconnection structure are obtained by wafer dicing. After the microstructure of the amorphous wire and substrate is established, a winding machine is used to wind enameled wire onto the structure to form the signal pickup coil. The number of turns and spacing are precisely controlled according to the design. The sensor structure with the wound pickup coil is mounted on a Printed Circuit Board (PCB) with bonding pads. Finally, flip-chip bonding is performed to achieve secondary interconnection between the sensor structure and the PCB, completing fabrication of the sensor device.  Results and Discussions  The fabricated sensor device based on microelectronics processes is shown in Fig. 6(a). A 40 μm diameter enameled wire is uniformly wound on the substrate surface to form the signal pickup coil, with the number of turns and spacing precisely controlled by programmed parameters of the winding machine. As shown in the magnified view in Fig. 6(b), the bonding pad areas at both ends of the amorphous wire are completely covered by a copper layer. The copper plating defines the electrical connection area of the amorphous wire, while polyimide provides reliable fixation and surface protection on the substrate. The performance of five fabricated amorphous wire GMI magnetic sensors is presented in Fig. 13 and Table 1. The standard deviation of sensor output ranges from 0.0272 to 0.0163, and the sensors exhibit similar sensitivity, indicating good consistency. The output characteristic curves are shown in Fig. 14. Fitting analysis shows that both the Pearson correlation coefficient and the coefficient of determination are close to 1, demonstrating excellent linearity. When a 1 MHz excitation signal is applied to the amorphous wire, the output voltage exhibits a linear relationship with the external magnetic field within the range of –1 Oe to +1 Oe, with a sensitivity of 5.7 V/Oe. The magnetic noise spectrum, measured inside a magnetic shielding barrel, is shown in Fig. 15. The results indicate that the magnetic noise level of the sensor is approximately 55 pT/\begin{document}$\sqrt {{\mathrm{Hz}}} $\end{document}.  Conclusions  A fabrication method for amorphous wire-based GMI magnetic sensors is proposed using a glass substrate integration process. The sensor is constructed through microfabrication of a glass substrate-amorphous wire microstructure. The method is characterized by three features: (1) highly reliable interconnections between the amorphous wire and bonding pads are established by electroplating, yielding a 10 mm × 0.6 mm × 0.5 mm microstructure with fixed amorphous wires; (2) a signal pickup coil is precisely wound on the microstructure surface with a winding machine, ensuring accurate control of coil turns and spacing; and (3) electrical connection and circuit integration with a PCB are completed by flip-chip bonding. Compared with conventional amorphous wire GMI sensors, this approach provides two technical advantages. The microfabrication interconnection process reduces contact resistance fluctuations, addressing sensor performance dispersion. In addition, the combination of conventional winding and microelectronics techniques ensures device consistency while avoiding the high cost of full-process microfabrication. This method improves process compatibility and manufacturing repeatability, offering a practical route for engineering applications of GMI magnetic sensors.
Research and Design of a Ballistocardiogram-Based Heart Rate Variability (HRV) Monitoring Device Integrated into Pilot Helmets
ZHAO Yanpeng, LI Falin, LI Xuan, YU Haibo, CAO Zhengtao, ZHANG Yi
2025, 47(11): 4655-4664.   doi: 10.11999/JEIT250342
[Abstract](324) [FullText HTML](250) [PDF 2666KB](22)
Abstract:
  Objective  Conventional Heart Rate Variability (HRV) monitoring in aviation is limited by bulky wearable devices that require direct skin contact, are prone to electromagnetic interference during flight, and suffer from electrode displacement during high-G maneuvers. These constraints hinder continuous physiological monitoring, which is critical for flight safety. This study presents a non-contact monitoring approach integrated into pilot helmets, utilizing BallistoCardioGram (BCG) technology to detect cardiac mechanical activity via helmet-mounted inertial sensors. The objective is to establish a novel physiological monitoring paradigm that eliminates the need for skin-electrode interfaces while achieving measurement accuracy suitable for aviation operational standards.  Methods  Hardware ConfigurationA patented BCG sensing module is embedded within the occipital stabilization system of flight protective helmets. Miniaturized, high-sensitivity inertial sensors interface with proprietary signal conditioning circuits that execute a three-stage physiological signal refinement process. First, primary analog amplification scales microvolt-level inputs to measurable voltage ranges. Second, a fourth-order Butterworth bandpass filter (0.5~20 Hz) isolates cardiac mechanical signatures. Third, analog-to-digital conversion quantizes the signals at a 250 Hz sampling rate. Physical integration complies with military equipment standards for helmet structural integrity and ergonomic performance, ensuring full compatibility with existing flight gear without compromising protection or pilot comfort during extended missions.Computational Framework A multi-layer signal processing architecture is implemented to extract physiological features. Raw BCG signals undergo five-level discrete wavelet transformation using Daubechies-4 basis functions, effectively separating cardiac components from respiratory modulation and motion-induced artifacts. J-wave identification is achieved through dual-threshold detection: morphological amplitudes exceeding three times the local baseline standard deviation and temporal positioning within 200 ms sliding analysis windows. Extracted J-J intervals are treated as functional analogs of ElectroCardioGram (ECG)-derived R-R intervals. Time-domain HRV metrics are computed as follows: (1) Standard Deviation of NN intervals (SDNN), representing overall autonomic modulation; (2) Root Mean Square of Successive Differences (RMSSD), indicating parasympathetic activity; (3) Percentage of adjacent intervals differing by more than 50 ms (pNN50). Frequency-domain analysis applies Fourier transformation to quantify Low-Frequency (LF: 0.04~0.15 Hz) and High-Frequency (HF: 0.15~0.4 Hz) spectral powers. The LF/HF ratio is used to assess sympathetic-parasympathetic balance. The entire processing pipeline is optimized for real-time execution under in-flight operational conditions.  Results and Discussions  System validation is conducted under simulated flight conditions to evaluate physiological monitoring performance. Signal acquisition is found to be reliable across static, turbulent, and high-G scenarios, with consistent capture of BCG waveforms. Quantitative comparisons with synchronized ECG recordings show strong agreement between measurement modalities: (1) SDNN: 95.80%; (2) RMSSD: 94.08%; (3) LF/HF ratio: 92.86%. These results demonstrate that the system achieves physiological measurement equivalence to established clinical standards. Artifact suppression is effectively performed by the wavelet-based signal processing framework, which maintains waveform integrity under conditions of aircraft vibration and rapid gravitational transition—conditions where conventional ECG monitoring often fails. Among tested sensor placements, the occipital position exhibits the highest signal-to-noise ratio. Operational stability is maintained during continuous 6-hour monitoring sessions, with no observed signal degradation. This long-duration robustness indicates suitability for extended flight operations.Validation results indicate that the BCG-based approach addresses three primary limitations associated with ECG systems in aviation. The removal of electrode-skin contact mitigates the risk of contact dermatitis during prolonged wear. Non-contact sensing eliminates susceptibility to electromagnetic interference generated by radar and communication systems. Furthermore, mechanical coupling ensures signal continuity during abrupt gravitational changes, which typically displace ECG electrodes and cause signal dropout. The wavelet decomposition method is particularly effective in attenuating rotorcraft harmonic vibrations and turbulence-induced high-frequency noise. Autonomic nervous system modulation is reliably captured through pulse transit time variability, which aligns with neurocardiac regulation indices derived from ECG. Two operational considerations are identified. First, respiratory coupling under hyperventilation may introduce artifacts that require additional filtering. Second, extreme cervical flexion exceeding 45 degrees may degrade signal quality, indicating the potential benefit of redundant sensor configurations under such conditions.  Conclusions  This study establishes a functional, helmet-integrated BCG monitoring system capable of delivering medical-grade HRV metrics without compromising flight safety protocols. The technology represents a shift from contact-based to non-contact physiological monitoring in aviation settings. Future system development will incorporate: (1) Infrared eye-tracking modules to assess blink interval variability for objective fatigue evaluation; (2) Dry-contact electroencephalography sensors to quantify prefrontal cortex activity and assess cognitive workload; (3) Multimodal data fusion algorithms to generate unified indices of physiological strain. The integrated framework aims to enable real-time pilot state awareness during critical operations such as aerial combat maneuvers, hypoxia exposure, and emergency responses. Further technology maturation will prioritize operational validation across diverse aircraft platforms and environmental conditions. System implementation remains fully compliant with military equipment specifications and is positioned for future translation to commercial aviation and human factors research. Broader applications include astronaut physiological monitoring during spaceflight missions and enhanced safety systems in high-performance motorsports.
News
more >
Conference
more >
Author Center

Wechat Community