Advanced Search
Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Gate-level Side-Channel Protection Method Based on Hybrid-Order Masking
ZHAO Yiqiang, LI Zhengyang, ZHANG Qizhi, YE Mao, XIA Xianzhao, LI Yao, HE Jiaji
 doi: 10.11999/JEIT250198
[Abstract](0) [FullText HTML](0) [PDF 2344KB](0)
Abstract:
  Objective  Side-Channel Analysis (SCA) presents a significant threat to the hardware implementation of cryptographic algorithms. Among various sources of side-channel leakage, power consumption is particularly vulnerable due to its ease of extraction and interpretation, making power analysis one of the most prevalent SCA techniques. To address this threat, masking has been widely adopted as a countermeasure in hardware security. Masking introduces randomness to disrupt the correlation between sensitive intermediate data and observable side-channel information, thereby enhancing resistance to SCA. However, existing masking approaches face notable limitations. Algorithm-level masking requires comprehensive knowledge of algorithmic structure and does not reliably strengthen hardware-level security. Masking applied at the Register Transfer Level (RTL) is prone to structural alterations during hardware synthesis and is constrained by the need for logic optimization, limiting scalability. Gate-level masking offers certain advantages, yet such approaches depend on precise localization of leakage and often incur unpredictable overhead after deployment. Furthermore, many masking schemes remain susceptible to higher-order SCA techniques. To overcome these limitations, there is an urgent need for gate-level masking strategies that provide robust security, maintain acceptable overhead, and support scalable deployment in practical hardware systems.  Methods  To address advances in SCA techniques and the limitations of existing masking schemes, this paper proposes a hybrid-order masking method. The approach is specifically designed for gate-level netlist circuits to provide fine-grained and precise protection. By considering the structural characteristics of encryption algorithm circuits, the method integrates masking structures of different orders according to circuit requirements, introduces randomness to sensitive variables, and substantially improves resistance to side-channel attacks. In parallel, the approach accounts for potential hardware overhead to maintain practical feasibility. Theoretical security is verified through statistical evaluation combined with established SCA techniques. An automated deployment framework is developed to facilitate rapid and efficient application of the masking scheme. The framework incorporates functional modules for circuit topology analysis, leakage identification, and masking deployment, supporting a complete workflow from circuit analysis to masking implementation. The security performance of the masked design is further assessed through correlation-based evaluation methods and simulation.  Results and Discussions  The automated masking deployment tool is applied to implement gate-level masking for Advanced Encryption Standard (AES) circuits. The security of the masked design is evaluated through first-order and higher-order power analysis in simulation. The correlation coefficient and Minimum Traces to Disclosure (MTD) parameter serve as the primary evaluation metrics, both widely used in side-channel security assessment. The MTD reflects the number of power traces required to extract the encryption key from the circuit. In first-order power analysis, the unmasked design exhibits a maximum correlation value of 0.49 for the correct key (Fig. 6(a)), and the correlation curve for the correct key is clearly separated from those of incorrect keys. By contrast, the masked design reduces the correlation to approximately 0.02 (Fig. 6(b)), with no evidence of successful key extraction. Based on the MTD parameter, the unmasked design requires 116 traces for key disclosure, whereas the masked design requires more than 200,000 traces, reflecting an improvement exceeding 1724 times (Fig. 7). Higher-order power analysis yields consistent results. The unmasked design demonstrates an MTD of 120 traces, indicating clear vulnerability, whereas the masked design maintains a maximum correlation near 0.02 (Fig. 8) and an MTD greater than 200,000 traces (Fig. 9), corresponding to a 1667-fold improvement. In terms of hardware overhead, the masked design shows a 1.2% increase in area and a 41.1% reduction in maximum operating frequency relative to the unmasked circuit.  Conclusions  This study addresses the limitations of existing masking schemes by proposing a hybrid-order masking method that disrupts the conventional definition of protection order. The approach safeguards sensitive data during cryptographic algorithm operations and enhances resistance to SCA in gate-level hardware designs. An automated deployment tool is developed to efficiently integrate vulnerability identification and masking protection, supporting practical application by hardware designers. The proposed methodology is validated through correlation analysis across different orders. The results demonstrate that the method improves resistance to power analysis attacks by more than 1600 times and achieves significant security enhancement with minimal hardware overhead compared to existing masking techniques. This work advances the current knowledge of masking strategies and provides an effective approach for improving hardware-level security. Future research will focus on extending the method to broader application scenarios and enhancing performance through algorithmic improvements.
Multi-objective Remote Sensing Product Production Task Scheduling Algorithm Based on Double Deep Q-Network
ZHOU Liming, YU Xi, FAN Minghu, ZUO Xianyu, QIAO Baojun
 doi: 10.11999/JEIT250089
[Abstract](124) [FullText HTML](33) [PDF 3089KB](27)
Abstract:
  Objective  Remote sensing product generation is a multi-task scheduling problem influenced by dynamic factors, including resource contention and real-time environmental changes. Achieving adaptive, multi-objective, and efficient scheduling remains a central challenge. To address this, a Multi-Objective Remote Sensing scheduling algorithm (MORS) based on a Double Deep Q-Network (DDQN) is proposed. A subset of candidate algorithms is first identified using a value-driven, parallel-executable screening strategy. A deep neural network is then designed to perceive the characteristics of both remote sensing algorithms and computational nodes. A reward function is constructed by integrating algorithm execution time and node resource status. The DDQN is employed to train the model to select optimal execution nodes for each algorithm in the processing subset. This approach reduces production time and enables load balancing across computational nodes.  Methods  The MORS scheduling process comprises two stages: remote sensing product processing and screening, followed by scheduling model training and execution. A time-triggered strategy is adopted, whereby all newly arrived remote sensing products within a predefined time window are collected and placed in a task queue. For efficient scheduling, each product is parsed into a set of executable remote sensing algorithms. Based on the model illustrated in Figure 2, the processing unit extracts all constituent algorithms to form an algorithm set. An optimal subset is then selected using a value-driven parallel-executable screening strategy. The scheduling process is modeled as a Markov decision process, and the DDQN is applied to assign each algorithm in the selected subset to the optimal virtual node.  Results and Discussions  Simulation experiments use varying numbers of tasks and nodes to evaluate the performance of MORS. Comparative analyses are conducted against several baseline scheduling algorithms, including First-Come, First-Served (FCFS), Round Robin (RR), Genetic Algorithm (GA), Deep Q-Network (DQN), and Dueling Deep Q-Network (Dueling DQN). The results demonstrate that MORS outperforms all other algorithms in terms of scheduling efficiency and adaptability in remote sensing task scheduling. The learning rate, a critical hyperparameter in DDQN, influences the step size for parameter updates during training. When the learning rate is set to 0.00001, the model fails to converge even after 5,000 iterations due to extremely slow optimization. A learning rate of 0.0001 achieves a balance between convergence speed and training stability, avoiding oscillations associated with overly large learning rates (Figure 3 and Figure 4). The corresponding DDQN loss values show a steady decline, reflecting effective optimization and gradual convergence. In contrast, the unpruned DDQN initially declines sharply but plateaus prematurely, failing to reach optimal convergence. DDQN without soft updates shows large fluctuations in loss and remains unstable during later training stages, indicating that the absence of soft updates impairs convergence (Figure 5). Regarding decision quality, the reward values of DDQN gradually approach 25 in the later training stages, reflecting stable convergence and strong decision-making performance. Conversely, DDQN models without pruning or soft updates display unstable reward trajectories, particularly the latter, which exhibits pronounced reward fluctuations and slower convergence (Figure 6). A comparison of DQN, Dueling DQN, and DDQN reveals that all three show decreasing training loss, suggesting continuous optimization (Figure 7). However, the reward curve of Dueling DQN shows higher volatility and reduced stability (Figure 8). To further assess scalability, four sets of simulation experiments use 30, 60, 90, and 120 remote sensing tasks, with the number of virtual machine nodes fixed at 15. Each experimental configuration is evaluated using 100 Monte Carlo iterations to ensure statistical robustness. DDQN consistently shows superior performance under high-concurrency conditions, effectively managing increased scheduling pressure (Table 7). In addition, DDQN exhibits lower standard deviations in node load across all task volumes, reflecting more balanced resource allocation and reduced fluctuation in system utilization (Table 8 and Table 9).  Conclusions  The proposed MORS algorithm addresses the variability and complexity inherent in remote sensing task scheduling. Experimental results demonstrate that MORS not only improves scheduling efficiency but also significantly reduces production time and achieves balanced allocation of node resources.
Estimation Method of Target Propeller Parameters under Low Signal-to-noise Ratio
HAN Chuang, LENG Bing, LAN Chaofeng, XING Bowen
 doi: 10.11999/JEIT240790
[Abstract](60) [FullText HTML](29) [PDF 5791KB](12)
Abstract:
  Objective  Accurate estimation of underwater propeller parameters—such as blade number, blade length, and rotational speed—is critical for target identification in marine environments. However, low Signal-to-Noise Ratio (SNR) conditions, caused by complex underwater clutter and ambient noise, substantially degrade the performance of conventional micro-Doppler feature extraction methods. Existing approaches, including Fourier Transform (FT), wavelet analysis, and Hilbert-Huang Transform (HHT), are limited in handling non-stationary signals and are highly susceptible to noise, leading to unreliable parameter estimation. To address these limitations, this study proposes a method that integrates Complex Variational Mode Decomposition (CVMD) for signal denoising with Orthogonal Matching Pursuit (OMP) for sparse parameter estimation. The combined approach improves robustness against noise while maintaining computational efficiency. This method contributes to advancing underwater acoustic target recognition in interference-rich environments and offers a theoretical basis for improving the reliability of marine detection systems.  Methods  The proposed method integrates CVMD and OMP to improve the estimation of propeller parameters in low-SNR environments. The approach consists of three sequential phases: signal decomposition and denoising, time-frequency feature extraction, and sparse parameter estimation. This structure enhances robustness to noise while maintaining computational efficiency. CVMD extends conventional Variational Mode Decomposition (VMD) to the complex domain, enabling adaptive decomposition of propeller echo signals into Intrinsic Mode Functions (IMFs) with preserved spectral symmetry. Unlike standard VMD, which cannot process complex-valued signals directly, CVMD treats the real and imaginary parts of the noisy signal separately. The decomposition is formulated as a constrained optimization problem, where IMFs are iteratively extracted by minimizing the total bandwidth of all modes. A correlation-based thresholding scheme is then used to identify and discard noise-dominated IMFs. The remaining signal-related IMFs are reconstructed to obtain a denoised signal, effectively isolating micro-Doppler features from background clutter. Time-frequency analysis is subsequently applied to the denoised signal to extract key scintillation parameters, including blade parity, scintillation intervals, and the maximum instantaneous micro-Doppler frequency. These parameters are used as prior information to constrain the parameter search space and reduce computational burden. Blade parity, inferred from the symmetry of the time-frequency distribution, narrows the candidate blade number range by half. Scintillation intervals and frequency bounds are also used to define physical constraints for sparse estimation. A sparse dictionary is constructed using Sinusoidal Frequency-Modulated (SFM) atoms, each corresponding to a candidate blade number. The OMP algorithm iteratively selects the atom most correlated with the residual signal, updates the sparse coefficient vector, and refines the residual until convergence. Incorporating prior information into dictionary design significantly reduces its dimensionality, transforming a multi-parameter estimation problem into an efficient single-parameter search. This step allows precise estimation of the blade number with minimal computational cost. Once the blade number is determined, the blade length and rotational speed are derived analytically using the relationships between the micro-Doppler frequency, scintillation period, and geometric parameters of the propeller.  Results and Discussions  The proposed CVMD-OMP framework demonstrates robust performance in propeller parameter estimation under low-SNR conditions, as verified through comprehensive simulations. The denoising efficacy of CVMD is illustrated by the reconstruction of distinct time-frequency features from heavily noise-corrupted propeller echoes (Fig. 10). By decomposing the complex-valued signal into IMFs and retaining only signal-dominant components, CVMD achieves a 12.4 dB improvement in SNR and reduces the Mean Square Error (MSE) to 0.009 at SNR = −10 dB, outperforming conventional methods such as EMD-WT and CEEMDAN-WT (Table 3). Time-frequency analysis of the denoised signal reveals clear periodic scintillation patterns (Fig. 11), which enable accurate extraction of blade parity and scintillation intervals. Guided by these prior features, the OMP algorithm achieves 91.9% accuracy in blade number estimation at SNR = −10 dB (Table 4). Accuracy improves progressively with increasing SNR, reaching 98% at SNR = 10 dB, highlighting the method’s adaptability to varying noise levels. The sparse dictionary, refined through prior-informed dimensionality reduction, maintains high precision while minimizing computational complexity. Comparative evaluations confirm that OMP outperforms CoSaMP and Subspace Pursuit (SP) in both estimation accuracy and computational efficiency. The execution time is reduced to 1.73 ms for single-parameter estimation (Fig. 15, Table 5). Parameter estimation consistency is further validated through the calculation of blade length and rotational speed. At SNR = −10 dB, the Mean Absolute Error (MAE) for blade length is 0.021 m, and 0.31 rad/s for rotational speed (Table 6). Both errors decrease significantly with improved SNR, demonstrating the method’s robustness across diverse noise conditions. The framework remains stable in multi-blade configurations, with extracted time-frequency characteristics closely matching theoretical expectations (Figs. 2 and 3). The integration of CVMD and OMP effectively balances accuracy and computational efficiency under low-SNR conditions. By leveraging prior-informed dimensionality reduction, the framework achieves a 90% reduction in computational load relative to conventional techniques. Future research will extend this framework to multi-target environments and validate its performance using real-world underwater acoustic datasets.  Conclusions  This study addresses the challenge of estimating underwater propeller parameters under low SNR conditions by proposing a novel framework that integrates CVMD and OMP. CVMD demonstrates strong capability in suppressing noise while preserving key micro-Doppler features, allowing reliable extraction of target signatures from severely corrupted signals. By incorporating time-frequency characteristics as prior knowledge, OMP enables accurate and efficient blade number estimation, substantially reducing computational complexity. The proposed framework shows high adaptability to varying noise levels and propeller configurations, ensuring robust performance in complex underwater environments. Its balance between estimation accuracy and computational efficiency supports real-time application in acoustic target recognition. The consistency of results with theoretical models further supports the method’s physical interpretability and practical relevance. Future work will extend this approach to multi-target scenarios and validate its effectiveness using experimental acoustic datasets, advancing the deployment of model-driven methods in real-world marine detection systems.
Dynamic Analysis and Synchronization Control of Extremely Simple Cyclic Memristive Chaotic Neural Network
LAI Qiang, QIN Minghong
 doi: 10.11999/JEIT250212
[Abstract](31) [FullText HTML](7) [PDF 5832KB](19)
Abstract:
  Objective  Memristors are considered promising devices for the construction of artificial synapses because their unique nonlinear and non-volatile properties effectively mimic the functions and mechanisms of biological synapses. These features have made memristors a research focus in brain-inspired science. Memristive neural networks, composed of memristive neurons or memristive synapses, constitute a class of biomimetic artificial neural networks that exhibit dynamic behaviors more closely aligned with those of biological neural systems and provide more plausible biological interpretations. Since the concept of the memristive neural network was proposed, extensive pioneering research has been conducted, revealing several critical issues that require further exploration. Although current memristive neural networks can generate complex dynamic behaviors such as chaos and multistability, these effects are often achieved at the cost of increased network complexity or the requirement for specialized memristive characteristics. Therefore, the systematic exploration of simple memristive neural networks that can produce diverse dynamic behaviors, the proposal of practical design strategies, and the development of efficient, precise control schemes remain of considerable research value.  Methods  This paper proposes a chaoticization method for an Extremely Simple Cyclic Memristive Convolutional Neural Network (ESCMCNN) that contains only unidirectional synaptic connections based on memristors. Using a three-node neural network as an example, a class of memristive cyclic neural networks with simple structures and rich dynamic behaviors is constructed. Numerical analysis tools, including bifurcation diagrams, basins of attraction, phase plane diagrams, and Lyapunov exponents, are employed to investigate the networks’ diverse bifurcation processes, multiple types of multistability, and multi-variable signal amplitude control. Electronic circuit experiments are used to validate the feasibility of the proposed networks. Finally, a novel multi-power reaching law is developed to achieve chaotic synchronization within fixed time.  Results and Discussions  For a three-node cyclic neural network initially in a periodic state, two network chaotification methods—full-synaptic memristivation and multi-node extension—are proposed using magnetically controlled memristors (Fig. 1). Phase plane diagrams illustrate the chaotic attractors generated by these networks (Fig. 2), confirming the feasibility of the proposed methods. Using network (B) as an example, numerical analysis tools are utilized to study its diverse dynamic evolution processes (Fig. 5, Fig. 6, Fig. 7), various forms of multistability (Fig. 8, Fig. 9), and multi-variable amplitude control (Fig. 10). The physical realization of network (B) is further demonstrated through circuit experiments (Fig. 11, Fig. 12). Additionally, the effectiveness of the fixed-time synchronization control strategy for network (B) is verified through numerical simulations (Fig. 13, Fig. 14).  Conclusions  This paper proposes a construction method for the ESCMCNN capable of generating rich dynamic behaviors. A series of ESCMCNNs is successfully designed based on a three-node neural network in a periodic state. The dynamic evolution of the ESCMCNN as a function of memristive parameters is investigated using numerical tools, including single- and dual-parameter bifurcation diagrams and Lyapunov exponents. Under different initial conditions, the ESCMCNN exhibits various forms of multistability, including the coexistence of point attractors with periodic attractors, and point attractors with chaotic attractors. The study further demonstrates that the oscillation amplitudes of multiple variables in the ESCMCNN are strongly dependent on the memristive coupling strength. The reliability of these numerical results is confirmed through electronic circuit experiments. In addition, a novel multi-power reaching law is proposed to achieve fixed-time synchronization of the network, and its feasibility and effectiveness are validated through simulation tests.
Sparse Channel Estimation and Array Blockage Diagnosis for Non-Ideal RIS-Assisted MIMO Systems
LI Shuangzhi, LEI Haojie, GUO Xin
 doi: 10.11999/JEIT241108
[Abstract](71) [FullText HTML](18) [PDF 2088KB](5)
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RISs) offer a promising approach to enhance Millimeter-Wave (mmWave) Multiple-Input Multiple-Output (MIMO) systems by dynamically manipulating wireless propagation. However, practical deployments are challenged by hardware faults and environmental blockages (e.g., dust or rain), which impair Channel State Information (CSI) accuracy and reduce Spectral Efficiency (SE). Most existing studies either overlook the interdependence between the CSI and blockage vector or fail to leverage the dual sparsity of multipath channels and blockage patterns. This study proposes a joint sparse channel estimation and blockage diagnosis scheme to overcome these limitations, thereby enabling reliable beamforming and enhancing system robustness in non-ideal RIS-assisted mmWave MIMO environments.  Methods  A third-order parallel factor (PARAFAC) decomposition model is constructed for the received signals using a tensor-based signal representation. The intrinsic relationship between mmWave channel parameters and the blockage vector is exploited to estimate spatial angular frequencies at the User Equipment (UE) and Base Station (BS) using Orthogonal Matching Pursuit (OMP). Based on these frequencies, a coupled observation matrix is formed to jointly capture residual channel parameters and blockage vector information. This matrix is reformulated as a Least Absolute Shrinkage and Selection Operator (LASSO) problem, which is solved using the Alternating Direction Method of Multipliers (ADMM) to estimate the blockage vector. The remaining channel parameters are then recovered using sparse reconstruction techniques by leveraging their inherent sparsity. Iterative refinement updates both the blockage vector and channel parameters, ensuring convergence under limited pilot overhead conditions.  Results and Discussions  For a non-ideal RIS-assisted mmWave MIMO system (Fig. 1), a signal transmission framework is designed (Fig. 2), in which the received signals are represented as a third-order tensor. Leveraging the dual-sparsity of multipath channels and the blockage vector, a joint estimation scheme is developed (Table 2), enabling effective parameter decoupling through tensor-based parallel factor decomposition and iterative optimization. Simulation results show that the proposed scheme achieves superior performance in both channel estimation and blockage diagnosis compared with baseline methods by fully exploiting dual-sparsity characteristics (Fig. 3). SE analysis confirms the detrimental effect of blockages on system throughput and highlights that the proposed scheme improves SE by compensating for blockage-induced impairments (Fig. 4). The method also demonstrates strong estimation accuracy under reduced pilot overhead (Fig. 5) and improved robustness as the number of blocked RIS elements increases (Fig. 6). A decline in spatial angular frequency estimation is observed with fewer UE antennas, which negatively affects overall performance; however, estimation stabilizes as antenna count increases (Fig. 7). Moreover, when Non-Line-of-Sight (NLoS) path contributions decrease, the scheme exhibits enhanced performance due to improved resolution between Line-of-Sight (LoS) and NLoS components (Fig. 8).  Conclusions  This study proposes a joint channel estimation and blockage diagnosis scheme for non-ideal RIS-assisted mmWave MIMO systems, based on the dual sparsity of multipath channels and blockage vectors. Analysis of the tensor-based parallel factor decomposition model reveals that the estimation of spatial angular frequencies at the UE and BS is unaffected by blockage conditions. The proposed scheme accounts for the contributions of NLoS paths, enabling accurate decoupling of residual channel parameters and blockage vector across different propagation paths. Simulation results confirm that incorporating NLoS path information improves both channel estimation accuracy and blockage detection. Compared with existing methods, the proposed approach achieves superior performance in both aspects. In practical scenarios, real-time adaptability may be challenged if blockage states vary more rapidly than channel characteristics. Future work will focus on enhancing the scheme’s responsiveness to dynamic blockage conditions.
LFTA:Lightweight Feature Extraction and Additive Attention-based Feature Matching Method
GUO Zhiqiang, WANG Zihan, WANG Yongsheng, CHEN Pengyu
 doi: 10.11999/JEIT250124
[Abstract](36) [FullText HTML](18) [PDF 4504KB](5)
Abstract:
  Objective  With the rapid development of deep learning, feature matching has advanced considerably, particularly in computer vision. This progress has led to improved performance in tasks such as 3D reconstruction, motion tracking, and image registration, all of which depend heavily on accurate feature matching. Nevertheless, current techniques often face a trade-off between accuracy and computational efficiency. Some methods achieve high matching accuracy and robustness but suffer from slow processing due to algorithmic complexity. Others offer faster processing but compromise matching accuracy, especially under challenging conditions such as dynamic scenes, low-texture environments, or large view-angle variations. The key challenge is to provide a balanced solution that ensures both accuracy and efficiency. To address this, this paper proposes a Lightweight Feature exTraction and matching Algorithm (LFTA), which integrates an additive attention mechanism within a lightweight architecture. LFTA enhances the robustness and accuracy of feature matching while maintaining the computational efficiency required for real-time applications.  Methods  LFTA utilizes a multi-scale feature extraction network designed to capture information from images at different levels of detail. A triple-exchange fusion attention mechanism merges information across multiple dimensions, including spatial and channel features, allowing the network to learn more robust feature representations. This mechanism improves matching accuracy, particularly in scenarios with sparse textures or large viewpoint variations. LFTA further integrates an adaptive Gaussian kernel to dynamically generate keypoint heatmaps. The kernel adjusts according to local feature strength, enabling accurate keypoint extraction in both high-response and low-response regions. To improve keypoint precision, a dynamic Non-Maximum Suppression (NMS) strategy is applied, which adapts to varying keypoint densities across different image regions. This approach reduces redundancy and improves detection accuracy. In the final stage, LFTA employs a lightweight module with an additive Transformer attention mechanism to refine feature matching. This module strengthens feature fusion while reducing computational complexity through depthwise separable convolutions. These operations substantially lower parameter count and computational cost without affecting performance. Through this combination of techniques, LFTA achieves accurate pixel-level matching with fast inference times, making it suitable for real-time applications.  Results and Discussions  The performance of LFTA is assessed through extensive experiments conducted on two widely used and challenging datasets: MegaDepth and ScanNet. These datasets offer diverse scenarios for evaluating the robustness and efficiency of feature matching methods, including variations in texture, environmental complexity, and viewpoint changes. The results indicate that LFTA achieves higher accuracy and computational efficiency than conventional feature matching approaches. On the MegaDepth dataset, an AUC@20° of 79.77% is attained, which is comparable to or exceeds state-of-the-art methods such as LoFTR. Notably, this level of performance is achieved while reducing inference time by approximately 70%, supporting the suitability of LFTA for practical, time-sensitive applications. When compared with other efficient methods, including Xfeat and Alike, LFTA demonstrates superior matching accuracy with only a marginal increase in inference time, proving its competitive performance in both accuracy and speed. The improvement in accuracy is particularly apparent in scenarios characterized by sparse textures or large viewpoint variations, where traditional methods often fail to maintain robustness. Ablation studies confirm the contribution of each LFTA component. Exclusion of the triple-exchange fusion attention mechanism results in a significant reduction in accuracy, indicating its function in managing complex feature interactions. Similarly, both the adaptive Gaussian kernel and dynamic NMS are found to improve keypoint extraction, emphasizing their roles in enhancing overall matching precision.  Conclusions  The LFTA algorithm addresses the long-standing trade-off between feature extraction accuracy and computational efficiency in feature matching. By integrating the triple-exchange fusion attention mechanism, adaptive Gaussian kernels, and lightweight fine-tuning strategies, LFTA achieves high matching accuracy in dynamic and complex environments while maintaining low computational requirements. Experimental results on the MegaDepth and ScanNet datasets demonstrate that LFTA performs well under typical feature matching conditions and shows clear advantages in more challenging scenarios, including low-texture regions and large viewpoint variations. Given its efficiency and robustness, LFTA is well suited for real-time applications such as Augmented Reality (AR), autonomous driving, and robotic vision, where fast and accurate feature matching is essential. Future work will focus on further optimizing the algorithm for high-resolution images and more complex scenes, with the potential integration of hardware acceleration to reduce computational overhead. The method could also be extended to other computer vision tasks, including image segmentation and object detection, where reliable feature matching is required.
Bootstrapping Optimization Techniques for the FINAL Fully Homomorphic Encryption Scheme
ZHAO Xiufeng, WU Meng, SONG Weitao
 doi: 10.11999/JEIT241036
[Abstract](60) [FullText HTML](23) [PDF 4247KB](9)
Abstract:
  Objective  Bootstrapping is a fundamental process in Fully Homomorphic Encryption (FHE) that directly affects its practical efficiency. The FINAL scheme, presented at ASIACRYPT 2022, achieves a 28% improvement in bootstrapping speed compared with TFHE, demonstrating high suitability for homomorphic Boolean operations. Nevertheless, further improvements are required to reduce its computational overhead and storage demands. This study aims to optimize the bootstrapping phase of FINAL by lowering its computational complexity and key size while preserving the original security level.  Methods  This study proposes two key optimizations. Accumulator compression for blind rotation: A blockwise binary distribution is incorporated into the Learning With Errors (LWE) key generation process. By organizing the key into blocks, each requiring only a single external product, the number of external product operations during blind rotation is reduced. Key reuse strategy for key switching: The LWE key is partially reused during the generation of the Number-theoretic Gadget Switching (NGS) key. The reused portion is excluded from the key switching key, thereby reducing both the key size and the number of associated operations.  Results and Discussions  Under equivalent security assumptions, the optimized FINAL scheme yields substantial efficiency gains. For blind rotation, the number of external product operations is reduced by 50% (from 610 to 305), and the number of Fast Fourier Transform (FFT) operations is halved (from 3,940 to 1,970) (Table 5). For key switching, the key size is reduced by 60% (from 11,264 to 4,554), and the computational complexity decreases from 13.8 × 106 to 5.6× 106 scalar operations (Table 6).  Conclusions  The proposed optimizations substantially improve the efficiency of the FINAL scheme’s bootstrapping phase. Blind rotation benefits from structured key partitioning, reducing the number of core operations by half. Key switching achieves comparable reductions in both storage requirements and computational cost through partial key reuse. These enhancements improve the practicality of FHE for real-world applications that demand efficient evaluation of Boolean circuits. Future directions include hardware acceleration and adaptive parameter tuning.
Overview of the Research on Key Technologies for AI-powered Integrated Sensing, Communication and Computing
ZHU Zhengyu, YIN Menglin, YAO Xinwei, XU Yongjun, SUN Gangcan, XU Mingliang
 doi: 10.11999/JEIT250242
[Abstract](375) [FullText HTML](118) [PDF 4073KB](65)
Abstract:
The Integration of Sensing, Communication and Computing (ISCC) combined with Artificial Intelligence(AI) algorithms has emerged as a critical enabler of Sixth-Generation (6G) networks due to its high spectral efficiency and low hardware cost. AI-powered ISCC systems, which combine sensing, communication, computing, and intelligent algorithms, support fast data processing, real-time resource allocation, and adaptive decision-making in complex and dynamic environments. These systems are increasingly applied in intelligent vehicular networks—including Unmanned Aerial Vehicles (UAVs) and autonomous driving—as well as in radar, positioning, tracking, and beamforming. This overview outlines the development and advantages of AI-enabled ISCC systems, focusing on performance benefits, application potential, evaluation metrics, and enabling technologies. It concludes by discussing future research directions. Future 6G networks are expected to evolve beyond data transmission to form an integrated platform that unifies sensing, communication, computing, and intelligence, enabling pervasive AI services.  Significance   AI-powered ISCC marks a transformative shift in wireless communication, enabling more efficient spectrum utilization, reduced hardware cost, and improved adaptability in complex environments. This integration is central to the development of 6G networks, which aim to deliver intelligent and efficient services across applications such as autonomous vehicles, UAVs, and smart cities. The significance of this research lies in its potential to reshape the management and optimization of communication, sensing, and computing resources, advancing the realization of a ubiquitously connected and intelligent infrastructure.  Progress   Recent advances in AI—particularly in machine learning, deep learning, and reinforcement learning—have substantially improved the performance of ISCC systems. These methods enable real-time data processing, intelligent resource management, and adaptive decision-making, which are critical for future 6G requirements. Notable progress includes AI-driven waveform design, beamforming, channel estimation, and dynamic spectrum allocation, all of which enhance ISCC efficiency and reliability. Additionally, the integration of edge computing and federated learning has mitigated challenges related to latency, data privacy, and scalability, facilitating broader deployment of AI-enabled ISCC systems.  Conclusions  Research on AI-powered ISCC systems highlights the benefits of integrating AI with sensing, communication, and computing. AI algorithms improve resource efficiency, sensing precision, and real-time adaptability, making ISCC systems well suited for dynamic and complex environments. The adoption of lightweight models and distributed learning has broadened applicability to resource-limited platforms such as drones and IoT sensors. Overall, AI-enabled ISCC systems advance the realization of 6G networks, where sensing, communication, and computing are unified to support intelligent and ubiquitous services.  Prospects   The advancement of AI-powered ISCC systems depends on addressing key challenges, including data quality, model complexity, security, and real-time performance. Future research should focus on developing robust AI models capable of generalizing across diverse wireless environments. Progress in lightweight AI and edge computing will be critical for deployment in resource-constrained devices. The integration of multi-modal data and the design of secure, privacy-preserving algorithms will be essential to ensure system reliability and safety. As 6G networks evolve, AI-powered ISCC systems are expected to underpin intelligent, efficient, and secure communication infrastructures, reshaping human-technology interaction in the digital era.
Resource Allocation in Reconfigurable Intelligent Surfaces Assisted NOMA Based Space-Air-Ground Integrated Network
LIANG Wei, LI Aoying, LUO Wei, LI Lixin, LIN Wensheng, LI Xu, WEI Baoguo
 doi: 10.11999/JEIT250078
[Abstract](116) [FullText HTML](44) [PDF 3486KB](22)
Abstract:
  Objective  The exponential growth of 6G wireless communication demands has positioned the Space-Air-Ground Integrated Network (SAGIN) as a promising architecture, aiming to achieve broad coverage and adaptive networking. However, complex geographic environments, including building obstructions, frequently hinder direct communication between ground users and base stations, thereby requiring effective relay strategies to maintain reliability. Reconfigurable Intelligent Surfaces (RIS) have attracted considerable attention for their capacity to improve signal coverage through passive beamforming. This study develops an RIS-assisted SAGIN architecture incorporating aerial RIS clusters and High-Altitude Platforms (HAPS) to enable communication between ground users and a Low Earth Orbit (LEO) satellite. To enhance energy efficiency, the system further optimizes user relay selection, power allocation, and beamforming for both LEO and RIS components.  Methods  The proposed system integrates LEO satellites, HAPS, Unmanned Aerial Vehicles (UAVs) equipped with RIS, and ground users within a three-dimensional communication space. Due to environmental obstructions, ground users are unable to maintain direct links with base stations; instead, RIS functions as a passive relay node. To improve relay efficiency, users are grouped and associated with specific RIS units. The total system bandwidth is partitioned into sub-channels assigned to different user groups. A matching algorithm is designed for user selection, followed by user group association with each RIS. For LEO communications, HAPS serve as active relay nodes that decode and forward signals to ground base stations. The system considers both direct and RIS-assisted communication links. An optimization problem is formulated to maximize energy efficiency under constraints related to user Quality of Service (QoS), power allocation, and beamforming for both LEO and RIS. To solve this, the proposed Alternating Pragmatic Iterative Algorithm in SAGIN (APIA-SAGIN) decomposes the problem into three sub-tasks: user relay selection, LEO beamforming, and RIS beamforming. The Successive Convex Approximation (SCA) and SemiDefinite Relaxation (SDR) methods are employed to transform the original non-convex problem into tractable convex forms for efficient solution.g.  Results and Discussions  Simulation results confirm the effectiveness of the proposed APIA-SAGIN algorithm in optimizing the energy efficiency of the RIS-assisted SAGIN. As shown in (Fig. 5), increasing the number of RIS elements and LEO antennas markedly improves energy efficiency compared with the random phase shift algorithm. This demonstrates that the proposed algorithm enables channel-aware control by aligning RIS beamforming with the ground transmission channel and jointly optimizing LEO beamforming, RIS beamforming, and LEO-channel alignment. As illustrated in (Fig. 6), both energy efficiency and the achievable rate of the LEO link increase with transmission power. However, beyond a certain power threshold, energy consumption rises faster than the achievable rate, leading to diminishing or even negative returns in energy efficiency. (Fig. 7) shows that higher power in ground user groups leads to increased achievable rates. Nonetheless, expanding the number of RIS elements proves more effective than increasing transmission power for enhancing user throughput. As shown in (Fig. 8), a higher number of RIS elements leads to simultaneous improvements in energy efficiency and achievable rate in the ground segment. Moreover, increasing the number of ground users does not degrade energy efficiency; instead, it results in a gradual increase, suggesting efficient resource allocation. Compared with the random phase shift algorithm, the proposed approach achieves superior performance in both energy efficiency and achievable rate. These findings support its potential for practical deployment in SAGIN systems.  Conclusions  This study proposes an RIS-assisted SAGIN architecture that utilizes aerial RIS clusters and HAPS to support communication between ground users and LEO satellites. The APIA-SAGIN algorithm is developed to jointly optimize user relay selection, LEO beamforming, and RIS beamforming with the objective of maximizing system energy efficiency. Simulation results demonstrate the effectiveness and robustness of the algorithm under complex conditions. The proposed approach offers a promising direction for improving the energy efficiency and overall performance of SAGIN, providing a foundation for future research and practical implementation.
Aggregation of Combat Units and Mission Planning under the Influence of War Damage
WANG Chen, ZHU Cheng, LEI Hongtao
 doi: 10.11999/JEIT250079
[Abstract](26) [FullText HTML](13) [PDF 2606KB](17)
Abstract:
  Objective  In dynamic and volatile battlefield environments, where the command structure of combat units may be disrupted, combat units must autonomously form appropriate tactical groups in edge operational settings, determine group affiliation, and rapidly allocate tasks. This study proposes a combat unit aggregation and planning method based on an adaptive clustering contract network, addressing the real-time limitations of traditional centralized optimization algorithms. The proposed method enables collaborative decision-making for autonomous group formation and supports multi-task optimization and allocation under dynamic battlefield conditions.  Methods  (1) An adaptive combat group division algorithm based on the second-order relative change rate is proposed. The optimal number of groups is determined using the Sum of Squared Errors (SSE) indicator, and spatial clustering of combat units is performed via an improved K-means algorithm. (2) A dual-layer contract network architecture is designed. In the first layer, combat groups participate in bidding by computing the net effectiveness of tasks, incorporating attributes such as attack, defense, and value. In the second layer, individual combat units conduct bidding with a load balancing factor to optimize task selection. (3) Mechanisms for task redistribution and exchange are introduced, improving global utility through a secondary bidding process that reallocates unassigned tasks and replaces those with negative effectiveness.  Results and Discussions  (1) The adaptive combat group division algorithm demonstrates enhanced situational awareness (Algorithm 1). Through dynamic clustering analysis, it accurately captures the spatial aggregation of combat units (Fig. 6 and Fig. 9), showing greater adaptability to environmental variability than conventional fixed-group models. (2) The multi-layer contract network architecture exhibits marked advantages in complex task allocation. The group-level pre-screening mechanism significantly reduces computational overhead (Fig. 2), while the unit-level negotiation process improves resource utilization by incorporating load balancing. (3) The dynamic task optimization mechanism enables continuous refinement of the allocation scheme. It resolves unassigned tasks and enhances overall system effectiveness through intelligent task exchanges. Comparative experiments confirm that the proposed framework outperforms traditional approaches in task coverage and resource utilization efficiency (Table 4 and Table 5), supporting its robustness in dynamic battlefield conditions.  Conclusions  This study integrates clustering analysis with contract network protocols to establish an intelligent task allocation framework suited to dynamic battlefield conditions. By implementing dual-layer optimization in combat group division and task assignment, the approach improves combat resource utilization and shortens the kill chain. Future research will focus on validating the framework in multi-domain collaborative combat scenarios, refining bidding strategies informed by combat knowledge, and advancing command and control technologies toward autonomous coordination.
Joint Resource Optimization Algorithm for Intelligent Reflective Sur-face Assisted Wireless Soft Video Transmission
WU Junjie, LUO Lei, ZHU Ce, JIANG Pei
 doi: 10.11999/JEIT250019
[Abstract](21) [FullText HTML](12) [PDF 0KB](0)
Abstract:
  Objective  Intelligent Reflecting Surface (IRS) technology is a key enabler for next-generation mobile communication systems, addressing the growing demands for massive device connectivity and increasing data traffic. Video data accounts for over 80% of global mobile traffic, and this proportion continues to rise. Although video SoftCast offers a simpler structure and more graceful degradation compared to conventional separate source-channel coding schemes, its transmission efficiency is restricted by the limited availability of wireless transmission resources. Moreover, existing SoftCast frameworks are not inherently compatible with IRS-assisted wireless channels. To address these limitations, this paper proposes an IRS-assisted wireless soft video transmission scheme.  Methods  Video soft transmission distortion is jointly determined by three critical wireless resources: transmit power, active beamforming at the primary transmitter, and passive beamforming at the IRS. Minimizing video soft transmission distortion is therefore formulated as a joint optimization problem over these resources. To solve this multivariable problem, an Alternating Optimization (AO) framework is employed to decouple the original problem into single-variable subproblems. For the fractional nonhomogeneous quadratic optimization and unit-modulus constraints arising in this process, the Semi-Definite Relaxation (SDR) method is applied to obtain near-optimal solutions for both active and passive beamforming vectors. Based on the derived beamforming vectors, the optimal power allocation factor for soft transmission is then computed using the Lagrange multiplier method.  Results and Discussions  Simulation results indicate that the proposed method yields an improvement of at least 1.82 dB in Peak Signal-to-Noise Ratio (PSNR) compared to existing video soft transmission approaches (Fig. 3 and Fig. 4). Besides, evaluation across extensive HEVC test sequences shows that the proposed method achieves an average received quality gain of no less than 1.51 dB (Table 1). Further simulations reveal that when the secondary link channel quality falls below a critical threshold, it no longer contributes to improving received video quality (Fig. 5). Rapid variations in the secondary signal \begin{document}$c$\end{document} degrade the reception quality of the primary signal, with a reduction of approximately 0.52 dB observed (Fig. 6). Increasing the number of IRS elements significantly enhances both video reception quality and achievable rates for the primary and secondary links (Fig. 7); however, this improvement comes with a power-law scaling increase in computational complexity. Additional simulations confirm that the proposed method maintains per-frame quality fluctuations within an acceptable range across each Group of Pictures (GOP) (Fig. 8). As GOP size increases, temporal redundancy within the source is more effectively removed, leading to further improvements in received quality, although this is accompanied by higher computational complexity (Fig. 9).  Conclusions  This paper proposes an IRS-assisted soft video transmission scheme that leverages IRS-aided secondary links to improve received video quality. To minimize video signal distortion, a multivariable optimization problem is formulated for joint resource allocation. An AO framework is adopted to decouple the problem into single-variable subproblems, which are solved iteratively. Simulation results show that the proposed method achieves significant improvements in both objective and subjective visual quality compared to existing video transmission algorithms. In addition, the effects of secondary link channel gain, secondary signal characteristics, the number of IRS elements, and GOP parameters on transmission performance are systematically examined. This study demonstrates, for the first time, the performance enhancement of video soft transmission using IRS and provides a technical basis for the development of video soft transmission in IRS-assisted communication environments.
Design of Private Set Intersection Protocol Based on National Cryptographic Algorithms
HUANG Hai, GUAN Zhibo, YU Bin, MA Chao, YANG Jinbo, MA Xiangyu
 doi: 10.11999/JEIT250050
[Abstract](16) [FullText HTML](14) [PDF 1122KB](0)
Abstract:
  Objective  The rapid development of global digital transformation has exposed Private Set Intersection (PSI) as a key bottleneck constraining the digital economy. Although technical innovations and architectural advances in PSI protocols continue to emerge, current protocols face persistent challenges, including algorithmic vulnerabilities in international cryptographic primitives and limited computational efficiency when applied to large-scale datasets. To address these limitations, this study integrates domestic SM2 elliptic curve cryptography and the SM3 cryptographic hash function to enhance PSI protocol performance and protect sensitive data, providing technical support for China’s cyberspace security. A PSI protocol based on national cryptographic standards (SM-PSI) is proposed, with hardware acceleration of core cryptographic operations implemented using domestic security chips. This approach achieves simultaneous improvements in both security and computational efficiency.  Methods  SM-PSI integrates the domestic SM2 and SM3 cryptographic algorithms to reveal only the intersection results without disclosing additional information, while preserving the privacy of each participant’s input set. By combining SM2 elliptic curve public-key encryption with the SM3 hash algorithm, the protocol reconstructs encryption parameter negotiation, data obfuscation, and ciphertext mapping processes, thereby eliminating dependence on international algorithms such as RSA and SHA-256. An SM2-based non-interactive zero-knowledge proof mechanism is designed to verify the validity of public–private key pairs using a single communication round. This reduces communication overhead, mitigates man-in-the-middle attack risks, and prevents private key exposure. The domestic reconfigurable cryptographic chip RSP S20G is integrated to offload core computations, including SM2 modular exponentiation and SM3 hash iteration, to dedicated hardware. This software–hardware co-acceleration approach significantly improves protocol performance.  Results and Discussions  Experimental results on simulated datasets demonstrate that SM-PSI, through hardware–software co-optimization, significantly outperforms existing protocols at comparable security levels. The protocol achieves an average speedup of 4.2× over the CPU-based SpOT-Light PSI scheme and 6.3× over DH-IPP (Table 4), primarily due to offloading computationally intensive operations, including SM2 modular exponentiation and SM3 hash iteration, to dedicated hardware. Under the semi-honest model, SM-PSI reduces both the number of dataset encryption operations and communication rounds, thereby lowering data transmission volume and computational overhead. Its computational and communication complexities are substantially lower than those of SpOT-Light, DH-IPP, and FLASH-RSA, making it suitable for large-scale data processing and low-bandwidth environments (Table 2). Simulation experiments further show that the hardware-accelerated framework consistently outperforms CPU-only implementations, achieving a peak speedup of 9.0×. The speedup ratio exhibits a near-linear relationship with dataset size, indicating stable performance as the ID data volume increases with minimal efficiency loss (Fig. 3). These results demonstrate SM-PSI’s ability to balance security, efficiency, and scalability for practical privacy-preserving data intersection applications.  Conclusions  This study proposes SM-PSI, a PSI protocol that integrates national cryptographic algorithms SM2 and SM3 with hardware–software co-optimization. By leveraging domestic security chip acceleration for core operations, including non-interactive zero-knowledge proofs and cryptographic computations, the protocol addresses security vulnerabilities present in international algorithms and overcomes computational inefficiencies in large-scale applications. Theoretical analysis confirms its security under the semi-honest adversary model, and experimental results demonstrate substantial performance improvements, with an average speedup of 4.2× over CPU-based SpOT-Light and 6.3× over DH-IPP. These results establish SM-PSI as an efficient and autonomous solution for privacy-preserving set intersection, supporting China’s strategic objective of achieving technical independence and high-performance computation in privacy-sensitive environments.  Prospects   Future work will extend this research by exploring more efficient PSI protocols based on national cryptographic standards, aiming to improve chip–algorithm compatibility, reduce power consumption, and enhance large-scale data processing efficiency. Further efforts will target optimizing protocol scalability in multi-party scenarios and developing privacy-preserving set intersection mechanisms suitable for multiple participants to meet complex practical application demands. In addition, this research will promote integration with other privacy-enhancing technologies, such as federated learning and differential privacy, to support the development of a more comprehensive privacy protection framework.
SINR Adaptive Symbol Level Precoding and Position Joint Optimization Strategy for Multiple Unmanned Aerial Vehicles Anti-Jamming Communication
WEI Haoran, YAO Rugui, FAN Ye, MA Weixin, ZUO Xiaoya
 doi: 10.11999/JEIT250221
[Abstract](24) [FullText HTML](11) [PDF 2842KB](2)
Abstract:
  Objective  Unmanned Aerial Vehicles (UAVs) are widely applied in areas such as traffic monitoring, wireless coverage, and precision agriculture due to their high mobility and deployment flexibility. In air–ground communication scenarios requiring flexible deployment, UAV mobility can be leveraged to counteract external malicious jamming. Further, the collaborative operation of multiple UAVs enables improved system performance. However, the broadcast nature of wireless communication renders multiple UAV communication systems vulnerable to jamming attacks that disrupt legitimate communication services. Addressing this challenge, this study proposes a Signal-to-Interference-plus-Noise Ratio (SINR) adaptive Symbol-Level Precoding (SLP) and position joint optimization strategy for anti-jamming communication in multi-UAV systems. The approach fully exploits UAV mobility to enhance communication robustness under different user requirements. By integrating Coordinated Multi-Point (CoMP) transmission with SLP in a Multi-User Multiple-Input Single-Output (MU-MISO) system, the strategy improves interference utilization, enhances system energy efficiency, and reduces computational complexity.  Methods  An SINR adaptive SLP and position joint anti-jamming optimization strategy for multiple UAVs is proposed by integrating CoMP and SLP technologies. To address the challenges of three-dimensional operational space and the overlap of nodes assigned to multiple sets, a ground-to-air multi-node matching mechanism based on three-dimensional K-means++ collaborative set partitioning is designed. To reduce the computational complexity of the joint optimization process, an iterative optimization algorithm based on particle reconstruction is developed. This algorithm simultaneously solves for both the UAV precoding matrix and spatial positions with low computational overhead. Additionally, an SINR adaptive SLP approach is introduced to enable optimized power allocation for multiple UAVs, considering the varying power characteristics of jamming and noise experienced by users. Simulation results demonstrate that the integration of CoMP and SLP technologies effectively enhances the communication performance of jammed users, while stable communication performance is maintained for ordinary users.  Results and Discussions  In the proposed UAV anti-jamming communication strategy, the SINR of jammed users is improved without compromising the normal communication performance of ordinary users. In the simulation results, UAV positions are marked with five-pointed stars (Figs. 6 and 7), UAV coverage areas are represented by circles, and users are indicated by dots. A comparison of SINR variations under four schemes (Fig. 8) shows that the received SINR of jammed users increases by approximately 12–13 dB, while the SINR of ordinary users remains above the required threshold. When different SINR thresholds are applied, the received SINR of each user type varies accordingly (Fig. 9). By setting appropriate thresholds based on actual scenario requirements, different energy allocation effects can be achieved. Following optimization, the Bit Error Rate (BER) of jammed users is significantly reduced (Fig. 10). The constellation diagrams comparing the received signals under two precoding schemes (Fig. 11) indicate that the proposed SINR adaptive SLP strategy for multiple UAVs effectively improves the SINR of jammed users, while maintaining the communication quality of ordinary users. Moreover, the fitness evolution curve of the iterative optimization algorithm based on particle reconstruction (Fig. 12) shows that the algorithm approaches the global optimal solution at an early stage of iteration.  Conclusions  To address the challenge of anti-jamming communication for multi-user services supported by multiple UAVs, this study integrates CoMP transmission with SLP technology and proposes an SINR adaptive SLP and position joint optimization strategy for multi-UAV anti-jamming communication. The strategy is implemented in two stages. First, to solve the clustering problem in three-dimensional space and allow nodes to belong to multiple groups, a ground-to-air multi-node matching mechanism based on three-dimensional K-means++ collaborative set partitioning is designed. Second, an SINR adaptive SLP method is proposed to optimize UAV power allocation based on the SINR requirements of different user types. To reduce the computational complexity of jointly optimizing the precoding matrix and UAV positions, an iterative optimization algorithm based on particle reconstruction is developed. Simulation results demonstrate that the proposed strategy, by combining CoMP and SLP, effectively improves the communication performance of jammed users while maintaining reliable communication for ordinary users.
A Collaborative Detection Method for Bauxite Quality Parameters Based on the Fusion of G-DPN and Near-Infrared Spectroscopy
ZOU Liang, REN Kelong, WU Hao, XU Zhibin, TAN Zhiyi, LEI Meng
 doi: 10.11999/JEIT250240
[Abstract](20) [FullText HTML](12) [PDF 2923KB](1)
Abstract:
  Objective  Bauxite is a critical non-metallic mineral resource used in aluminum production, ceramic manufacturing, and refractory material processing. As global demand for aluminum and its derivatives continues to rise, improving the efficiency of bauxite resource utilization is essential. Accurate determination of quality parameters supports the reduction of waste from low-grade ores during smelting and improves overall process optimization. However, traditional chemical analyses are time-consuming, costly, complex, and subject to human error. Existing rapid testing methods, often based on machine learning, typically predict individual quality indicators and overlook correlations among multiple parameters. Deep learning, particularly multi-task learning, offers a solution to this limitation. Near-InfraRed (NIR) spectroscopy, a real-time, non-destructive analytical technique, is especially suited for assessing mineral quality. This study proposes a multi-indicator collaborative detection model—Gate-Depthwise Pointwise Network (G-DPN)—based on NIR spectroscopy to enable the simultaneous prediction of multiple bauxite quality parameters. The proposed approach addresses the limitations of conventional methods and supports efficient, accurate, and cost-effective real-time quality monitoring in industrial settings.  Methods  To accurately model the nonlinear relationships between NIR spectral features and bauxite quality parameters while leveraging inter-parameter correlations, this study proposes a dedicated representation model, G-DPN. The model incorporates large-kernel DepthWise Convolution (DWConv) to extract long-range dependencies within individual spectral channels, and PointWise Convolution (PWConv) to enable inter-channel feature fusion. A Spatial Squeeze-and-Excitation (sSE) mechanism is introduced to enhance spatial feature weighting, and residual connections support the integration of deep features. To further improve task differentiation, a Custom Gate Control (CGC) module is added to separate shared and task-specific features. Orthogonal constraints are applied within this module to reduce feature redundancy. Gate-controlled fusion enables each branch to focus on extracting task-relevant information while preserving shared representations. Additionally, quality parameter labels are normalized to address scale heterogeneity, allowing the model to establish a stable nonlinear mapping between spectral inputs and multiple output parameters.  Results and Discussions  This study applies large convolution kernels in DWConv to capture long-range dependencies within individual spectral channels (Fig. 3). Compared with conventional small-sized kernels (e.g., 3×3), which increase the receptive field but exhibit limited focus on critical spectral regions, large kernels enable more concentrated activation in key bands, thereby enhancing model sensitivity (Fig. 4). Empirical results confirm that the use of large kernels improves prediction accuracy (Table 6). Furthermore, compared to Transformer-based models, DWConv with large kernels achieves comparable accuracy with fewer parameters, offering computational efficiency. The CGC module effectively disentangles shared and task-specific features while applying orthogonal constraints to reduce redundancy. Its dynamic fusion mechanism enables adaptive feature sharing across tasks without compromising task-specific learning, thereby mitigating task interference and accounting for sample correlations (Fig. 6). Relative to conventional multi-task learning frameworks, the CGC-based architecture demonstrates superior performance in multi-parameter prediction (Table 6).  Conclusions  This study proposes a deep learning approach that integrates large-kernel DWConv and a CGC module for multi-parameter prediction of bauxite quality using NIR spectroscopy. DWConv captures long-range dependencies within spectral channels, while the CGC module leverages inter-parameter correlations to enhance feature sharing and reduce task interference. This design mitigates the effects of spectral peak overlap and establishes a robust nonlinear mapping between spectral features and quality parameters. Experiments on 424 bauxite samples show that the proposed G-DPN model achieves \begin{document}${R^2}$\end{document} values of 0.9226, 0.9377, and 0.9683 for aluminum, silicon, and iron content, respectively—outperforming conventional machine learning and existing deep learning methods. These results highlight the potential of combining NIR spectroscopy with G-DPN for accurate, efficient, and scalable mineral quality analysis, contributing to the sustainable utilization of bauxite resources.
YOMANet-Accel: A Lightweight Algorithm Accelerator for Pedestrians and Vehicles Detection at the Edge
CHEN Ningjiang, LU Yaozong
 doi: 10.11999/JEIT250059
[Abstract](73) [FullText HTML](20) [PDF 5028KB](17)
Abstract:
  Objective  Accurate and real-time detection of pedestrians and vehicles is essential for autonomous driving at the edge. However, deep learning-based object detection algorithms are often challenging to deploy in edge environments due to their high computational demands and complex parameter structures. To address these limitations, this study proposes a soft-hard coordination strategy. A lightweight neural network model, Yolo Model Adaptation Network (YOMANet), is designed, and a corresponding neural network accelerator, YOMANet Accelerator (YOMANet-Accel), is implemented on a heterogeneous Field-Programmable Gate Array (FPGA) platform. This system enables efficient algorithm acceleration for pedestrian and vehicle detection in edge-based autonomous driving scenarios.  Methods  The lightweight backbone of YOMANet adopts MobileNetv2 to reduce the number of network parameters. The neck network incorporates the Spatial Pyramid Pooling (SPP) and Path Aggregation Network (PANet) structures from YOLOv4 to expand the receptive field and accommodate targets of varying sizes. Depthwise separable convolution replaces standard convolution, thereby reducing training complexity and improving convergence speed. To enhance detail extraction, the Normalization-based Attention Module (NAM) is integrated into the head network, allowing suppression of irrelevant feature weights. For deployment on a FPGA platform, parallel computing and data storage schemes are designed. The parallel computing strategy adopts a loop blocking method to reorder inner and outer loops, enabling access to different output array elements through adjacent loop layers and facilitating parallel processing of output feature map pixels. Multiply-add trees are implemented in the Processing Engine (PE) to support efficient task allocation and operation scheduling. A double-buffer mechanism is introduced in the data storage scheme to increase data reuse, minimize transmission latency, and enhance system throughput. In addition, int8 quantization is applied to both weight parameters and activation functions, reducing the overall parameter size and accelerating parallel computation.  Results and Discussions  Experimental results on the training platform indicate that YOMANet achieves the inference speed characteristic of lightweight models while maintaining the detection accuracy of large-scale models, thereby improving overall detection performance (Fig. 12, Table 2). The ablation study demonstrates that the integration of MobileNetv2 and depthwise separable convolution significantly reduces the number of model parameters. Embedding the NAM attention mechanism does not noticeably increase model size but enhances detail extraction and improves detection of small targets (Table 3). Compared with other lightweight algorithms, the enhanced YOMANet shows improved detail extraction and superior detection of small and occluded targets, with substantially lower false and missed detection rates (Fig. 13). Results on the accelerator platform reveal that quantization has minimal effect on accuracy while substantially reducing model size, supporting deployment on resource-constrained edge devices (Table 4). When deployed on the FPGA platform, YOMANet retains detection accuracy comparable to GPU/CPU platforms, while power consumption is reduced by an order of magnitude, meeting the efficiency requirements for edge deployment (Fig. 14). Compared with related accelerator designs, YOMANet-Accel achieves competitive throughput and the highest Digital Signal Processing (DSP) efficiency, demonstrating the effectiveness of the proposed parallel computing and storage schemes in utilizing FPGA resources (Table 5).  Conclusions  Experimental results demonstrate that YOMANet achieves high detection accuracy and fast inference speed on the training platform, with enhanced performance for small and occluded targets, leading to a reduced missed detection rate. When deployed on the FPGA platform, YOMANet-Accel achieves an effective balance between detection performance and resource efficiency, supporting real-time pedestrian and vehicle detection in edge computing scenarios.
Cover
Cover
2025, 47(6).  
[Abstract](23) [PDF 4831KB](4)
Abstract:
2025, 47(6): 1-4.  
[Abstract](16) [FullText HTML](16) [PDF 273KB](2)
Abstract:
Special Topic on Satellite Information Intelligent Processing and Application Research
Overview of Security Issues and Defense Technologies for Low Earth Orbit Satellite Network
DU Xingkui, SHU Nina, LIU Chunsheng, YANG Fang, MA Tao, LIU Yang
2025, 47(6): 1609-1622.   doi: 10.11999/JEIT240957
[Abstract](260) [FullText HTML](211) [PDF 1725KB](56)
Abstract:
  Significance   In recent years, Low-Earth Orbit (LEO) satellite networks have experienced rapid development, demonstrating broad application prospects in mobile communications, the Internet of Things (IoT), maritime operations, and other domains. These networks are poised to become a critical component of next-generation network architectures. Currently, leading global and domestic commercial entities are actively deploying mega-constellations to enable worldwide mobile communication and broadband internet services. However, as the scale of LEO constellations expands, the satellite networks are increasingly exposed to both anthropogenic threats (e.g., cyberattacks) and environmental hazards (e.g., space debris). Existing review studies have systematically summarized research on security threats and defense mechanisms across the physical, network, and application layers of LEO satellite networks. Nevertheless, gaps remain in prior literature: First, lack of technical granularity. Many studies provide taxonomies of security issues but fail to focus sufficiently on domain-specific cybersecurity challenges or delve into technical details. Second, overemphasis on integrated space-terrestrial networks. Existing reviews often prioritize the broader context of space-air-ground-sea integrated networks, obscuring the unique vulnerabilities inherent to LEO satellite architectures. Third, imbalanced layer-specific analysis: Current works predominantly address physical and link-layer security, while insufficiently highlighting the distinct characteristics of network-layer threats. Building upon prior research, this paper presents a comprehensive review of security challenges and defense technologies in LEO satellite networks. By analyzing the inherent vulnerabilities of these systems, we provide an in-depth exploration of security threats, particularly those targeting network-layer integrity. Furthermore, we critically evaluate cutting-edge defense mechanisms developed to mitigate realistic threats, offering insights into their technical principles and implementation challenges.  Progress   This paper first elaborates on the architecture of LEO satellite networks, systematically analyzing the composition and functional roles of three core components: the space segment, ground segment, and user segment. It then summarizes the operational characteristics of LEO networks, including their dynamic multi-layer topology, globally ubiquitous coverage, low-latency data transmission, and resilient resource allocation mechanisms. These intrinsic characteristics fundamentally enable LEO networks to deliver high-quality communication services. Subsequently, this study identifies potential vulnerabilities across four dimensions: nodes, links, protocols, and infrastructure. Due to the open nature of satellite links, transmitted data are susceptible to eavesdropping, where adversaries may intercept satellite signals, predict orbital dynamics, and deploy surveillance systems preemptively. Prior research has addressed satellite communication security through physical-layer security designs and scenario-specific eavesdropping analyses. Through theoretical modeling and case studies, this work categorizes multiple Denial-of-Service (DoS) attack variants and explores routing attack risks inherent to the open architecture of LEO networks. Furthermore, it classifies electronic countermeasure interference types based on target scenarios and adversarial objectives. To counter these threats, the paper evaluates emerging defense technologies, including encryption-based security frameworks, resilient routing protocols, and digital twin-enabled virtualization platforms for network simulation and secure design optimization. Finally, it highlights cutting-edge AI-driven security solutions, such as machine learning-powered anomaly detection and federated learning for distributed threat intelligence.  Conclusions  This review critically examines the evolution of LEO satellite networks, identifying critical gaps in systematic analysis and comprehensive threat coverage within existing studies. By establishing a four-dimensional vulnerability framework—node vulnerabilities arising from harsh space environmental conditions, link vulnerabilities exacerbated by high orbital dynamics, protocol vulnerabilities stemming from commercial standardization compromises, and infrastructure vulnerabilities due to tight coupling with terrestrial internet systems—the study systematically classifies security threats across physical, network, and application layers. The paper further dissects attack methodologies unique to each threat category and evaluates advanced countermeasures. Notable innovations include quantum cryptography-enhanced encryption systems, fault-tolerant routing algorithms, virtualized network emulation environments, and AI-empowered security paradigms leveraging deep learning and federated learning architectures. These technologies not only significantly enhance the security posture of LEO networks but also demonstrate transformative potential for future adaptive security frameworks. However, challenges persist in balancing computational overhead with real-time operational constraints, necessitating further research into lightweight cryptographic primitives and cross-domain collaborative defense mechanisms. This synthesis provides a foundational reference for advancing next-generation satellite network security while underscoring the imperative for interdisciplinary innovation in space-terrestrial converged systems.  Prospects   Looking ahead, research on the security of LEO satellite networks will constitute a long-term and complex process. With the integration of emerging technologies such as quantum communication and artificial intelligence, security defense mechanisms in LEO satellite networks will evolve toward greater intelligence and automation. Emerging technologies are anticipated to play increasingly critical roles in this domain, particularly through advancements in adaptive intelligent networking technologies and intelligent networking protocol architectures. These developments will support the efficient convergence of space-air-ground-sea integrated networks. The application of deep learning methodologies to analyze network characteristics and construct corresponding neural network models will further enhance network adaptability and coordination. Concurrently, as commercial deployment of LEO satellite networks accelerates, the critical challenge of balancing security requirements with economic efficiency warrants in-depth investigation. Future research should prioritize cost-benefit analyses and explore optimal trade-offs between cybersecurity and service efficiency across diverse application scenarios. Furthermore, international collaboration is expected to assume a pivotal role in the security governance of LEO satellite networks, particularly through jointly establishing international standards and regulatory frameworks to address transnational security threats. This multilateral approach will be essential for maintaining the integrity and resilience of next-generation satellite infrastructures in an increasingly interconnected orbital environment.
Patch-based Adversarial Example Generation Method for Multi-spectral Object Tracking
MA Jiayi, XIANG Xinyu, YAN Qinglong, ZHANG Hao, HUANG Jun, MA Yong
2025, 47(6): 1623-1632.   doi: 10.11999/JEIT240891
[Abstract](306) [FullText HTML](140) [PDF 5147KB](71)
Abstract:
  Objective   Current research on tracker-oriented adversarial sample generation primarily focuses on the visible spectral band, leaving a gap in addressing multi-spectral conditions, particularly the infrared spectrum. To address this, this study proposes a novel patch-based adversarial sample generation framework for multi-spectral object tracking. By integrating adversarial texture generation modules and adversarial shape optimization strategies, the framework disrupts the tracking model’s interpretation of target textures in the visible spectrum and impairs the extraction of thermal salient features in the infrared spectrum, respectively. Additionally, tailored loss functions, including mis-regression loss, mask interference loss, and maximum feature discrepancy loss, guide the generation of adversarial patches, leading to the expansion or deviation of tracking prediction boxes and weakening the correlation between template and search frames in the feature space. Research on adversarial sample generation contributes to the development of robust object tracking models resistant to interference in practical scenarios.  Methods   The proposed framework integrates two key components. A Generative Adversarial Network (GAN) synthesizes texture-rich patches to interfere with the tracker’s semantic understanding of target appearance. This module employs upsampling layers to generate adversarial textures that disrupt the tracker’s ability to recognize and localize targets in the visible spectrum. A deformable patch algorithm dynamically adjusts geometric shapes to disrupt thermal saliency features. By optimizing the length of radial vectors, the algorithm generates adversarial shapes that interfere with the tracker’s extraction of thermal salient features, which are critical for infrared object tracking. Tailored loss functions are designed for different trackers. Mis-regression loss and mask interference loss guide attacks on region-proposal-based trackers (e.g., SiamRPN) and mask-guided trackers (e.g., SiamMask), respectively. These losses mislead the regression branches of region-proposal-based trackers and degrade the mask prediction accuracy of mask-guided trackers. Maximum feature discrepancy loss reduces the correlation between template and search features in deep representation space, further weakening the tracker’s ability to match and track targets. The adversarial patches are generated through iterative optimization of these losses, ensuring cross-spectral attack effectiveness.  Results and Discussions   Experimental results validate the method’s effectiveness. In the visible spectrum, the proposed framework achieves attack success rates of 81.57% (daytime) and 81.48% (night) against SiamRPN, significantly outperforming state-of-the-art methods PAT and MTD (Table 1). For SiamMask, success rates reach 53.65% (day) and 52.77% (night), demonstrating robust performance across different tracking architectures (Fig. 3). In the infrared spectrum, the method attains attack success rates of 71.43% (day) and 81.08% (night) against SiamRPN, exceeding the HOTCOLD method by more than 30% (Table 2). For SiamMask, the success rates reach 65.95% (day) and 65.85% (night), highlighting the effectiveness of the adversarial shape optimization strategy in disrupting thermal salient features. Multi-scene robustness is further demonstrated through qualitative results (Fig. 4), which show consistent attack performance across diverse environments, including roads, grasslands, and playgrounds under varying illumination conditions. Ablation studies confirm the necessity of each loss component. The combination of mis-regression and feature discrepancy losses improves the SiamRPN attack success rate to 75.95%, while the mask and feature discrepancy losses enhance SiamMask attack success to 65.91% (Table 3). Qualitative and quantitative experiments demonstrate that the adversarial samples proposed in this study effectively increase attack success rates against trackers in multi-spectral environments. These results highlight the framework’s ability to generate highly effective adversarial patches across both visible and infrared spectra, offering a comprehensive solution for multi-spectral object tracking security.   Conclusions   This study addresses the gap in multi-spectral adversarial attacks on object trackers by proposing a novel patch-based adversarial example generation framework. The method integrates a texture generation module for visible-spectrum attacks and a shape optimization strategy for thermal infrared interference, effectively disrupting trackers’ reliance on texture semantics and heat-significant features. By designing task-specific loss functions, including mis-regression loss, mask disruption loss, and maximum feature discrepancy loss, the framework enables precise attacks on both region-proposal and mask-guided trackers. Experimental results demonstrate the adversarial patches’ strong cross-spectral transferability and environmental robustness, causing trackers to deviate from targets or produce excessively enlarged bounding boxes. This work not only advances multi-spectral adversarial attacks in object tracking but also provides insights into improving model robustness against real-world perturbations. Future research will explore dynamic patch generation and extend the framework to emerging transformer-based trackers.
An Uncertainty-driven Pixel-level Adversarial Noise Detection Method for Remote Sensing Images
YAO Xudong, GUO Yaping, LIU Mengyang, MENG Gang, LI Yang, ZHANG Haopeng
2025, 47(6): 1633-1644.   doi: 10.11999/JEIT241157
[Abstract](198) [FullText HTML](93) [PDF 3988KB](31)
Abstract:
  Objective  The development of remote sensing technology has expanded its range of applications. However, during image acquisition and transmission, various factors can introduce noise that reduces image quality and clarity, affecting the extraction of ground object information. In particular, adversarial noise poses serious security risks, as it compromises the robustness of intelligent algorithms and may lead to decision failures. Evaluating the accuracy and reliability of remote sensing image data is therefore essential, highlighting the need for dedicated adversarial noise detection methods. Existing adversarial defense strategies primarily detect adversarial samples generated by specific attack methods, but these approaches often exhibit high computational cost, limited transferability, and lack pixel-level detection capabilities. In large-scale remote sensing images, adversarial noise is typically concentrated in key local regions containing ground objects. To address these limitations, this study proposes an uncertainty-driven, pixel-level adversarial noise detection method for remote sensing images. The method integrates adversarial noise characteristic analysis with uncertainty modeling, enabling precise localization of adversarial noise and improving the reliability of remote sensing applications.  Methods  To address the limitations of existing adversarial sample detection algorithms, an uncertainty-driven pixel-level adversarial noise detection method is proposed. The approach uses Monte Carlo Batch Normalization (MCBN) for uncertainty modeling and exploits the typically high uncertainty of adversarial noise to enable pixel-level detection. In deep neural networks, inference based on the stochasticity of the batch mean and variance in Batch Normalization (BN) layers is theoretically equivalent to variational inference in Bayesian models. This enables pixel-wise uncertainty estimation without modifying the network architecture or training process. In general, high-frequency regions such as edges exhibit greater uncertainty. In adversarial samples, however, artificially altered texture details introduce abnormal uncertainty. The uncertainty in these regions increases with the intensity of the adversarial noise. The proposed method comprises three main components: a feature extraction network, adversarial sample identification, and pixel-level adversarial noise detection. The input image is processed by a feature extraction network with BN layers to generate multiple Monte Carlo samples. The mean of these samples is treated as the reconstructed image, and the standard deviation is used to generate the uncertainty map. To identify adversarial samples, the algorithm calculates the Mean Squared Error (MSE) between the reconstructed image and the input image. If the image is classified as adversarial, the corresponding uncertainty map is further used to localize adversarial noise at the pixel level.  Results and Discussions  The experimental evaluation first quantifies the performance of the proposed method in adversarial sample detection and performs a comparative analysis with existing approaches. It also examines the effectiveness of pixel-level adversarial noise detection from both quantitative and qualitative perspectives. Experimental results show that the proposed algorithm achieves high detection performance and strong adaptability to various adversarial attacks, with robust generalization capability. Specifically, the method maintains detection accuracy above 0.87 against adversarial samples generated by four attack algorithms—FGSM, BIM, DeepFool, and AdvGAN—indicating consistent generalization across different adversarial methods. Although adversarial samples generated by DeepFool exhibit higher visual imperceptibility, the proposed method sustains stable performance across all evaluation metrics. This robustness highlights its adaptability even to potential unknown adversarial attacks. To further evaluate its effectiveness, the method is compared with existing adversarial sample detection algorithms, including MAD, PACA, E2E-Binary, and DSADF. The results indicate that the proposed method achieves competitive results in accuracy, precision, recall, and F1-score, reflecting strong overall performance in adversarial sample detection. For adversarial samples, the method also performs pixel-level adversarial noise detection. Results confirm its effectiveness in identifying various types of adversarial noise, with high accuracy in localizing noise within specific regions, such as baseball fields and storage tanks. It successfully detects most noise-affected areas in remote sensing images. However, complex textures and high-frequency details in some background regions cause increased uncertainty, which may lead to false positives, with non-adversarial regions misclassified as adversarial noise. Despite this limitation, the method maintains high overall detection accuracy and a low false negative rate, supporting its practical value in high-security applications.  Conclusions  To address the limitations of existing adversarial noise detection algorithms, this study proposes an uncertainty-driven pixel-level detection method for remote sensing images. The approach integrates MCBN into the feature extraction network to generate multiple Monte Carlo samples. The sample mean is used as the reconstructed image, while the sample standard deviation provides uncertainty modeling. The method determines whether an image is adversarial based on the difference in MSE between clean and adversarial samples, and the uncertainty map is utilized to localize adversarial noise at the pixel level across various attack scenarios. Experiments are conducted using the publicly available DIOR dataset, with adversarial samples generated by four representative attack algorithms: FGSM, BIM, DeepFool, and AdvGAN. Quantitative and qualitative evaluations confirm the method’s effectiveness in detecting adversarial noise at the pixel level and demonstrate strong generalization across attack types. The ability to localize noise improves the transparency and interpretability of adversarial sample identification, supporting more informed and targeted mitigation strategies. Despite its strong performance, the method currently relies solely on uncertainty estimation and thresholding for segmentation, which may result in misclassification in regions with complex textures or high-frequency details. Future research will explore the integration of uncertainty modeling with additional features to improve detection accuracy in such regions.
Evolutionary Optimization for Satellite Constellation Task Scheduling Based on Intelligent Optimization Engine
DU Yonghao, LI Lei, XU Shilong, CHEN Ming, CHEN Yingguo
2025, 47(6): 1645-1657.   doi: 10.11999/JEIT240974
[Abstract](172) [FullText HTML](101) [PDF 2829KB](38)
Abstract:
  Objective  The expansion of China’s aerospace capabilities has led to the widespread deployment of remote sensing satellites for applications such as land resource surveys and disaster monitoring. However, current methods face substantial challenges in the integrated scheduling of complex targets, including multi-frequency observations, dense point clusters, and wide-area imaging. This study develops an intelligent task planning engine architecture tailored for heterogeneous satellite constellations. By applying advanced modeling and evolutionary optimization techniques, the proposed framework addresses the collaborative scheduling of multi-dimensional targets, aiming to overcome key limitations in traditional satellite mission planning.  Methods  Through systematic analysis of models and algorithms, this study decouples the “Constraint-Decision-Reward” framework and develops an optimization algorithm module featuring “global evolution + local search + data-driven” strategies. At the modeling level, standard tasks are derived via target decomposition, and a multi-dimensional scheduling model for complex targets is established. At the algorithmic level, a Learning Memetic Algorithm (LMA) based on dual-model evolution is proposed. This approach incorporates strategies for initial solution generation, global optimization, and a generalized neighborhood search operator template to improve solution diversity and enhance global exploration capabilities. Additionally, data-driven optimization and dynamic multi-stage rapid insertion strategies are introduced to address real-time scheduling requirements.  Results and Discussions   Comprehensive experimental comparisons are conducted across three scenario scales—low, medium, and high difficulty—and three task planning scenarios (static scheduling, dynamic three-stage scheduling, and dynamic twelve-stage scheduling). Both classical and advanced algorithms are evaluated. Ablation experiments (Tables 4 and 5) assess the contribution of each component within the LMA. In all task scenarios, the proposed method consistently outperforms advanced algorithms, including adaptive large neighborhood search and the reinforcement learning genetic algorithm, as shown in Figure 11 and Table 3. The algorithm reliably completes iterations within 20 seconds, demonstrating high computational efficiency.  Conclusions  By standardizing complex targets and generating tasks, this research effectively addresses the integrated scheduling challenge of multi-dimensional objectives across heterogeneous resources. Experimental results show that the LMA outperforms traditional algorithms in terms of both solution quality and computational efficiency. The dual-model evolution mechanism enhances the algorithm’s global search capabilities, while the dynamic insertion strategy effectively handles scenarios with dynamically arriving tasks. These innovations highlight the algorithm’s significant advantages in aerospace mission scheduling.
A Model Pre-training Method with Self-Supervised Strategies for Multimodal Remote Sensing Data
DIAO Wenhui, GONG Shuo, XIN Linlin, SHEN Zhiping, SUN Chao
2025, 47(6): 1658-1668.   doi: 10.11999/JEIT241016
[Abstract](259) [FullText HTML](59) [PDF 8194KB](40)
Abstract:
  Objective  With the advancement of the remote sensing field and large model technologies, self-supervised learning enables model training on unlabeled remote sensing data through a mask-and-reconstruction approach. However, existing masking strategies primarily focus on spatial feature modeling while overlooking spectral feature modeling, resulting in an insufficient exploitation of spectral dimension information in spectral data. To address these challenges, this paper explores the imaging mechanisms and data characteristics of remote sensing and constructs a foundational pretraining model for self-supervised learning that supports multimodal remote sensing image data input, thereby providing a new approach for pretraining on multimodal remote sensing image data.  Methods  By exploring the imaging mechanisms and data characteristics of remote sensing, this paper constructs a foundational pretraining model for self-supervised learning based on Masked AutoEncoders (MAE) that supports the input of Synthetic Aperture Radar (SAR), Light Detection And Ranging (LiDAR), and HyperSpectral Imaging (HSI) data. The model employs a spatial branch that randomly masks pixel blocks to reconstruct missing pixels, and a spectral branch that randomly masks spectral channels to reconstruct the missing frequency information. This dual-branch design enables the model to effectively capture both spatial and spectral features of multimodal remote sensing image data, thereby improving the accuracy of pixel-level land cover classification.  Results and Discussions  The model was evaluated on land cover classification tasks using two publicly available datasets: the Berlin dataset and the Houston dataset. The experimental results demonstrate that the dual-channel attention mechanism more effectively extracts features from multimodal remote sensing image data. Through iterative parameter tuning, the model determined optimal hyperparameters tailored to each dataset. Compared to mainstream self-supervised learning methods such as BYOL, SimCLR, and SimCLRv2, our model achieved improvements in land cover classification accuracy of 1.98% on the Berlin dataset (Table.3, Fig.7) and 2.49% on the Houston dataset (Table.4, Fig.8), respectively.  Conclusions  This paper proposes a model for multimodal remote sensing image data classification, which comprises two main components: a spatial branch and a spectral branch. The spatial branch is designed to process the spatial information of images by applying masking to randomly selected image patches and reconstructing the missing pixels, thereby enhancing the model’s understanding of spatial structures. The spectral branch performs masking on randomly selected spectral channels with the goal of reconstructing the missing spectral responses, effectively leveraging the spectral dimension of hyperspectral data. Experimental results indicate that the proposed model can efficiently extract and utilize both spatial and spectral information, leading to a significant improvement in classification accuracy.
A Novel Earth Surface Anomaly Detection Method Based on Collaborative Reasoning of Deep Learning and Remote Sensing Indexes
WANG Libo, GAO Zhi, WANG Qiao
2025, 47(6): 1669-1678.   doi: 10.11999/JEIT240882
[Abstract](245) [FullText HTML](71) [PDF 2613KB](44)
Abstract:
  Objective  Earth Surface Anomalies (ESAs) refer to geographical phenomena that deviate from the normal state. They are characterized by wide distribution, high occurrence frequency, rapid evolution, and a large impact range. In recent years, sudden surface anomalies have occurred frequently, making remote sensing surface anomaly detection a prominent research topic. Although deep learning-based anomaly detection methods have made substantial progress, they still face two challenges: (1) limited learning ability under conditions of few samples, and (2) unreliable reasoning when identifying surface anomaly scenes with high inter-class similarity. To address these challenges, a novel surface anomaly detection method, DeepIndex, is proposed. This method leverages prior knowledge from large vision-language models to enhance few-sample learning and integrates remote sensing indexes to improve the reliability of identifying complex and similar surface anomaly scenes.  Methods  A novel scheme of “large-scale pre-trained foundational model + efficient fine-tuning” is employed to construct the entire network and implement training, thereby enabling efficient learning of surface anomaly features under conditions with few samples. Specifically, the foundational vision-language model, Contrastive Language-Image Pretraining (CLIP), is selected as the backbone of DeepIndex, with an efficient fine-tuning module developed to enhance few-sample learning. Leveraging the vision-language structure, DeepIndex can simultaneously encode image and text features, with the output category determined by text input, granting it open-set classification capability. Furthermore, DeepIndex innovatively integrates remote sensing indexes and physical mechanisms into the reasoning process, improving both interpretability and generalization performance. Specifically, DeepIndex first computes remote sensing indexes and applies an adaptive threshold segmentation method to generate binary segmentation maps. These maps are then processed to output the area ratio of the anomalous region. Based on the area ratio (with a default threshold of 0.1), potential surface anomaly categories are identified. The classification weights of these potential categories are then increased by 20%. Finally, DeepIndex uses the increased weights for classification, improving the identification of surface anomaly scenes with high inter-class similarity and enhancing reasoning reliability. Notably, DeepIndex increases weights only for categories with lower original confidence (<0.5), achieving a balance between regular and confused samples for stable classification. In summary, DeepIndex utilizes vision-language representation learning to develop a collaborative reasoning framework that integrates remote sensing indexes for surface anomaly detection. This framework improves the deep network’s reasoning capabilities and realizes the complementary advantages of deep learning and remote sensing indexes.  Results and Discussions  The effectiveness and superiority of the proposed DeepIndex are demonstrated using a self-constructed dataset, MultiSpectral Earth Surface Anomaly Detection (MS-ESAD), and the public dataset, NWPU45. The MS-ESAD dataset is challenging, containing 2,768 multispectral remote sensing images across six bands (red, green, blue, infrared, and two short infrared bands) and three types of surface anomalies (wildfire, green tide, and blue algae). This dataset provides a foundation for surface anomaly detection research. For evaluation, class Average Accuracy (AA) and Overall Accuracy (OA) metrics are used for both datasets. The ablation study (Tables 2 and 3) shows that the proposed DeepIndex collaborative reasoning framework significantly enhances zero-shot classification performance (9.84%) and improves the identification of confusing samples (7.39%). Quantitative and qualitative comparisons (Fig. 4, Table 4) further illustrate that DeepIndex achieves the best class AA (92.36%), which is 3.38% higher than the classic convolutional neural network ResNet and 0.42% higher than ViT. Additionally, compared to recent remote sensing scene classification networks, DeepIndex demonstrates more stable performance, owing to the integration of remote sensing index priors. For the NWPU45 dataset, experimental results (Fig. 5, Table 5) further highlight the advantages of DeepIndex under conditions with few samples (10% and 20% for training). Compared with advanced remote sensing image scene classification methods (e.g., EMSCNet) from the past two years, DeepIndex shows a slight accuracy advantage of 0.17% and 0.31%, respectively. These results demonstrate the strong application potential of DeepIndex for remote sensing image scene classification tasks, especially with limited training samples.  Conclusions  This paper combines physically constrained remote sensing indexes with deep networks and proposes a collaborative reasoning deep framework for Earth surface anomaly detection, named DeepIndex. Through large-scale pre-training and adaptive fine-tuning strategies, DeepIndex effectively learns highly generalized features from scarce samples. Additionally, DeepIndex adopts a unique reasoning pattern that utilizes remote sensing index priors to assist network discrimination, enhancing its ability to recognize complex and ambiguous surface anomaly scenes. Furthermore, this paper constructs a multispectral surface anomaly dataset that provides valuable data support for related research. The experimental results demonstrate that the integration of remote sensing indexes significantly improves classification performance under conditions with limited training samples. Compared with other advanced remote sensing scene classification methods, DeepIndex shows notable advantages in both accuracy and stability.
Geometrically Consistent Based Neural Radiance Field for Satellite City Scene Rendering and Digital Surface Model Generation in Sparse Viewpoints
SUN Wenbo, GAO Zhi, ZHANG Yichen, ZHU Jun, Li Yanzhang, LU Yao
2025, 47(6): 1679-1689.   doi: 10.11999/JEIT240898
[Abstract](168) [FullText HTML](83) [PDF 3245KB](20)
Abstract:
  Objective   Satellite-based Earth observation enables global, continuous, multi-scale, and multi-dimensional surface monitoring through diverse remote sensing techniques. Recent progress in 3D modelling and rendering has seen widespread adoption of Neural Radiance Fields (NeRF), owing to their continuous-view synthesis and implicit geometry representation. Although NeRF performs robustly in areas such as autonomous driving and large-scale scene reconstruction, its direct application to satellite observation scenarios remains limited. This limitation arises primarily from the nature of satellite imaging, which often lacks the tens or hundreds of viewpoints typically required for NeRF training. Under sparse-view conditions, NeRF tends to overfit the available training perspectives, leading to poor generalization to novel viewpoints.   Methods   To address the performance limitations of NeRF under sparse-view conditions, this study proposes an approach that introduces geometric constraints on scene depth and surface normals during model training. These constraints are designed to compensate for the lack of prior knowledge inherent in sparse-view satellite imagery and to improve rendering and DSM generation. The approach leverages the importance of scene geometry in both novel view synthesis and DSM generation, particularly in accurately representing spatial structures through DSMs. To mitigate the degradation in NeRF performance under limited viewpoint conditions, the geometric relationships between scene depth and surface normals are formulated as loss functions. These functions enforce consistency between estimated depth and surface orientation, enabling the model to learn more reliable geometric features despite limited input data. The proposed constraints guide the model toward generating geometrically coherent and realistic scene reconstructions.   Results and Discussions   The proposed method is evaluated on the DFC2019 dataset to assess its effectiveness in novel view synthesis and DSM generation under sparse-view conditions. Experimental results demonstrate that the NeRF model with geometric constraints achieves superior performance across both tasks, confirming its applicability to satellite observation scenarios with limited viewpoints. For novel view synthesis, model performance is assessed using 2, 3, and 5 input images. The proposed method consistently outperforms existing approaches across all configurations. In the JAX 004 scene, Peak Signal-to-Noise Ratio (PSNR) values of 21.365 dB, 21.619 dB, and 23.681 dB are achieved under the 2-view, 3-view, and 5-view settings, respectively. Moreover, the method exhibits the smallest degradation in PSNR and Structural Similarity Index (SSIM) as the number of training views decreases, indicating greater robustness under sparse input conditions. Qualitative results further confirm that the method yields sharper and more detailed renderings across all view configurations. For DSM generation, the proposed method achieves comparable or better performance relative to other NeRF-based approaches in most test scenarios. In the JAX 004 scene, Mean Absolute Error (MAE) values of 2.414 m, 2.198 m, and 1.602 m are obtained under the 2-view, 3-view, and 5-view settings, respectively. Qualitative assessments show that the generated DSMs exhibit clearer structural boundaries and finer geometric details compared to those produced by baseline methods.   Conclusions   Incorporating geometric consistency constraints between scene depth and surface normals enhances the model’s ability to capture the spatial structure of objects in satellite imagery. The proposed method achieves state-of-the-art performance in both novel view synthesis and DSM generation tasks under sparse-view conditions, outperforming both NeRF-based and traditional Multi-View Stereo (MVS) approaches.
Earth Surface Anomaly Detection Using Graph Neural Network-based Representation and Reasoning of Remote Sensing Geographic Object Relationships
LIU Siqi, GAO Zhi, CHEN Boan, LU Yao, ZHU Jun, LI Yanzhang, WANG Qiao
2025, 47(6): 1690-1703.   doi: 10.11999/JEIT240883
[Abstract](131) [FullText HTML](39) [PDF 6650KB](15)
Abstract:
  Objective  The increasing frequency and severity of surface anomalies induced by natural processes and human activities has raised the demand for real-time, intelligent remote sensing systems for disaster monitoring and emergency response. Existing approaches to extracting geographic object relationships in remote sensing images primarily rely on object detection models. These approaches often lack sufficient localization precision and fail to capture topological dependencies between objects. Moreover, the absence of standardized, high-quality datasets restricts progress in model development. To address these limitations, this study proposes a framework that integrates graph-based representation with a Graph Neural Network (GNN) architecture to reason over geographic object relationships. The main objectives are to: (1) construct a semantically annotated dataset of geographic object relationships in remote sensing imagery; (2) develop a GNN-based model to improve relationship prediction accuracy; and (3) evaluate the model’s effectiveness in detecting and interpreting surface anomalies by analyzing pre- and post-disaster relationship patterns across a range of scenarios.  Methods  The methodology comprises three primary components: dataset construction, model development, and performance evaluation. To address the scarcity of labeled data, a semantic relationship dataset is constructed. Thirty high-resolution remote sensing images from the OpenEarthMap dataset are manually annotated using EISeg software, resulting in 17 object categories (Table 1) and five semantic relationships—contain, connect, on, along, and beside—defined through analysis of topological interactions (Table 2). Instance-level annotations are generated using connected component labeling, and relationship labels are assigned based on both topological configuration and object category. The resulting dataset includes 7,063 annotated entities and 13,273 relationship triplets. A GNN-based model is developed to predict semantic relationships, incorporating subgraph sampling and hyperparameter optimization. The model employs the Personalized PageRank (PPR) algorithm to extract query-relevant subgraphs, thereby reducing computational complexity while preserving essential topological structure. Message passing mechanisms from RED-GNN are used to propagate node features, and Bayesian optimization is applied to tune hyperparameters. Model performance is assessed using standard metrics: Mean Reciprocal Rank (MRR), HITS@1, and HITS@10.  Results and Discussions  Extensive experiments demonstrate the high performance of the proposed framework. On the constructed dataset, the model achieves an MRR of 0.9879 on the test set, with HITS@1 and HITS@10 scores of 97.03% and 99.96%, respectively, outperforming baseline methods such as RED-GNN and Grail (Table 5). Ablation studies confirm the effectiveness of the PPR sampling strategy, which outperforms random walk, breadth-first search, and standard PageRank in terms of both accuracy and efficiency (Table 6). Model generalizability is further assessed using pre- and post-disaster images from the xBD dataset. In hurricane-affected regions (Fig. 6, Fig. 7), abnormal relationships—such as “sea lake pond contain residential area”—emerge, reflecting the submergence of buildings and roads due to flooding. Frequency histograms (Fig. 8, Fig. 9) indicate a post-disaster decrease in relationship diversity and a shift toward water-related spatial associations. In wildfire scenarios (Fig. 10Fig. 13), relationships such as "bareland contain rangeland" replace "tree beside rangeland," suggesting vegetation loss and soil exposure. These findings demonstrate the model’s capacity to detect spatial and semantic shifts in geographic object relationships caused by disasters. Coarse anomaly localization is achieved through centroid-based node mapping, enabling interpretation of surface anomaly dynamics over time.  Conclusions  This study contributes to remote sensing-based surface anomaly detection through three main innovations. First, a high-quality semantic relationship dataset is constructed with pixel-level annotations and standardized relationship definitions, addressing the lack of labeled data in this area. The dataset includes 17 object categories and five topologically defined relationship types, offering a valuable benchmark for future research. Second, a novel GNN-based model is developed that advances relationship prediction by integrating PPR-based subgraph sampling with optimized message passing mechanisms. Third, the framework is extensively validated using real-world disaster scenarios, demonstrating its practical utility in detecting and interpreting surface anomalies through changes in object relationships. The model’s ability to produce interpretable relationship graphs while maintaining computational efficiency supports its application in time-sensitive emergency response contexts. Future work will focus on expanding image diversity, refining relationship definitions, and incorporating real-world noise to improve robustness.
Non-zero Frequency Clutter Cancellation Method for Passive Bistatic Radar
CHEN Gang, SU Siyuan, WANG Jun, JIN Yi, XU Changzhi, ZHANG Meng, FU Shiwei
2025, 47(6): 1704-1711.   doi: 10.11999/JEIT241018
[Abstract](162) [FullText HTML](38) [PDF 2743KB](40)
Abstract:
  Objective and Methods   In passive bistatic radar systems, in addition to strong direct-path signals and zero-frequency multipath signals, non-zero frequency clutter echoes are also present. The conventional method is ineffective in removing these non-zero frequency clutter signals due to their strong randomness. To address this issue, several algorithms, such as the Extensive Cancellation Algorithm (ECA) and the Extensive Cancellation Algorithm by subCarrier (ECA-C), have been proposed. However, these methods have limitations in terms of computational cost and signal applicability. To overcome these challenges, this paper proposes a novel clutter cancellation method for passive bistatic radar. First, two types of clutter subspaces are constructed: the conventional clutter subspace and the extended clutter subspace. By designing and solving a new cost function, the optimal clutter cancellation weight factor is derived. The clutter signals, including non-zero frequency components, are then removed. Residual clutter signals are further suppressed through range-Doppler processing. Simulation analysis and real-data applications demonstrate that the proposed method reduces computational complexity while maintaining effective clutter cancellation performance.   Results and Discussions  As shown in Fig. 2(a), the main lobe of the weak target echo is obscured by the sidelobes of strong clutter signals, preventing target detection. The noise platform level is 0.98 dB. In Fig. 2(b), although the direct-path and zero-frequency clutter are suppressed using the conventional method, the target echo remains undetectable due to non-zero frequency clutter. The noise platform level is –28.85 dB, representing a reduction of approximately 28 dB in the detection platform. In contrast, Fig. 2(c) and Fig. 2(d) show that the target is detected when applying the extended ECA method and the proposed method, as the direct-path signal, zero-frequency clutter, and non-zero frequency clutter are effectively removed. The noise platform level is –43.6 dB, indicating a further reduction of approximately 15 dB compared with the conventional method. The clutter cancellation time for both methods increases with the clutter cancellation order and data length. However, the processing time growth of the extended ECA method is greater than that of the proposed method in both cases (Fig. 4). Validation using real data confirms that both targets are detected using the extended ECA method and the proposed method, as both effectively mitigate the effects of non-zero frequency clutter compared with the conventional method (Fig. 6). The processing time of the proposed method (13.56 s) is shorter than that of the extended ECA method (21.73 s). The results from real data further confirm the effectiveness of the proposed method.  Conclusions  This study proposes a new method for addressing the non-zero frequency clutter cancellation problem. In this approach, both the conventional clutter subspace and the extended clutter subspace are constructed. A new cost function is then designed and solved to achieve cancellation of both zero and non-zero frequency clutter. Residual clutter signals are further suppressed through range-Doppler processing. The performance of the proposed method is validated and compared with the extended ECA method through simulation results. Additionally, real-data applications confirm its effectiveness. This method effectively transforms high-order matrix operations into two low-order matrix operations, thereby reducing computational complexity. In practical applications, as the order of the clutter cancellation step increases, the computational advantage of the proposed method over the extended ECA method becomes more pronounced.
Fractional-Order Sliding Mode Fault-Tolerant Attitude Controller for Spacecraft
ZHANG Meng, WANG Linan, ZHENG Dezhi, YI Xiaojian
2025, 47(6): 1712-1722.   doi: 10.11999/JEIT250025
[Abstract](112) [FullText HTML](37) [PDF 2656KB](23)
Abstract:
  Objective  Spacecraft attitude control under complex operational conditions remains limited by inadequate controller adaptability, insufficient precision, and rapid chattering near the sliding surface. Fractional-Order Sliding Mode Control (FOSMC), which integrates fractional calculus into control algorithms, offers improved modeling flexibility and robustness. Compared with conventional integer-order controllers, fractional-order controllers yield smoother responses and enhanced dynamic behavior. This study proposes a novel fault-tolerant control strategy that combines FOSMC with fault accommodation mechanisms to achieve accurate spacecraft attitude tracking in the presence of system faults and environmental disturbances.  Methods  A finite-time disturbance observer is proposed to unify actuator faults, inertia uncertainties, and external disturbances into a single lumped term. This formulation allows for accurate estimation of both the system state and disturbances, supporting effective compensation. The observer’s fast convergence and robustness are analytically demonstrated using finite-time stability theory. To further accelerate convergence and mitigate the chattering typically observed in conventional sliding mode control, a finite-time fault-tolerant controller based on fractional-order sliding mode is developed. This controller ensures finite-time stabilization of spacecraft attitude and angular velocity.  Results and Discussions  To evaluate the effectiveness and performance advantages of the proposed method, a comparative analysis is conducted against an Integer-Order Sliding Mode Controller (IOSMC) using MATLAB simulations. Figure 1 shows the estimation error of the proposed finite-time observer, demonstrating its ability to rapidly and accurately estimate the lumped disturbance term. Figure 2 presents the attitude response trajectories under both control strategies, with blue and red lines corresponding to the fractional-order and integer-order controllers, respectively. Although both methods successfully track the desired attitude, the FOSMC exhibits significantly faster convergence. Figure 3 displays the angular velocity error curves, indicating that the FOSMC achieves finite-time stabilization. In comparison with the IOSMC, the proposed controller yields quicker convergence and reduced steady-state error. Figure 4 illustrates the control torque profiles, revealing that the FOSMC produces smoother torque outputs.  Conclusions  This study proposes a spacecraft attitude controller based on fractional-order sliding mode theory to reduce the high-frequency chattering observed near the sliding surface in conventional terminal sliding mode control. By incorporating fractional-order calculus into the control framework, a fault-tolerant strategy is developed to enhance the performance of spacecraft attitude systems under uncertain and faulty conditions. Simulation results validate the following advantages: (1) Compared with the IOSMC algorithm, the proposed controller achieves faster convergence and improved chattering suppression. (2) Actuator faults, inertia uncertainties, and external disturbances are consolidated into a unified disturbance term, which is accurately estimated by a finite-time disturbance observer for effective compensation. (3) The use of a fractional-order non-singular terminal sliding surface, together with Lyapunov-based analysis, provides a rigorous guarantee of finite-time stability. Moreover, the fractional-order sliding surface increases the design flexibility, allowing broader optimization of controller parameters. This work addresses finite-time fault-tolerant control for spacecraft attitude systems. Future research may investigate fixed-time fault-tolerant control approaches to further improve robustness and ensure consistent response times.
A Moving Target Detection Method for GEO SAR Image in Maritime Areas
WU Yifan, HUANG Lijia, YAN Chaobao, ZHANG Bingchen
2025, 47(6): 1723-1733.   doi: 10.11999/JEIT240906
[Abstract](157) [FullText HTML](60) [PDF 2592KB](35)
Abstract:
  Objective  Geosynchronous Synthetic Aperture Radar (GEO SAR) provides wide-area coverage and rapid revisit, supporting near real-time observation of large maritime regions. However, detecting moving targets in GEO SAR images remains challenging due to severe geometric shifts and defocusing effects induced by target motion. These issues are further compounded when deep learning-based detection algorithms are applied. Specifically, the extended synthetic aperture time of GEO SAR leads to significant motion-induced defocusing, degrading the structural clarity of moving targets. Moreover, the wide swath of GEO SAR results in sparsely distributed moving targets, substantially increasing computational demands. The long integration time also renders GEO SAR imagery more sensitive to sea clutter, which elevates false alarm rates and obscures target backscattering signatures, thereby compromising target contour visibility. To address these challenges, this study analyzes the displacement and phase errors introduced by moving targets under the non-linear squint-angle imaging geometry of GEO SAR. Based on this analysis, a diffusion model-based method is proposed to detect maritime moving targets in GEO SAR imagery.  Methods  To address the aforementioned challenges, this study analyzes azimuth displacement and phase errors induced by target motion under the non-planar squint-angle imaging geometry of GEO SAR. Based on this analysis, a diffusion-based approach is proposed for detecting maritime moving targets in GEO SAR imagery. The method comprises two primary components: a preprocessing stage and a conditional diffusion detection network. In the preprocessing stage, the full-scene image is divided into smaller sub-scenes to mitigate the difficulty of directly processing ultra-wide-swath GEO SAR data. These sub-scenes are subsequently downsampled to dimensions compatible with network input, which enhances the signal-to-noise ratio and strengthens the representation of motion-related features. In the detection stage, a conditional diffusion detection network tailored for GEO SAR is developed. This network accepts the preprocessed image as conditional input to guide the generation of detection results, thereby improving accuracy. To further refine performance, a dense interaction module is introduced to facilitate multi-scale feature fusion between the segmentation mask and original data in the latent space, enabling precise segmentation of moving targets.  Results and Discussions  To evaluate the effectiveness of the proposed detection method, the imaging characteristics of sea clutter and moving targets under long synthetic aperture time are simulated. The sea surface is modeled using the Jonswap spectrum, and clutter is generated via a backscattering coefficient model, resulting in simulated imagery of sea clutter and moving targets under extended integration time (Fig. 5). A quantitative analysis is performed to examine the effect of downsampling on detection performance by calculating the Signal-to-Clutter Ratio (SCR), as summarized in Table 4. The results show that 8× downsampling improves the SCR by 11.3 dB. After preprocessing, the downsampled sub-scenes are fed into the detection network to produce moving target detection results (Fig. 6). Compared to other methods, the proposed approach yields improvements of 1.1%, 2.2%, and 0.9% in Intersection over Union (IoU), Accuracy, and F1-score, respectively.  Conclusions   To address the challenges of detecting moving maritime targets in GEO SAR images—such as ultra-long synthetic aperture duration, extremely wide swath, and interference from sea clutter—this study first analyzes azimuth displacement and phase errors introduced by target motion under the non-linear squint-angle imaging geometry of GEO SAR. A detection method is then proposed, comprising two key components: preprocessing and a conditional diffusion detection network. During preprocessing, the full-scene image is divided into multiple sub-scenes to facilitate processing of ultra-wide-swath GEO SAR data. These sub-scenes are downsampled to dimensions compatible with the detection network, improving the SCR, enhancing motion features, and suppressing clutter. In the detection stage, a conditional diffusion detection network customized for GEO SAR imagery is employed. This network uses the preprocessed sub-scenes as conditional input to guide the generation of detection results, enhancing accuracy. A dense interaction module is further incorporated to enable multi-scale feature coupling between the segmentation mask and the original data in the latent space, achieving pixel-level segmentation of moving targets. Simulation experiments confirm the method’s effectiveness in detecting moving maritime targets under complex oceanic conditions. Although initial progress has been achieved, further work is required to improve parameter extraction and optimize network performance. Future efforts will focus on refining detection models for more comprehensive analysis of moving targets in GEO SAR imagery.
Remote Sensing Image Text Retrieval Method Based on Object Semantic Prompt and Dual-Attention Perception
TIAN Shu, ZHANG Bingxi, CAO Lin, XING Xiangwei, TIAN Jing, SHEN Bo, DU Kangning, ZHANG Ye
2025, 47(6): 1734-1746.   doi: 10.11999/JEIT240946
[Abstract](106) [FullText HTML](41) [PDF 10664KB](29)
Abstract:
  Objective  High-resolution remote sensing imagery presents complex scene configurations, diverse semantic associations, and significant object scale variations, often resulting in overlapping feature distributions across categories in the latent space. These ambiguities hinder the model’s ability to capture intrinsic associations between textual semantics and visual representations, reducing retrieval accuracy in image-text retrieval tasks. This study aims to address these challenges by investigating object-level attention mechanisms and cross-modal feature alignment strategies. By dynamically allocating attention weights to salient object features and optimizing image-text feature alignment, the proposed approach enables more precise extraction of semantic information and achieves high-quality cross-modal alignment, thereby improving retrieval accuracy in remote sensing image-text retrieval.  Methods  Building on the theoretical foundations above, this study proposes an Object Semantic and Dual-attention Perception Model (OSDPM) for remote sensing image-text retrieval. OSDPM first utilizes a pretrained CLIP model to extract features from remote sensing images and their associated textual descriptions. A Dual-Attention Perception Network (DAPN) is then developed to characterize both global contextual information and salient object regions in the imagery. DAPN adaptively enhances the representation of salient objects with large scale variations by dynamically attending to significant local regions and integrating attention across spatial and channel dimensions. To address cross-modal heterogeneity between image and text features, an Object Semantic-aware Feature Clustering Module (OSFCM) is introduced. OSFCM conducts statistical analysis of the frequency of semantic nouns associated with object categories in image-text pairs, extracting high-probability semantic priors for the corresponding images. These semantic cues are used to guide the clustering of image features that exhibit ambiguity in the cross-modal feature space, thereby reducing distribution overlap across object categories. This targeted clustering enables precise alignment between image and text features and improves retrieval performance in remote sensing image-text tasks.  Results and Discussions  The proposed OSDPM integrates spatial-channel attention and adaptive saliency mechanisms to capture multiscale object information within image features. It then leverages semantic priors from textual descriptions to guide cross-modal feature alignment, improving retrieval performance in remote sensing image-text tasks. Experiments on the RSICD and RSITMD benchmark datasets show that OSDPM outperforms state-of-the-art methods by 9.01% and 8.83%, respectively (Table 1, Table 2). Comparative results for image-to-text and text-to-image retrieval (Fig. 6, Fig. 7) further confirm the superior retrieval accuracy achieved by the proposed approach. Feature heatmap visualizations (Fig. 5) indicate that the DAPN effectively captures both global contextual features and local salient object regions, maintaining spatial semantic consistency between visual and textual representations. In addition, t-SNE visualizations across training stages demonstrate that OSFCM mitigates feature distribution overlap among object categories, thereby improving feature alignment accuracy. Ablation studies (Table 3) confirm that each module in the proposed network contributes to retrieval performance gains.  Conclusions  This study proposes a remote sensing image-text retrieval method, OSDPM, to address challenges in object representation and cross-modal semantic alignment caused by complex scenes, diverse semantics, and scale variation in high-resolution remote sensing images. OSDPM first employs a pretrained CLIP model to extract global contextual features from both images and corresponding text descriptions. It then introduces the DAPN to capture salient object features by dynamically attending to significant local regions and adjusting attention across spatial and channel dimensions. Furthermore, the model incorporates an OSFCM, which extracts prior semantic information through frequency analysis of object category terms and uses these priors to guide the clustering of ambiguous image features in the embedding space. This strategy reduces semantic misalignment and facilitates accurate cross-modal mapping between image and text features. Experiments on the RSICD and RSITMD benchmark datasets confirm that OSDPM outperforms existing methods, demonstrating improved accuracy and robustness in remote sensing image-text retrieval.
A Few-Shot Land Cover Classification Model for Remote Sensing Images Based on Multimodality
ZHOU Wei, WEI Mingan, XU Haixia, WU Zhiming
2025, 47(6): 1747-1761.   doi: 10.11999/JEIT241057
[Abstract](123) [FullText HTML](100) [PDF 10316KB](23)
Abstract:
  Objective   To address the challenges of broad coverage, limited sample annotation, and poor adaptability in category fusion for remote sensing images, this paper proposes a few-shot semantic segmentation model based on image-text multimodal fusion, termed the Few-shot Semantic Segmentation Network (FSSNet). FSSNet is designed to effectively utilize multimodal information to improve generalization and segmentation accuracy under data-scarce conditions.   Methods   The proposed model, FSSNet, adopts a classic encoder-decoder architecture. The encoder serves as the central component, extracting features from both remote sensing images and associated text. An interaction mechanism is introduced to semantically align and fuse these multimodal features, generating enriched semantic representations. Within the encoder, two modules are incorporated: a class information fusion module and an instance information extraction module. The class information fusion module is developed based on the CLIP model and leverages correlation principles to enhance the adaptation between support and query image-text pairs. Simultaneously, inspired by the pyramid feature structure, an improved version, referred to as IFPN, is constructed. The instance information extraction module, built upon IFPN, captures detailed regional features of target instances from support images. These instance areas serve as prior prompts to guide the recognition and segmentation of corresponding regions in query images. The IFPN further provides semantic context and fine-grained spatial details, enhancing the completeness and boundary precision of object detection and segmentation in query images. The decoder integrates class-level information, multi-scale instance features, and query image features through a semantic aggregation module operating at multiple scales. This module outputs four levels of aggregated features by concentrating inputs at different resolutions. Large-scale features, with higher resolution, improve the detection of small target regions, whereas small-scale features, with lower resolution and broader receptive fields, are better suited for identifying large targets. The integration of multi-scale features improves segmentation accuracy across varying object sizes. This framework enables few-shot classification and segmentation of land cover in remote sensing images by leveraging image–text multimodality.   Results and Discussions   To evaluate the performance of the proposed FSSNet model, extensive experiments are conducted on multiple representative datasets. On the standard few-shot semantic segmentation benchmark PASCAL-5i, FSSNet is compared with several mainstream models, including the Multi-Information Aggregation Network (MIANet). Under both 1-shot and 5-shot settings, FSSNet achieves higher mean Intersection over Union (mIoU) scores, exceeding State-Of-The-Art (SOTA) models by 2.29% and 1.96%, respectively. Further evaluation on three public remote sensing datasets—LoveDA, Postdam, and Vaihingen—demonstrates model generalization across domains. FSSNet outperforms existing methods, with mIoU improvements of 2.1%, 1.4%, and 1.9%, respectively. For practical applicability, a custom dataset (HERSD) is constructed for hydraulic engineering, comprising various types of hydraulic infrastructure and land cover. On HERSD, FSSNet maintains robust performance, exceeding SOTA models by 1.89% in mIoU accuracy. Overall, the results indicate that FSSNet provides effective and robust performance in both standard benchmarks and real-world remote sensing tasks under few-shot learning conditions.  Conclusions   This paper presents a novel FSSNet for remote sensing images, FSSNet, which demonstrates strong performance in data-constrained scenarios through the integration of image–text multimodal information and three specifically designed modules. Experimental results on multiple public and custom datasets confirm the effectiveness and robustness of the proposed approach, particularly in few-shot and small-sample object classification tasks, as well as in practical land cover classification applications. The proposed framework offers new perspectives and practical solutions for few-shot learning and cross-modal information fusion in remote sensing, facilitating broader adoption of remote sensing image analysis in real-world settings. Future work will focus on extending the model to zero-shot land cover classification by exploring additional multimodal data sources and more efficient feature fusion strategies.
Radar Emitter Individual Identification Based on Information Sidebands of Unintentional Phase Modulation on Pulses
HUANG Xiangsong, WANG Zhen, PAN Dapeng, ZHAO Yiyang
2025, 47(6): 1762-1771.   doi: 10.11999/JEIT240774
[Abstract](72) [FullText HTML](32) [PDF 4420KB](15)
Abstract:
  Objective  Radar Emitter Identification (REI) plays a critical role in complex battlefield environments and electronic warfare. Conventional methods for extracting Unintentional Phase Modulation On Pulse (UPMOP) features are often ineffective at distinguishing emitters of the same model and manufacturer, due to hardware similarities. This limitation hinders the ability to accurately identify individual emitters. To overcome this challenge, a new method is proposed that integrates Information Sidebands Of Unintentional Phase Modulation On Pulses (ISOUPMOP) with deep learning techniques. This approach mitigates phase ambiguity and noise, thereby enhancing the ability to discriminate between emitters of the same type. A Dual-Loop Circular DILated Convolutional Neural Network (DLCDIL-CNN) is used to expand the receptive field, improving the processing of long-sequence data. This design results in a more accurate identification of radar radiation sources with similar hardware characteristics.  Methods  Unintentional phase modulation refers to subtle fluctuations in the phase of radar signals, primarily caused by hardware instability and nonlinear characteristics of radar transmitters. This study first applies Variational Mode Decomposition (VMD) to denoise the UPMOP features by decomposing them into signal components at different frequency bands. These components are then analyzed using the Wavelet SynchroSqueezed Transform (WSST) to perform high-resolution time-frequency analysis and extract phase information from multiple time-frequency components. The mean of the time-frequency ridge line is used for discriminative analysis, and components containing ISOUPMOP are reconstructed as identification features. For individual identification, a DLCDIL-CNN is employed. The input data are split into two branches, each processed using Circular Dilated Convolution (CDC) layers with ring padding. This architecture expands the receptive field without introducing boundary effects, enabling the model to capture long-range dependencies and maintain robustness to data shifts.  Results and Discussions  Visualization experiments reveal that traditional UPMOP features are strongly influenced by the emitter model, resulting in poor differentiation among sources of the same type (Figure 9). Feature map analysis and model performance comparisons show that HeatMap visualizations of the fully connected layers highlight higher Activation Values (AV) in the final layer of the DLCDIL-CNN compared to those of CDIL-CNN and conventional CNNs (Figure 11). This indicates that the CDC used in DLCDIL-CNN is more effective at capturing global features. Robustness validation is conducted under varying Signal-to-Noise Ratio (SNR) conditions, comparing the proposed method with several other identification approaches. At 5 dB SNR, the ISOUPMOP feature extraction method combined with ResNet1D achieves an average identification accuracy of 66.17%, outperforming other methods. When paired with the DLCDIL-CNN, average accuracy increases to 87.58%, representing a 21.42% improvement over ResNet1D (Figure 12). Although DLCDIL-based identification using noisy UPMOP features remains functional, its accuracy is lower than that of the proposed approach. Moreover, smoothed UPMOP features fail to support accurate recognition, even under high SNR conditions. These results suggest that although noisy UPMOP retains individual information, emitters of the same model exhibit similar circuit behavior, and denoising may remove critical nonlinear characteristics. As the number of individual emitters increases, the smoothed UPMOP feature curve leads to feature aliasing, reducing recognition performance. In contrast, the ISOUPMOP method retains unintentional signal characteristics while eliminating trend components, thereby enhancing model generalizability and mitigating overfitting.  Conclusions  To improve individual differentiation among radar sources of the same model, this study proposes a method that combines ISOUPMOP feature extraction with a DLCDIL-CNN architecture. The approach enhances feature discriminability by preserving subtle individual variations and improves identification accuracy through expanded receptive fields and reduced boundary effects. Experimental results confirm that the proposed method achieves an average accuracy of 87.58% under 5 dB SNR for 10 emitters of the same model. These findings indicate that the method effectively resolves the challenge of limited individual differentiation in traditional UPMOP-based techniques and provides a reliable framework for radar emitter individual identification.
Two-stage Constrained Weighted Least Squares method for Multistatic Passive Localization of a Moving Object Under Unknown Transmitter Position and Velocity
ZUO Yan, CHEN Wangrong, PENG Dongliang
2025, 47(6): 1772-1781.   doi: 10.11999/JEIT240664
[Abstract](109) [FullText HTML](45) [PDF 2218KB](23)
Abstract:
  Objective  This study addresses target localization in multistatic passive radar systems under conditions where the transmitter’s position and velocity are unknown. Multistatic passive radar systems utilize covert deployment, exhibit resilience to jamming, and provide wide-area coverage. Conventional localization techniques rely on precise transmitter state information, which is often unavailable due to the mobility of transmitters mounted on dynamic platforms. Environmental disturbances can further introduce inaccuracies in position and velocity measurements. In non-cooperative scenarios, direct acquisition of transmitter state information is typically infeasible. Existing localization methods, such as the Two-Step Weighted Least Squares (TSWLS) approach, exhibit a threshold effect under high noise conditions, while the Semi-Definite Programming (SDP) method achieves Cramér-Rao Lower Bound (CRLB) accuracy but incurs excessive computational costs, limiting real-time applicability. To address these challenges, a localization algorithm is formulated that enables high-precision tracking of moving targets under uncertain transmitter conditions while maintaining relatively low computational complexity.  Methods  Multistatic passive radar localization systems employ two receiving channels: the reference channel, which receives direct signals from the transmitter, and the surveillance channel, which captures signals reflected from the target. Delay-Doppler cross-correlation between the reflected and reference signals enables the measurement of time delay and Doppler shift. The time delay from the reference channel corresponds to the distance between the transmitter and receivers, denoted as the Direct Range (DR), while the associated Doppler shift represents the Direct Range Rate (DRR). Similarly, the bistatic time delay from the surveillance channel corresponds to the sum of the distances between the target, transmitter, and receivers, referred to as the Bistatic Range (BR), with the associated Doppler shift representing the Bistatic Range Rate (BRR). A two-stage localization algorithm is proposed for estimating the positions and velocities of both the target and transmitter. In the first stage, a Constrained Weighted Least Squares (CWLS) problem is formulated using DR and DRR measurements to estimate the transmitter’s position and velocity. In the second stage, the estimated transmitter state is incorporated into the DR/DRR and BR/BRR measurements to construct a new CWLS problem. This problem is then solved using the Quasi-Newton method to determine the position and velocity of the moving target.  Results and Discussions  Compared with traditional localization approaches that rely solely on indirect path information (BR/BRR), incorporating direct path information (DR/DRR) for joint estimation improves target localization accuracy when the transmitter’s position and velocity are unknown (Figure 1). The performance of the proposed two-stage localization algorithm is evaluated through Monte Carlo simulations and compared with the TSWLS method, the SDP approach, and the CRLB. Estimation accuracy is assessed using the Mean Squared Error (MSE), while algorithm complexity is evaluated based on runtime. In scenarios with only four receivers, the TSWLS algorithm fails to provide accurate estimates, whereas the proposed algorithm maintains localization performance, with deviations occurring only when noise reaches 25 dB (Figure 2). When five receivers with uncorrelated noise are used, the TSWLS algorithm deviates from the CRLB and exhibits a threshold effect at a measurement noise level of 10 dB. At 30 dB, the proposed algorithm reduces the MSE for target position estimation by approximately 7 m2 compared to the TSWLS algorithm, slightly outperforming the SDP algorithm, and reduces the MSE for target velocity estimation by approximately 10 (m/s)2, approaching the localization accuracy of the SDP algorithm (Figure 3). When 5 receivers with correlated noise are used, the TSWLS algorithm begins to deviate from the CRLB at a noise variance of 15 dB and exhibits significant performance degradation at 30 dB. Under these conditions, the proposed algorithm reduces the MSE for target position estimation by approximately 5 m2 compared to the TSWLS algorithm, slightly outperforming the SDP algorithm, and reduces the MSE for target velocity estimation by approximately 7 (m/s)2, achieving localization accuracy comparable to the SDP algorithm (Figure 4). While the SDP algorithm has higher computational complexity and longer runtime, the proposed algorithm achieves a shorter runtime while maintaining localization accuracy, demonstrating good real-time performance (Table 1).  Conclusions  This study investigates the localization of a moving target in a multistatic passive radar system when the transmitter’s position and velocity are unknown. By leveraging time delay and Doppler frequency shift measurements from both direct and indirect paths, a quadratic constraint model is formulated and iteratively solved using the Quasi-Newton method. Simulation results demonstrate that the proposed algorithm can achieve CRLB accuracy even under high-noise conditions. Compared with localization algorithms based on joint transmitter and target estimation using TSWLS and SDP, the proposed algorithm achieves lower computational complexity and enables three-dimensional moving target localization with only four receivers. The proposed method, designed for a single-transmitter multiple-receiver system, can be directly extended to multiple-transmitter multiple-receiver configurations. Additionally, this method assumes time synchronization between the transmitter and receivers. Future research will focus on extending multistatic passive radar localization to scenarios where the transmitter is not synchronized with the receivers.
An Efficient Lightweight Network for Intra-pulse Modulation Identification of Low Probability of Intercept Radar Signals
WANG Xudong, WU Jiaxin, CHEN Binbin
2025, 47(6): 1782-1791.   doi: 10.11999/JEIT240848
[Abstract](264) [FullText HTML](124) [PDF 3771KB](69)
Abstract:
  Objective  Low Probability of Intercept (LPI) radar enhances stealth, survivability, and operational efficiency by reducing the likelihood of detection, making it widely used in military applications. However, accurately analyzing the intra-pulse modulation characteristics of LPI radar signals remains a key challenge for radar countermeasure technologies. Traditional methods for identifying radar signal modulation suffer from poor noise resistance, limited applicability, and high misclassification rates. These limitations necessitate more robust approaches capable of handling LPI radar signals under low Signal-to-Noise Ratios (SNRs). This study proposes an advanced deep learning-based method for LPI radar signal recognition, integrating Hybrid Dilated Convolutions (HDC) and attention mechanisms to improve performance in low SNR environments.  Methods  This study proposes a deep learning-based framework for LPI radar signal modulation recognition. The training dataset includes 12 types of LPI radar signals, including BPSK, Costas, LFM, NLFM, four multi-phase, and four multi-time code signals. To enhance model robustness, a comprehensive preprocessing pipeline is applied. Initially, raw signals undergo SPWVD and CWD time-frequency analysis to generate two-dimensional time-frequency feature maps. These maps are then processed through grayscale conversion, Wiener filtering for denoising, principal component extraction, and adaptive cropping. A dual time-frequency fusion method is subsequently applied, integrating SPWVD and CWD to enhance feature distinguishability (Fig. 2). Based on this preprocessed data, the model employs a modified GhostNet architecture, Dilated CBAM-GhostNet (DCGNet). This architecture integrates HDC and the Convolutional Block Attention Module (CBAM), optimizing efficiency while enhancing the extraction of spatial and channel-wise information (Fig. 7). HDC expands the receptive field, enabling the model to capture long-range dependencies, while CBAM improves feature selection by emphasizing the most relevant spatial and channel-wise features. The combination of HDC and CBAM strengthens feature extraction, improving recognition accuracy and overall model performance.  Results and Discussions  This study analyzes the effects of different preprocessing methods, network architectures, and computational complexities on LPI radar signal modulation recognition. The results demonstrate that the proposed framework significantly improves recognition accuracy, particularly under low SNR conditions. A comparison of four time-frequency analysis methods shows that SPWVD and CWD achieve higher recognition accuracy (Fig. 8). These datasets are then fused to evaluate the effectiveness of image enhancement techniques. Experimental results indicate that, compared to datasets without image enhancement, the fusion of SPWVD and CWD reduces signal confusion and improves feature discriminability, leading to better recognition performance (Fig. 9). Comparative experiments validate the contributions of HDC and CBAM to recognition performance (Fig. 10). The proposed architecture consistently outperforms three alternative network structures under low SNR conditions, demonstrating the effectiveness of HDC and CBAM in capturing spatial and channel-wise information. Further analysis of three attention mechanisms confirms that CBAM enhances feature extraction by focusing more effectively on relevant time-frequency regions (Fig. 11). To comprehensively evaluate the proposed network, its performance is compared with ResNet50, MobileNetV2, and MobileNetV3 using the SPWVD and CWD fusion-based dataset (Fig. 12). The results show that the proposed network outperforms the other three networks under low SNR conditions, confirming its superior recognition capability for low SNR radar signals. Finally, computational complexity and storage requirements are assessed using floating-point operations and parameter count (Table 2). The results indicate that the proposed network maintains relatively low computational complexity and parameter count, ensuring high efficiency and low computational cost. Overall, the proposed deep learning framework improves radar signal recognition performance while maintaining efficiency.  Conclusions  This study proposes a deep learning-based method for LPI radar signal modulation recognition using the DCGNet model, which integrates dilated convolutions and attention mechanisms. The framework incorporates an advanced image enhancement preprocessing pipeline, leveraging SPWVD and CWD time-frequency feature fusion to improve feature distinguishability and recognition accuracy, particularly under low SNR conditions. Experimental results confirm that DCGNet outperforms existing methods, demonstrating its practical potential for radar signal recognition. Future research will focus on optimizing the model further and extending its applicability to a wider range of radar signal types and scenarios.
Study on Satellite Signal Recognition with Multi-scale Feature Attention Network
LI Yun, YANG Songlin, XING Zhitong, WU Guangfu, MA Hao
2025, 47(6): 1792-1802.   doi: 10.11999/JEIT250126
[Abstract](154) [FullText HTML](58) [PDF 1986KB](45)
Abstract:
  Objective  Automatic modulation recognition of satellite communication signals is essential for communication security, signal monitoring, and efficient spectrum management. Traditional methods face limitations in handling non-stationary signals, require substantial prior knowledge, and often incur high computational costs. To address these issues, this study proposes an Enhanced Multi-Scale Feature Attention Network (EMSF) for satellite signal recognition. EMSF is designed to deliver high recognition accuracy, robustness under noisy conditions, and computational efficiency, making it suitable for deployment on resource-constrained platforms. This model contributes to satellite communication, signal processing, and deep learning by improving the reliability and efficiency of automatic signal recognition.  Methods  The EMSF integrates four key components to effectively capture and classify satellite signals: Data Augmentation (DA), a denoising convolution module, a multi-scale global perception module, and an Efficient Channel Attention (ECA) mechanism. DA expands the training dataset via rotational transformations to improve generalization and robustness. A deep residual network with soft thresholding selectively suppresses noise while preserving key signal features, enhancing performance under low Signal-to-Noise Ratio (SNR) conditions. The multi-scale global perception module combines dilated convolutions with Spatial Pyramid Pooling (SPP) to extract both global and local contextual information across frequency and time scales, enabling the model to detect subtle signal variations. The ECA module learns channel-wise dependencies to emphasize informative features and suppress irrelevant ones, improving classification accuracy. The model is trained using the Adam optimizer with an adaptive learning rate and a cross-entropy loss function. Custom callbacks monitor validation loss and dynamically adjust the learning rate during training.  Results and Discussions  Extensive experiments were conducted using simulated satellite signals across various modulation types and SNR levels. The EMSF model consistently outperforms state-of-the-art models—including MCLDNN, MCNet, CGDNet, IC-AMCNet, PET-CGDNN, and ResNet—in terms of classification accuracy, parameter efficiency, and computational cost. Model accuracy improves with increasing SNR, maintaining strong performance even under low SNR conditions (Fig. 3). Notably, EMSF achieves nearly 90% accuracy for QAM16 and QAM64 at 0 dB SNR, demonstrating its ability to detect subtle signal variations. Compared with other models, EMSF achieves higher accuracy using significantly fewer parameters and shorter training time (Table 1; Fig. 5). Ablation experiments further verify the contribution of each component, with the denoising convolution module, SPP layer, and data augmentation strategy each yielding measurable performance gains (Table 2; Fig. 6).  Conclusions  The proposed EMSF demonstrates high accuracy, robustness under noisy conditions, and computational efficiency in satellite signal recognition. Its suitability for deployment in resource-constrained devices highlights its practical applicability. The EMSF model contributes to the advancement of satellite communication and offers a foundation for further research in signal processing and deep learning.
Research on Power Allocation Method for Networked Radar Based on Extended Game Theory
YE Fang, QI Changlong, SUN Liuqing, LI Yibing
2025, 47(6): 1803-1815.   doi: 10.11999/JEIT241131
[Abstract](88) [FullText HTML](53) [PDF 2455KB](10)
Abstract:
  Objective  As jamming technology grows increasingly sophisticated, networked radar systems in penetration countermeasure scenarios often operate under partial information, which markedly reduces detection performance. Strategic power allocation can improve spatial and frequency diversity, thereby enhancing target detection. However, most existing methods optimize radar resource distribution in isolation, without accounting for the dynamic interactions between radars and jammers. To address this limitation, this paper proposes a power allocation method for networked radar based on extensive-form game theory. The allocation problem is modeled under partial observability, and the Deep CounterFactual Regret minimization (Deep CFR) algorithm is employed to solve it. This approach increases the probability of successfully detecting penetration targets in adversarial environments.  Methods  A power allocation model for networked radar is developed in parallel with an information-loss model that captures the adversarial dynamics between networked radar and jammer swarms. Drawing on the principles of extensive-form games, the fundamental elements are defined and used to construct an extensive-form game model for radar power allocation. In this framework, networked radar aggregates observable information to mitigate the effects of unobservable jammer signals. To solve the game, the Deep CFR algorithm is employed, integrating deep learning with regret minimization to approximate Nash equilibrium strategies. This approach addresses the storage and computational challenges associated with traditional extensive-form game solutions. Simulation results confirm that the proposed method allocates radar power effectively under partial observation, improving the probability of target detection.  Results and Discussions  Simulation results show that under partial observation conditions, the proposed method achieves a detection probability of 0.813, exceeding the performance of random strategies, Deep Deterministic Policy Gradient (DDPG), and Double Deep Q-Network (Double DQN). While ensuring stable convergence, the method also reduces training time by 27.8% and 31.5% compared with DDPG and Double DQN, respectively. Sensitivity analysis indicates that detection performance declines with an increasing number of jammers due to stronger interference. Additionally, variations in the number of missing information elements (M) demonstrate that overall radar performance depends on both the extent of information loss and the intensity of coordinated jamming. When jamming degradation outweighs the benefits of reduced information loss, the detection probability decreases accordingly.  Conclusions  In modern electronic warfare, where jammers employ complex and adaptive interference strategies and networked radar systems operate with partial adversary information, this study proposes an effective approach to power resource management. By modeling the dynamic interaction between radar systems and jammer swarms through extensive-form game theory and applying the Deep CFR algorithm, simulation results demonstrate the following: (1) The near-Nash equilibrium strategy aligns with the optimal allocationobtained using the Sparrow Search Algorithm, confirming its validity; (2) The proposed method achieves higher detection probability (0.813) than random strategies, DDPG, and Double DQN; and (3) It reduces training time significantly compared with DDPG and Double DQN. Future work will extend this approach to other resource management dimensions, including waveform selection and beam dwell time optimization.
Channel Doppler Information-based Sparse Representation Model and Target Detection Method in Passive Radar
ZHAO Zhixin, LIN Yingyun, ZHENG Yiqun, ZHOU Huilin
2025, 47(6): 1816-1825.   doi: 10.11999/JEIT250076
[Abstract](92) [FullText HTML](33) [PDF 2904KB](29)
Abstract:
  Objective  In passive radar systems based on Orthogonal Frequency Division Multiplexing (OFDM) waveforms, conventional target detection applies clutter suppression followed by parameter estimation using a Range-Doppler (RD) map derived from the mutual ambiguity function between surveillance and reference channel signals. However, this method yields low parameter resolution. Recent advances in sparse representation theory—applied to time-domain or subcarrier-domain data—have enabled higher-resolution target detection in OFDM-based passive radar. Despite this progress, several challenges remain. First, constructing a high-resolution sparse dictionary requires longer-coherence reference signal samples, which significantly increases dictionary dimensionality and computational cost in sparse reconstruction. Second, weak target echoes are often masked by clutter, such as direct-path signals and strong multipath components, which are typically not considered in current models. Therefore, reconstruction performance becomes unstable under low signal-to-noise Signal-to-Noiseratio (SNR) conditions.  Methods  This study proposes a novel sparse representation model for OFDM waveform passive radar that achieves clutter suppression and reduced dictionary dimensionality. The dictionary can be generated offline and facilitates target detection using channel Doppler information. Based on this model, Range-Doppler (RD) maps are constructed through a single sparse optimization process, reducing the number of iterations required for sparse reconstruction. The method first estimates the frequency-domain channel response of the detection scene by modeling the surveillance channel signal in both the time domain and the effective subcarrier domain. Given that direct-path and multipath clutter typically exhibit zero Doppler frequency shift—unlike target echoes—clutter suppression is achieved by subtracting the average channel response from the observed channel response. Channel Doppler analysis is then applied to obtain a sparse representation model based on the clutter-suppressed channel Doppler information. Finally, target detection is performed by introducing sparse constraints and executing sparse reconstruction.  Results and Discussions  Both simulation and experimental results are demonstrated to evaluate the target detection performance of the proposed method in comparison with time-domain and effective subcarrier-domain sparse models. Simulation results indicate that the proposed sparse model enables detection of targets at lower Signal-to-Noise Ratios (SNRs) than the other two models. Quantitative analysis shows that the Peak SideLobe Ratio (PSLR) and Integrated SideLobe Ratio (ISLR) achieved by the proposed method are approximately 1 dB and 1.5 dB lower, respectively, than those obtained using the time-domain and subcarrier-domain approaches. Furthermore, the computational complexity of the proposed method is significantly reduced—by 98.4% and 97.6% compared to the time-domain and subcarrier-domain models, respectively. This efficiency is attributed to the ability to generate the sparse dictionary matrix once offline, enhancing suitability for real-time applications. The experimental results further validate the superior target detection performance of the proposed method.  Conclusions  To address the challenges of high computational complexity in sparse reconstruction and the masking of weak targets by strong clutter, this study proposes a sparse representation model based on channel Doppler information, leveraging the signal characteristics of OFDM-based passive radar. Sparse constraints are incorporated into the model to enable effective target detection via sparse reconstruction. The dictionary matrix can be generated offline, which substantially reduces its dimensionality. This approach not only lowers the computational cost associated with high-resolution processing and extended integration times but also alleviates the masking effect of strong clutter on weak targets. Simulation results demonstrate that the proposed method achieves reliable detection of weak targets in multi-target scenarios while significantly reducing computational complexity. Performance is quantitatively evaluated using PSLR and ISLR, both of which are lower than those of existing time-domain and subcarrier-domain methods. In addition, experimental results using real data in complex clutter environments confirm the practical effectiveness of the proposed approach.
Modeling and Simulation Analysis of Inter-satellites Pseudo-code Ranging for Space Gravitational Wave Detection
SUN Chenying, YAO Weilai, LIANG Xindong, JIA Jianjun
2025, 47(6): 1826-1836.   doi: 10.11999/JEIT250121
[Abstract](172) [FullText HTML](71) [PDF 4282KB](33)
Abstract:
  Objective  Inter-satellite laser interferometry for space gravitational wave detection is constrained by orbital dynamics and other perturbations, which cause continuous variations in inter-satellite distances. Therefore, laser frequency noise becomes the dominant noise source in the inter-satellite interferometry system. To suppress this noise, the Time Delay Interferometry (TDI) algorithm is applied during data post-processing, where a virtual equal-arm interferometer is synthesized by shifting and combining data streams. However, accurate TDI combinations depend on precise knowledge of absolute inter-satellite distances at the picometer level. Any deviation in these measurements may propagate into errors in the final processed data. To address this issue, an inter-satellite ranging scheme based on Pseudo-Random Noise (PRN) is proposed. This method enables both inter-satellite ranging and data communication, providing theoretical support for autonomous satellite navigation as well as inter-satellite ranging and communication in space-based gravitational wave missions.  Methods  To reduce power consumption and spacecraft mass, the inter-satellite ranging task is implemented using existing laser links for scientific measurement. Only a small fraction of the available power is allocated to the ranging subsystem to avoid degrading the phase stability of science measurements. A low-depth Binary Phase-Shift Keying (BPSK) modulation scheme based on PRN is proposed to enable laser ranging and data communication as auxiliary functions of the high-precision inter-satellite interferometry system. The ranging system architecture incorporates a Digital Phase-Locked Loop (DPLL) for carrier synchronization and a Delay-Locked Loop (DLL) for PRN code synchronization. Theoretical limitations of ranging accuracy are systematically analyzed, including contributions from shot noise, integration time, inter-code interference, optical data bit encoding, and the impulse response of the DPLL. These analyses guide improvements in both the DPLL and DLL designs. A Direct Digital Synthesizer (DDS) is used to generate the heterodyne signal. Simulation verification of unidirectional ranging, bidirectional ranging and inter-satellite data communication is performed on a Field Programmable Gate Array (FPGA) platform.  Results and Discussions  The simulation results (Table 2, Table 3) demonstrate that the optimization methods proposed in (Fig.9, Fig.11) effectively reduce the effects of data encoding and inter-code interference on the ranging accuracy of the delay-locked tracking loop, respectively. As shown in (Table 4), in a single delay-locked tracking loop, the dominant factor limiting ranging accuracy is data bit encoding for optical communication when the local PRN code is absent; otherwise, shot noise becomes the primary source of error. (Fig.16) illustrates the distortion of the PRN code caused by the phasemeter pulse response, and shows that Manchester encoding significantly mitigates this distortion. The final simulation results after applying all optimization techniques are summarized in (Table 5). With a modulation depth of approximately 0.4 rad, corresponding to an equivalent optical power of less than 4%, the Root Mean Square (RMS) errors for both unidirectional and bidirectional ranging are approximately 3 cm at a measurement rate of 3 Hz with an 80 MHz sampling frequency. For unidirectional ranging with data streams encoded at 19 kbps and 39 kbps, the corresponding RMS ranging errors are approximately 5 cm and 20 cm, respectively. Bidirectional ranging supports data transmission only at 19 kbps, yielding an RMS error of approximately 6 cm. When the phase modulation depth is reduced to 0.2 rad (corresponding to an equivalent optical power below 1%), the RMS ranging error is approximately 6 cm; if 19 kbps data are transmitted simultaneously, the RMS error increases to approximately 12 cm. These simulation results confirm that sub-meter absolute distance resolution is achievable under all tested conditions.  Conclusions  Based on the Taiji plan, an absolute distance measurement scheme utilizing low-depth phase modulation of PRN codes is proposed. A receiver model based on a DPLL and a DLL is established. The limiting factors affecting inter-satellite ranging accuracy are analyzed, leading to improvements in the ranging model. The simulation results, following comprehensive optimizations, show that the primary limiting factors of ranging accuracy are unavoidable shot noise and the encoding of data bits for optical communication. At a clock sampling rate of 80 MHz, with a PRN code phase modulation depth of 0.4 rad, the bidirectional ranging RMS error is approximately 6 cm when communication data is encoded at 19 kbps. When the modulation depth is reduced to 0.2 rad, the RMS error increases to approximately 12 cm while transmitting 19 kbps data concurrently. These simulation results demonstrate a clear improvement over meter-level accuracy, and the ranging model offers valuable insights for space gravitational wave detection and satellite autonomous navigation. Given the complexity of clock synchronization, it is assumed in this study that the clocks of the transmitter and receiver are fully synchronized. Further research will address clock synchronization issues, and electrical and optical experiments will be conducted to assess the performance of the proposed architecture in future work.
Siamese Network-assisted Multi-domain Feature Fusion for Radar Active Jamming Recognition Method
LI Ning, WANG Zan, SHU Gaofeng, ZHANG Tingwei, GUO Zhengwei
2025, 47(6): 1837-1849.   doi: 10.11999/JEIT240797
[Abstract](232) [FullText HTML](96) [PDF 6448KB](58)
Abstract:
  Objective  The rapid development of electronic warfare technology has introduced complex scenarios in which active jamming presents considerable challenges to radar systems. On modern battlefields, the electromagnetic environment is highly congested, and various forms of active jamming signals frequently disrupt radar functionality. Although existing recognition algorithms can identify certain types of radar active jamming, their performance declines under low Jamming-to-Noise Ratio (JNR) conditions or when training data are scarce. Low JNR reduces the detectability of jamming signals by conventional methods, and limited sample size further constrains recognition accuracy. To address these challenges, neural network-based methods have emerged as viable alternatives. This study proposes a radar active jamming recognition approach based on multi-domain feature fusion assisted by a Siamese network, which enhances recognition capability under low JNR and small-sample conditions. The proposed method offers an intelligent framework for improving jamming recognition in complex environments and provides theoretical support for battlefield awareness and the design of effective counter-jamming strategies.  Methods  The proposed method comprises a multi-domain feature fusion subnetwork, a Siamese architecture, and a joint loss design. To extract jamming features effectively under low JNR conditions, a multi-domain feature fusion subnetwork is developed. Specifically, a semi-soft thresholding shrinkage module is proposed by integrating a semi-soft threshold function with an attention mechanism. This module efficiently extracts time-domain features and eliminates the limitations of manual threshold selection. To enhance the extraction of time-frequency domain features, a multi-scale convolution module and an additional attention mechanism are incorporated. To reduce the model’s dependence on large training datasets, a weight-sharing Siamese network is constructed. By comparing similarity between sample pairs, this network increases the number of training iterations, thereby mitigating the limitations imposed by small sample sizes. Finally, three loss functions are jointly applied: an improved weighted contrastive loss, an adaptive cross-entropy loss, and a triplet loss. This joint strategy promotes intra-class compactness and inter-class separability of jamming features.  Results and Discussions  When the number of training samples is limited (Table 6), the proposed method achieves an accuracy of 96.88% at a JNR of –6 dB with only 20 training samples, indicating its effectiveness under data-scarce conditions. With further reduction in sample size—specifically, when only 15 training samples are available per jamming type—the recognition performance of other methods declines substantially. In contrast, the proposed method maintains higher recognition accuracy, demonstrating enhanced stability and robustness under low JNR and limited sample conditions. This performance advantage is attributable to three key factors: (1) Multi-domain feature fusion integrates jamming features from multiple domains, preventing the loss of discriminative information commonly observed under low JNR conditions. (2) The weight-sharing Siamese network increases the number of effective training iterations by evaluating sample similarities, thereby mitigating the limitations associated with small datasets. (3) The combined use of an improved weighted contrastive loss, an adaptive cross-entropy loss, and a triplet loss promotes intra-class compactness and inter-class separability of jamming features, enhancing the model’s generalization capability.  Conclusions  This study proposes a radar active jamming recognition method that performs effectively under low JNR and limited training sample conditions. A multi-domain feature fusion subnetwork is developed to extract representative features from both the time and time-frequency domains, enabling a more comprehensive and discriminative characterization of jamming signals. A weight-sharing Siamese network is then introduced to reduce reliance on large training datasets by leveraging sample similarity comparisons to expand training iterations. In addition, three loss functions—an improved weighted contrastive loss, an adaptive cross-entropy loss, and a triplet loss—are jointly applied to promote intra-class compactness and inter-class separability. Experimental results validate the effectiveness of the proposed method. At a low JNR of –6 dB with only 20 training samples, the method achieves a recognition accuracy of 96.88%, demonstrating its robustness and adaptability in challenging electromagnetic environments. These findings provide technical support for the development of anti-jamming strategies and enhance the operational reliability of radar systems in complex battlefield scenarios.
Radar, Navigation and Array Signal Processing
Millimeter-wave Radar Point Cloud Gait Recognition Method Under Open-set Conditions Based on Similarity Prediction and Automatic Threshold Estimation
DU Lan, LI Yiming, XUE Shikun, SHI Yu, CHEN Jian, LI Zhenfang
2025, 47(6): 1850-1863.   doi: 10.11999/JEIT241034
[Abstract](200) [FullText HTML](87) [PDF 3652KB](56)
Abstract:
  Objective  Radar-based gait recognition systems are typically developed under closed-set assumptions, limiting their applicability in real-world scenarios where unknown individuals frequently occur. This constraint presents challenges in security-critical settings such as surveillance and access control, where both accurate recognition of known individuals and reliable exclusion of unknown identities are essential. Existing methods often lack effective mechanisms to differentiate between known and unknown classes, leading to elevated false acceptance rates and security risks. To overcome this limitation, this study proposes an open-set recognition framework that integrates a similarity prediction network with an adaptive thresholding method based on Extreme Value Theory (EVT). The framework models the score distributions of known and unknown classes to enable robust identification of unfamiliar identities without requiring samples from unknown classes during training. The proposed method enhances the robustness and applicability of millimeter-wave radar-based gait recognition under open-set conditions, supporting its deployment in operational environments.  Methods  The proposed method comprises four key modules: point cloud feature extraction network training, similarity prediction network training, automatic threshold estimation, and open-set testing. A sequential training strategy is adopted to ensure robust learning. First, the point cloud feature extraction network is trained with a triplet loss function that encourages intra-class compactness and inter-class separability by pulling same-class samples closer and pushing different-class samples apart. This enables the network to learn stable and discriminative representations, even under variations in viewpoint or clothing. The extracted features are then input into a similarity prediction network trained to model the score distributions of known and unknown identities. By incorporating score-based constraints, the network learns a decision space in which known and unknown classes are more effectively separated. Following network optimization, an EVT-based thresholding module is employed. This module dynamically models the tail distributions of similarity scores and automatically determines a class-agnostic threshold by minimizing the joint false acceptance and false rejection rates. This adaptive and theoretically grounded strategy enhances the separation between known and unknown classes in the similarity space. Together, these modules improve the stability and accuracy of radar-based gait recognition under open-set conditions, supporting reliable operation in real-world scenarios where unfamiliar individuals may appear.  Results and Discussions  The proposed method improves distributional separation between known and unknown classes in the similarity score space through the similarity prediction network and distinguishes them effectively using adaptive thresholding. Experimental results show that the method consistently yields higher F1 scores across all openness levels compared with baseline approaches, indicating strong robustness to open-set variations (Table 1). Specifically, the method achieves an 87% recognition rate for known classes and a 96% rejection rate for unknown classes, outperforming all comparison methods (Fig. 7). Ablation experiments confirm that incorporating the similarity prediction module enhances recognition performance under high openness. Manually set thresholds, while effective under low openness, show substantial performance degradation under large openness (F1 score: 43.93%). In contrast, the proposed automatic thresholding module demonstrates superior generalization, improving the F1 score by 22.88% under large openness conditions (Table 2). Further analysis shows that the method significantly increases the score distribution gap between known and unknown classes, contributing to improved recognition reliability (Fig. 8). Comparative evaluations (Table 3) confirm that the method achieves superior open-set recognition performance. In addition, the employed point cloud feature extraction network captures temporal features at multiple time scales and uses an attention-based mechanism to adaptively aggregate information across frames and temporal resolutions. This contributes to more robust gait representations and further improves open-set recognition performance compared with other feature extraction networks (Table 4).  Conclusions  Building on previous work on robust feature extraction under complex covariate conditions, this study extends millimeter-wave radar point cloud gait recognition to open-set scenarios. The proposed method preserves the recognition strength of the original feature extraction network and enhances class discriminability by integrating a similarity prediction network. To address the limitations of manually defined rejection thresholds, an automatic threshold determination module based on EVT is introduced. Extensive experiments using measured millimeter-wave radar point cloud gait data confirm that the method reliably distinguishes between known and unknown individuals, demonstrating its effectiveness and robustness under open-set conditions.
Wireless Communication and Internet of Things
Codebook Attack and Camouflage Solution in Intelligent Reflective Surface-aided Wireless Communications
LI Runyu, PENG Wei, ZHOU JianLong
2025, 47(6): 1864-1872.   doi: 10.11999/JEIT240991
[Abstract](161) [FullText HTML](56) [PDF 3335KB](21)
Abstract:
  Objective  Intelligent Reflective Surface (IRS) technology has demonstrated significant potential in enhancing Physical Layer Security (PLS). While the use of IRS to support PLS has been extensively studied, there is limited research addressing the security challenges inherent to the IRS system itself. In particular, when facing an attacker, obtaining the real-time codebook is crucial for mastering the entire IRS cascaded channel. The IRS controller, an IoT device with limited computational resources and security assurances, stores the real-time codebook and serves as the system's Achilles' heel. This paper proposes a new type of attack, the Controller Manipulation Attack (CMA). The CMA can be executed by an attacker who either compromises the IRS controller or infects it with malware, allowing for the malicious manipulation of phase shifts, which can degrade the rate of legitimate communication. Additionally, an attacker can retrieve the codebook information by exploiting the vulnerabilities of the IRS controller. Due to hardware constraints, the controller is a vulnerable, zero-trust device, making it easier for attackers to gain access to the codebook. With knowledge of the IRS geometric structure, operating frequency, codebook, and the location of the Base Station (BS), an attacker can infer the direction of the main lobe beam, thereby enabling more efficient passive eavesdropping. This passive eavesdropping represents a serious threat, especially in high-frequency scenarios with narrow beams, and is more covert than traditional pilot contamination attacks.  Methods  To address the codebook attack, a lightweight camouflage method is proposed at the physical layer. In this approach, the IRS phase shifts—termed the camouflage codebook—comprise both the real codebook and a fabricated one designed to deceive potential attackers. A subset of IRS elements is configured to produce ostensible phase shifts corresponding to the fake codebook. These elements do not radiate energy, serving solely to mislead attackers. Therefore, even if an attacker compromises the IRS controller and accesses the codebook, the retrieved information remains ineffective. To quantify the level of security provided, the Codebook-Secrecy-Rate (CSR) is defined as the difference in data rates between the real and camouflage codebooks. The optimization of discrete phase shifts for the IRS is formulated as an inner product maximization problem. Leveraging the structural properties of this formulation, a Divide-and-Sort (DaS) algorithm is proposed. This algorithm achieves global optimality with a computational complexity of \begin{document}$ O\left({2}^{B}N\right) $\end{document}. Based on the DaS solution, the CSR is maximized in the following steps: the optimal codebook for signal enhancement is first derived; subsequently, a subset of IRS elements is phase-shifted by π to act as inactive units providing destructive interference. Finally, a Tabu Search (TS) algorithm is employed to determine the optimal topology of the codebook configuration.  Results and Discussions  Simulation results confirm the performance of the proposed solution. Experiments are conducted across four IRS configurations. When the number of IRS elements exceeds 1,000 and each unit operates with 1-bit phase resolution, the average CSR reaches approximately 15~20 bit/(s·Hz), as shown in Fig. 5. Monte Carlo simulations evaluate the relationship between the number of active elements \begin{document}$ {N}_{{\mathrm{T}}} $\end{document} and \begin{document}$ N $\end{document}. A linear correlation is observed, as depicted in Fig. 6. The CSR reaches its maximum when approximately half of the IRS units are active. In practical IRS-assisted communication systems, selecting the number of active units within the interval [\begin{document}$ N/2,N $\end{document}] offers a trade-off between signal enhancement and security. When the size of the real codebook approaches that of the fake codebook, the constructive gain from the real codebook is largely neutralized by the interference from the fake codebook. This configuration corresponds to the maximum achievable CSR for the system.  Conclusions  This study considers the codebook attack in IRS-aided communication systems and proposes a physical-layer camouflage codebook solution. Owing to the limited computational capacity of the IRS controller, which restricts the implementation of conventional security protocols, the controller remains vulnerable to compromise. An attacker with access to the IRS geometric structure, operating frequency, codebook, and BS location can infer the main lobe beam direction, facilitating efficient passive eavesdropping. In the proposed method, IRS elements are divided into two groups: one group operates normally to enhance legitimate signals, while the other is configured to generate deceptive phase shifts without energy radiation. This arrangement produces a camouflage codebook. Even if attackers gain control of the IRS controller, the obtained codebook includes phase information associated with inactive elements, resulting in a misleading beamforming pattern. To quantify the security level, the CSR is introduced. The optimization of the camouflage codebook is formulated as an inner product maximization problem. A DaS algorithm is used to derive the optimal codebook for signal enhancement, followed by TS to determine the phase shift topology that maximizes CSR. Simulation results support the effectiveness of the proposed approach.
Identification of Non-Line-Of-Sight Signals Based on Direct Path Signal Residual and Support Vector Data Description
NI Xue, ZENG HaiYu, YANG Wendong
2025, 47(6): 1873-1884.   doi: 10.11999/JEIT240960
[Abstract](98) [FullText HTML](46) [PDF 2593KB](23)
Abstract:
  Objective  Current machine learning-based methods for Non-Line-Of-Sight (NLOS) signal recognition either require the collection of a large amount of data from two different types of signals for various scenarios, or the trained models fail to generalize across different environments. These methods also do not simultaneously address the practical challenges of low training sample acquisition cost and good scene adaptation. This paper proposes a new NLOS recognition method that collects single-class signals from a single environment to train recognition models, which then demonstrate high accuracy when recognizing signals in different scenarios. This approach offers the advantages of low sample acquisition cost and strong environmental adaptability.  Methods  This paper proposes Direct Path (DP) signal residual feature parameters that exhibit significant differences between two types of signals. The effectiveness of these parameters is theoretically analyzed and combined with nine feature parameters identified in typical literature, forming various feature vectors to characterize the signals. This approach effectively enhances the accuracy of the recognition model. A class of signals with high feature similarity across different scenarios is used as training data, and a single recognition model is employed as the machine learning algorithm. The model is trained on signal samples collected in typical Line-Of-Sight (LOS) channels to improve its scene adaptability. Based on the principles of Deep Support Vector Data Description (DSVDD), a reverse-expanded DSVDD model is designed for NLOS signal recognition, further improving the model’s accuracy in recognizing samples across different scenarios.  Results and Discussions  As shown in Table 2, in the signal recognition scenario where the test set and training set originate from the same scene, the Least Squares Support Vector Machine (LSSVM) model demonstrates the best recognition performance. This is achieved using hyperplanes trained with two types of signals, resulting in a recognition accuracy of over 95%. In comparison, the standard Support Vector Data Description (SVDD) model, which is trained using only single-class LOS signal samples, exhibits a performance loss relative to LSSVM, with a maximum accuracy decrease exceeding 5%. The recognition accuracy of the SVDD model trained with DP signal residual features improves compared to the standard SVDD model, with the highest accuracy difference remaining within 5% of the LSSVM model. Furthermore, the performance of the DSVDD model, trained with DP signal residuals, shows a further improvement, with the highest accuracy decrease decreased to less than 2% compared to the LSSVM model. In scenarios where the training set and test data come from different scenes, LSSVM requires two types of signals for training. However, the hyperplane trained with two types of signal samples from a single scene exhibits poor performance when recognizing signal samples from other scenarios, with a maximum accuracy of less than 75%. The SVDD model trained with DP signal residual eigenvalues incorporates features with significant differences between the two signal types, improving recognition accuracy to over 80%. Finally, the DSVDD model, trained with DP signal residual features and replacing the Gaussian kernel function in the SVDD model with a neural network, further enhances recognition accuracy, achieving a maximum accuracy exceeding 85%.  Conclusions  A recognition method based on DP signal residual feature training for DSVDD is proposed to address the challenges of low sample acquisition cost and strong environmental adaptability in typical NLOS signal recognition. Compared with the SVDD method, this approach improves upon feature parameters, models, and model structures by introducing features with significant differences between the two types of signals, resulting in a substantial improvement in recognition performance. Additionally, the paper designs a reverse dimensionality expansion for DSVDD and incorporates it into NLOS signal recognition, further enhancing the accuracy of the recognition model across different scene samples. Compared to other typical machine learning algorithms, the proposed method requires the collection of single-class signal data from a single scene and performs effectively in recognizing signal samples from other scenes. Although the proposed method outperforms typical single-recognition approaches, the overall performance still has room for improvement. The theoretical analysis regarding how neural networks can better explore potential relationships between features is insufficient, and the full potential of neural networks in single-recognition models has not been fully realized. Furthermore, due to time constraints, this study only simulated sample data collected from three scenarios, and the recognition performance in other typical scenarios requires further validation.
Priority-aware Per-flow Size Measurement in High-speed Networks
GAO Guoju, ZHOU Shaolong, SUN Yu-E, HUANG He
2025, 47(6): 1885-1895.   doi: 10.11999/JEIT240834
[Abstract](88) [FullText HTML](40) [PDF 4121KB](23)
Abstract:
  Objective  Network traffic measurement is essential for supporting applications such as anomaly detection and capacity planning. With growing demand for flow-level analysis, traffic measurement technologies are facing increasing performance requirements. In typical network environments, a flow comprises packets sharing a common five-tuple (including source/destination IP address, source/destination port, and protocol). Measuring per-flow size presents three core challenges: high data volume, fast transmission rates, and limited on-chip memory. Sketch-based data structures offer an effective trade-off among memory efficiency, query speed, and measurement accuracy, and have been widely adopted for tasks such as per-flow size estimation, cardinality estimation, persistent flow detection, and burst detection. However, the rising need for differentiated traffic handling has highlighted the limitations of traditional Sketches, which treat all flows uniformly. Existing priority-aware Sketches often fail to maintain both high accuracy for high-priority flows and overall system throughput. To address this gap, this study proposes EssentialKeeper, a priority-aware algorithm that combines priority-sensitive hashing with Cuckoo Hashing. The proposed method ensures accurate measurement for high-priority flows while maintaining efficient system-wide performance. This approach supports differentiated traffic measurement in high-speed networks and contributes both theoretical insights and practical value.  Methods  In practical networks, different traffic types have distinct requirements for measurement accuracy. For example, suspicious or malicious flows require high-precision measurement for security monitoring, whereas latency-sensitive services such as real-time video streaming demand continuous tracking to maintain service quality. To accommodate these varying demands, several priority-aware Sketch algorithms have been proposed. These typically partition memory into high- and low-priority regions, assigning different levels of accuracy according to flow priority. All incoming traffic first passes through the high-priority region, where high-priority flows are retained, whereas others are redirected with degraded measurement accuracy. This architecture, however, presents performance challenges. Because low-priority flows constitute the majority of network traffic, they still traverse the high-priority region, incurring additional hash computations and memory access overhead. This overhead substantially lowers throughput. Algorithms such as MC-Sketch and Cuckoo Sketch are particularly affected. Although PA-Sketch introduces priority-aware hashing to reduce the processing load for low-priority flows, it compromises measurement accuracy for medium-priority flows, limiting its practical utility. To address these limitations, this study proposes EssentialKeeper, a new Sketch algorithm for efficient priority-aware traffic measurement under constrained memory conditions. The algorithm combines priority-aware hashing with Cuckoo Hashing. For high-priority flows, it dynamically allocates more hash functions and candidate buckets, using Cuckoo hashing’s "kick-out and relocate" mechanism to enhance measurement precision. For low-priority flows, it employs an optimized Count-Sketch (CS-Sketch) structure to ensure fast processing. This hybrid design sustains high throughput while ensuring accurate tracking of high-priority traffic, thereby resolving the speed-accuracy trade-off that limits existing approaches.  Results and Discussions  This study evaluates EssentialKeeper using the real-world CAIDA-2019 traffic dataset and a network interaction dataset derived from Stack Overflow. Performance is assessed under different priority allocation strategies—random and size-based—and across a range of memory configurations. Optimal algorithm parameters are determined through systematic tuning (Fig. 35). Compared with existing priority-aware Sketches, EssentialKeeper demonstrates substantial improvements across three key metrics. Under the random priority allocation strategy, the average relative error for high-priority flows decreases by 63.2%, while the F1-score increases by 14.8% (Fig. 6, Fig. 7). With size-based priority allocation, the error is reduced by 53.8%, and the F1-score improves by 11.8% (Fig. 8, Fig. 9). Additionally, EssentialKeeper achieves a 10.8% increase in throughput (Fig. 10), while maintaining lower memory overhead. These results highlight the effectiveness of EssentialKeeper in supporting accurate and efficient priority-aware traffic measurement in high-speed network environments.  Conclusions  This study proposes EssentialKeeper, a novel algorithm for priority-aware traffic measurement in high-speed networks. By enhancing the structure of existing priority-aware Sketches, the algorithm enables accurate, differentiated measurement based on flow priority. It combines the efficient conflict resolution of Cuckoo Hashing with the adaptive precision of priority-aware hashing, thereby improving measurement accuracy for high-priority flows while sustaining high throughput for low-priority traffic. Experimental results demonstrate that EssentialKeeper reduces the average relative error of high-priority flows by 58.5%, increases the F1-score by 13.3%, and improves overall system throughput by 10.8% compared to the best existing approaches, achieving a favorable trade-off between speed and accuracy. Despite these advances, several challenges remain. One is the integration with sampling algorithms. Since high-priority flows often carry more critical information, future work could explore dynamic sampling strategies that retain high-priority packets while selectively discarding lower-priority traffic. This hybrid approach may further reduce system overhead without compromising measurement precision. Another direction is task generalization. Beyond per-flow size and cardinality estimation, other core measurement tasks—such as persistent flow detection and burst detection—may benefit from priority-aware techniques. Extending EssentialKeeper to support these applications would broaden its utility. Finally, current experiments are conducted in a CPU-based environment. However, practical deployment in production networks may require adaptation to hardware platforms such as P4 switches or FPGAs, which impose tighter resource constraints. Future research should focus on implementing and optimizing priority-aware Sketch algorithms for hardware deployment to assess feasibility and facilitate real-world adoption.
Cryption and Network Information Security
Mixture Distribution-Based Truth Discovery Algorithm under Local Differential Privacy
ZHANG Pengfei, AN Jianlong, CHENG Xiang, ZHANG Zhikun, SUN Li, ZHANG Ji, ZHU Yibo
2025, 47(6): 1896-1910.   doi: 10.11999/JEIT240936
[Abstract](212) [FullText HTML](96) [PDF 2832KB](20)
Abstract:
  Objective  Mobile crowd sensing is recognized as one of the significant means for data collection, wherein a fundamental challenge lies in discovering the “truth” from a multitude of sensing data of varying quality. To address potential privacy leakage issues during the truth discovery process, existing methods often incorporate local differential privacy techniques to protect the data submitted by workers. However, these methods fail to adequately consider the negative impact of Gaussian noise, which reflects worker quality, on the accuracy of the noise “truth”. Moreover, directly applying the Laplace mechanism for privacy protection introduces excessive noise due to the randomness and unbounded nature of the Laplace distribution, resulting in poor precision and utility of truth discovery. Additionally, existing truth discovery methods are either designed for discrete value scenarios or often fail to strictly satisfy Local Differential Privacy (LDP). Therefore, designing a truth discovery algorithm based on mixed distributions that strictly adheres to LDP poses a significant challenge. This is particularly true in continuous value scenarios, where balancing privacy protection with the accuracy of truth discovery, as well as efficiently optimizing the complexity of mixed distribution models to enhance algorithm precision and efficiency, remains a critical issue to be resolved.  Methods  A novel algorithm, termed Mixture distributiOn-based truth discOvery under local differeNtial privacy (MOON), is proposed. This algorithm primarily considers both the Gaussian noise inherent in the data uploaded by workers, which reflects their quality, and the exogenous Laplace noise injected to protect private data. Based on the mixed noise distribution, new iterative equations for truth discovery are designed. Specifically, each worker first injects Laplace noise into their sensed data and uploads the noise-added data to the server. Subsequently, a probabilistic model combining Gaussian and Laplace noise is constructed and jointly estimated. Finally, the constrained optimization problem is solved using the Lagrange multiplier method to derive iterative equations for worker quality and the “true value” of the noise.  Results and Discussions  Experimental results demonstrate that, across two real-world datasets, as the privacy budget ε increases, the MOON algorithm exhibits the least impact on utility compared to other benchmark algorithms. Furthermore, when compared to the state-of-the-art TESLA algorithm, MOON achieves at least a 20% improvement in precision (Fig.3). In the context of truth discovery, weight updating is a critical component. Therefore, the experiments also validate the differences between the mixed noise weight distribution derived by the MOON algorithm and the true weight distribution across different datasets (Fig.4). The results indicate that the weight distribution obtained by MOON is closer to the true distribution, aligning with the utility analysis presented in the algorithm analysis section. This is attributed to the smaller scale of noise added to high-quality data. Additionally, the runtime of the MOON algorithm is generally higher than that of the non-privacy-preserving truth discovery algorithm NoPriv, being approximately twice as long (Fig.5), which is consistent with the theoretical analysis. This is due to the injection of Laplace noise into the data uploaded by workers in MOON, necessitating more iterations to converge to the final truth. However, since both runtimes are measured in seconds, this discrepancy is considered acceptable in practical applications.  Conclusions  Existing truth discovery algorithms that satisfy local differential privacy (LDP) fail to adequately account for the negative impact of Gaussian noise, which reflects worker quality, on the accuracy of the noise "truth." Moreover, while directly applying the Laplace mechanism for noise addition strictly ensures LDP compliance, the randomness and unbounded nature of the Laplace distribution result in excessive noise injection. To address these issues, the MOON algorithm is proposed in this work. Theoretical analysis demonstrates that MOON achieves privacy protection while maintaining low computational and communication complexity. Experimental results on two real-world datasets show that, compared to the latest advancements, MOON improves the precision of the derived "truth" by 20% with minimal additional computational overhead. In future work, the potential social relationships among workers, which may lead to similarities in the data submitted by certain workers, as well as functional dependencies among task attributes, will be leveraged to further enhance the accuracy of truth discovery under local differential privacy.
TTRC-ABE: Traitor Traceable and Revocable CLWE-based ABE Scheme from Lattices
LIU Yuan, WANG Licheng, ZHOU Yongbin
2025, 47(6): 1911-1926.   doi: 10.11999/JEIT240997
[Abstract](155) [FullText HTML](87) [PDF 2996KB](25)
Abstract:
  Objective  With the advancement of quantum computing, lattice-based cryptography has emerged as a key approach for constructing post-quantum secure cryptographic primitives due to its inherent resistance to quantum attacks. Among these primitives, lattice-based Attribute-Based Encryption (ABE) is particularly notable for its ability to provide fine-grained access control and flexible authorization, making it suitable for data-sharing applications, such as cloud computing and the Internet of Things (IoT). However, existing lattice-based ABE schemes, especially those based on Learning With Errors (LWE) or Ring-LWE (RLWE), exhibit limitations that hinder their practical deployment. A significant issue is the absence of traitor tracing and revocation mechanisms, which leaves these schemes vulnerable to key abuse, where malicious users can share decryption keys without detection or prevention. Furthermore, the exposure of attribute values in access policies creates a privacy risk, as sensitive user information may be inferred from these values. These limitations undermine the security and privacy of lattice-based ABE systems, limiting their applicability in real-world scenarios where accountability and privacy are critical. To address these challenges, this paper proposes a novel Traitor Traceable and Revocable CLWE-based ABE (TTRC-ABE) scheme, which employs a new variant of LWE called Cyclic Algebra LWE (CLWE). The proposed scheme aims to achieve three key objectives: (1) to introduce an efficient traitor tracing mechanism to identify malicious users and a revocation mechanism to prevent revoked users from decrypting messages; (2) to enhance attribute privacy by concealing attribute values in access policies; and (3) to improve the efficiency of lattice-based ABE schemes, specifically in terms of public key size, ciphertext size, and ciphertext expansion rate. By addressing these critical issues, TTRC-ABE contributes to the advancement of lattice-based cryptography and provides a viable solution for secure, privacy-preserving data sharing in quantum-vulnerable environments.  Methods  In the TTRC-ABE scheme, each user’s Global IDentity (GID) is bound to the leaf nodes of a complete binary tree. This binding enables the tracing of malicious users by identifying their GIDs embedded in decryption keys. To revoke compromised users, their GIDs are added to a revocation list, and the ciphertext is updated accordingly, ensuring that any revoked user cannot decrypt the message, even if they possess a valid decryption key. Additionally, the traditional one-dimensional attribute structure (attribute value only) is replaced with a two-dimensional structure (attribute label, attribute value). The attribute labels act as public identifiers, while the attribute values remain confidential. This separation allows for the concealment of sensitive attribute values while still enabling effective access control. A semi-access policy structure is combined with an extended Shamir’s secret sharing scheme over cyclic algebra to conceal attribute values in access policies, preventing adversaries from inferring sensitive user information. Furthermore, the proposed scheme utilizes CLWE, a new variant of LWE that offers improved efficiency and security properties. A formal security proof for TTRC-ABE is provided in the standard model. The security of the scheme relies on the hardness of the CLWE problem, which is believed to be resistant to quantum computing attacks.  Results and Discussions  The proposed TTRC-ABE scheme demonstrates significant improvements over existing lattice-based ABE schemes in terms of functionality, security, and efficiency. The scheme successfully integrates traitor tracing and revocation features, effectively preventing key abuse by identifying malicious users and revoking their access to encrypted data. By adopting a two-dimensional attribute structure and a semi-access policy, the scheme conceals attribute values in access policies, ensuring that sensitive user information remains confidential, even when the access policy is publicly accessible. Performance analysis shows that TTRC-ABE supports traitor tracing and revocation, protects attribute privacy, and is resistant to quantum computing attacks (Table 2). Compared to related lattice-based ABE schemes, TTRC-ABE significantly reduces the public key size, ciphertext size, and average ciphertext expansion rate (Table 3, Figure 7). These improvements enhance the practicality of the scheme for real-world applications, especially in resource-constrained environments.  Conclusions  This paper presents a novel TTRC-ABE scheme that addresses the limitations of existing lattice-based ABE schemes. By integrating traitor tracing and revocation mechanisms, the scheme effectively prevents key abuse and ensures system integrity. The introduction of a two-dimensional attribute structure and a semi-access policy enhances attribute privacy, safeguarding sensitive user information from leakage. Furthermore, the use of CLWE improves the scheme’s efficiency, reducing public key size, ciphertext size, and ciphertext expansion rate. Security analysis confirms that TTRC-ABE is secure in the standard model, making it a robust solution for post-quantum secure ABE. Future work will focus on extending the scheme to support more complex access policies, such as hierarchical and multi-authority structures, and optimizing its performance for large-scale applications. Additionally, the integration of TTRC-ABE with other cryptographic primitives, such as homomorphic encryption and secure multi-party computation, will be explored to enable more advanced data-sharing scenarios.
Image and Intelligent Information Processing
Multi-Resolution Spatio-Temporal Fusion Graph Convolutional Network for Attention Deficit Hyperactivity Disorder Classification
SONG Xiaoying, HAO Chunyu, CHAI Li
2025, 47(6): 1927-1936.   doi: 10.11999/JEIT240872
[Abstract](104) [FullText HTML](57) [PDF 2911KB](30)
Abstract:
  Objective  Predicting neurodevelopmental disorders remains a central challenge in neuroscience and artificial intelligence. Attention Deficit Hyperactivity Disorder (ADHD), a representative complex brain disorder, presents diagnostic difficulties due to its increasing prevalence, clinical heterogeneity, and reliance on subjective criteria, which impede early and accurate detection. Developing objective, data-driven classification models is therefore of significant clinical relevance. Existing graph convolutional network-based approaches for functional brain network analysis are constrained by several limitations. Most adopt single-resolution brain parcellation schemes, reducing their capacity to capture complementary features from multi-resolution functional Magnetic Resonance Imaging (fMRI) data. Moreover, the lack of effective cross-scale feature fusion restricts the integration of essential features across resolutions, hampering the modeling of hierarchical dependencies among brain regions. To address these limitations, this study proposes a Multi-resolution Spatio-Temporal Fusion Graph Convolutional Network (MSTF-GCN), which integrates spatiotemporal features across multiple fMRI resolutions. The proposed method substantially improves the accuracy and robustness of functional brain network classification for ADHD.  Methods  The MSTF-GCN improves learning performance through two main components: (1) construction of multi-resolution, multi-channel networks, and (2) comprehensive fusion of temporal and spatial information. Multiple brain atlases at different resolutions are employed to parcellate the brain and generate functional connectivity networks. Spatial features are extracted from these networks, and optimal nodal features are selected using Support Vector Machine-Recursive Feature Elimination (SVM-RFE). To preserve global temporal characteristics and capture hierarchical signal variations, both the original time series and their differential signals are processed using a temporal convolutional network. This structure enables the extraction of complex temporal features and inter-subject temporal correlations. Spatial features from different resolutions are then fused with temporal correlations to form population graphs, which are adaptively integrated via a multi-channel graph convolutional network. Non-imaging data are also integrated to produce effective multi-channel, multi-modal spatiotemporal fusion features. The final classification is performed using a fully connected layer.  Results and Discussions  The proposed MSTF-GCN model is evaluated for ADHD classification using two independent sites from the ADHD-200 dataset: Peking and NI. The model consistently outperforms existing methods, achieving classification accuracies of 75.92% at the Peking site and 82.95% at the NI site (Table 2, Table 3). Ablation studies confirm the contributions of two key components: (1) The multi-atlas, multi-resolution feature extraction strategy significantly enhances classification accuracy (Table 4), supporting the utility of complementary cross-scale topological information; (2) The multimodal fusion strategy, which incorporates non-imaging variables (gender and age), yields notable performance improvements (Table 5). Furthermore, t-SNE visualization and inter-class distance analysis (Fig. 6) show that MSTF-GCN generates a feature space with clearer class separation, reflecting the effectiveness of its multi-channel spatiotemporal fusion design. Overall, the MSTF-GCN model achieves superior performance compared with state-of-the-art methods and demonstrates strong robustness across sites, offering a promising tool for auxiliary diagnosis of brain disorders.  Conclusions  This study proposes a novel multi-channel graph embedding framework that integrates spatial topological and temporal features derived from multi-resolution fMRI data, leading to marked improvements in classification performance. Experimental results show that the MSTF-GCN method exceeds current state-of-the-art algorithms, with accuracy gains of 3.92% and 8.98% on the Peking and NI sites, respectively. These findings confirm its strong performance and cross-site robustness in ADHD classification. Future work will focus on constructing more expressive hypergraph neural networks to capture higher-order relationships within functional brain networks.
3D Model Classification Based on Central Anchor Hard Triplet Loss and Multi-view Feature Fusion
GAO Xueyao, ZHANG Yunkai, ZHANG Chunxiang
2025, 47(6): 1937-1949.   doi: 10.11999/JEIT240633
[Abstract](92) [FullText HTML](44) [PDF 3935KB](18)
Abstract:
  Objective  In view-based 3D model classification, deep learning algorithms extract more representative features from 2D projections to improve classification accuracy. However, several challenges remain. A single view captures information only from a specific perspective, often leading to the omission of critical features. To address this, multiple views are generated by projecting the 3D model from various angles. These multi-view representations provide more comprehensive information through fusion. Nonetheless, the feature content of each view differs, and treating all views equally may obscure discriminative information. Moreover, inter-view complementarity and correlations may be overlooked. Effective utilization of multi-view information is therefore essential to enhance the accuracy of 3D model classification.  Methods  A 3D model classification method based on Central Anchor Hard Triplet Loss (CAH Triplet Loss) and multi-view feature fusion is proposed. Firstly, multi-view sets of 3D models are used as input, and view features are extracted using a Deep Residual Shrinkage Network (DRSN). These features are then fused with the 2D shape distribution features D1, D2, and D3 to obtain fused features of the 2D views. Secondly, Shannon entropy is applied to evaluate the uncertainty of view classification based on the fused features. The multiple views of each 3D model are then ranked in descending order of view saliency. Thirdly, triple network based on an Attention-enhanced Long Short-Term Memory (Att-LSTM) architecture is constructed for multi-view feature fusion. The LSTM component captures contextual dependencies among views, while a multi-head attention mechanism is integrated to fully capture inter-view relevance. Fourth, metric learning is applied by combining CAH Triplet Loss with Cross-Entropy Loss (CE Loss) to optimize the fusion network. This combined loss function is designed to reduce the feature-space distance between similar samples while increasing the distance between different samples, thereby enhancing the network’s capacity to learn discriminative features from 3D models.  Results and Discussions  When DRSN is used to extract view features from 2D projections and softmax is applied for classification, the 3D model classification achieves the highest accuracy, as shown in Table 1. The integration of shape distribution features D1, D2, and D3 with view features yields a more comprehensive representation of the 3D model, which significantly improves classification accuracy (Table 2). Incorporating CAH Triplet Loss reduces intra-class distances and increases inter-class distances in the feature space. This guides the network to learn more discriminative feature representations, further improving classification accuracy, as illustrated in Figure 4. The application of Shannon entropy to rank view saliency enables the extraction of complementary and correlated information across multiple views. This ranking strategy enhances the effective use of multi-view data, resulting in improved classification performance, as shown in Table 3.  Conclusions  This study presents a novel multi-view 3D model classification framework that achieves improved performance through 3 key innovations. Firstly, a hybrid feature extraction strategy is proposed, combining view features extracted by the DRSN with 2D shape distribution features D1, D2, and D3. This fusion captures both high-level semantic and low-level geometric characteristics, enabling a comprehensive representation of 3D objects. Secondly, a view saliency evaluation mechanism based on Shannon entropy is introduced. This approach dynamically assesses and ranks views according to their classification uncertainty, ensuring that the most informative views are prioritized and that the complementarity among views is retained. At the core of the architecture lies a feature fusion module that integrates Long Short-Term Memory (LSTM) networks with multi-head attention mechanisms. This dual-path structure captures sequential dependencies across ordered views through LSTM and models global inter-view relationships through attention, thereby effectively leveraging view correlation and complementarity. Thirdly, the proposed CAH Triplet Loss combines center loss and hard triplet loss to simultaneously minimize intra-class variation and maximize inter-class separation. Together with cross-entropy loss, this joint optimization enhances the network’s ability to learn discriminative features for robust 3D model classification.
Hierarchical Network-Based Multi-Task Learning Method for Fishway Water Level Prediction
SU Xin, QIN Zijian, LÜ Jia, QIN Mingyu
2025, 47(6): 1950-1965.   doi: 10.11999/JEIT241003
[Abstract](141) [FullText HTML](78) [PDF 7200KB](27)
Abstract:
  Objective  The construction of dams and other large-scale water infrastructure projects has significant ecological consequences, particularly affecting fish migration patterns. These environmental changes pose substantial challenges to biodiversity conservation and resource management. One of the key challenges is the accurate and real-time prediction of water levels in fish passages, which is essential for mitigating the negative effects of dams on fish migration, maintaining ecological balance, and ensuring the sustainability of aquatic species. Traditional water level monitoring systems often face limitations, such as insufficient coverage, lack of real-time predictive capabilities, and an inability to capture complex temporal dependencies in water level fluctuations, leading to inaccurate or delayed predictions. Furthermore, the processing of long-term, high-dimensional water level data in dynamic environments remains a critical gap in existing systems. To address these issues, this study proposes a Hierarchical Network-based Fish Passage Monitoring System (HNFMS) and a novel Multi-Task (MT) learning model, Adaptive Sequence Self-Organizing Map Transformation based on Variational Mode Decomposition (AS-SOMVT). The HNFMS aims to enhance both the efficiency and coverage of water level monitoring by providing comprehensive and timely data. The AS-SOMVT model employs auxiliary sequences to improve prediction accuracy and manage dynamic, multi-dimensional water level data in real time. Through these innovations, this study aims to enhance fish passage monitoring, mitigate the ecological impact of dam construction on fish migration, and provide a robust tool for ecological conservation and resource management.  Methods  The HNFMS integrates a hierarchical network structure to improve both the efficiency and coverage of water level monitoring. To address the complex temporal dependencies inherent in water level fluctuations, this study introduces the AS-SOMVT MT learning model. This model leverages auxiliary sequences to enhance the ability to capture complex temporal relationships, ensuring accurate water level predictions. The approach enables real-time processing of multi-dimensional water level data, effectively managing the complexity of fluctuating water levels across varying conditions. Additionally, the study incorporates an Auxiliary Sequence Self-Organizing Map (AS-SOM) algorithm to optimize prediction efficiency for long sequences, further enhancing the model’s capacity to process high-dimensional, multi-variate water level data. The model also integrates a Variational Mode Decomposition (VMD) technique, which decomposes complex water level time series into different frequency components. This approach extracts key feature patterns with higher predictive value while filtering out noise and redundant information, improving data quality and enhancing the model’s predictive performance. To increase the robustness of the system, the study incorporates an ensemble of diverse machine learning techniques, including both deep learning models and traditional statistical methods. This ensemble is designed to adapt to varying environmental conditions and ensure robust performance across different situations.  Results and Discussions  The AS-SOMVT model significantly outperforms traditional models in water level prediction accuracy. The integration of auxiliary sequences allows the model to capture complex temporal dependencies more effectively, resulting in more reliable real-time predictions (Fig. 4). Furthermore, the incorporation of VMD improves the model’s ability to remove noise and extract crucial features, enhancing its adaptability to dynamic water level changes in real-world environments. Ablation experiments demonstrate that removing key components, such as feature Relationship modeling (Rel), Attention Pooling (AP), or MT Learning, leads to a substantial decline in model performance. This highlights the essential role these components play in improving predictive accuracy and managing complex patterns. Specifically, the removal of any of these components results in a marked decrease in precision and stability, highlighting the collaborative contribution of these elements within the MT learning framework. In multi-dimensional water level prediction tasks, the AS-SOMVT model performs exceptionally well, especially in dynamic environments. Additionally, the hierarchical structure of the HNFMS substantially enhances monitoring efficiency and coverage, providing more accurate and comprehensive water level data through real-time model adjustments (Fig. 8). In comparative experiments, the AS-SOMVT model consistently outperforms traditional models, particularly in forecasting multi-dimensional water levels, establishing it as a powerful tool for large-scale, real-time monitoring applications (Table 4).  Conclusions  The proposed HNFMS, combined with the AS-SOMVT MT learning model, offers an effective solution for real-time, accurate water level prediction in fish passages. This innovative approach not only enhances the efficiency and coverage of water level monitoring systems but also provides a valuable tool for mitigating the ecological impacts of dam constructions on fish migration. The integration of auxiliary sequences into the MT learning model has proven to be a critical factor in improving predictive performance, opening new opportunities for ecological conservation. As concerns about the ecological impacts of water infrastructure projects grow, the development of more accurate and efficient water level monitoring systems becomes increasingly vital for informing policy decisions, designing fish-friendly structures, and enhancing aquatic ecosystem management. This study presents a scientifically significant and practically necessary solution for promoting sustainable environmental practices. The integration of advanced machine learning techniques, such as MT learning and VMD, ensures the system can handle both short-term and long-term water level prediction tasks, addressing the complexities of environmental dynamics in real time. This research, therefore, makes a significant contribution to the field of environmental monitoring and provides essential insights for the future development of eco-friendly infrastructure.
Multimodal Intent Recognition Method with View Reliability
YANG Ying, YANG Yanqiu, YU Bengong
2025, 47(6): 1966-1975.   doi: 10.11999/JEIT240778
[Abstract](181) [FullText HTML](86) [PDF 1467KB](42)
Abstract:
  Objective  With the rapid advancement of human-computer interaction technologies, accurately recognizing users’ multimodal intentions in social chat dialogue systems has become essential. These systems must process both semantic and affective content to meet users’ informational and emotional needs. However, current approaches face two major challenges: ineffective cross-modal interaction and difficulty handling uncertainty. First, the heterogeneity of multimodal data limits the ability to leverage intermodal complementarity. Second, noise affects the reliability of each modality differently, and traditional methods often fail to account for these dynamic variations, leading to suboptimal fusion performance. To address these limitations, this study proposes a Trusted Multimodal Intent Recognition (TMIR) method. TMIR adaptively fuses multimodal information by assessing the credibility of each modality, thereby enhancing intent recognition accuracy and model interpretability. This approach supports intelligent and personalized services in open-domain conversational systems.  Methods  The TMIR method is developed to improve the accuracy and reliability of intent recognition in social chat dialogue systems. It consists of three core modules: a multimodal feature representation layer, a multi-view feature extraction layer, and a trusted fusion layer (Fig. 1). In the multimodal feature representation layer, BERT, Wav2Vec 2.0, and Faster R-CNN are used to extract features from text, audio, and video inputs, respectively. The multi-view feature extraction layer comprises a cross-modal interaction module and a modality-specific encoding module. The cross-modal interaction module applies cross-modal Transformers to generate cross-modal feature views, enabling the model to capture complementary information between modalities (e.g., text and audio). This enhances the expressiveness of the overall feature representation. The modality-specific encoding module employs Bi-LSTM to extract unimodal feature views, preserving the distinct characteristics of each modality. In the trusted fusion layer, features from each view are converted into evidence. Subjective opinions are formulated according to subjective logic theory and are fused dynamically using Dempster’s combination rules. This process yields the final intent recognition result and provides a measure of credibility. To optimize model training, a combinatorial strategy based on Dirichlet distribution expectation is applied, which reduces uncertainty and enhances recognition reliability.  Results and Discussions  The TMIR method is evaluated on the MIntRec dataset, achieving a 1.73% improvement in accuracy and a 1.1% increase in recall compared with the baseline (Table 2). Ablation studies confirm the contribution of each module: removing the cross-modal interaction and modality-specific encoding components results in a 3.82% drop in accuracy, highlighting their roles in capturing intermodal interactions and preserving unimodal features (Table 3). Excluding the multi-view trusted fusion module reduces accuracy by 1.12% and recall by 1.67%, demonstrating the effectiveness of credibility-based dynamic fusion in enhancing generalization (Table 3). Receiver Operating Characteristic (ROC) curve analysis (Fig. 2) shows that TMIR outperforms the MULT model in detecting both “thanks” and “taunt” intents, with higher Area Under the Curve (AUC) values. In terms of computational efficiency, TMIR maintains comparable FLOPs and parameter counts to existing multimodal models (Table 4), indicating its feasibility for real-world deployment. These results demonstrate that TMIR effectively balances performance and efficiency, offering a promising approach for robust multimodal intent recognition.  Conclusions  This study proposes a TMIR method. By addressing the heterogeneity and uncertainty of multimodal data—specifically text, audio, and video—the method incorporates a cross-modal interaction module, a modality-specific encoding module, and a multi-view trusted fusion module. These components collectively enhance the accuracy and interpretability of intent recognition. Experimental results demonstrate that TMIR outperforms the baseline in both accuracy and recall, and exhibits strong generalization in handling multimodal inputs. Future work will address class imbalance and the dynamic identification of emerging intent categories. The method also holds potential for broader application in domains such as healthcare and customer service, supporting its multi-domain scalability.
CFS-YOLO: An Early Fire Detection Method via Coarse and Fine Grain Search and Focus Modulation
FANG Xianjin, JIANG Xuefeng, XU Liuquan, FANG Zhongyi
2025, 47(6): 1976-1991.   doi: 10.11999/JEIT240928
[Abstract](392) [FullText HTML](231) [PDF 7104KB](87)
Abstract:
  Objective   Fire is a frequent disaster, and detecting early fire phenomena effectively can significantly reduce casualties. Traditional fire detection methods, which rely on sensor devices, struggle to accurately detect fires in open spaces. With the development of deep learning, fire detection can be automated through image capture devices like cameras, improving detection accuracy. However, early-stage fires are small and often obscured by occlusion or fire-like objects. Fire detection models, such as Faster Region-based Convolutional Neural Networks (R-CNN) and You Only Look Once (YOLO), often fail to meet real-time detection requirements due to their large number of parameters, which slow down inference. Additionally, existing models face challenges in preserving fire edges and color features, leading to reduced detection accuracy. To address these issues, this paper proposes CFS-YOLO, an early-stage fire recognition model that incorporates coarse- and fine-grained search and focus modulation, enhancing both the speed and accuracy of fire detection.  Methods   To enhance the detection efficiency of the model, a coarse- and fine-grained search strategy is introduced to optimize the lightweight structure of the Unit Inference Block (UIB) module, which consists of four possible instantiations (Fig. 2). A coarse-grained search quickly evaluates different network architectures by adapting the network topology, adding optional convolutional modules to the UIB, and modifying the arrangement and combination of the modules. Dimensionality tuning is performed during the search process to select feature map dimensions and convolutional kernel sizes, generating candidate architectures by expanding or compressing the network width. During the filtering process, candidate architectures are evaluated based on multiple performance metrics. A multi-objective optimization approach is used to find the Pareto-optimal solution, retaining candidate architectures that balance accuracy and efficiency. Weight sharing is employed to improve parameter reuse. The fine-grained search refines the candidate architectures from the coarse-grained search, dynamically adjusting hyperparameters such as learning rate, batch size, regularization coefficient, and optimization algorithm according to the model's performance during training. It analyzes and adjusts each module layer by layer to accelerate convergence and better adapt to data complexity. To address the challenges posed by complex scenes and interfering objects, a focus-modulated attention mechanism is introduced, as shown in (Fig. 3). The input fire images are processed through a lightweight linear layer, followed by selective aggregation of contextual information to the modulators of each query token through a hierarchical contextualization module and gating mechanism. These aggregated modulators are injected into each query token via affine changes to generate outputs. This approach helps tackle the challenges of detecting small targets or objects in complex backgrounds, effectively capturing long-range dependencies and contextual information in the image. Finally, to account for the effects of anchor shape and angle, the model introduces a ShapeIoU loss function (Fig. 4). This function considers the influence of distance, shape, and angle between the true and prior frames on the predictor’s frame regression, enabling accurate measurement of the similarity between the true and predicted frames.  Results and Discussions   (Table 1) presents the results of the ablation experiments. The results show that CFS-YOLO achieved optimal performance. Compared to the baseline model, CFS-YOLO improves precision, recall, and F1 score by 13.33%, 4.96%, and 9.36%, respectively, and increases fps by 22. The model also shows significant improvements in APflame, APsmoke, and mAP, with increases of 11.1%, 16.2%, and 13.65%, respectively, validating the model’s effectiveness. (Fig. 6) illustrates the detection heat map for the ablation model, demonstrating that the combination of the focus-modulated attention mechanism and the ShapeIoU loss function effectively captures key features, confirming their synergistic effect. (Fig. 7) shows the loss curve plots for IoU and ShapeIoU. At the 80th training cycle, the loss of the baseline model stabilizes and converges to 0.5. In contrast, the bounding box loss and DFL loss with the ShapeIoU loss function converge to 0.3 by the 40th cycle, while the classification loss reaches 0.15 by the 80th epoch, highlighting the effectiveness of the ShapeIoU loss function. (Table 2) compares the performance with several state-of-the-art target detection models, while (Table 3) presents a comparison of different flame detection algorithms.The results show that CFS-YOLO leads in performance and demonstrates higher computational efficiency, indicating its potential application value in the flame detection field. (Fig. 10) and (Fig. 11) provide visualizations of the CFS-YOLO detection results, showing its excellent performance in capturing fire information despite background interference and small fire targets.  Conclusions   CFS-YOLO demonstrates outstanding performance in early fire detection, achieving detection speeds of up to 75 frame. It provides high inference speeds, meeting the requirements for real-time detection. Compared to state-of-the-art object detection models, CFS-YOLO outperforms in both detection accuracy and speed.
Texture-Enhanced Infrared-Visible Image Fusion Approach Driven by Denoising Diffusion Model
WANG Hongyan, PENG Jun, YANG Kai
2025, 47(6): 1992-2004.   doi: 10.11999/JEIT240975
[Abstract](95) [FullText HTML](73) [PDF 4966KB](7)
Abstract:
  Objective  The growing demand for high-quality fusion of infrared and visible images in various applications has highlighted the limitations of existing methods, which often fail to preserve texture details or introduce artifacts that degrade structural integrity and color fidelity. To address these challenges, this study proposes a fusion method based on a denoising diffusion model. The approach employs a multi-scale spatiotemporal feature extraction and fusion strategy to improve structural consistency, texture sharpness, and color balance in the fused image. The resulting fusion images better align with human visual perception and demonstrate enhanced reliability in practical applications.  Methods  The proposed method integrates a denoising diffusion model to extract multi-scale spatiotemporal features from infrared and visible images, enabling the capture of fine-grained structural and textural information. To improve edge preservation and reduce blurring, a high-frequency texture enhancement module based on convolution operations is employed to strengthen edge representation. A Dual-directional Multi-scale Convolution Module (DMCM) extracts hierarchical features across multiple scales, while a Bidirectional Attention Fusion Module dynamically emphasizes key global information to improve the completeness of feature representation. The fusion process is optimized using a hybrid loss function that combines adaptive structural similarity loss, multi-channel intensity loss, and multi-channel texture loss. This combination improves color consistency, structural fidelity, and the retention of high-frequency details.  Results and Discussions  Experiments conducted on the Multi-Spectral Road Scenarios (MSRS) and TNO datasets demonstrate the effectiveness and generalization capacity of the proposed method. In daytime scenes (Fig. 4, Fig. 5), the method reduces edge distortion and corrects color saturation imbalance, producing sharper edges and more balanced brightness in high-contrast regions such as vehicles and road obstacles. In nighttime scenes (Fig. 6), it maintains the saliency of thermal targets and smooth color transitions, avoiding spectral artifacts typically introduced by simple feature fusion. Generalization tests on the TNO dataset (Fig. 7) confirm the robustness of the approach. In contrast to the overlapping light source artifacts observed in Dif-Fusion, the proposed method enhances thermal targets while preserving background details. Quantitative evaluation (Table 1, Fig. 8) shows improved contrast, structural fidelity, and edge preservation.  Conclusions  This study presents a texture-enhanced infrared–visible image fusion method driven by a denoising diffusion model. By integrating multi-scale spatiotemporal feature extraction, feature fusion, and hybrid loss optimization, the method demonstrates clear advantages in texture preservation, color consistency, and edge sharpness. Experimental results across multiple datasets confirm the fusion quality and generalization capability of the proposed approach.
A Multiparameter Spoofing Detection Method Based on Parallel CNN-Transformer Neural Network with Gating Mechanism
ZHUANG Xuebin, NIU Ben, LIN Zijian, ZHANG Linjie
2025, 47(6): 2005-2014.   doi: 10.11999/JEIT240977
[Abstract](220) [FullText HTML](95) [PDF 5796KB](57)
Abstract:
  Objective  Global Navigation Satellite Systems (GNSS) provide location, velocity, and timing services globally and are widely used. However, their signals are highly susceptible to interference from natural environments or human factors, and existing single-parameter and multi-parameter detection methods have limitations. In an increasingly complex electromagnetic environment, satellite navigation systems face a growing risk of deception and interference. Therefore, it is essential to refine deception interference detection techniques to enhance the generality and adaptability of detection algorithms. This study proposes a multi-parameter deception interference detection algorithm that addresses the limitations of existing methods, ensures the secure and reliable operation of GNSS receivers, and contributes to the safety and stability of satellite navigation systems.  Methods  Extract key information from the receiver tracking phase. Select five observation metrics: code rate, discriminator result, Doppler shift, carrier-to-noise ratio, and SQM index. Due to the large fluctuations in the original values, apply sliding window processing using Moving Variance (MV) and Moving Mean (MA) to obtain nine feature parameters, forming a multidimensional time series sample. This approach better captures signal feature trends, reduces the effect of data fluctuations, and provides a stable and reliable data foundation for subsequent detection. Construct a Parallel CNN-Transformer Neural network (PCTN) based on a gating mechanism. The network consists of three convolutional neural network modules, eight Transformer encoder modules, and one gating module. The gating mechanism learns the weights of the two branches, fuses their outputs, and detects deception interference signals. Evaluate the model using the TEXBAT dataset and an actual dataset, comparing its performance with five existing algorithms.  Results and Discussions  The PCTN algorithm performs well on the TEXBAT dataset. As shown in Fig. 6, its classification accuracy for real signals reaches 99.222%, exceeding that of the five comparison algorithms. The ROC curve (Fig. 8) and evaluation metrics (Table 3) indicate that the PCTN algorithm achieves the highest AUC value and outperforms others in accuracy, precision, recall, and F1 score, demonstrating stable classification performance across various deception scenarios and effectively distinguishing deception signals from real signals. A deception interference collection platform collects actual data, and after fine-tuning, the model is tested. The PCTN algorithm maintains significant advantages, achieving the highest AUC value in the ROC curve (Fig. 10). As shown in Table 4, its detection accuracy remains above 94.5%, exceeding other algorithms. Compared with its performance on the TEXBAT dataset, the PCTN algorithm exhibits only a 5% decrease on the actual dataset, significantly lower than other algorithms. This demonstrates its robustness, strong generalization capability, and effectiveness in detecting deception interference in new scenarios.  Conclusions  This study proposes a multi-parameter deception interference detection algorithm based on Deep Learning (DL). The method extracts multiple parameter features from the receiver tracking stage, forms multidimensional time series samples, and employs the PCTN model for detection. Experimental results demonstrate that, compared with five existing algorithms, the proposed method offers significant advantages. On the TEXBAT dataset, it achieves high accuracy across various deception scenarios. On the actual dataset, it exhibits better generalization performance and effectively differentiates deceptive signals from real signals, even with new datasets. Future research can focus on deploying the algorithm on hardware platforms to enable real-time and accurate deception interference detection in practical satellite navigation scenarios. This will further enhance the security of satellite navigation systems and support the reliable application of satellite navigation technology in complex electromagnetic environments.
Circuit and System Design
A High-quality Factor Mode-localized MEMS Electric Field Sensor
WANG Guijie, CHU Zhaozhi, YANG Pengfei, RAN Lifang, PENG Chunrong, LI Jianhua, ZHANG Bo, WEN Xiaolong
2025, 47(6): 2015-2022.   doi: 10.11999/JEIT241008
[Abstract](169) [FullText HTML](148) [PDF 3396KB](26)
Abstract:
  Objective  High-performance Micro-Electro-Mechanical Systems (MEMS) Electric Field Sensors (EFS) are essential for measuring atmospheric electric fields and non-contact voltage. The mode localization effect can significantly improve resolution and is a recent focus in EFS research. However, in weakly coupled resonant systems, mode aliasing occurs when the quality factor is low, hindering the extraction of valid amplitude information. This study proposes a novel resonant MEMS EFS based on mode localization. The sensor employs a Double-Ended Tuning Fork (DETF) structure and a T-shaped tether to minimize energy loss, achieving a high-quality factor and resolution while effectively mitigating mode aliasing. This study presents theoretical analysis and numerical simulations. A prototype is fabricated and tested at a pressure of \begin{document}$ {10}^{-3}\;\mathrm{P}\mathrm{a} $\end{document}. Experimental results demonstrate that within an electric field range of \begin{document}$ 0~90\;\mathrm{k}\mathrm{V}/\mathrm{m} $\end{document}, the EFS exhibits a resolution of \begin{document}$ 32(\mathrm{V}/\mathrm{m})/\sqrt {\mathrm{H}\mathrm{z}} $\end{document}, and a quality factor of 42,423.  Methods  The sensor comprises two coupled resonators based on a tuning fork and T-shaped tether structure. It utilizes the principle of mode localization and an amplitude ratio output metric to enhance electric field sensing performance and prevent mode aliasing. The primary measurement principle is based on the transmission of induced charge from the electric field sensing electrode to the perturbed electrode of Resonator 1 through an electrical connection. This perturbed electrode generates a negative electrostatic perturbation, inducing mode localization in the coupled resonators. The resulting change in the amplitude ratio enables electric field detection. Furthermore, the tuning fork and T-shaped tether structure are designed to minimize clamping and anchor losses, thereby achieving a high-quality factor and effectively mitigating mode aliasing.  Results and Discussions  This study presents a mode-localized MEMS EFS that achieves a high-quality factor of 42,423 and a high resolution of \begin{document}$ 32\;(\mathrm{V}/\mathrm{m})/\sqrt {\mathrm{H}\mathrm{z}} $\end{document}, effectively preventing mode aliasing. Experiments are conducted in a vacuum chamber at a pressure of \begin{document}$ {10}^{-3}\;\mathrm{P}\mathrm{a} $\end{document}. The vacuum environment leads to heat accumulation from the amplifiers on the circuit board, increasing the board’s temperature and causing temperature drift in the sensor. Temperature drift is identified as the primary source of error in sensor testing. Future work will focus on testing the sensor chip with vacuum packaging to mitigate temperature drift caused by the vacuum chamber. Further optimization of the chip and circuit structures is conducted to minimize the effects of feedthrough and parasitic capacitance. Additionally, a differential structure will be designed to enhance common-mode rejection.  Conclusions  This study addresses mode aliasing in weakly coupled structures by proposing a mode-localized EFS based on a DETF and a T-shaped tether design. The DETF reduces clamping losses, while the T-shaped tether minimizes anchor losses. These structural optimizations reduce energy dissipation, enhance the quality factor, and effectively mitigate mode aliasing. The structural design, working principle, and sensitivity characteristics of the sensor are analyzed through numerical simulations, demonstrating that a lower quality factor under the same coupling strength can induce mode aliasing. The sensor fabrication process is introduced, and a prototype is developed. A testing system is established to evaluate the sensor’s performance in both open-loop and closed-loop configurations. Experimental results indicate that under a pressure of \begin{document}$ {10}^{-3}\;\mathrm{P}\mathrm{a} $\end{document} and within an electric field range of \begin{document}$ 0~90\;\mathrm{k}\mathrm{V}/\mathrm{m} $\end{document}, the sensor achieves a quality factor of 42,423, a resolution of \begin{document}$ 32\;(\mathrm{V}/\mathrm{m})/\sqrt {\mathrm{H}\mathrm{z}} $\end{document}, and a sensitivity of \begin{document}$ 0.0336\;/(\mathrm{k}\mathrm{V}/\mathrm{m}) $\end{document}. The sensor demonstrates a high-quality factor and excellent electric field resolution while effectively mitigating mode aliasing in mode-localized sensors. This work provides valuable insights for EFS research and the structural design of mode-localized sensors.
News
more >
Conference
more >
Author Center

Wechat Community