Latest Articles

Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes/issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Radar High-speed Target Tracking via Quick Unscented Kalman Filter
SONG Jiazhen, SHI Zhuoyue, ZHANG Xiao-Ping, LIU Zhenyu
Available online  , doi: 10.11999/JEIT250010
Abstract:
  Objective  The increasing prevalence of high-speed targets due to advancements in space technology presents new challenges for radar tracking. The pronounced motion of such targets within a single frame induces large variations in range, causing dispersion of echo energy across the range-Doppler plane and invalidating the assumption of concentrated target energy. This results in "range cell migration" and "Doppler cell migration", both of which degrade tracking accuracy. To address these challenges, this study proposes a Quick Unscented Kalman Filter (Q-UKF) algorithm tailored for high-speed radar target tracking. The Q-UKF performs recursive, pulse-by-pulse state estimation directly from radar echo signals, thereby improving tracking precision and eliminating the need for conventional energy correction and migration compensation. Furthermore, the algorithm employs the Woodbury matrix identity to reduce computational burden while preserving the estimation accuracy of the standard Unscented Kalman Filter (UKF).  Methods  The target state vector at each pulse time is modeled as a three-dimensional random vector representing position, velocity, and acceleration. Target motion is governed by a kinematic model that characterizes its temporal dynamics. A measurement model is formulated based on the radar echo signals received at each pulse, defining a nonlinear relationship between the target state and the observed measurements. This formulation supports recursive state estimation. In the classical UKF, the high dimensionality of radar echo data necessitates frequent inversion of large covariance matrices, imposing a substantial computational burden. To mitigate this issue, the Q-UKF is developed. By incorporating the Woodbury matrix identity, the Q-UKF reduces the computational complexity of matrix inversion without compromising estimation accuracy relative to the classical UKF. Within this framework, Q-UKF performs pulse-by-pulse recursive estimation, integrating all measurements up to the current pulse to improve prediction accuracy. In contrast to conventional radar tracking methods that process complete frame data and apply multiple signal correction steps, Q-UKF operates directly on raw measurements and avoids such corrections, thereby simplifying the processing pipeline. This efficiency makes Q-UKF well suited for real-time tracking of high-speed targets.  Results and Discussions  The performance of the proposed Q-UKF method is assessed using Monte Carlo simulations. Estimation errors of the Q-UKF and Extended Kalman Filter (EKF) are compared over time (Fig. 3). During the effective pulse periods within each frame cycle, both methods yield accurate target state estimates. Estimation errors increase during the delay intervals, but rapidly decrease and stabilize once effective pulse signals resume, forming a periodic error pattern. To evaluate robustness, the Root Mean Square Error (RMSE) of state estimation is examined under varied initial conditions, including different positions, velocities, and accelerations. In all scenarios, both Q-UKF and EKF perform reliably, with Q-UKF consistently demonstrating superior accuracy. Under Signal-to-Noise Ratios (SNRs) from –15 dB to 0 dB, the RMSEs in both Gaussian and Rayleigh noise environments (Fig. 5a and Fig. 5b) decrease with increasing SNR. Q-UKF maintains high accuracy even under low SNR conditions. In the Gaussian noise setting, Q-UKF improves estimation accuracy by an average of 10.60% relative to EKF; in the Rayleigh environment, the average improvement is 9.55%. In terms of computational efficiency, Q-UKF demonstrates the lowest runtime among the evaluated methods (EKF, UKF, and Particle Filter [PF]). The average computation time per effective pulse is reduced by 8.91% compared to EKF, 72.55% compared to UKF, and over 90% compared to PF (Table 2). This efficiency gain results from applying the Woodbury matrix identity, which alleviates the computational load of matrix inversion in high-dimensional radar echo data processing.  Conclusions  This study presents the Q-UKF method for high-speed target tracking in radar systems. The algorithm performs pulse-by-pulse state estimation directly from radar echo signals, advancing estimation granularity from the frame level to the pulse level. By removing the need for energy accumulation and migration correction, Q-UKF simplifies the conventional signal processing pipeline. The method incorporates the Woodbury matrix identity to efficiently invert covariance matrices, substantially reducing computational load. Simulation results show that Q-UKF consistently outperforms the EKF in estimation accuracy under varied initial target states, achieving an average improvement of approximately 10.60% under Gaussian noise and 9.55% under Rayleigh noise. Additionally, Q-UKF improves computational efficiency by 8.91% compared to EKF. Compared to the classical UKF, Q-UKF delivers equivalent accuracy with significantly reduced runtime. Although the PF may yield slightly better accuracy under certain conditions, its computational demand limits its practicality in real-time applications. Overall, Q-UKF provides a favorable balance between accuracy and efficiency, making it a viable solution for real-time tracking of high-speed targets. Its ability to address high-dimensional, nonlinear measurement problems also highlights its potential for broader application.
Priority-aware Per-flow Size Measurement in High-speed Networks
GAO Guoju, ZHOU Shaolong, SUN Yu-E, HUANG He
Available online  , doi: 10.11999/JEIT240834
Abstract:
  Objective  Network traffic measurement is essential for supporting applications such as anomaly detection and capacity planning. With growing demand for flow-level analysis, traffic measurement technologies are facing increasing performance requirements. In typical network environments, a flow comprises packets sharing a common five-tuple (including source/destination IP address, source/destination port, and protocol). Measuring per-flow size presents three core challenges: high data volume, fast transmission rates, and limited on-chip memory. Sketch-based data structures offer an effective trade-off among memory efficiency, query speed, and measurement accuracy, and have been widely adopted for tasks such as per-flow size estimation, cardinality estimation, persistent flow detection, and burst detection. However, the rising need for differentiated traffic handling has highlighted the limitations of traditional Sketches, which treat all flows uniformly. Existing priority-aware Sketches often fail to maintain both high accuracy for high-priority flows and overall system throughput. To address this gap, this study proposes EssentialKeeper, a priority-aware algorithm that combines priority-sensitive hashing with Cuckoo Hashing. The proposed method ensures accurate measurement for high-priority flows while maintaining efficient system-wide performance. This approach supports differentiated traffic measurement in high-speed networks and contributes both theoretical insights and practical value.  Methods  In practical networks, different traffic types have distinct requirements for measurement accuracy. For example, suspicious or malicious flows require high-precision measurement for security monitoring, whereas latency-sensitive services such as real-time video streaming demand continuous tracking to maintain service quality. To accommodate these varying demands, several priority-aware Sketch algorithms have been proposed. These typically partition memory into high- and low-priority regions, assigning different levels of accuracy according to flow priority. All incoming traffic first passes through the high-priority region, where high-priority flows are retained, whereas others are redirected with degraded measurement accuracy. This architecture, however, presents performance challenges. Because low-priority flows constitute the majority of network traffic, they still traverse the high-priority region, incurring additional hash computations and memory access overhead. This overhead substantially lowers throughput. Algorithms such as MC-Sketch and Cuckoo Sketch are particularly affected. Although PA-Sketch introduces priority-aware hashing to reduce the processing load for low-priority flows, it compromises measurement accuracy for medium-priority flows, limiting its practical utility. To address these limitations, this study proposes EssentialKeeper, a new Sketch algorithm for efficient priority-aware traffic measurement under constrained memory conditions. The algorithm combines priority-aware hashing with Cuckoo Hashing. For high-priority flows, it dynamically allocates more hash functions and candidate buckets, using Cuckoo hashing’s "kick-out and relocate" mechanism to enhance measurement precision. For low-priority flows, it employs an optimized Count-Sketch (CS-Sketch) structure to ensure fast processing. This hybrid design sustains high throughput while ensuring accurate tracking of high-priority traffic, thereby resolving the speed–accuracy trade-off that limits existing approaches.  Results and Discussions  This study evaluates EssentialKeeper using the real-world CAIDA-2019 traffic dataset and a network interaction dataset derived from Stack Overflow. Performance is assessed under different priority allocation strategies—random and size-based—and across a range of memory configurations. Optimal algorithm parameters are determined through systematic tuning (Fig. 35). Compared with existing priority-aware Sketches, EssentialKeeper demonstrates substantial improvements across three key metrics. Under the random priority allocation strategy, the average relative error for high-priority flows decreases by 63.2%, while the F1-score increases by 14.8% (Fig. 6, Fig. 7). With size-based priority allocation, the error is reduced by 53.8%, and the F1-score improves by 11.8% (Fig. 8, Fig. 9). Additionally, EssentialKeeper achieves a 10.8% increase in throughput (Fig. 10), while maintaining lower memory overhead. These results highlight the effectiveness of EssentialKeeper in supporting accurate and efficient priority-aware traffic measurement in high-speed network environments.  Conclusions  This study proposes EssentialKeeper, a novel algorithm for priority-aware traffic measurement in high-speed networks. By enhancing the structure of existing priority-aware Sketches, the algorithm enables accurate, differentiated measurement based on flow priority. It combines the efficient conflict resolution of Cuckoo Hashing with the adaptive precision of priority-aware hashing, thereby improving measurement accuracy for high-priority flows while sustaining high throughput for low-priority traffic. Experimental results demonstrate that EssentialKeeper reduces the average relative error of high-priority flows by 58.5%, increases the F1-score by 13.3%, and improves overall system throughput by 10.8% compared to the best existing approaches, achieving a favorable trade-off between speed and accuracy. Despite these advances, several challenges remain. One is the integration with sampling algorithms. Since high-priority flows often carry more critical information, future work could explore dynamic sampling strategies that retain high-priority packets while selectively discarding lower-priority traffic. This hybrid approach may further reduce system overhead without compromising measurement precision. Another direction is task generalization. Beyond per-flow size and cardinality estimation, other core measurement tasks—such as persistent flow detection and burst detection—may benefit from priority-aware techniques. Extending EssentialKeeper to support these applications would broaden its utility. Finally, current experiments are conducted in a CPU-based environment. However, practical deployment in production networks may require adaptation to hardware platforms such as P4 switches or FPGAs, which impose tighter resource constraints. Future research should focus on implementing and optimizing priority-aware Sketch algorithms for hardware deployment to assess feasibility and facilitate real-world adoption.
A 2 pJ/bit, 4×112 Gbps PAM4 linear driver for MZM in LPO Application
ZHANG Shuan, ZHU Wenrui, GU Yuandong, LEI Meng, ZHANG Jianling
Available online  , doi: 10.11999/JEIT250176
Abstract:
  Objective  The rapid increase in data transmission demands, driven by big data, cloud computing, and Artificial Intelligence (AI), requires advanced optical module technologies capable of supporting higher data rates, such as 800 G. Conventional optical modules depend on power-intensive Digital Signal Processors (DSPs) for signal compensation, which increases cost, complexity, and energy consumption. This study addresses these limitations by proposing a Linear Driver Pluggable Optics (LPO) solution that eliminates the DSP while preserving high performance. The primary objective is to design a low-power, high-efficiency Mach–Zehnder Modulator (MZM) driver using 130 nm SiGe BiCMOS technology for 400 Gbps PAM4 applications. The design integrates Continuous-Time Linear Equalization (CTLE) and gain control to support reliable, cost-effective, and energy-efficient data transmission.  Methods  The proposed quad-channel MZM driver adopts a two-stage architecture: a merged Continuous-Time Linear Equalizer (CTLE) and Variable Gain Amplifier (VGA) stage (Stage 1), and an output driver (OUTDRV) stage (Stage 2). By integrating CTLE and VGA functions (Fig. 3), the design removes the pre-driver stage, improves current reuse, and enhances drive capability. Stage 1 employs a Gilbert cell-based core amplifier (Fig. 5a) with programmable peaking via Re and Ce, enabling a transfer function with adjustable gain (\begin{document}$ \eta $\end{document}) and peaking characteristics (Eq. 1). A novel low-frequency gain adjustment branch (Fig. 6) mitigates nonlinearity induced by conductor loss (Fig. 4), resulting in a flattened frequency response (Eq. 2). Stage 2 uses a cascode open-drain output structure to achieve a 3 Vppd swing at 56 Gbaud while reducing power consumption. Simulations and measurements confirm the design’s performance, with key metrics including S-parameters, Total Harmonic Distortion (THD), and Transmitter Dispersion Eye Closure for PAM4 (TDECQ).  Results and Discussions  The driver achieves a maximum gain of 19.49 dB with 9.2 dB peaking and a 12.57 dB gain control range. Measured S-parameters (Fig. 9a–b) confirm the 19.49 dB gain, 47 GHz bandwidth, and a 4.4 dB programmable peaking range. The low-frequency adjustment circuit reduces gain by 1.6 dB below 3 GHz (Fig. 9c), effectively compensating for distortion caused by the skin effect. THD remains below 3.5% across input swings of 300–800 mVppd (Fig. 10). Eye diagrams (Fig. 11) demonstrate 56 Gbaud PAM4 operation, achieving a 3 Vppd output swing with TDECQ below 2.57 dB. The driver achieves a power efficiency of 2 pJ/bit (225.23 mW per channel), outperforming previous designs (Table I). The use of a single 3.3 V supply eliminates the need for external DC–DC converters, facilitating system integration. Compared with recent drivers [11,14–16], this work demonstrates the highest data rate (112 Gb/s via PAM4) implemented in a mature 130 nm process while maintaining the lowest power consumption per bit.  Conclusions  This study presents a high-performance, energy-efficient MZM driver designed for LPO-based 400 Gbps optical modules. Key contributions include the merged CTLE–VGA architecture for optimized current reuse, a low-frequency gain adjustment technique that mitigates skin effect distortion, and a cascode output stage that achieves high swing and linearity. Measured results are consistent with simulations, confirming 19.49 dB gain, 3 Vppd output swing, and 2 pJ/bit energy efficiency. The elimination of DSPs, compatibility with cost-effective BiCMOS technology, and improved power performance highlight the driver’s potential for deployment in next-generation data centers and high-speed optical interconnects.
Siamese Network-assisted Multi-domain Feature Fusion for Radar Active Jamming Recognition Method
LI Ning, WANG Zan, SHU Gaofeng, ZHANG Tingwei, GUO Zhengwei
Available online  , doi: 10.11999/JEIT240797
Abstract:
  Objective  The rapid development of electronic warfare technology has introduced complex scenarios in which active jamming presents considerable challenges to radar systems. On modern battlefields, the electromagnetic environment is highly congested, and various forms of active jamming signals frequently disrupt radar functionality. Although existing recognition algorithms can identify certain types of radar active jamming, their performance declines under low Jamming-to-Noise Ratio (JNR) conditions or when training data are scarce. Low JNR reduces the detectability of jamming signals by conventional methods, and limited sample size further constrains recognition accuracy. To address these challenges, neural network-based methods have emerged as viable alternatives. This study proposes a radar active jamming recognition approach based on multi-domain feature fusion assisted by a Siamese network, which enhances recognition capability under low JNR and small-sample conditions. The proposed method offers an intelligent framework for improving jamming recognition in complex environments and provides theoretical support for battlefield awareness and the design of effective counter-jamming strategies.  Methods  The proposed method comprises a multi-domain feature fusion subnetwork, a Siamese architecture, and a joint loss design. To extract jamming features effectively under low JNR conditions, a multi-domain feature fusion subnetwork is developed. Specifically, a semi-soft thresholding shrinkage module is proposed by integrating a semi-soft threshold function with an attention mechanism. This module efficiently extracts time-domain features and eliminates the limitations of manual threshold selection. To enhance the extraction of time-frequency domain features, a multi-scale convolution module and an additional attention mechanism are incorporated. To reduce the model’s dependence on large training datasets, a weight-sharing Siamese network is constructed. By comparing similarity between sample pairs, this network increases the number of training iterations, thereby mitigating the limitations imposed by small sample sizes. Finally, three loss functions are jointly applied: an improved weighted contrastive loss, an adaptive cross-entropy loss, and a triplet loss. This joint strategy promotes intra-class compactness and inter-class separability of jamming features.  Results and Discussions  When the number of training samples is limited (Table 6), the proposed method achieves an accuracy of 96.88% at a JNR of –6 dB with only 20 training samples, indicating its effectiveness under data-scarce conditions. With further reduction in sample size—specifically, when only 15 training samples are available per jamming type—the recognition performance of other methods declines substantially. In contrast, the proposed method maintains higher recognition accuracy, demonstrating enhanced stability and robustness under low JNR and limited sample conditions. This performance advantage is attributable to three key factors: (1) Multi-domain feature fusion integrates jamming features from multiple domains, preventing the loss of discriminative information commonly observed under low JNR conditions. (2) The weight-sharing Siamese network increases the number of effective training iterations by evaluating sample similarities, thereby mitigating the limitations associated with small datasets. (3) The combined use of an improved weighted contrastive loss, an adaptive cross-entropy loss, and a triplet loss promotes intra-class compactness and inter-class separability of jamming features, enhancing the model’s generalization capability.  Conclusions  This study proposes a radar active jamming recognition method that performs effectively under low JNR and limited training sample conditions. A multi-domain feature fusion subnetwork is developed to extract representative features from both the time and time-frequency domains, enabling a more comprehensive and discriminative characterization of jamming signals. A weight-sharing Siamese network is then introduced to reduce reliance on large training datasets by leveraging sample similarity comparisons to expand training iterations. In addition, three loss functions—an improved weighted contrastive loss, an adaptive cross-entropy loss, and a triplet loss—are jointly applied to promote intra-class compactness and inter-class separability. Experimental results validate the effectiveness of the proposed method. At a low JNR of –6 dB with only 20 training samples, the method achieves a recognition accuracy of 96.88%, demonstrating its robustness and adaptability in challenging electromagnetic environments. These findings provide technical support for the development of anti-jamming strategies and enhance the operational reliability of radar systems in complex battlefield scenarios.
Evolutionary Optimization for Satellite Constellation Task Scheduling Based on Intelligent Optimization Engine
DU Yonghao, LI Lei, XU Shilong, CHEN Ming, CHEN Yingguo
Available online  , doi: 10.11999/JEIT240974
Abstract:
  Objective  The expansion of China’s aerospace capabilities has led to the widespread deployment of remote sensing satellites for applications such as land resource surveys and disaster monitoring. However, current methods face substantial challenges in the integrated scheduling of complex targets, including multi-frequency observations, dense point clusters, and wide-area imaging. This study develops an intelligent task planning engine architecture tailored for heterogeneous satellite constellations. By applying advanced modeling and evolutionary optimization techniques, the proposed framework addresses the collaborative scheduling of multi-dimensional targets, aiming to overcome key limitations in traditional satellite mission planning.  Methods  Through systematic analysis of models and algorithms, this study decouples the “Constraint-Decision-Reward” framework and develops an optimization algorithm module featuring “global evolution + local search + data-driven” strategies. At the modeling level, standard tasks are derived via target decomposition, and a multi-dimensional scheduling model for complex targets is established. At the algorithmic level, a Learning Memetic Algorithm (LMA) based on dual-model evolution is proposed. This approach incorporates strategies for initial solution generation, global optimization, and a generalized neighborhood search operator template to improve solution diversity and enhance global exploration capabilities. Additionally, data-driven optimization and dynamic multi-stage rapid insertion strategies are introduced to address real-time scheduling requirements.  Results and Discussions   Comprehensive experimental comparisons are conducted across three scenario scales—low, medium, and high difficulty—and three task planning scenarios (static scheduling, dynamic three-stage scheduling, and dynamic twelve-stage scheduling). Both classical and advanced algorithms are evaluated. Ablation experiments (Tables 4 and 5) assess the contribution of each component within the LMA. In all task scenarios, the proposed method consistently outperforms advanced algorithms, including adaptive large neighborhood search and the reinforcement learning genetic algorithm, as shown in Figure 11 and Table 3. The algorithm reliably completes iterations within 20 seconds, demonstrating high computational efficiency.  Conclusions  By standardizing complex targets and generating tasks, this research effectively addresses the integrated scheduling challenge of multi-dimensional objectives across heterogeneous resources. Experimental results show that the LMA outperforms traditional algorithms in terms of both solution quality and computational efficiency. The dual-model evolution mechanism enhances the algorithm’s global search capabilities, while the dynamic insertion strategy effectively handles scenarios with dynamically arriving tasks. These innovations highlight the algorithm’s significant advantages in aerospace mission scheduling.
An Uncertainty-driven Pixel-level Adversarial Noise Detection Method for Remote Sensing Images
YAO Xudong, GUO Yaping, LIU Mengyang, MENG Gang, LI Yang, ZHANG Haopeng
Available online  , doi: 10.11999/JEIT241157
Abstract:
  Objective  The development of remote sensing technology has expanded its range of applications. However, during image acquisition and transmission, various factors can introduce noise that reduces image quality and clarity, affecting the extraction of ground object information. In particular, adversarial noise poses serious security risks, as it compromises the robustness of intelligent algorithms and may lead to decision failures. Evaluating the accuracy and reliability of remote sensing image data is therefore essential, highlighting the need for dedicated adversarial noise detection methods. Existing adversarial defense strategies primarily detect adversarial samples generated by specific attack methods, but these approaches often exhibit high computational cost, limited transferability, and lack pixel-level detection capabilities. In large-scale remote sensing images, adversarial noise is typically concentrated in key local regions containing ground objects. To address these limitations, this study proposes an uncertainty-driven, pixel-level adversarial noise detection method for remote sensing images. The method integrates adversarial noise characteristic analysis with uncertainty modeling, enabling precise localization of adversarial noise and improving the reliability of remote sensing applications.  Methods  To address the limitations of existing adversarial sample detection algorithms, an uncertainty-driven pixel-level adversarial noise detection method is proposed. The approach uses Monte Carlo Batch Normalization (MCBN) for uncertainty modeling and exploits the typically high uncertainty of adversarial noise to enable pixel-level detection. In deep neural networks, inference based on the stochasticity of the batch mean and variance in Batch Normalization (BN) layers is theoretically equivalent to variational inference in Bayesian models. This enables pixel-wise uncertainty estimation without modifying the network architecture or training process. In general, high-frequency regions such as edges exhibit greater uncertainty. In adversarial samples, however, artificially altered texture details introduce abnormal uncertainty. The uncertainty in these regions increases with the intensity of the adversarial noise. The proposed method comprises three main components: a feature extraction network, adversarial sample identification, and pixel-level adversarial noise detection. The input image is processed by a feature extraction network with BN layers to generate multiple Monte Carlo samples. The mean of these samples is treated as the reconstructed image, and the standard deviation is used to generate the uncertainty map. To identify adversarial samples, the algorithm calculates the Mean Squared Error (MSE) between the reconstructed image and the input image. If the image is classified as adversarial, the corresponding uncertainty map is further used to localize adversarial noise at the pixel level.  Results and Discussions  The experimental evaluation first quantifies the performance of the proposed method in adversarial sample detection and performs a comparative analysis with existing approaches. It also examines the effectiveness of pixel-level adversarial noise detection from both quantitative and qualitative perspectives. Experimental results show that the proposed algorithm achieves high detection performance and strong adaptability to various adversarial attacks, with robust generalization capability. Specifically, the method maintains detection accuracy above 0.87 against adversarial samples generated by four attack algorithms—FGSM, BIM, DeepFool, and AdvGAN—indicating consistent generalization across different adversarial methods. Although adversarial samples generated by DeepFool exhibit higher visual imperceptibility, the proposed method sustains stable performance across all evaluation metrics. This robustness highlights its adaptability even to potential unknown adversarial attacks. To further evaluate its effectiveness, the method is compared with existing adversarial sample detection algorithms, including MAD, PACA, E2E-Binary, and DSADF. The results indicate that the proposed method achieves competitive results in accuracy, precision, recall, and F1-score, reflecting strong overall performance in adversarial sample detection. For adversarial samples, the method also performs pixel-level adversarial noise detection. Results confirm its effectiveness in identifying various types of adversarial noise, with high accuracy in localizing noise within specific regions, such as baseball fields and storage tanks. It successfully detects most noise-affected areas in remote sensing images. However, complex textures and high-frequency details in some background regions cause increased uncertainty, which may lead to false positives, with non-adversarial regions misclassified as adversarial noise. Despite this limitation, the method maintains high overall detection accuracy and a low false negative rate, supporting its practical value in high-security applications.  Conclusions  To address the limitations of existing adversarial noise detection algorithms, this study proposes an uncertainty-driven pixel-level detection method for remote sensing images. The approach integrates MCBN into the feature extraction network to generate multiple Monte Carlo samples. The sample mean is used as the reconstructed image, while the sample standard deviation provides uncertainty modeling. The method determines whether an image is adversarial based on the difference in MSE between clean and adversarial samples, and the uncertainty map is utilized to localize adversarial noise at the pixel level across various attack scenarios. Experiments are conducted using the publicly available DIOR dataset, with adversarial samples generated by four representative attack algorithms: FGSM, BIM, DeepFool, and AdvGAN. Quantitative and qualitative evaluations confirm the method’s effectiveness in detecting adversarial noise at the pixel level and demonstrate strong generalization across attack types. The ability to localize noise improves the transparency and interpretability of adversarial sample identification, supporting more informed and targeted mitigation strategies. Despite its strong performance, the method currently relies solely on uncertainty estimation and thresholding for segmentation, which may result in misclassification in regions with complex textures or high-frequency details. Future research will explore the integration of uncertainty modeling with additional features to improve detection accuracy in such regions.
Geometrically Consistent Based Neural Radiance Field for Satellite City Scene Rendering and Digital Surface Model Generation in Sparse Viewpoints
SUN Wenbo, GAO Zhi, ZHANG Yichen, ZHU Jun, Li Yanzhang, LU Yao
Available online  , doi: 10.11999/JEIT240898
Abstract:
  Objective   Satellite-based Earth observation enables global, continuous, multi-scale, and multi-dimensional surface monitoring through diverse remote sensing techniques. Recent progress in 3D modelling and rendering has seen widespread adoption of Neural Radiance Fields (NeRF), owing to their continuous-view synthesis and implicit geometry representation. Although NeRF performs robustly in areas such as autonomous driving and large-scale scene reconstruction, its direct application to satellite observation scenarios remains limited. This limitation arises primarily from the nature of satellite imaging, which often lacks the tens or hundreds of viewpoints typically required for NeRF training. Under sparse-view conditions, NeRF tends to overfit the available training perspectives, leading to poor generalization to novel viewpoints.   Methods   To address the performance limitations of NeRF under sparse-view conditions, this study proposes an approach that introduces geometric constraints on scene depth and surface normals during model training. These constraints are designed to compensate for the lack of prior knowledge inherent in sparse-view satellite imagery and to improve rendering and DSM generation. The approach leverages the importance of scene geometry in both novel view synthesis and DSM generation, particularly in accurately representing spatial structures through DSMs. To mitigate the degradation in NeRF performance under limited viewpoint conditions, the geometric relationships between scene depth and surface normals are formulated as loss functions. These functions enforce consistency between estimated depth and surface orientation, enabling the model to learn more reliable geometric features despite limited input data. The proposed constraints guide the model toward generating geometrically coherent and realistic scene reconstructions.   Results and Discussions   The proposed method is evaluated on the DFC2019 dataset to assess its effectiveness in novel view synthesis and DSM generation under sparse-view conditions. Experimental results demonstrate that the NeRF model with geometric constraints achieves superior performance across both tasks, confirming its applicability to satellite observation scenarios with limited viewpoints. For novel view synthesis, model performance is assessed using 2, 3, and 5 input images. The proposed method consistently outperforms existing approaches across all configurations. In the JAX 004 scene, Peak Signal-to-Noise Ratio (PSNR) values of 21.365 dB, 21.619 dB, and 23.681 dB are achieved under the 2-view, 3-view, and 5-view settings, respectively. Moreover, the method exhibits the smallest degradation in PSNR and Structural Similarity Index (SSIM) as the number of training views decreases, indicating greater robustness under sparse input conditions. Qualitative results further confirm that the method yields sharper and more detailed renderings across all view configurations. For DSM generation, the proposed method achieves comparable or better performance relative to other NeRF-based approaches in most test scenarios. In the JAX 004 scene, Mean Absolute Error (MAE) values of 2.414 m, 2.198 m, and 1.602 m are obtained under the 2-view, 3-view, and 5-view settings, respectively. Qualitative assessments show that the generated DSMs exhibit clearer structural boundaries and finer geometric details compared to those produced by baseline methods.   Conclusions   Incorporating geometric consistency constraints between scene depth and surface normals enhances the model’s ability to capture the spatial structure of objects in satellite imagery. The proposed method achieves state-of-the-art performance in both novel view synthesis and DSM generation tasks under sparse-view conditions, outperforming both NeRF-based and traditional Multi-View Stereo (MVS) approaches.
A Novel Earth Surface Anomaly Detection Method Based on Collaborative Reasoning of Deep Learning and Remote Sensing Indexes
WANG Libo, GAO Zhi, WANG Qiao
Available online  , doi: 10.11999/JEIT240882
Abstract:
  Objective  Earth Surface Anomalies (ESAs) refer to geographical phenomena that deviate from the normal state. They are characterized by wide distribution, high occurrence frequency, rapid evolution, and a large impact range. In recent years, sudden surface anomalies have occurred frequently, making remote sensing surface anomaly detection a prominent research topic. Although deep learning-based anomaly detection methods have made substantial progress, they still face two challenges: (1) limited learning ability under conditions of few samples, and (2) unreliable reasoning when identifying surface anomaly scenes with high inter-class similarity. To address these challenges, a novel surface anomaly detection method, DeepIndex, is proposed. This method leverages prior knowledge from large vision-language models to enhance few-sample learning and integrates remote sensing indexes to improve the reliability of identifying complex and similar surface anomaly scenes.  Methods  A novel scheme of “large-scale pre-trained foundational model + efficient fine-tuning” is employed to construct the entire network and implement training, thereby enabling efficient learning of surface anomaly features under conditions with few samples. Specifically, the foundational vision-language model, Contrastive Language-Image Pretraining (CLIP), is selected as the backbone of DeepIndex, with an efficient fine-tuning module developed to enhance few-sample learning. Leveraging the vision-language structure, DeepIndex can simultaneously encode image and text features, with the output category determined by text input, granting it open-set classification capability. Furthermore, DeepIndex innovatively integrates remote sensing indexes and physical mechanisms into the reasoning process, improving both interpretability and generalization performance. Specifically, DeepIndex first computes remote sensing indexes and applies an adaptive threshold segmentation method to generate binary segmentation maps. These maps are then processed to output the area ratio of the anomalous region. Based on the area ratio (with a default threshold of 0.1), potential surface anomaly categories are identified. The classification weights of these potential categories are then increased by 20%. Finally, DeepIndex uses the increased weights for classification, improving the identification of surface anomaly scenes with high inter-class similarity and enhancing reasoning reliability. Notably, DeepIndex increases weights only for categories with lower original confidence (<0.5), achieving a balance between regular and confused samples for stable classification. In summary, DeepIndex utilizes vision-language representation learning to develop a collaborative reasoning framework that integrates remote sensing indexes for surface anomaly detection. This framework improves the deep network’s reasoning capabilities and realizes the complementary advantages of deep learning and remote sensing indexes.  Results and Discussions  The effectiveness and superiority of the proposed DeepIndex are demonstrated using a self-constructed dataset, MultiSpectral Earth Surface Anomaly Detection (MS-ESAD), and the public dataset, NWPU45. The MS-ESAD dataset is challenging, containing 2,768 multispectral remote sensing images across six bands (red, green, blue, infrared, and two short infrared bands) and three types of surface anomalies (wildfire, green tide, and blue algae). This dataset provides a foundation for surface anomaly detection research. For evaluation, class Average Accuracy (AA) and Overall Accuracy (OA) metrics are used for both datasets. The ablation study (Tables 2 and 3) shows that the proposed DeepIndex collaborative reasoning framework significantly enhances zero-shot classification performance (9.84%) and improves the identification of confusing samples (7.39%). Quantitative and qualitative comparisons (Fig. 4, Table 4) further illustrate that DeepIndex achieves the best class AA (92.36%), which is 3.38% higher than the classic convolutional neural network ResNet and 0.42% higher than ViT. Additionally, compared to recent remote sensing scene classification networks, DeepIndex demonstrates more stable performance, owing to the integration of remote sensing index priors. For the NWPU45 dataset, experimental results (Fig. 5, Table 5) further highlight the advantages of DeepIndex under conditions with few samples (10% and 20% for training). Compared with advanced remote sensing image scene classification methods (e.g., EMSCNet) from the past two years, DeepIndex shows a slight accuracy advantage of 0.17% and 0.31%, respectively. These results demonstrate the strong application potential of DeepIndex for remote sensing image scene classification tasks, especially with limited training samples.  Conclusions  This paper combines physically constrained remote sensing indexes with deep networks and proposes a collaborative reasoning deep framework for Earth surface anomaly detection, named DeepIndex. Through large-scale pre-training and adaptive fine-tuning strategies, DeepIndex effectively learns highly generalized features from scarce samples. Additionally, DeepIndex adopts a unique reasoning pattern that utilizes remote sensing index priors to assist network discrimination, enhancing its ability to recognize complex and ambiguous surface anomaly scenes. Furthermore, this paper constructs a multispectral surface anomaly dataset that provides valuable data support for related research. The experimental results demonstrate that the integration of remote sensing indexes significantly improves classification performance under conditions with limited training samples. Compared with other advanced remote sensing scene classification methods, DeepIndex shows notable advantages in both accuracy and stability.
FSG: Feature-level Semantic-aware Guidance for multi-modal Image Fusion Algorithm
ZHANG Mei, JIN Ye, ZHU Jinhui, HE Lin
Available online  , doi: 10.11999/JEIT250042
Abstract:
  Objective  Multimodal vision techniques offer greater advantages than unimodal ones in autonomous driving scenarios. Fused images from multiple modalities enhance salient radiation information from targets while preserving background texture and detail. Furthermore, such fused images improve the performance of downstream visual tasks—i.e., semantic segmentation—compared with visible-light images alone, thereby enhancing the decision accuracy of automated driving systems. However, most existing fusion algorithms prioritize visual quality and standard evaluation metrics, often overlooking the requirements of downstream tasks. Although some approaches attempt to integrate task-specific guidance, they are constrained by weak interaction between semantic priors and fusion processes, and fail to address cross-modal feature variability. To address these limitations, this study proposes a multimodal image fusion algorithm, termed Feature-level Semantic-aware Guidance (FSG), which leverages feature-level semantic information from segmentation networks to guide the fusion process. The proposed method aims to enhance the utility of fused images in advanced vision tasks by strengthening the alignment between semantic understanding and feature integration.  Methods  The proposed algorithm adopts a parallel fusion framework integrating a fusion network and a segmentation network. Feature-level semantic prior knowledge from the segmentation network guides the fusion process, aiming to enhance the semantic richness of the fused image and improve performance in downstream visual tasks. The overall architecture comprises a fusion network, a segmentation network, and a feature interaction mechanism connecting the two. Infrared and visible images serve as inputs to the fusion network, whereas only visible images, which are rich in texture and detail, are used as inputs to the segmentation network. The fusion network uses a dual-branch structure for modality-specific feature extraction, with each branch containing two Adaptive Gabor Convolution Residual (AGR) modules. A Multimodal Spatial Attention Fusion (MSAF) module is incorporated to effectively integrate features from different modalities. In the reconstruction phase, semantic features from the segmentation network are combined with image features from the fusion network via a Dual Feature Interaction (DFI) module, enhancing semantic representation before generating the final fused image.  Results and Discussions  This study includes fusion experiments and joint segmentation task experiments. For the fusion experiments, the proposed method is compared with seven state-of-the-art algorithms—DenseFuse, DIDFuse, U2Fusion, TarDal, SeAFusion, DIVFusion, and CDDFuse—across three datasets: MFNet, M3FD, and RoadScene. Both subjective and objective evaluations are conducted. For subjective evaluation, the fused images generated by each method are visually compared. For objective evaluation, six metrics are employed: Mutual Information (MI), Visual Information Fidelity (VIF), Average Gradient (AG), Sum of Correlation Differences (SCD), Structural Similarity Index Measure (SSIM), and Gradient-based Similarity Measurement (QAB/F). The results show that the proposed method performs consistently well across all datasets, effectively preserves complementary information from infrared and visible images, and achieves superior scores on all evaluation metrics. In the joint segmentation experiments, comparisons are made on the MFNet dataset. Subjective evaluation is presented through semantic segmentation visualizations, and objective evaluation uses Intersection over Union (IoU) and mean IoU (mIoU) metrics. The segmentation results produced by the proposed method more closely resemble ground truth labels and achieve the highest or second-highest IoU scores across all classes. Overall, the proposed method not only yields improved visual fusion results but also demonstrates clear advantages in downstream segmentation performance.  Conclusions  This study proposes a FSG strategy for multimodal image fusion networks, designed to fully leverage semantic information to improve the utility of fused images in downstream visual tasks. The method accounts for the variability among heterogeneous features and integrates the segmentation and fusion networks into a unified framework. By incorporating feature-level semantic information, the approach enhances the quality of the fused images and strengthens their performance in segmentation tasks. The proposed DFI module serves as a bridge between the segmentation and fusion networks, enabling effective interaction and selection of semantic and image features. This reduces the influence of feature variability and enriches the semantic content of the fusion results. In addition, the proposed MSAF module promotes the complementarity and integration of features from infrared and visible modalities while mitigating the disparity between them. Experimental results demonstrate that the proposed method not only achieves superior visual fusion quality but also outperforms existing methods in joint segmentation performance.
3D Model Classification Based on Central Anchor Hard Triplet Loss and Multi-view Feature Fusion
GAO Xueyao, ZHANG Yunkai, ZHANG Chunxiang
Available online  , doi: 10.11999/JEIT240633
Abstract:
  Objective  In view-based 3D model classification, deep learning algorithms extract more representative features from 2D projections to improve classification accuracy. However, several challenges remain. A single view captures information only from a specific perspective, often leading to the omission of critical features. To address this, multiple views are generated by projecting the 3D model from various angles. These multi-view representations provide more comprehensive information through fusion. Nonetheless, the feature content of each view differs, and treating all views equally may obscure discriminative information. Moreover, inter-view complementarity and correlations may be overlooked. Effective utilization of multi-view information is therefore essential to enhance the accuracy of 3D model classification.  Methods  A 3D model classification method based on Central Anchor Hard Triplet Loss (CAH Triplet Loss) and multi-view feature fusion is proposed. Firstly, multi-view sets of 3D models are used as input, and view features are extracted using a Deep Residual Shrinkage Network (DRSN). These features are then fused with the 2D shape distribution features D1, D2, and D3 to obtain fused features of the 2D views. Secondly, Shannon entropy is applied to evaluate the uncertainty of view classification based on the fused features. The multiple views of each 3D model are then ranked in descending order of view saliency. Thirdly, triple network based on an Attention-Enhanced Long Short-Term Memory (Att-LSTM) architecture is constructed for multi-view feature fusion. The LSTM component captures contextual dependencies among views, while a multi-head attention mechanism is integrated to fully capture inter-view relevance. Fourth, metric learning is applied by combining CAH Triplet Loss with Cross-Entropy Loss (CE Loss) to optimize the fusion network. This combined loss function is designed to reduce the feature-space distance between similar samples while increasing the distance between different samples, thereby enhancing the network’s capacity to learn discriminative features from 3D models.  Results and Discussions  When DRSN is used to extract view features from 2D projections and softmax is applied for classification, the 3D model classification achieves the highest accuracy, as shown in Table 1. The integration of shape distribution features D1, D2, and D3 with view features yields a more comprehensive representation of the 3D model, which significantly improves classification accuracy (Table 2). Incorporating CAH Triplet Loss reduces intra-class distances and increases inter-class distances in the feature space. This guides the network to learn more discriminative feature representations, further improving classification accuracy, as illustrated in Figure 4. The application of Shannon entropy to rank view saliency enables the extraction of complementary and correlated information across multiple views. This ranking strategy enhances the effective use of multi-view data, resulting in improved classification performance, as shown in Table 3.  Conclusions  This study presents a novel multi-view 3D model classification framework that achieves improved performance through 3 key innovations. Firstly, a hybrid feature extraction strategy is proposed, combining view features extracted by the DRSN with 2D shape distribution features D1, D2, and D3. This fusion captures both high-level semantic and low-level geometric characteristics, enabling a comprehensive representation of 3D objects. Secondly, a view saliency evaluation mechanism based on Shannon entropy is introduced. This approach dynamically assesses and ranks views according to their classification uncertainty, ensuring that the most informative views are prioritized and that the complementarity among views is retained. At the core of the architecture lies a feature fusion module that integrates Long Short-Term Memory (LSTM) networks with multi-head attention mechanisms. This dual-path structure captures sequential dependencies across ordered views through LSTM and models global inter-view relationships through attention, thereby effectively leveraging view correlation and complementarity. Thirdly, the proposed CAH Triplet Loss combines center loss and hard triplet loss to simultaneously minimize intra-class variation and maximize inter-class separation. Together with cross-entropy loss, this joint optimization enhances the network’s ability to learn discriminative features for robust 3D model classification.
Covert Communication Transmission Scheme in Low Earth Orbit Satellite Integrated Sensing and Communication Systems
ZHU Zhengyu, OUYANG Zebin, PAN Gaofeng, WANG Shuai, SUN Gangcan, CHU Zheng, HAO Fengyu
Available online  , doi: 10.11999/JEIT250208
Abstract:
  Objective  Globalization has driven growing demand for ubiquitous communication. However, traditional terrestrial networks face limitations in areas with poor infrastructure, such as deserts, oceans, and airspace. Satellite communication, with its flexibility, wide frequency coverage, and high capacity, offers a viable alternative for these environments. Yet, due to the open nature of satellite channels and long-distance transmission, signals remain vulnerable to interception, posing serious security challenges. Conventional wireless security measures, including physical-layer techniques and encryption, are limited by assumptions about computational hardness and the statistical structure of intercepted ciphertexts. Covert communication aims to reduce the probability of signal detection by eavesdroppers, providing an alternative layer of security. In Low Earth Orbit (LEO) satellite systems, high mobility leads to dynamic channel variations and misalignment of transmission beams, further complicating covert communication. To mitigate these challenges, Integrated Sensing and Communication (ISAC) technology can be embedded in satellite systems. By detecting the approximate location of potential eavesdroppers and directing radar beams to degrade their detection capability, ISAC improves communication covertness. This study proposes a covert communication strategy tailored for LEO satellite ISAC systems, offering enhanced security in satellite-based infrastructure.  Methods  This paper investigates a covert communication scheme in LEO satellite ISAC systems. First, the system model of the LEO satellite ISAC system is proposed, and the covertness constraints within the system are analyzed. Based on this, an optimization problem is formulated with the objective of maximizing the total covert communication rate for multiple users, subject to constraints including satellite power limits, radar power limits, sensing performance thresholds, and covertness requirements. This optimization problem is non-convex and highly coupled in variables, making it intractable to solve directly. To address this, relaxation variables are introduced to simplify the objective function into a solvable form. Subsequently, an alternating optimization algorithm is employed to decompose the original problem into two subproblems. One subproblem is transformed into an equivalent convex problem using methods like SemiDefinite Relaxation (SDR) and Successive Convex Approximation (SCA), which is then solved iteratively. The obtained solution from this subproblem is then used to solve the other subproblem. Once the solutions to both subproblems are determined, the original problem is effectively solved.  Results and Discussions  This paper validates the performance of the proposed covert communication transmission scheme through numerical simulations. To demonstrate the effectiveness of the proposed scheme, a comparison with a system lacking ISAC technology is introduced. First, the convergence of the proposed algorithm is verified (Fig. 2). The algorithm achieves rapid convergence under different resource allocation configurations in the system (converging at the 7th iteration). Additionally, higher covertness requirements lead to lower covert communication rates, as stricter covertness constraints reduce the allowable satellite transmit power. Increasing radar power causes a slight decline in the covert rate because Bob is subjected to minor interference from the radar beam. However, since the primary radar beam is focused on Eve, the interference level remains limited. Second, a higher number of satellite antennas improves the system’s covert communication rate (Fig. 3). This is attributed to the increased spatial dimensions available for optimizing signal transmission and reception. More antennas enable additional independent signal paths, enhancing transmission performance and flexibility. When the satellite is equipped with 81 antennas and the covertness requirement is 0.1, the total covert rate reaches 17.18 bit/s/Hz. Compared to the system without ISAC technology, the proposed system demonstrates superior resistance to detection (Fig. 4). The radar beam introduces interference to eavesdroppers, and higher radar power degrades the eavesdropper’s detection performance. Finally, increasing the satellite transmit power improves the covert communication rate (Fig. 5). In the system without ISAC, the satellite power cannot be arbitrarily increased due to covertness constraints. However, the ISAC-enabled system leverages radar integration to grant the satellite greater flexibility in power allocation, thereby enhancing the covert rate. Notably, raising the lower bound of sensing performance slightly reduces the covert rate, as higher sensing demands require increased radar power, which introduces additional interference to Bob’s signal reception.  Conclusions  This paper investigates a covert communication transmission scheme in a LEO satellite-enabled ISAC system. An optimization problem is formulated to maximize the sum of multi-user covert communication rates, subject to constraints including covertness requirements, minimum sensing performance, radar power budget, and satellite power budget. The alternating optimization algorithm and SCA algorithm are jointly employed to design the radar and satellite beamforming, as well as the radar receiver filter. Simulation results demonstrate that, compared to a system without radar beam interference, the proposed system with radar beam interference significantly reduces the detectability of communication between Alice and Bob. Furthermore, the integration of radar enhances the satellite’s flexibility to increase transmit power, thereby improving the overall covert communication rate of the system.
Multi-Resolution Spatio-Temporal Fusion Graph Convolutional Network for Attention Deficit Hyperactivity Disorder Classification
SONG Xiaoying, HAO Chunyu, CHAI Li
Available online  , doi: 10.11999/JEIT240872
Abstract:
  Objective  Predicting neurodevelopmental disorders remains a central challenge in neuroscience and artificial intelligence. Attention Deficit Hyperactivity Disorder (ADHD), a representative complex brain disorder, presents diagnostic difficulties due to its increasing prevalence, clinical heterogeneity, and reliance on subjective criteria, which impede early and accurate detection. Developing objective, data-driven classification models is therefore of significant clinical relevance. Existing graph convolutional network-based approaches for functional brain network analysis are constrained by several limitations. Most adopt single-resolution brain parcellation schemes, reducing their capacity to capture complementary features from multi-resolution functional Magnetic Resonance Imaging (fMRI) data. Moreover, the lack of effective cross-scale feature fusion restricts the integration of essential features across resolutions, hampering the modeling of hierarchical dependencies among brain regions. To address these limitations, this study proposes a Multi-resolution Spatio-Temporal Fusion Graph Convolutional Network (MSTF-GCN), which integrates spatiotemporal features across multiple fMRI resolutions. The proposed method substantially improves the accuracy and robustness of functional brain network classification for ADHD.  Methods  The MSTF-GCN improves learning performance through two main components: (1) construction of multi-resolution, multi-channel networks, and (2) comprehensive fusion of temporal and spatial information. Multiple brain atlases at different resolutions are employed to parcellate the brain and generate functional connectivity networks. Spatial features are extracted from these networks, and optimal nodal features are selected using Support Vector Machine–Recursive Feature Elimination (SVM–RFE). To preserve global temporal characteristics and capture hierarchical signal variations, both the original time series and their differential signals are processed using a temporal convolutional network. This structure enables the extraction of complex temporal features and inter-subject temporal correlations. Spatial features from different resolutions are then fused with temporal correlations to form population graphs, which are adaptively integrated via a multi-channel graph convolutional network. Non-imaging data are also integrated to produce effective multi-channel, multi-modal spatiotemporal fusion features. The final classification is performed using a fully connected layer.  Results and Discussions  The proposed MSTF-GCN model is evaluated for ADHD classification using two independent sites from the ADHD-200 dataset: Peking and NI. The model consistently outperforms existing methods, achieving classification accuracies of 75.92% at the Peking site and 82.95% at the NI site (Table 2, Table 3). Ablation studies confirm the contributions of two key components: (1) The multi-atlas, multi-resolution feature extraction strategy significantly enhances classification accuracy (Table 4), supporting the utility of complementary cross-scale topological information; (2) The multimodal fusion strategy, which incorporates non-imaging variables (gender and age), yields notable performance improvements (Table 5). Furthermore, t-SNE visualization and inter-class distance analysis (Fig. 6) show that MSTF-GCN generates a feature space with clearer class separation, reflecting the effectiveness of its multi-channel spatiotemporal fusion design. Overall, the MSTF-GCN model achieves superior performance compared with state-of-the-art methods and demonstrates strong robustness across sites, offering a promising tool for auxiliary diagnosis of brain disorders.  Conclusions  This study proposes a novel multi-channel graph embedding framework that integrates spatial topological and temporal features derived from multi-resolution fMRI data, leading to marked improvements in classification performance. Experimental results show that the MSTF-GCN method exceeds current state-of-the-art algorithms, with accuracy gains of 3.92% and 8.98% on the Peking and NI sites, respectively. These findings confirm its strong performance and cross-site robustness in ADHD classification. Future work will focus on constructing more expressive hypergraph neural networks to capture higher-order relationships within functional brain networks.
Self-Interference Measurements and Analysis of Full Duplex Arrays in U6G Frequency Band
Available online  , doi: 10.11999/JEIT241086
Abstract:
  Objective  The U6G frequency band spans a continuous 700 MHz bandwidth, aligning closely with the Sub-6GHz range. It offers a balance between low-frequency coverage capabilities and high-frequency capacity advantages, making it suitable for the deployment of future 5G-A and 6G systems. With the growing demand for wireless communication services and the limited availability of spectrum resources in future networks, the need for full-duplex technology has emerged. A U6G full-duplex transceiver, with sufficient transmit-receive isolation, can transmit and receive simultaneously within the same frequency band, effectively doubling spectral efficiency compared to Time-Division Duplex (TDD) or Frequency-Division Duplex (FDD) systems. However, full-duplex systems with large-scale array antennas face the challenge of complex multi-dimensional cross-coupling strong self-interference. Near-field coupling self-interference can degrade reception sensitivity, potentially leading to the saturation of low-noise amplifiers. Understanding the near-field coupling characteristics of self-interference between arrays is crucial for evaluating the proposed full-duplex industrial standards and protocols. Currently, self-interference measurements for array systems primarily focus on the millimeter-wave band, with research on array-to-array self-interference in the U6G band being relatively scarce, mostly limited to single-antenna configurations. This work utilizes an analog beamforming phased array platform capable of precise beam steering to conduct large-scale full-duplex array self-interference coupling channel measurements in the U6G band, completing nearly 3.6 million measurements. Through these self-interference coupling channel measurements between beams as well as between array elements, an in-depth analysis is provided on the angular and physical spatial distribution characteristics of transmit-receive isolation, and the inherent connection between element-to-element coupling and beam-to-beam coupling is revealed.  Methods  In this work, a 128T-128R phased array platform with analog beamforming capability is deployed in an outdoor environment. Frequency-domain measurement techniques are employed to acquire the frequency response of the self-interference coupling channels between different transmit and receive beams, as well as between array elements. The spatial and numerical distribution characteristics of the coupling self-interference are analyzed using transmit-receive isolation as the evaluation criterion. The measurement process utilizes a dual-port vector network analyzer, with stepwise frequency scanning conducted across the 66756875 MHz band (with a total measurement bandwidth of 200 MHz) to measure the frequency response of the self-interference channels. For beam-to-beam coupling channels, the azimuth sweep range is set from –60° to +60°, and the elevation sweep range is from –30° to +30°, with a step interval of 2°, resulting in a total of 3,575,881 sets of beam-to-beam coupling channel data. For element-to-element coupling channels, only one pair of transmit-receive elements is excited at a time, while all other transmit and receive elements are turned off. This measurement process covers all possible transmit-receive element pairs, yielding a total of 16,384 sets of element-to-element coupling channel data.  Results and Discussions  The analysis of the transmit-receive isolation between beams reveals that the maximum and minimum isolation between the transmit and receive beams are 52.17 dB and –6.25 dB, respectively. Approximately 95% of the isolation values fall between 10 dB and 40 dB, with a median isolation of 26.66 dB (Fig. 6). The isolation distribution between beams exhibits strong spatial symmetry and directionality (Fig. 7, Fig. 8, Fig. 9, Fig. 10). Specifically, steering the transmit and receive beams along the direction of the array causes significant variations in the self-interference coupling, with no transmit and receive beams consistently providing high or low isolation (Fig. 7). Moreover, in the U6G frequency band, the sensitivity of self-interference coupling to beam steering is much weaker than in the millimeter-wave frequency band (Fig. 9). Therefore, relying on beam steering to reduce self-interference in the U6G frequency band may be inefficient and result in suboptimal performance.The analysis of the transmit-receive isolation between array elements indicates that the maximum and minimum isolation between transmit and receive elements are 88.08 dB and 54.48 dB, respectively. Approximately 95% of the isolation values fall between 63 dB and 76 dB, with a median isolation of 69.43 dB (Fig. 11). Even with the same element spacing, the isolation between elements is not necessarily the same; multiple isolation mappings exist for the same distance (Fig. 12). This many-to-one mapping relationship is likely due to differences in multipath propagation between the positions of the transmit and receive elements, as well as amplitude-phase inconsistencies in the transmit and receive chains. Furthermore, by assigning beamforming weights to the non-directional element-to-element coupling channels, the transmit-receive isolation between beams can be reliably predicted. This approach accurately reproduces the self-interference coupling between the transmit and receive beams and, compared to the spherical wave model, better captures the realistic characteristics of self-interference in both spatial and numerical distributions (Fig. 13, Fig. 14).  Conclusions  This study examines the near-field self-interference coupling characteristics of U6G full-duplex array systems through nearly 3.6 million measurements. The measurement and analysis results demonstrate that the isolation distribution between transmit and receive beams exhibits strong spatial symmetry and directionality. In contrast to the millimeter-wave frequency band, self-interference coupling in the U6G band shows weaker sensitivity to beam steering. Therefore, relying solely on beam steering to reduce self-interference is insufficient to achieve the required receiver sensitivity, necessitating the adoption of additional active or passive spatial self-interference suppression techniques. In some cases, a combination of RF and digital-domain self-interference suppression techniques may also be necessary.Furthermore, measurements of the self-interference coupling channels between transmit and receive array elements reveal that multiple isolation mappings exist for the same element spacing, which cannot be accurately described by traditional spherical wave models. In particular, by assigning beamforming weights to the non-directional element-to-element coupling channels, the self-interference coupling characteristics between beams can be replicated, and the array isolation can be accurately predicted. These measurement and analysis results provide essential insights for the design of U6G full-duplex communication systems and lay the foundation for future work on self-interference channel modeling, beamforming optimization, and self-interference suppression.
Identification of Non-Line-Of-Sight Signals Based on Direct Path Signal Residual and Support Vector Data Description
NI Xue, ZENG HaiYu, YANG Wendong
Available online  , doi: 10.11999/JEIT240960
Abstract:
  Objective  Current machine learning-based methods for Non-Line-of-Sight (NLOS) signal recognition either require the collection of a large amount of data from two different types of signals for various scenarios, or the trained models fail to generalize across different environments. These methods also do not simultaneously address the practical challenges of low training sample acquisition cost and good scene adaptation. This paper proposes a new NLOS recognition method that collects single-class signals from a single environment to train recognition models, which then demonstrate high accuracy when recognizing signals in different scenarios. This approach offers the advantages of low sample acquisition cost and strong environmental adaptability.  Methods  This paper proposes Direct Path (DP) signal residual feature parameters that exhibit significant differences between two types of signals. The effectiveness of these parameters is theoretically analyzed and combined with nine feature parameters identified in typical literature, forming various feature vectors to characterize the signals. This approach effectively enhances the accuracy of the recognition model. A class of signals with high feature similarity across different scenarios is used as training data, and a single recognition model is employed as the machine learning algorithm. The model is trained on signal samples collected in typical Line-of-Sight (LOS) channels to improve its scene adaptability. Based on the principles of Deep Support Vector Data Description (DSVDD), a reverse-expanded DSVDD model is designed for NLOS signal recognition, further improving the model’s accuracy in recognizing samples across different scenarios.  Results and Discussions  As shown in Table 4 , in the signal recognition scenario where the test set and training set originate from the same scene, the Least Squares Support Vector Machine (LSSVM) model demonstrates the best recognition performance. This is achieved using hyperplanes trained with two types of signals, resulting in a recognition accuracy of over 95%. In comparison, the standard Support Vector Data Description (SVDD) model, which is trained using only single-class LOS signal samples, exhibits a performance loss relative to LSSVM, with a maximum accuracy decrease exceeding 5%. The recognition accuracy of the SVDD model trained with DP signal residual features improves compared to the standard SVDD model, with the highest accuracy difference remaining within 5% of the LSSVM model. Furthermore, the performance of the DSVDD model, trained with DP signal residuals, shows a further improvement, with the highest accuracy decrease decreased to less than 2% compared to the LSSVM model. In scenarios where the training set and test data come from different scenes, LSSVM requires two types of signals for training. However, the hyperplane trained with two types of signal samples from a single scene exhibits poor performance when recognizing signal samples from other scenarios, with a maximum accuracy of less than 75%. The SVDD model trained with DP signal residual eigenvalues incorporates features with significant differences between the two signal types, improving recognition accuracy to over 80%. Finally, the DSVDD model, trained with DP signal residual features and replacing the Gaussian kernel function in the SVDD model with a neural network, further enhances recognition accuracy, achieving a maximum accuracy exceeding 85%.  Conclusions  A recognition method based on DP signal residual feature training for DSVDD is proposed to address the challenges of low sample acquisition cost and strong environmental adaptability in typical NLOS signal recognition. Compared with the SVDD method, this approach improves upon feature parameters, models, and model structures by introducing features with significant differences between the two types of signals, resulting in a substantial improvement in recognition performance. Additionally, the paper designs a reverse dimensionality expansion for DSVDD and incorporates it into NLOS signal recognition, further enhancing the accuracy of the recognition model across different scene samples. Compared to other typical machine learning algorithms, the proposed method requires the collection of single-class signal data from a single scene and performs effectively in recognizing signal samples from other scenes. Although the proposed method outperforms typical single-recognition approaches, the overall performance still has room for improvement. The theoretical analysis regarding how neural networks can better explore potential relationships between features is insufficient, and the full potential of neural networks in single-recognition models has not been fully realized. Furthermore, due to time constraints, this study only simulated sample data collected from three scenarios, and the recognition performance in other typical scenarios requires further validation.
Diffusion Model and Edge Information Guided Single-photon Image Reconstruction Algorithm
ZHANG Dan, LIAN Qiusheng, YANG Yuchi
Available online  , doi: 10.11999/JEIT241063
Abstract:
  Objective  Quanta Image Sensors (QIS) are solid-state sensors that encode scene information into binary bit-streams. The reconstruction for QIS consists of recovering the original scenes from these bit-streams, which is an ill-posed problem characterized by incomplete measurements. Existing reconstruction algorithms based on physical sensors primarily use maximum-likelihood estimation, which may introduce noise-like components and result in insufficient sharpness, especially under low oversampling factors. Model-based optimization algorithms for QIS generally combine the likelihood function with an explicit or implicit image prior in a cost function. Although these methods provide superior quality, they are computationally intensive due to the use of iterative solvers. Additionally, intrinsic readout noise in QIS circuits can degrade the binary response, complicating the imaging process. To address these challenges, an image reconstruction algorithm, Edge-Guided Diffusion Model (EGDM), is proposed for single-photon sensors. This algorithm utilizes a diffusion model guided by edge information to achieve high-speed, high-quality imaging for QIS while improving robustness to readout noise.  Methods  The proposed EGDM algorithm incorporates a measurement subspace constrained by binary measurements into the unconditional diffusion model sampling framework. This constraint ensures that the generated images satisfy both data consistency and the natural image distribution. Due to high noise intensity in latent variables during the initial reverse diffusion stages of diffusion models, texture details may be lost, and structural components may become blurred. To enhance reconstruction quality while minimizing the number of sampling steps, a bilateral filter is applied to extract edge information from images generated by maximum likelihood estimation. Additionally, the integration of jump sampling with a measurement subspace projection termination strategy reduces inference time and computational complexity, while preserving visual quality.  Results and Discussions  Experimental results on both the benchmark datasets, Set10 and BSD68 (Fig. 6, Fig. 7, Table 2), and the real video frame (Fig. 8) demonstrate that the proposed EGDM method outperforms several state-of-the-art reconstruction algorithms for QIS and diffusion-based methods in both objective metrics and visual perceptual quality. Notably, EGDM achieves an improvement of approximately 0.70 dB to 3.00 dB compared to diffusion-based methods for QIS in terms of Peak Signal-to-Noise Ratio (PSNR) across all oversampling factors. For visualization, the proposed EGDM produces significantly finer textures and preserves image sharpness. In the case of real QIS video sequences (Fig. 8), EGDM preserves more detailed information while mitigating blur artifacts commonly found in low-light video capture. Furthermore, to verify the robustness of the reconstruction algorithm to readout noise, the reconstruction of the original scene from the measurements is conducted under various readout noise levels. The experimental results (Table 3, Fig. 9, Fig. 10) demonstrate the effectiveness of the proposed EGDM method in suppressing readout noise, as it achieves the lowest average Mean Squared Error (MSE) and superior quality compared to other algorithms in terms of PSNR, particularly at higher noise levels. Visually, EGDM produces the best results, with sharp edges and clear texture patterns even under severe noise conditions. Compared to the EGDM algorithm without acceleration strategies, the implementation of jump sampling and measurement subspace projection termination strategies reduces the execution time by 5 seconds and 1.9 seconds, respectively (Table 4). Moreover, EGDM offers faster computation speeds than other methods, including deep learning-based reconstruction algorithms that rely on GPU-accelerated computing. After thorough evaluation, these experimental findings confirm that the high-performance reconstruction and rapid imaging speed make the proposed EGDM method an excellent choice for practical applications.  Conclusions  This paper proposes a single-photon image reconstruction algorithm, EGDM, based on a diffusion model and edge information guidance, overcoming the limitations of traditional algorithms that produce suboptimal solutions in the presence of low oversampling factors and readout noise. The measurement subspace defined by binary measurements is introduced as a constraint in the diffusion model sampling process, ensuring that the reconstructed images satisfy both data consistency and the characteristics of natural image distribution. The bilateral filter is applied to extract edge components from the MLE-generated image as auxiliary information. Furthermore, a hybrid sampling strategy combining jump sampling with measurement subspace projection termination is introduced, significantly reducing the number of sampling steps while improving reconstruction quality. Experimental results on both benchmark datasets and real video frames demonstrate that: (1) Compared with conventional image reconstruction algorithms for QIS, EGDM achieves excellent performance in both average PSNR and SSIM. (2) Under different oversampling factors, EGDM outperforms existing diffusion-based reconstruction methods by a large margin. (3) Compared with existing techniques, the EGDM algorithm requires less computational time while exhibiting strong robustness against readout noise, confirming its effectiveness in practical applications. Future research could focus on developing parameter-free reconstruction frameworks that preserve imaging quality and extending EGDM to address more complex environmental challenges, such as dynamic low-light or high dynamic range imaging for QIS.
An Audio-visual Generalized Zero-Shot Learning Method Based on Multimodal Fusion Transformer
YANG Jing, LI Xiaoyong, RUAN Xiaoli, LI Shaobo, TANG Xianghong, XU Ji
Available online  , doi: 10.11999/JEIT241090
Abstract:
  Objective  Audio-visual Generalized Zero-Shot Learning (GZSL) integrates audio and visual signals in videos to enable the classification of known classes and the effective recognition of unseen classes. Most existing approaches prioritize the alignment of audio-visual and textual label embeddings, but overlook the interdependence between audio and video, and the mismatch between model outputs and target distributions. This study proposes an audio-visual GZSL method based on a Multimodal Fusion Transformer (MFT) to address these limitations.  Methods  The MFT employs a transformer-based multi-head attention mechanism to enable effective cross-modal interaction between visual and audio features. To optimize the output probability distribution, the Kullback-Leibler (KL) divergence between the predicted and target distributions is minimized, thereby aligning predictions more closely with the true distribution. This optimization also reduces overfitting and improves generalization to unseen classes. In addition, cosine similarity loss is applied to measure the similarity of learned representations within the same class, promoting feature consistency and improving discriminability.  Results and Discussions  The experiments include both GZSL and Zero-Shot Learning (ZSL) tasks. The ZSL task requires classification of unseen classes only, whereas the GZSL task addresses both unseen and seen class classification to mitigate catastrophic forgetting. To evaluate the proposed method, experiments are conducted on three benchmark datasets: VGGSound-GZSLcls, UCF-GZSLcls, and ActivityNet-GZSLcls (Table 1). MFT is quantitatively compared with five ZSL methods and nine GZSL methods (Table 2). The results show that the proposed method achieves state-of-the-art performance on all three datasets. For example, on ActivityNet-GZSLcls, MFT exceedes the previous best ClipClap-GZSL method by 14.6%. This confirms the effectiveness of MFT in modeling cross-modal dependencies, aligning predicted and target distributions, and achieving semantic consistency between audio and visual features. Ablation studies (Tables 35) further support the contribution of each module in the proposed framework.  Conclusions  This study proposes a transformer-based audio-visual GZSL method that uses a multi-head self-attention mechanism to extract intrinsic information from audio and video data and enhance cross-modal interaction. This design enables more accurate capture of semantic consistency between modalities, improving the quality of cross-modal feature representations. To align the predicted and target distributions and reinforce intra-class consistency, KL divergence and cosine similarity loss are incorporated during training. KL divergence improves the match between predicted and true distributions, while cosine similarity loss enhances discriminability within each class. Extensive experiments demonstrate the effectiveness of the proposed method.
Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy
CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia
Available online  , doi: 10.11999/JEIT250021
Abstract:
  Objective  To address the future trend of network densification and spatial distribution, this study proposes a multi-base station air–ground integrated ultra-dense network architecture and develops a semi-distributed scheme for resource optimization. The network comprises coexisting macro, micro, and Unmanned Aerial Vehicle (UAV) base stations. A semi-distributed Double Deep Q Network (DDQN)-based power control scheme is designed to reduce computational burden, improve response speed, and overcome the lack of global optimization in conventional fully centralized approaches. The proposed scheme enhances energy efficiency by combining distributed decision-making at the base station level with centralized training via a network trainer, enabling a balance between computational complexity and performance. The DDQN algorithm facilitates local decision-making while centralized coordination ensures overall network optimization.  Methods  This study establishes a complex dense network model for air–ground integration with coexisting macro, micro, and UAV base stations, and proposes a semi-distributed DDQN scheme to improve network energy efficiency. The methods are as follows: (1) Construct an integrated air–ground dense network model in which macro, micro, and UAV base stations share the spectrum through a cooperative mechanism, thereby overcoming the performance bottlenecks of conventional heterogeneous networks. (2) Develop an improved semi-distributed DDQN algorithm that enhances Q-value estimation accuracy, addressing the limitations of traditional centralized and distributed control modes and mitigating Q-value overestimation observed in conventional Deep Q Network (DQN) approaches. (3) Introduce a disturbance factor to increase the probability of exploring random actions, strengthen the algorithm’s ability to escape local optima, and improve estimation accuracy.  Results and Discussions  Simulation results demonstrate that the proposed semi-distributed DDQN scheme effectively adapts to dense and complex network topologies, yielding marked improvements in both energy efficiency and total throughput relative to traditional DQN and Q-learning algorithms. Key results include the following: The total throughput achieved by DDQN exceeds that of the baseline DQN and Q-learning algorithms (Fig. 3). In terms of energy efficiency, DDQN exhibits a clear advantage, converging to 84.60%, which is 17.94% higher than DQN (69.42%) and 20.21% higher than Q-learning (67.50%) (Fig. 4). The loss value of DDQN also decreases more rapidly and stabilizes at a lower level. With increasing iterations, the loss curve becomes smoother and ultimately converges to 100, which is 100 lower than that of DQN (Fig. 5). Moreover, DDQN achieves the highest user access success rate compared with DQN and Q-learning (Fig. 6). When the access success rate reaches 80%, DDQN requires significantly fewer iterations than the other two algorithms. This advantage becomes more pronounced under high user density. For example, when the number of users reaches 800, DDQN requires fewer iterations than both DQN and Q-learning to achieve comparable performance (Fig. 7).  Conclusions  This study proposes a semi-distributed DDQN strategy for intelligent control of base station transmission power in ultra-dense air–ground networks. Unlike traditional methods that target energy efficiency at the individual base station level, the proposed strategy focuses on optimizing the overall energy efficiency of the network system. By dynamically adjusting the transmission power of macro, micro, and airborne base stations through intelligent learning, the scheme achieves system-level coordination and adaptation. Simulation results confirm the superior adaptability and performance of the proposed DDQN scheme under complex and dynamic network conditions. Compared with conventional DQN and Q-learning approaches, DDQN exhibits greater flexibility and effectiveness in resource control, achieving higher energy efficiency and sustained improvements in total throughput. These findings offer a new approach for the design and management of integrated air–ground networks and provide a technical basis for the development of future large-scale dense network architectures.
Non-orthogonal Prolate Spheroidal Wave Functions Signal Detection Method with Cross-terms
LU Faping, MAO Zhongyang, XU Zhichao, SHU Yihao, KANG Jiafang, WANG Feng, WANG Mengjiao
Available online  , doi: 10.11999/JEIT250052
Abstract:
  Objective  Non-orthogonal Shape Modulation based on Prolate Spheroidal Wave Functions (NSM-PSWFs) utilizes PSWFs with high time-frequency energy concentration as basis waveforms. This structure enables high spectral efficiency and time-frequency energy aggregation, making it a promising candidate for B5G/6G waveform design. However, due to the non-orthogonality of the PSWFs used for information transmission in NSM-PSWFs, mutual interference between non-orthogonal signals and poor bit error performance in coherent detection systems significantly limit their practical deployment. This issue is a common challenge in non-orthogonal modulation and access technologies. To address the problem of low detection performance resulting from mutual interference among non-orthogonal PSWFs, this study incorporates time-frequency domain characteristics into signal detection. A novel detection mechanism for non-orthogonal PSWFs in the time-frequency domain is proposed, with the aim of reducing interference between non-orthogonal PSWFs and enhancing detection performance.  Methods  Given the different time-frequency domain energy distribution characteristics of PSWF signals at various stages—particularly the "local" energy concentration in different regions—this study introduces cross-terms between signals. Based on an analysis of non-orthogonal signal time-frequency characteristics, with a focus on innovating detection mechanisms, a combined approach of theoretical modeling and numerical simulation is employed to explore novel methods for detecting non-orthogonal PSWF signals via cross-terms. Specifically: (1) The impact of interference between PSWF signals and Gaussian white noise on the time-frequency distribution of cross-terms is analyzed, demonstrating the feasibility of detecting PSWF signals in the time-frequency domain. (2) Building on this analysis, time-frequency characteristics are integrated into the detection process. A novel method for detecting non-orthogonal PSWFs based on cross-terms is proposed, accompanied by a strategy for selecting time-frequency feature parameters. The "integral value of cross-terms over symmetric time intervals at the frequency corresponding to the peak energy density of cross-terms" is chosen as the feature parameter. This shifts signal detection from the "one-dimensional energy domain (time or frequency)" to the "two-dimensional time-frequency energy domain," enabling detection by exploiting localized energy regions while simultaneously mitigating interference during statistical acquisition.  Results and Discussions  This study demonstrates the feasibility of detecting signals in the two-dimensional time-frequency domain and analyzes the impact of different PSWFs and AWGN on the distribution characteristics of cross-terms. Notably, AWGN interference can be regarded as a special form of "interference between PSWFs" exhibiting a linear superposition with PSWF-induced interference. The interference from PSWFs with time-domain parity opposite to that of the template signal can be eliminated through "symmetric time-interval integration" (Fig. 1, Table 1, Table 2). This establishes a theoretical foundation for the novel detection mechanism based on cross-terms and provides a reference for incorporating other two-dimensional distribution characteristics into signal detection. Additionally, a novel detection mechanism for non-orthogonal PSWFs based on cross-terms is proposed, utilizing time-frequency distribution characteristics for signal detection. This method effectively reduces interference between non-orthogonal PSWFs, thereby enhancing detection performance. It also offers valuable insights for exploring detection mechanisms based on other two-dimensional distribution characteristics. For example, compared to conventional coherent detection, the proposed method achieves a superior performance with approximately 1 dB improvement in bit error rate at 4 × 10–5 (Fig. 4).  Conclusions  This paper demonstrates the feasibility of detecting PSWFs in the two-dimensional time-frequency domain and proposes a novel detection method for non-orthogonal PSWFs based on cross-terms. The proposed method transforms traditional coherent detection from “global energy in the time/frequency domain” to “local energy in the time-frequency domain” significantly reducing interference between non-orthogonal signals and enhancing detection performance. This approach not only provides a new perspective for developing efficient detection methods for non-orthogonal signals but also serves as a valuable reference for investigating novel signal detection mechanisms in two-dimensional domains.
SealVerifier: Seal Verification System Based on Dual-stream Model
LEI Meng, NING Qiyue, JU Jinjun, ZOU Liang
Available online  , doi: 10.11999/JEIT241059
Abstract:
  Objective  Seals serve a critical legal function in scenarios such as document authentication and contract execution, acting as essential markers of document authenticity and legitimacy. However, the increasing sophistication of seal forgery techniques, driven by advances in digital technology, presents new challenges to existing verification methods. In particular, low-quality or blurred seal images substantially reduce the accuracy and reliability of traditional approaches, limiting their practical utility. To address these limitations, this study proposes SealVerifier, an automatic seal verification system based on a dual-stream model. The method is designed to improve recognition accuracy, generalization ability, and robustness to noise. SealVerifier contributes to the intelligent development of seal verification and offers technical support for secure digital document authentication, thereby facilitating the broader deployment of reliable seal verification technologies.  Methods  SealVerifier comprises an image enhancement module and a dual-stream verification model, designed to improve the accuracy and robustness of seal authentication. The framework follows a two-stage pipeline: image preprocessing and authenticity verification. In the preprocessing stage, the DeARegNet module is introduced to correct degradation caused by uneven stamping pressure, scanner variability, paper background complexity, and interference from document content. DeARegNet integrates a Denoising Adversarial Network (DAN) and a GeomPix alignment module to enhance seal image clarity and consistency. DAN employs an adversarial training, consisting of a denoiser and a discriminator. The denoiser uses a multi-level residual dense connection module to extract fine-grained features and eliminate noise, thereby improving image resolution. The discriminator enforces denoising reliability by distinguishing between clean and denoised images using an adversarial loss. The GeomPix alignment module exploits geometric characteristics of circular and elliptical seals. It relies on a central pentagram positioning marker and the radial fan-shaped pixel density distribution to achieve high-precision alignment, significantly improving the accuracy and stability of image correction. In the verification stage, a dual-stream architecture combining EfficientNet and Streamlining Vision Transformer (SViT) is employed to extract local detail features and global structural information. EfficientNet performs efficient multi-scale feature extraction via compound scaling, capturing textures, edge contours, and subtle defects. SViT models global dependencies through self-attention mechanisms and enhances feature learning with high-dimensional multilayer perceptrons and denormalization techniques, thereby improving verification accuracy. To improve generalization and reduce inter-domain discrepancies among seal datasets, a Data Distribution Adapter (DDA) and Gradient Reversal Layer (GRL) are incorporated. These components use adversarial training to support the seal authenticity classifier—comprising EfficientNet and SViT—in learning domain-invariant features. This approach enhances robustness and adaptability in diverse application scenarios.  Results and Discussions  Experimental results demonstrate that the integration of the dual-stream architecture—EfficientNet for local detail extraction and SViT for global structural representation—enables SealVerifier to significantly improve verification accuracy. On a custom Chinese seal dataset comprising 30,699 image pairs, SealVerifier achieved precision, recall, and F1 scores of 91.34%, 96.83%, and 93.57%, respectively, outperforming existing methods (Table 3). The incorporation of a DDA and a dual loss function further reduced distributional discrepancies across seal datasets using adversarial training, enhancing both recognition accuracy and generalization performance (Table 4). Under noise interference, SealVerifier maintained high verification accuracy, confirming its robustness and applicability in real-world scenarios (Table 2).  Conclusions  This study proposes SealVerifier, a dual-stream model for fully automated seal authenticity verification. A Chinese seal dataset with complex backgrounds is constructed, and nine-fold cross-validation confirms the method’s effectiveness. SealVerifier integrates DeARegNet for image enhancement and combines EfficientNet and SViT to capture both fine-grained details and global semantic features. To address the limitations of conventional Vision Transformer (ViT) models, high-dimensional multilayer perceptrons and denormalization techniques are introduced, improving the model’s capacity to learn complex features and enhancing generalization and robustness. A DDA and dual loss function are also incorporated to mitigate dataset variability, enabling stable classification performance across heterogeneous seal images. Experimental results show that SealVerifier achieves precision, recall, and F1 scores of 91.34%, 96.83%, and 93.57%, respectively, demonstrating its performance advantage in seal verification tasks. Future work explores high-precision alignment strategies for multi-view seal images to further reduce error and improve image correction accuracy under challenging imaging conditions.
Overview of Security Issues and Defense Technologies for Low Earth Orbit Satellite Network
DU Xingkui, SHU Nina, LIU Chunsheng, YANG Fang, MA Tao, LIU Yang
Available online  , doi: 10.11999/JEIT240957
Abstract:
  Significance   In recent years, Low-Earth Orbit (LEO) satellite networks have experienced rapid development, demonstrating broad application prospects in mobile communications, the Internet of Things (IoT), maritime operations, and other domains. These networks are poised to become a critical component of next-generation network architectures. Currently, leading global and domestic commercial entities are actively deploying mega-constellations to enable worldwide mobile communication and broadband internet services. However, as the scale of LEO constellations expands, the satellite networks are increasingly exposed to both anthropogenic threats (e.g., cyberattacks) and environmental hazards (e.g., space debris). Existing review studies have systematically summarized research on security threats and defense mechanisms across the physical, network, and application layers of LEO satellite networks. Nevertheless, gaps remain in prior literature: First, lack of technical granularity. Many studies provide taxonomies of security issues but fail to focus sufficiently on domain-specific cybersecurity challenges or delve into technical details. Second, overemphasis on integrated space-terrestrial networks. Existing reviews often prioritize the broader context of space-air-ground-sea integrated networks, obscuring the unique vulnerabilities inherent to LEO satellite architectures. Third, imbalanced layer-specific analysis: Current works predominantly address physical and link-layer security, while insufficiently highlighting the distinct characteristics of network-layer threats. Building upon prior research, this paper presents a comprehensive review of security challenges and defense technologies in LEO satellite networks. By analyzing the inherent vulnerabilities of these systems, we provide an in-depth exploration of security threats, particularly those targeting network-layer integrity. Furthermore, we critically evaluate cutting-edge defense mechanisms developed to mitigate realistic threats, offering insights into their technical principles and implementation challenges.  Progress   This paper first elaborates on the architecture of LEO satellite networks, systematically analyzing the composition and functional roles of three core components: the space segment, ground segment, and user segment. It then summarizes the operational characteristics of LEO networks, including their dynamic multi-layer topology, globally ubiquitous coverage, low-latency data transmission, and resilient resource allocation mechanisms. These intrinsic characteristics fundamentally enable LEO networks to deliver high-quality communication services. Subsequently, this study identifies potential vulnerabilities across four dimensions: nodes, links, protocols, and infrastructure. Due to the open nature of satellite links, transmitted data are susceptible to eavesdropping, where adversaries may intercept satellite signals, predict orbital dynamics, and deploy surveillance systems preemptively. Prior research has addressed satellite communication security through physical-layer security designs and scenario-specific eavesdropping analyses. Through theoretical modeling and case studies, this work categorizes multiple Denial-of-Service (DoS) attack variants and explores routing attack risks inherent to the open architecture of LEO networks. Furthermore, it classifies electronic countermeasure interference types based on target scenarios and adversarial objectives. To counter these threats, the paper evaluates emerging defense technologies, including encryption-based security frameworks, resilient routing protocols, and digital twin-enabled virtualization platforms for network simulation and secure design optimization. Finally, it highlights cutting-edge AI-driven security solutions, such as machine learning-powered anomaly detection and federated learning for distributed threat intelligence.  Conclusions  This review critically examines the evolution of LEO satellite networks, identifying critical gaps in systematic analysis and comprehensive threat coverage within existing studies. By establishing a four-dimensional vulnerability framework—node vulnerabilities arising from harsh space environmental conditions, link vulnerabilities exacerbated by high orbital dynamics, protocol vulnerabilities stemming from commercial standardization compromises, and infrastructure vulnerabilities due to tight coupling with terrestrial internet systems—the study systematically classifies security threats across physical, network, and application layers. The paper further dissects attack methodologies unique to each threat category and evaluates advanced countermeasures. Notable innovations include quantum cryptography-enhanced encryption systems, fault-tolerant routing algorithms, virtualized network emulation environments, and AI-empowered security paradigms leveraging deep learning and federated learning architectures. These technologies not only significantly enhance the security posture of LEO networks but also demonstrate transformative potential for future adaptive security frameworks. However, challenges persist in balancing computational overhead with real-time operational constraints, necessitating further research into lightweight cryptographic primitives and cross-domain collaborative defense mechanisms. This synthesis provides a foundational reference for advancing next-generation satellite network security while underscoring the imperative for interdisciplinary innovation in space-terrestrial converged systems.  Prospects   Looking ahead, research on the security of LEO satellite networks will constitute a long-term and complex process. With the integration of emerging technologies such as quantum communication and artificial intelligence, security defense mechanisms in LEO satellite networks will evolve toward greater intelligence and automation. Emerging technologies are anticipated to play increasingly critical roles in this domain, particularly through advancements in adaptive intelligent networking technologies and intelligent networking protocol architectures. These developments will support the efficient convergence of space-air-ground-sea integrated networks. The application of deep learning methodologies to analyze network characteristics and construct corresponding neural network models will further enhance network adaptability and coordination. Concurrently, as commercial deployment of LEO satellite networks accelerates, the critical challenge of balancing security requirements with economic efficiency warrants in-depth investigation. Future research should prioritize cost-benefit analyses and explore optimal trade-offs between cybersecurity and service efficiency across diverse application scenarios. Furthermore, international collaboration is expected to assume a pivotal role in the security governance of LEO satellite networks, particularly through jointly establishing international standards and regulatory frameworks to address transnational security threats. This multilateral approach will be essential for maintaining the integrity and resilience of next-generation satellite infrastructures in an increasingly interconnected orbital environment.
Codebook Attack and Camouflage Solution in Intelligent Reflective Surface-aided Wireless Communications
LI Runyu, PENG Wei, ZHOU JianLong
Available online  , doi: 10.11999/JEIT240991
Abstract:
  Objective  Intelligent Reflective Surface (IRS) technology has demonstrated significant potential in enhancing Physical Layer Security (PLS). While the use of IRS to support PLS has been extensively studied, there is limited research addressing the security challenges inherent to the IRS system itself. In particular, when facing an attacker, obtaining the real-time codebook is crucial for mastering the entire IRS cascaded channel. The IRS controller, an IoT device with limited computational resources and security assurances, stores the real-time codebook and serves as the system's Achilles' heel. This paper proposes a new type of attack, the Controller Manipulation Attack (CMA). The CMA can be executed by an attacker who either compromises the IRS controller or infects it with malware, allowing for the malicious manipulation of phase shifts, which can degrade the rate of legitimate communication. Additionally, an attacker can retrieve the codebook information by exploiting the vulnerabilities of the IRS controller. Due to hardware constraints, the controller is a vulnerable, zero-trust device, making it easier for attackers to gain access to the codebook. With knowledge of the IRS geometric structure, operating frequency, codebook, and the location of the Base Station (BS), an attacker can infer the direction of the main lobe beam, thereby enabling more efficient passive eavesdropping. This passive eavesdropping represents a serious threat, especially in high-frequency scenarios with narrow beams, and is more covert than traditional pilot contamination attacks.  Methods  To address the codebook attack, a lightweight camouflage method is proposed at the physical layer. In this approach, the IRS phase shifts—termed the camouflage codebook—comprise both the real codebook and a fabricated one designed to deceive potential attackers. A subset of IRS elements is configured to produce ostensible phase shifts corresponding to the fake codebook. These elements do not radiate energy, serving solely to mislead attackers. Therefore, even if an attacker compromises the IRS controller and accesses the codebook, the retrieved information remains ineffective. To quantify the level of security provided, the Codebook-Secrecy-Rate (CSR) is defined as the difference in data rates between the real and camouflage codebooks. The optimization of discrete phase shifts for the IRS is formulated as an inner product maximization problem. Leveraging the structural properties of this formulation, a Divide-and-Sort (DaS) algorithm is proposed. This algorithm achieves global optimality with a computational complexity of \begin{document}$ O\left({2}^{B}N\right) $\end{document}. Based on the DaS solution, the CSR is maximized in the following steps: the optimal codebook for signal enhancement is first derived; subsequently, a subset of IRS elements is phase-shifted by π to act as inactive units providing destructive interference. Finally, a Tabu Search (TS) algorithm is employed to determine the optimal topology of the codebook configuration.  Results and Discussions  Simulation results confirm the performance of the proposed solution. Experiments are conducted across four IRS configurations. When the number of IRS elements exceeds 1,000 and each unit operates with 1-bit phase resolution, the average CSR reaches approximately 15–20 bit/(s·Hz), as shown in Fig. 5. Monte Carlo simulations evaluate the relationship between the number of active elements \begin{document}$ {N}_{T} $\end{document} and \begin{document}$ N $\end{document}. A linear correlation is observed, as depicted in Fig. 6. The CSR reaches its maximum when approximately half of the IRS units are active. In practical IRS-assisted communication systems, selecting the number of active units within the interval [\begin{document}$ N/2,N $\end{document}] offers a trade-off between signal enhancement and security. When the size of the real codebook approaches that of the fake codebook, the constructive gain from the real codebook is largely neutralized by the interference from the fake codebook. This configuration corresponds to the maximum achievable CSR for the system.  Conclusions  This study considers the codebook attack in IRS-aided communication systems and proposes a physical-layer camouflage codebook solution. Owing to the limited computational capacity of the IRS controller, which restricts the implementation of conventional security protocols, the controller remains vulnerable to compromise. An attacker with access to the IRS geometric structure, operating frequency, codebook, and BS location can infer the main lobe beam direction, facilitating efficient passive eavesdropping. In the proposed method, IRS elements are divided into two groups: one group operates normally to enhance legitimate signals, while the other is configured to generate deceptive phase shifts without energy radiation. This arrangement produces a camouflage codebook. Even if attackers gain control of the IRS controller, the obtained codebook includes phase information associated with inactive elements, resulting in a misleading beamforming pattern. To quantify the security level, the CSR is introduced. The optimization of the camouflage codebook is formulated as an inner product maximization problem. A DaS algorithm is used to derive the optimal codebook for signal enhancement, followed by TS to determine the phase shift topology that maximizes CSR. Simulation results support the effectiveness of the proposed approach.
Research on Unmanned Aircraft Radio Frequency Signal Recognition Algorithm Based on Wavelet Entropy Features
LIU Bing, SHI Mingxin, LIU Jiaqi
Available online  , doi: 10.11999/JEIT250051
Abstract:
  Objective   With the rapid development and broad application of Unmanned Aerial Vehicle (UAV) technology, its use in military reconnaissance, agricultural spraying, logistics, and film production presents growing challenges in signal classification and safety supervision. Accurate classification of UAV Radio Frequency (RF) signals in complex electromagnetic environments is critical for real-time flight monitoring, autonomous obstacle avoidance, and communication reliability in multi-agent coordination. However, conventional recognition methods exhibit limitations in both feature extraction and classification accuracy, particularly under interference or multipath propagation, which severely reduces recognition performance and constrains practical implementation. To address this limitation, this study proposes a recognition algorithm based on wavelet entropy features and an optimized Support Vector Machine (SVM). The method enhances classification accuracy and robustness by extracting wavelet entropy features from UAV RF signals and optimizing SVM parameters using the Great Cane Rat Optimization Algorithm (GCRA). The proposed approach offers a reliable strategy for UAV signal identification under complex electromagnetic conditions. The results contribute to UAV airspace regulation and unauthorized flight detection and establish a foundation for future applications, including autonomous navigation and intelligent route planning. This work holds both theoretical value and practical relevance for supporting the secure and standardized advancement of UAV systems.   Methods   This study adopts a systematic approach to achieve accurate classification and recognition of UAV RF signals, including four key stages: data acquisition, feature extraction, classifier design, and performance verification. First, the publicly available DroneRFa dataset is selected as the experimental dataset. It contains RF signals from 24 mainstream UAV models (e.g., DJI Phantom 3, DJI Inspire 2) across three ISM frequency bands—915 MHz, 2.4 GHz, and 5.8 GHz (Fig. 1). Data collection follows a “pick-store-pick-store” protocol to preserve signal integrity and ensure accurate classification. During preprocessing, 50,000 sampling points are extracted from each channel (RF0_I, RF0_Q, RF1_I, RF1_Q), balancing data continuity and feature representativeness under hardware read/write constraints. Signal magnitudes are normalized to eliminate amplitude-related bias. For feature extraction, a three-level wavelet transform using the Daubechies “db4” wavelet is applied to decompose the signal at multiple scales. A four-dimensional feature matrix is constructed by computing wavelet spectral entropy (Figs. 2 and 3), which captures both time-frequency characteristics and energy distribution. Feature differences among UAV models are confirmed by F-test analysis (Table 1), establishing a robust foundation for classification. In the classifier design stage, the GCRA is applied to optimize the penalty parameter C and Gaussian kernel parameter γ of the SVM. The classification error rate serves as the fitness function during optimization (Fig. 5). Inspired by the foraging behavior of cane rats, GCRA offers improved global search performance. Finally, algorithm performance is evaluated using 10-fold cross-validation and benchmarked against unoptimized SVM, PSO-SVM, GA-SVM, and GWO-SVM (Table 3), demonstrating the robustness and reliability of the proposed method.   Results and Discussions   This study yields several key findings. For wavelet entropy feature extraction, the F-test confirms that features from all four channels are statistically significant (p < 0.05), demonstrating their effectiveness in distinguishing among UAV models (Table 1). In classifier optimization, the GCRA exhibits strong parameter search capability, with fitness convergence achieved within 50 iterations at approximately 0.03 (Fig. 6). The optimized SVM classifier reaches an average recognition accuracy of 98.5%, representing a 6.8 percentage point improvement over the traditional SVM (Table 3). At the individual model level, the highest recognition accuracy is observed for DJI Inspire 2 (99.0%), with all other models exceeding 97% (Table 2). Confusion matrix analysis indicates that all misclassification rates are below 3% (Table2, Fig. 7). Notably, under identical experimental conditions, GCRA-SVM outperforms other optimization algorithms—achieving higher accuracy than PSO-SVM (94.7%) and GA-SVM (94.2%)—with lower variance (±0.00032), indicating greater stability (Table 3). These results validate the discriminative power of wavelet entropy features and highlight the enhanced performance and robustness of GCRA-based SVM optimization.   Conclusions   Through systematic theoretical analysis and experimental validation, this study reaches several key conclusions. The wavelet entropy-based feature extraction method effectively captures the time-frequency characteristics of UAV RF signals. By employing multi-scale decomposition and energy distribution analysis, it accurately identifies the unique signal features of various UAV models. Statistical tests confirm significant differences among the features of different UAV categories, providing a solid foundation for feature selection in UAV identification. The optimization of SVM parameters using the GCRA substantially enhances classification performance, achieving an average accuracy of 98.5% and a peak of 99% on the DroneRFa dataset, with excellent stability. This method addresses the technical challenge of UAV RF signal recognition in complex electromagnetic environments, with performance metrics fully meeting practical application needs. The findings offer a reliable technical solution for UAV flight supervision and lay the groundwork for advanced applications such as autonomous obstacle avoidance. Future research may focus on evaluating the method’s performance in high-noise environments and exploring fusion strategies with other models. Overall, this study provides significant contributions both in terms of theoretical innovation and engineering application.
Research on an EEG-based Neurofeedback System for the Auxiliary Intervention of Post-Traumatic Stress Disorder
TAN Lize, DING Peng, WANG Fan, LI Na, GONG Anmin, NAN Wenya, LI Tianwen, ZHAO Lei, FU Yunfa
Available online  , doi: 10.11999/JEIT250093
Abstract:
  Objective  The ElectroEncephaloGram (EEG)-based Neurofeedback Regulation (ENR) system is designed for real-time modulation of dysregulated stress responses to reduce symptoms of Post-Traumatic Stress Disorder (PTSD) and anxiety. This study evaluates the system’s effectiveness and applicability using a series of neurofeedback paradigms tailored for both PTSD patients and healthy participants.  Methods  Employing real-time EEG monitoring and feedback, the ENR system targets the regulation of alpha wave activity, to alleviate mental health symptoms associated with dysregulated stress responses. The system integrates MATLAB and Unity3D to support a complete workflow for EEG data acquisition, processing, storage, and visual feedback. Experimental validation includes both PTSD patients and healthy participants to assess the system’s effects on neuroplasticity and emotional regulation. Primary assessment indices include changes in alpha wave dynamics and self-reported reductions in stress and anxiety.  Results and Discussions  Compared with conventional therapeutic methods, the ENR system shows significant potential in reducing symptoms of PTSD and anxiety. During functionality tests, the system effectively captures and regulates alpha wave activity, enabling real-time and efficient neurofeedback. Dynamic adjustment of feedback thresholds and task paradigms allows participants to improve stress responses and emotional states following training. Quantitative data indicate clear enhancements in EEG pattern modulation, while qualitative assessments reflect improvements in participants’ self-reported stress and anxiety levels.  Conclusion  This study presents an effective and practical EEG-based neurofeedback regulation system that proves applicable and beneficial for both individuals with PTSD and healthy participants. The successful implementation of the system provides a new technological approach for mental health interventions and supports ongoing personalized neuroregulation strategies. Future research should explore broader applications of the system across neurological conditions to fully assess its efficacy and scalability.
Swin Transformer-based Wideband Wireless Image Transmission Semantic Joint Encoding and Decoding Method
SHEN Bin, LI Xuan, LAI Xuebing, YANG Shuhan
Available online  , doi: 10.11999/JEIT250039
Abstract:
  Objective  Conventional studies on image semantic communication primarily address simplified channel models, such as Gaussian and Rayleigh fading channels. However, real-world wireless communication environments are characterized by complex multipath fading, which necessitates advanced signal processing at both the transmitter and receiver. To address this challenge, this paper proposes a Wideband Wireless Image Transmission Semantic Communication (WWIT-SC) system based on the Swin Transformer. The proposed method enhances image transmission performance in multipath fading channels through end-to-end semantic joint encoding and decoding.  Methods  The WWIT-SC system adopts the Swin Transformer as the core architecture for semantic encoding and decoding. This network not only processes semantic image representations but also improves adaptability to complex channel conditions through a joint mechanism based on Channel State Information (CSI) and Coordinate Attention (CA). CSI, a key signal in wireless systems, enables accurate estimation of channel conditions. However, due to temporal variations in wireless channels, CSI is often subject to attenuation and distortion, reducing its effectiveness when used in isolation. To address this limitation, the system incorporates a CSI-guided CA mechanism that enables fine-grained mapping and adjustment of semantic features across subcarriers. This mechanism integrates spatial and channel-domain features to localize critical information adaptively, thereby accommodating the channel’s time-varying behavior. A Channel Estimation Subnetwork (CES) is further implemented at the receiver to correct CSI estimation errors introduced by noise and dynamic channel variations. The CES enhances CSI accuracy during decoding, resulting in improved semantic image reconstruction quality.  Results and Discussions   The WWIT-SC and CA-JSCC models are trained under fixed Signal-to-Noise Ratio (SNR) conditions and evaluated at the same SNR values. Across all SNR levels, the WWIT-SC model consistently outperforms CA-JSCC. Specifically, Peak Signal-to-Noise Ratio (PSNR) improves by 6.4%, 8.5%, and 9.3% at different bandwidth ratios ()(Fig.4). Both models are also trained using SNR values randomly selected from the range [0, 15] dB and tested at various SNR levels. Although random SNR training leads to reduced overall performance compared to fixed SNR training, WWIT-SC maintains superior performance over CA-JSCC across all conditions. Under these settings, PSNR gains of up to 6.8%, 8.3%, and 9.8% are achieved at different bandwidth ratios ()(Fig. 4). Further evaluation is conducted by training both models on randomly cropped ImageNet images and testing them on the Kodak dataset. The WWIT-SC model trained on the larger dataset achieves up to a 4% PSNR improvement over CA-JSCC on Kodak (Fig. 6). A series of ablation experiments are conducted to assess the contributions of each module in WWIT-SC. First, the Swin Transformer is replaced with the Feature Learning (FL) module from CA-JSCC. Across all three bandwidth ratios, PSNR values for WWIT-SC exceed those of the modified WWIT-SC-FL variant at all SNR levels (Fig. 5(a)), confirming the importance of multi-scale feature extraction. Next, the CSI-CA module is replaced with the Channel Learning (CL) module from CA-JSCC. Again, WWIT-SC outperforms the modified WWIT-SC-CL model across all bandwidth ratios and SNR values (Fig. 5(b)), highlighting the role of the long-range dependency mechanism in enhancing feature localization and adaptation. Finally, the CES is removed to assess its contribution. The original WWIT-SC model consistently achieves higher PSNR values than the variant without CES at all bandwidth ratios and SNR levels (Fig. 5(c)), demonstrating that the inclusion of CES substantially improves channel decoding accuracy.  Conclusions  This paper proposes a Swin Transformer-based WWIT-SC system, integrating Orthogonal Frequency Division Multiplexing (OFDM) technology to enhance semantic image transmission under multipath fading channels. The scheme employs the Swin Transformer as the backbone for the semantic encoder-decoder and incorporates a CSI-assisted CA mechanism to accurately map critical semantic features to subcarriers, adapting to time-varying channel conditions. In addition, a CES at the receiver compensates for channel estimation errors, improving CSI accuracy. Experimental results show that, compared to CA-JSCC, the WWIT-SC system achieves up to a 9.8% PSNR improvement. This work presents a novel solution for semantic image transmission in complex broadband wireless communication environments.
Exploration of Application of Artificial Intelligence Technology in Underwater Acoustic Network Routing Protocols
ZHAO Yihao, CHEN Yougan, LI Jianghui, WAN Lei, TAO Yi, WANG Xuchen, DONG Yanhan, TU Shen’ao, XU Xiaomei
Available online  , doi: 10.11999/JEIT250110
Abstract:
  Significance   Significance In response to the strategic emphasis on maritime power, China has experienced growing demand for ocean resource exploration, ecological monitoring, and defense applications. Underwater acoustic networks provide an effective solution for data acquisition in these domains, with network performance largely dependent on the design and implementation of routing protocols. These protocols determine the transmission path and method, forming a foundation for optimizing underwater communication. Recent advances in Artificial Intelligence (AI) have prompted efforts to apply AI techniques to underwater acoustic network routing. By leveraging AI’s learning capacity, data insight capability, and adaptability, researchers aim to address challenges posed by dynamic underwater environments, energy limitations of nodes, and potential security threats. This paper examines the integration of AI technology into underwater acoustic network routing protocols and provides a critical evaluation of current research progress.   Progress   This paper reviews the application of AI technology in underwater acoustic network routing protocols, classifying existing approaches into flat and hierarchical routing categories. In flat routing, AI methods such as conventional heuristic algorithms, reinforcement learning, and deep learning have been applied to improve routing decisions. For hierarchical routing, AI is utilized not only for routing optimization but also for node clustering and layer structuring. These applications offer potential benefits, including enhanced routing efficiency, reduced energy consumption, improved end-to-end delay, and strengthened network security. Most performance evaluations are based on simulations. However, simulation environments vary considerably across studies, particularly in node quantity and density, ranging from small-scale to very large-scale networks. This variability complicates quantitative comparisons of performance metrics. Additionally, replicating these simulation scenarios in sea trials is limited by the logistical and financial constraints of deploying and recovering large numbers of nodes, thus impeding the validation of protocol performance under real-world conditions. The review further identifies critical challenges in applying AI to underwater acoustic networks. Many AI-based protocols operate under impractical assumptions, such as global knowledge of node positions and energy levels, which is rarely achievable in dynamic underwater settings. Maintaining such information requires substantial communication overhead, thereby increasing energy consumption and delay. Furthermore, the computational complexity of AI algorithms—particularly deep learning models—presents difficulties for implementation on underwater nodes with limited power, processing, and storage capacities. Few studies provide detailed complexity analyses, and hardware-based performance verifications remain scarce. This lack of real-world validation limits the assessment of the practical feasibility and effectiveness of AI-enabled routing protocols.  Conclusions  AI technology offers considerable potential for enhancing underwater acoustic network routing protocols by addressing key challenges such as environmental variability, energy constraints, and security threats. However, current research is constrained by several limitations. Many studies rely on unrealistic assumptions regarding the availability of complete node information, which is impractical in dynamic underwater settings. The acquisition and maintenance of such information entail substantial communication overhead, leading to increased energy consumption and delay. Moreover, the computational demands of AI algorithms—particularly deep learning models—often exceed the capabilities of resource-limited underwater nodes. Performance assessments remain predominantly simulation-based, with limited hardware implementation, thereby restricting the verification of real-world feasibility and effectiveness.  Prospects  Future research should prioritize the development of more accurate and realistic simulation platforms to support the evaluation of AI-based routing protocols. This includes the integration of advanced channel models and real-world observational data to improve simulation fidelity. Establishing standardized simulation conditions will also be essential for enabling consistent performance comparisons across studies. In parallel, greater emphasis should be placed on hardware implementation of AI algorithms, with efforts directed toward reducing algorithmic complexity and storage demands to accommodate the limitations of energy-constrained underwater nodes. Exploring cost-effective validation approaches, such as small-scale sea trials and semi-physical simulation frameworks, will be critical for assessing the practical performance and deployment feasibility of AI-enabled routing protocols.
Millimeter-wave Radar Point Cloud Gait Recognition Method Under Open-set Conditions Based on Similarity Prediction and Automatic Threshold Estimation
DU Lan, LI Yiming, XUE Shikun, SHI Yu, CHEN Jian, LI Zhenfang
Available online  , doi: 10.11999/JEIT241034
Abstract:
  Objective  Radar-based gait recognition systems are typically developed under closed-set assumptions, limiting their applicability in real-world scenarios where unknown individuals frequently occur. This constraint presents challenges in security-critical settings such as surveillance and access control, where both accurate recognition of known individuals and reliable exclusion of unknown identities are essential. Existing methods often lack effective mechanisms to differentiate between known and unknown classes, leading to elevated false acceptance rates and security risks. To overcome this limitation, this study proposes an open-set recognition framework that integrates a similarity prediction network with an adaptive thresholding method based on Extreme Value Theory (EVT). The framework models the score distributions of known and unknown classes to enable robust identification of unfamiliar identities without requiring samples from unknown classes during training. The proposed method enhances the robustness and applicability of millimeter-wave radar-based gait recognition under open-set conditions, supporting its deployment in operational environments.  Methods  The proposed method comprises four key modules: point cloud feature extraction network training, similarity prediction network training, automatic threshold estimation, and open-set testing. A sequential training strategy is adopted to ensure robust learning. First, the point cloud feature extraction network is trained with a triplet loss function that encourages intra-class compactness and inter-class separability by pulling same-class samples closer and pushing different-class samples apart. This enables the network to learn stable and discriminative representations, even under variations in viewpoint or clothing. The extracted features are then input into a similarity prediction network trained to model the score distributions of known and unknown identities. By incorporating score-based constraints, the network learns a decision space in which known and unknown classes are more effectively separated. Following network optimization, an EVT-based thresholding module is employed. This module dynamically models the tail distributions of similarity scores and automatically determines a class-agnostic threshold by minimizing the joint false acceptance and false rejection rates. This adaptive and theoretically grounded strategy enhances the separation between known and unknown classes in the similarity space. Together, these modules improve the stability and accuracy of radar-based gait recognition under open-set conditions, supporting reliable operation in real-world scenarios where unfamiliar individuals may appear.  Results and Discussions  The proposed method improves distributional separation between known and unknown classes in the similarity score space through the similarity prediction network and distinguishes them effectively using adaptive thresholding. Experimental results show that the method consistently yields higher F1 scores across all openness levels compared with baseline approaches, indicating strong robustness to open-set variations (Table 1). Specifically, the method achieves an 87% recognition rate for known classes and a 96% rejection rate for unknown classes, outperforming all comparison methods (Fig. 7). Ablation experiments confirm that incorporating the similarity prediction module enhances recognition performance under high openness. Manually set thresholds, while effective under low openness, show substantial performance degradation under large openness (F1 score: 43.93%). In contrast, the proposed automatic thresholding module demonstrates superior generalization, improving the F1 score by 22.88% under large openness conditions (Table 2). Further analysis shows that the method significantly increases the score distribution gap between known and unknown classes, contributing to improved recognition reliability (Fig. 8). Comparative evaluations (Table 3) confirm that the method achieves superior open-set recognition performance. In addition, the employed point cloud feature extraction network captures temporal features at multiple time scales and uses an attention-based mechanism to adaptively aggregate information across frames and temporal resolutions. This contributes to more robust gait representations and further improves open-set recognition performance compared with other feature extraction networks (Table 4).  Conclusions  Building on previous work on robust feature extraction under complex covariate conditions, this study extends millimeter-wave radar point cloud gait recognition to open-set scenarios. The proposed method preserves the recognition strength of the original feature extraction network and enhances class discriminability by integrating a similarity prediction network. To address the limitations of manually defined rejection thresholds, an automatic threshold determination module based on EVT is introduced. Extensive experiments using measured millimeter-wave radar point cloud gait data confirm that the method reliably distinguishes between known and unknown individuals, demonstrating its effectiveness and robustness under open-set conditions.
Hierarchical Network-Based Multi-Task Learning Method for Fishway Water Level Prediction
SU Xin, QIN Zijian, JIA Lv, QIN Mingyu
Available online  , doi: 10.11999/JEIT241003
Abstract:
  Objective  The construction of dams and other large-scale water infrastructure projects has significant ecological consequences, particularly affecting fish migration patterns. These environmental changes pose substantial challenges to biodiversity conservation and resource management. One of the key challenges is the accurate and real-time prediction of water levels in fish passages, which is essential for mitigating the negative effects of dams on fish migration, maintaining ecological balance, and ensuring the sustainability of aquatic species. Traditional water level monitoring systems often face limitations, such as insufficient coverage, lack of real-time predictive capabilities, and an inability to capture complex temporal dependencies in water level fluctuations, leading to inaccurate or delayed predictions. Furthermore, the processing of long-term, high-dimensional water level data in dynamic environments remains a critical gap in existing systems. To address these issues, this study proposes a Hierarchical Network-based Fish Passage Monitoring System (HNFMS) and a novel Multi-Task (MT) learning model, Adaptive Sequence Self-Organizing Map Transformation based on Variational Mode Decomposition (AS-SOMVT). The HNFMS aims to enhance both the efficiency and coverage of water level monitoring by providing comprehensive and timely data. The AS-SOMVT model employs auxiliary sequences to improve prediction accuracy and manage dynamic, multi-dimensional water level data in real time. Through these innovations, this study aims to enhance fish passage monitoring, mitigate the ecological impact of dam construction on fish migration, and provide a robust tool for ecological conservation and resource management.  Methods  The HNFMS integrates a hierarchical network structure to improve both the efficiency and coverage of water level monitoring. To address the complex temporal dependencies inherent in water level fluctuations, this study introduces the AS-SOMVT MT learning model. This model leverages auxiliary sequences to enhance the ability to capture complex temporal relationships, ensuring accurate water level predictions. The approach enables real-time processing of multi-dimensional water level data, effectively managing the complexity of fluctuating water levels across varying conditions. Additionally, the study incorporates an Auxiliary Sequence Self-Organizing Map (AS-SOM) algorithm to optimize prediction efficiency for long sequences, further enhancing the model’s capacity to process high-dimensional, multi-variate water level data. The model also integrates a Variational Mode Decomposition (VMD) technique, which decomposes complex water level time series into different frequency components. This approach extracts key feature patterns with higher predictive value while filtering out noise and redundant information, improving data quality and enhancing the model’s predictive performance. To increase the robustness of the system, the study incorporates an ensemble of diverse machine learning techniques, including both deep learning models and traditional statistical methods. This ensemble is designed to adapt to varying environmental conditions and ensure robust performance across different situations.  Results and Discussions  The AS-SOMVT model significantly outperforms traditional models in water level prediction accuracy. The integration of auxiliary sequences allows the model to capture complex temporal dependencies more effectively, resulting in more reliable real-time predictions (Fig. 4). Furthermore, the incorporation of VMD improves the model’s ability to remove noise and extract crucial features, enhancing its adaptability to dynamic water level changes in real-world environments. Ablation experiments demonstrate that removing key components, such as feature Relationship modeling (Rel), Attention Pooling (AP), or MT Learning, leads to a substantial decline in model performance. This highlights the essential role these components play in improving predictive accuracy and managing complex patterns. Specifically, the removal of any of these components results in a marked decrease in precision and stability, highlighting the collaborative contribution of these elements within the MT learning framework. In multi-dimensional water level prediction tasks, the AS-SOMVT model performs exceptionally well, especially in dynamic environments. Additionally, the hierarchical structure of the HNFMS substantially enhances monitoring efficiency and coverage, providing more accurate and comprehensive water level data through real-time model adjustments (Fig. 8). In comparative experiments, the AS-SOMVT model consistently outperforms traditional models, particularly in forecasting multi-dimensional water levels, establishing it as a powerful tool for large-scale, real-time monitoring applications (Table 4).  Conclusions  The proposed HNFMS, combined with the AS-SOMVT MT learning model, offers an effective solution for real-time, accurate water level prediction in fish passages. This innovative approach not only enhances the efficiency and coverage of water level monitoring systems but also provides a valuable tool for mitigating the ecological impacts of dam constructions on fish migration. The integration of auxiliary sequences into the MT learning model has proven to be a critical factor in improving predictive performance, opening new opportunities for ecological conservation. As concerns about the ecological impacts of water infrastructure projects grow, the development of more accurate and efficient water level monitoring systems becomes increasingly vital for informing policy decisions, designing fish-friendly structures, and enhancing aquatic ecosystem management. This study presents a scientifically significant and practically necessary solution for promoting sustainable environmental practices. The integration of advanced machine learning techniques, such as MT learning and VMD, ensures the system can handle both short-term and long-term water level prediction tasks, addressing the complexities of environmental dynamics in real time. This research, therefore, makes a significant contribution to the field of environmental monitoring and provides essential insights for the future development of eco-friendly infrastructure.
Density Clustering Hypersphere-based self-adaptively Oversampling Algorithm for Imbalanced Datasets
TAO Xinmin, LI Junxuan, GUO Xinyue, SHI Lihang, XU Annan, ZHANG Yanping
Available online  , doi: 10.11999/JEIT241037
Abstract:
  Objective  LLearning from imbalanced datasets presents significant challenges for the supervised learning community. Existing oversampling methods, however, have notable limitations when applied to complex imbalanced datasets. These methods can introduce noisy instances, leading to class overlap, and fail to effectively address within-class imbalance caused by low-density regions and small disjuncts. To overcome these issues, this study proposes the Density Clustering Hypersphere-based self-adaptively Oversampling algorithm (DCHO).  Methods  The DCHO algorithm first identifies clustering centers by dynamically calculating the density of minority class instances. Hyperspheres are then constructed around each center to guide clustering, and oversampling is performed within these hyperspheres to reduce class overlap. Oversampling weights are adaptively assigned according to the number of instances and the radius of each hypersphere, which helps mitigate within-class imbalance. To further refine the boundary distribution of the minority class and explore underrepresented regions, a boundary-biased random oversampling technique is introduced to generate synthetic samples within each hypersphere.  Results and Discussions  The DCHO algorithm dynamically identifies clustering centers based on the density of minority class instances, constructs hyperspheres, and assigns all minority class instances to corresponding clusters. This forms the foundation for oversampling. The algorithm further adjusts the influence of the cumulative density of instances within each hypersphere and the hypersphere radius on the allocation of oversampling weights through a defined trade-off parameter \begin{document}$ \alpha $\end{document}. Experimental results indicate that this approach reduces class overlap and assigns greater oversampling weights to sparse, low-density regions, thereby generating more synthetic instances to improve representativeness and address within-class imbalance (Fig. 7). When the trade-off parameter is set to 0.5, the algorithm effectively incorporates both density and boundary distribution, improving the performance of subsequent classification tasks (Fig. 11).  Conclusions  Comparative results with other popular oversampling algorithms show that: (1) The DCHO algorithm effectively prevents class overlap by oversampling exclusively within the generated hypersphere. Meanwhile, the algorithm adaptively assigns oversampling weights based on the local density of instances within the hypersphere and its radius, thereby addressing the within-class imbalance issue. (2) By considering the relationship between the hypersphere radius and the density of the minority class instances, the balance parameter \begin{document}$ \alpha $\end{document} is set to 0.5, which comprehensively addresses both the within-class imbalance caused by density and the enhancement of the minority class boundary distribution, ultimately improving classification performance on imbalanced datasets. (3) When applied to highly imbalanced datasets with complex boundaries, DCHO significantly improves the distribution of minority class instances, thereby enhancing the classifier's generalization ability.
High Precision Large Aperture Array Calibration Method for Residual Separation of Near-field Effects in Darkroom
XU Libing, LIU Kaixin
Available online  , doi: 10.11999/JEIT241084
Abstract:
  Objective  Direction estimation is a critical aspect of array signal processing technology. Array errors are inevitably introduced during the manufacturing and installation processes. To mitigate the negative effects of these errors on the accuracy and resolution of direction estimation, arrays must be calibrated before deployment. In practical engineering applications, active array calibration is the primary method, and performing calibration in a darkroom, which shields against electromagnetic wave interference, enhances calibration performance. However, the size of the darkroom is limited. As the array aperture increases, the distance between the calibration source and the array may fail to meet the far-field condition, leading to the introduction of nonlinear near-field phase components in the received signal. Moreover, the absence of a precise positioning system in the darkroom may result in the actual installation position of the calibration source differing from its nominal position, further compromising calibration performance. To address these issues, this paper proposes a high-precision large aperture array calibration method that separates residual near-field effects in the darkroom. This method eliminates the phase residuals caused by deviations in the calibration source position and the near-field distance of the source, effectively calibrates the amplitude and phase errors of large aperture arrays in the darkroom, and thereby improves the accuracy of direction estimation.  Methods  The proposed method utilizes the nominal coordinates of the calibration source to compensate for the phase difference caused by the near-field distance and derives the formula for the near-field effect residual resulting from source position deviations. Next, an array amplitude and phase error estimation technique at nominal coordinates is proposed to solve for the array amplitude and phase error matrix containing the phase residual. This technique constructs a cost function based on the orthogonality between the subspace spanned by the array manifold vector with amplitude-phase errors and the noise subspace of the array's received signals. The least squares method is then applied to solve for the low-precision array error estimation results. To enhance precision, this method employs a near-field effect residual separation technique to separate the phase residuals from the solved array error matrix, thereby achieving high-precision array error estimation results. Through differential operations, the technique verifies that the near-field effect residual of each element in a uniform linear array is approximately proportional to the element's serial number. High-precision array calibration improves the accuracy of direction estimation.  Results and Discussions  This paper proposes a high-precision near-field calibration method for large-aperture uniform linear arrays. The method addresses the calibration problem under near-field conditions and mitigates the negative effects of calibration source position deviation on the active calibration algorithm. It requires only a single calibration source, demonstrating both innovation and practicality.In Section 5, the performance of the proposed method is analyzed in detail through simulation. First, Fig. 2 verifies an important conclusion of this algorithm: the near-field effect residual of each element in a uniform linear array is approximately proportional to the element's serial number. Figs. 4 to 7 examine the influences of various factors on array calibration performance, including calibration source position deviation, array aperture, distance between the calibration source and the array, and signal frequency. All four factors significantly impact the magnitude of the near-field residual. Specifically, increasing source position deviation, array aperture, and signal frequency, as well as decreasing the distance of the calibration source, will all increase the phase residual, which negatively affects array calibration. In Fig. 4, the proposed method demonstrates greater tolerance to severe source position deviations, maintaining high accuracy in array error estimation even under such conditions. Fig. 5 shows that increasing the number of array elements, which is equivalent to enlarging the array aperture, increases the near-field effect residual. However, this method effectively removes the residual, aiding in achieving high-precision array error estimation and restoring high-precision direction estimation for the large-aperture array. Fig. 6 investigates the influence of the distance between the calibration source and the array. Without removal of the near-field effect residual, the array error estimation accuracy rapidly decreases as the distance decreases. However, the proposed method ensures that the accuracy decreases only slowly. When the near-field distance ranges from 20m to 50m, the accuracy remains nearly unchanged. This simulation clearly demonstrates the effectiveness of the method in removing near-field effect residuals. High-frequency signals theoretically provide excellent direction estimation accuracy. However, higher signal frequencies lead to more severe near-field phase residuals, and inefficient array calibration can further degrade direction estimation accuracy. In Fig. 7, without removing the near-field phase residual, the direction estimation accuracy does not improve even with an increase in signal frequency. Fortunately, after removing the near-field effect residual with the proposed method, high-precision direction estimation performance is restored for high-frequency signals.  Conclusions  This paper addresses the challenge that large-aperture arrays often fail to satisfy the far-field distance condition in a darkroom. Additionally, due to instrument measurement errors, obtaining precise calibration source position coordinates in the darkroom is difficult, which complicates array calibration. To address these issues, a high-precision large-aperture array calibration method for residual separation of near-field effects in a darkroom is proposed. This method not only achieves near-field array calibration but also mitigates the phase errors caused by source position deviation, resulting in high-precision estimation of array amplitude and phase errors.The method compensates for the phase difference of the near-field path and constructs a cost function to estimate array amplitude and phase errors using the orthogonality of the signal subspace. The calibration source position deviation introduces near-field effect phase residuals. This paper analyzes the relationship between the phase residual and the array element serial number, solves for and removes the phase residual, thus obtaining high-precision array error estimation results.Simulation results demonstrate that the proposed method has a high tolerance for source position deviation. Even under conditions of large array aperture, high signal frequency, and close proximity between the calibration source and the array, the method remains effective in calibrating the array and significantly improves the accuracy of direction estimation.
Gas Station Inspection Task Allocation Algorithm in Digital Twin-assisted Reinforcement Learning
LIAN Yuanfeng, TIAN Tian, CHEN Xiaohe, DONG Shaohua
Available online  , doi: 10.11999/JEIT241027
Abstract:
  Objective  With the increasing quantity of equipment in gas stations and the growing demand for safety, Multi-Robot Task Allocation (MRTA) has become essential for improving inspection efficiency. Although existing MRTA algorithms offer basic allocation strategies, they have limited capacity to respond to emergent tasks and to manage energy consumption effectively. To address these limitations, this study integrates digital twin technology with a reinforcement learning framework. By incorporating Lyapunov optimization and decoupling the optimization objectives, the proposed method improves inspection efficiency while maintaining a balance between robot energy use and task delay. This approach enhances task allocation in complex gas station scenarios and provides theoretical support for intelligent unmanned management systems in such environments.  Methods  The DTPPO algorithm constructs a multi-objective joint optimization model for inspection task allocation, with energy consumption and task delay as the primary criteria. The model considers the execution performance of multiple robots and the characteristics of heterogeneous tasks. Lyapunov optimization theory is then applied to decouple the time-energy coupling constraints of the inspection objectives. Using the Lyapunov drift-plus-penalty framework, the algorithm balances task delay and energy consumption, which simplifies the original joint optimization problem. The decoupled objectives are solved using a strategy that combines digital twin technology with the Proximal Policy Optimization (PPO) algorithm, resulting in a task allocation policy for multi-robot inspection in gas station environments.  Results and Discussions  The DTPPO algorithm decouples long-term energy consumption and time constraints using Lyapunov optimization, incorporating their variations into the reward function of the reinforcement learning model. Simulation results show that the Pathfinding inspection path (Fig. 4) generated by the DTPPO algorithm improves the task completion rate by 1.94% compared to benchmark experiments. In complex gas station environments (Fig. 5), the algorithm achieves a 1.92% improvement. When the task quantity parameter is set between 0.1 and 0.5 (Fig. 8), the algorithm maintains a high task completion rate even under heavy load. With 2 to 6 robots (Fig. 9), the algorithm demonstrates strong adaptability and effectiveness in resource-constrained scenarios.  Conclusions  This study addresses the coupling between energy consumption and time by decoupling the objective function constraints through Lyapunov optimization. By incorporating the variation of Lyapunov drift-plus-penalty terms into the reward function of reinforcement learning, a digital twin-assisted reinforcement learning algorithm, named DTPPO, is proposed. The method is evaluated in multiple simulated environments, and the results show the following: (1) The proposed approach achieves a 1.92% improvement in task completion rate compared to the DDQN algorithm; (2) Lyapunov optimization improves performance by 5.89% over algorithms that rely solely on reinforcement learning; (3) The algorithm demonstrates good adaptability and effectiveness under varying task quantities and robot numbers. However, this study focuses solely on Lyapunov theory, and future research should explore the integration of Lyapunov optimization with other algorithms to further enhance MRTA methods.
Non-zero Frequency Clutter Cancellation Method for Passive Bistatic Radar
CHEN Gang, SU Siyuan, WANG Jun, JIN Yi, XU Changzhi, ZHANG Meng, FU Shiwei
Available online  , doi: 10.11999/JEIT241018
Abstract:
  Objective and Methods   In passive bistatic radar systems, in addition to strong direct-path signals and zero-frequency multipath signals, non-zero frequency clutter echoes are also present. The conventional method is ineffective in removing these non-zero frequency clutter signals due to their strong randomness. To address this issue, several algorithms, such as the Extensive Cancellation Algorithm (ECA) and the Extensive Cancellation Algorithm by subCarrier (ECA-C), have been proposed. However, these methods have limitations in terms of computational cost and signal applicability. To overcome these challenges, this paper proposes a novel clutter cancellation method for passive bistatic radar. First, two types of clutter subspaces are constructed: the conventional clutter subspace and the extended clutter subspace. By designing and solving a new cost function, the optimal clutter cancellation weight factor is derived. The clutter signals, including non-zero frequency components, are then removed. Residual clutter signals are further suppressed through range-Doppler processing. Simulation analysis and real-data applications demonstrate that the proposed method reduces computational complexity while maintaining effective clutter cancellation performance.   Results and Discussions  As shown in Fig. 2a, the main lobe of the weak target echo is obscured by the sidelobes of strong clutter signals, preventing target detection. The noise platform level is 0.98 dB. In Fig. 2b, although the direct-path and zero-frequency clutter are suppressed using the conventional method, the target echo remains undetectable due to non-zero frequency clutter. The noise platform level is -28.84 dB, representing a reduction of approximately 28 dB in the detection platform. In contrast, Fig. 2c and Fig. 2d show that the target is detected when applying the extended ECA method and the proposed method, as the direct-path signal, zero-frequency clutter, and non-zero frequency clutter are effectively removed. The noise platform level is -43.6 dB, indicating a further reduction of approximately 15 dB compared with the conventional method. The clutter cancellation time for both methods increases with the clutter cancellation order and data length. However, the processing time growth of the extended ECA method is greater than that of the proposed method in both cases (Fig. 4). Validation using real data confirms that both targets are detected using the extended ECA method and the proposed method, as both effectively mitigate the effects of non-zero frequency clutter compared with the conventional method (Fig. 6). The processing time of the proposed method (13.56 s) is shorter than that of the extended ECA method (21.73 s). The results from real data further confirm the effectiveness of the proposed method.  Conclusions  This study proposes a new method for addressing the non-zero frequency clutter cancellation problem. In this approach, both the conventional clutter subspace and the extended clutter subspace are constructed. A new cost function is then designed and solved to achieve cancellation of both zero and non-zero frequency clutter. Residual clutter signals are further suppressed through range-Doppler processing. The performance of the proposed method is validated and compared with the extended ECA method through simulation results. Additionally, real-data applications confirm its effectiveness. This method effectively transforms high-order matrix operations into two low-order matrix operations, thereby reducing computational complexity. In practical applications, as the order of the clutter cancellation step increases, the computational advantage of the proposed method over the extended ECA method becomes more pronounced.
DTDS: Dilithium Dataset for Power Analysis
YUAN Qingjun, ZHANG Haojin, FAN Haopeng, GAO Yang, WANG Yongjuan
Available online  , doi: 10.11999/JEIT250048
Abstract:
  Objective  The development of quantum computing threatens the security of traditional cryptosystems and advances the research and standardisation of post-quantum cryptographic algorithms. The Dilithium digital signature algorithm is designed based on the lattice theory and was selected by USA National Institute of Standards and Technology (NIST) as the standard for post-quantum cryptographic algorithms in 2024. Meanwhile, the side channel analysis of Dilithium, especially the power analysis, has become a current research hotspot. However, the existing power analysis datasets are mainly for classical packet cryptography algorithms, such as AES, etc., and the lack of datasets for novel algorithms, such as Dilithium, restricts the research of side-channel security analysis methods.  Results and Discussions  For this reason, this paper collects and discloses the first power analysis dataset for the Dilithium algorithm, aiming to facilitate the research on power analysis of post-quantum cryptographic algorithms. The dataset is based on the open-source reference implementation of Dilithium, running on a Cortex M4 processor and captured by a dedicated device, and contains 60,000 traces captured during the Dilithium signature process, as well as the signature source data and sensitive intermediate values corresponding to each trace.  Conclusions  The constructed DTDS dataset is further visualised and analysed, and the execution process of the random polynomial generation function polyz_unpack and its effect on the traces are investigated in detail. Finally, the dataset is modelled and tested using template analysis and deep learning analytics to verify the validity and usefulness of the dataset. The dataset and code could be found at https://doi.org/10.57760/sciencedb.j00173.00001.
TTRC-ABE: Traitor Traceable and Revocable CLWE-based ABE Scheme from Lattices
LIU Yuan, WANG Licheng, ZHOU Yongbin
Available online  , doi: 10.11999/JEIT240997
Abstract:
  Objective  With the advancement of quantum computing, lattice-based cryptography has emerged as a key approach for constructing post-quantum secure cryptographic primitives due to its inherent resistance to quantum attacks. Among these primitives, lattice-based Attribute-Based Encryption (ABE) is particularly notable for its ability to provide fine-grained access control and flexible authorization, making it suitable for data-sharing applications, such as cloud computing and the Internet of Things (IoT). However, existing lattice-based ABE schemes, especially those based on Learning With Errors (LWE) or Ring-LWE (RLWE), exhibit limitations that hinder their practical deployment. A significant issue is the absence of traitor tracing and revocation mechanisms, which leaves these schemes vulnerable to key abuse, where malicious users can share decryption keys without detection or prevention. Furthermore, the exposure of attribute values in access policies creates a privacy risk, as sensitive user information may be inferred from these values. These limitations undermine the security and privacy of lattice-based ABE systems, limiting their applicability in real-world scenarios where accountability and privacy are critical. To address these challenges, this paper proposes a novel Traitor Traceable and Revocable CLWE-based ABE (TTRC-ABE) scheme, which employs a new variant of LWE called Cyclic Algebra LWE (CLWE). The proposed scheme aims to achieve three key objectives: (1) to introduce an efficient traitor tracing mechanism to identify malicious users and a revocation mechanism to prevent revoked users from decrypting messages; (2) to enhance attribute privacy by concealing attribute values in access policies; and (3) to improve the efficiency of lattice-based ABE schemes, specifically in terms of public key size, ciphertext size, and ciphertext expansion rate. By addressing these critical issues, TTRC-ABE contributes to the advancement of lattice-based cryptography and provides a viable solution for secure, privacy-preserving data sharing in quantum-vulnerable environments.  Methods  In the TTRC-ABE scheme, each user’s Global IDentity (GID) is bound to the leaf nodes of a complete binary tree. This binding enables the tracing of malicious users by identifying their GIDs embedded in decryption keys. To revoke compromised users, their GIDs are added to a revocation list, and the ciphertext is updated accordingly, ensuring that any revoked user cannot decrypt the message, even if they possess a valid decryption key. Additionally, the traditional one-dimensional attribute structure (attribute value only) is replaced with a two-dimensional structure (attribute label, attribute value). The attribute labels act as public identifiers, while the attribute values remain confidential. This separation allows for the concealment of sensitive attribute values while still enabling effective access control. A semi-access policy structure is combined with an extended Shamir’s secret sharing scheme over cyclic algebra to conceal attribute values in access policies, preventing adversaries from inferring sensitive user information. Furthermore, the proposed scheme utilizes CLWE, a new variant of LWE that offers improved efficiency and security properties. A formal security proof for TTRC-ABE is provided in the standard model. The security of the scheme relies on the hardness of the CLWE problem, which is believed to be resistant to quantum computing attacks.  Results and Discussions  The proposed TTRC-ABE scheme demonstrates significant improvements over existing lattice-based ABE schemes in terms of functionality, security, and efficiency. The scheme successfully integrates traitor tracing and revocation features, effectively preventing key abuse by identifying malicious users and revoking their access to encrypted data. By adopting a two-dimensional attribute structure and a semi-access policy, the scheme conceals attribute values in access policies, ensuring that sensitive user information remains confidential, even when the access policy is publicly accessible. Performance analysis shows that TTRC-ABE supports traitor tracing and revocation, protects attribute privacy, and is resistant to quantum computing attacks (Table 2). Compared to related lattice-based ABE schemes, TTRC-ABE significantly reduces the public key size, ciphertext size, and average ciphertext expansion rate (Table 2, Figure 7). These improvements enhance the practicality of the scheme for real-world applications, especially in resource-constrained environments.  Conclusions  This paper presents a novel TTRC-ABE scheme that addresses the limitations of existing lattice-based ABE schemes. By integrating traitor tracing and revocation mechanisms, the scheme effectively prevents key abuse and ensures system integrity. The introduction of a two-dimensional attribute structure and a semi-access policy enhances attribute privacy, safeguarding sensitive user information from leakage. Furthermore, the use of CLWE improves the scheme’s efficiency, reducing public key size, ciphertext size, and ciphertext expansion rate. Security analysis confirms that TTRC-ABE is secure in the standard model, making it a robust solution for post-quantum secure ABE. Future work will focus on extending the scheme to support more complex access policies, such as hierarchical and multi-authority structures, and optimizing its performance for large-scale applications. Additionally, the integration of TTRC-ABE with other cryptographic primitives, such as homomorphic encryption and secure multi-party computation, will be explored to enable more advanced data-sharing scenarios.
A Model-Assisted Federated Reinforcement Learning Method for Multi-UAV Path Planning
LU Yin, LIU Jinzhi, ZHANG Min
Available online  , doi: 10.11999/JEIT241055
Abstract:
  Objective  The rapid advancement of low-altitude Internet of Things (IoT) applications has increased the demand for efficient sensor data acquisition. Unmanned Aerial Vehicles (UAVs) have emerged as a viable solution due to their high mobility and deployment flexibility. However, existing multi-UAV path planning algorithms show limited adaptability and coordination efficiency in dynamic and complex environments. To overcome these limitations, this study develops a model-assisted approach that constructs a hybrid simulated environment by integrating channel modeling with position estimation. This strategy reduces the interaction cost between UAVs and the real world. Building on this, a federated reinforcement learning-based algorithm is proposed, which incorporates a maximum entropy strategy, monotonic value function decomposition, and a federated learning framework. The method is designed to optimize two objectives: maximizing the data collection rate and minimizing the flight path length. The proposed algorithm provides a scalable and efficient solution for cooperative multi-UAV path planning under dynamic and uncertain conditions.  Methods  This study formulates the multi-UAV path planning problem as a multi-objective optimization task and models it using a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) to address dynamic environments with partially unknown device positions. To improve credit assignment and exploration efficiency, enhanced reinforcement learning algorithms are developed. The exploration capacity of individual agents is increased using a maximum entropy strategy, and a dynamic entropy regularization mechanism is incorporated to avoid premature convergence. To ensure global optimality of the cooperative strategy, the method integrates monotonic value function decomposition based on the QMIX algorithm. A multi-dimensional reward function is designed to guide UAVs in balancing competing objectives, including data collection, path length, and device exploration. To reduce interaction costs in real environments, a model-assisted training framework is established. This framework combines known information with neural networks to learn channel characteristics and applies an improved particle swarm algorithm to estimate unknown device locations. To enhance generalization, federated learning is employed to aggregate local experiences from multiple UAVs into a global model through periodic updates. In addition, an attention mechanism is introduced to optimize inter-agent information aggregation, improving the accuracy of collaborative decision-making.  Results and Discussions  Simulation results demonstrate that the proposed algorithm converges more rapidly and with reduced volatility (red curves in Fig. 3 and Fig. 4), due to a 70% reduction in interactions with the real environment achieved by the model-assisted framework. The federated learning mechanism further enhances policy generalization through global model aggregation. Under test conditions with an initial energy of 50–80 J, the data collection rate increases by 1.99–4.94%, and the flight path length decreases by 7.4–16.8% relative to the baseline model (Fig. 6 and Fig. 7), confirming the effectiveness of the reward function and exploration strategy (Fig. 5). The attention mechanism allows UAVs to identify dependencies among sensing targets and cooperative agents, improving coordination. As shown in Fig. 2, the UAVs dynamically partition the environment to cover undiscovered devices, reducing path overlap and significantly improving collaborative efficiency.  Conclusions  This study proposes a model-assisted multi-UAV path planning method that integrates maximum entropy reinforcement learning, the QMIX algorithm, and federated learning to address the multi-objective data collection problem in complex environments. By incorporating modeling, dynamic entropy adjustment, and an attention mechanism within the Dec-POMDP framework, the approach effectively balances exploration and exploitation while resolving collaborative credit assignment in partially observable settings. The use of federated learning for distributed training and model sharing reduces communication overhead and enhances system scalability. Simulation results demonstrate that the proposed algorithm achieves superior performance in data collection efficiency, path optimization, and training stability compared with conventional methods. Future work will focus on coordination of heterogeneous UAV clusters and robustness under uncertain communication conditions to further support efficient data collection for low-altitude IoT applications.
Low-Complexity Transform Domain Orthogonal Time Frequency Space Channel Equalization Algorithm
LIAO Yong, LIU Shuang, LI Xue
Available online  , doi: 10.11999/JEIT250013
Abstract:
  Objective  Orthogonal Time Frequency Space (OTFS) modulation is a key technique for high-mobility communication systems, offering robustness against severe Doppler shifts and multipath fading. It provides notable advantages in dynamic environments such as vehicular networks, high-speed rail communications, and unmanned aerial vehicle systems, where conventional orthogonal frequency-division multiplexing fails due to rapid channel variations and dense scattering. However, standard equalization algorithms, including Zero Forcing (ZF) and Minimum Mean Square Error (MMSE), are often ineffective in mitigating Inter-Symbol Interference (ISI) and Inter-Doppler Interference (IDI) under rich-scatterer conditions. These methods also require large-scale matrix inversion, resulting in prohibitively high computational complexity, particularly on OTFS grids with high dimensionality (e.g., M = 32 subcarriers, N = 16 symbols per frame). Most existing studies adopt single-scatterer models that do not reflect the interference structure in practical multipath channels. This study proposes a low-complexity transform domain OTFS equalization algorithm that incorporates block matrix decomposition, transform domain diagonalization, and decision feedback strategies. The algorithm aims to (1) reduce complexity by exploiting block sparsity and structural features of the Delay-Doppler (DD) domain channel matrix, (2) improve interference suppression in time-varying Doppler and dense scattering environments, and (3) validate performance using the 3GPP Extended Vehicular A (EVA) channel model, which simulates realistic high-speed scenarios with user velocities ranging from 121.5 km/h to 607.5 km/h and multiple scattering paths.  Methods  The proposed algorithm operates in three key stages:1. Block-Wise ISI Elimination: Leveraging the block-sparse structure of the DD-domain channel matrix, the algorithm partitions the channel into submatrices, each corresponding to a specific DD component. Guard intervals are introduced to suppress ISI arising from signal dispersion across the OTFS grid. Each submatrix Km,l is modeled as a Toeplitz circulant matrix, enabling iterative cancellation of interference by subtracting previously estimated symbols.2. Transform Domain Diagonalization: Each Toeplitz circulant submatrix is diagonalized using Fourier-based operations. Specifically, the normalized FFT matrix FN is applied to Km,l , converting it into a diagonal form and transforming complex matrix inversion into element-wise division. This step reduces the computational complexity of MMSE equalization from \begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document} to \begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}, where N denotes the Doppler dimension of the OTFS resource grid.3. Decision Feedback Refinement: A closed-loop decision feedback mechanism is introduced to iteratively improve symbol estimates. The demodulated symbols are re-modulated and fed back to update the channel matrix, thereby enhancing estimation accuracy and lowering pilot overhead. The algorithm is evaluated using the 3GPP EVA channel model, which reflects practical high-speed communication scenarios with user velocities between 121.5 km/h and 607.5 km/h, time-varying Doppler shifts, and multiple scatterers. Key system parameters include 32 subcarriers (M = 32), 16 symbols per frame (N = 16), and modulation formats ranging from QPSK to 64QAM.  Results and Discussions  The performance of the proposed algorithm is evaluated against ZF, MMSE, Message Passing (MP), Maximal Ratio Combining (MRC), and Hybrid MP (HMP) detectors under low-speed (\begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}) and high-speed (\begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}) scenarios:Complexity reduction. The algorithm achieves a computational complexity of \begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}, markedly lower than that of ZF/MMSE (\begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}) and MP (\begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}). For M = 32 and N = 16, this results in a 99.5% reduction in operations compared with ZF (Table 2). Transform domain diagonalization simplifies matrix inversion into element-wise division, thereby eliminating \begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document} operations.Interference Suppression: In low-speed scenarios (\begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}), the algorithm yields a 2.5 dB Bit Error Ratio (BER) improvement over ZF and MMSE at 15 dB SNR under 16QAM modulation (Figure 12). The decision feedback mechanism further reduces the Normalized Mean Square Error (NMSE) by 12.5 dB while lowering pilot overhead by 50% (Figure 11). In high-speed scenarios (\begin{document}$\mathcal{O}\left( {{N^3}} \right)$\end{document}), the algorithm maintains superior performance, outperforming MRC and HMP by 1.7 dB and 1.0 dB, respectively, under 64QAM modulation (Figure 15).Modulation Robustness: The algorithm consistently demonstrates performance gains across QPSK, 16QAM, and 64QAM. At high SNR with 64QAM, BER gains of 1.7 dB, 1.5 dB, and 1.0 dB are achieved over MRC, MP, and HMP, respectively (Figures 1415). Transform domain processing efficiently diagonalizes the channel matrix and eliminates IDI (Figure 8), which is critical in scatterer-rich environments where non-diagonal components dominate interference.Practical Validation: Simulations using the 3GPP EVA model confirm the algorithm’s applicability in real-world high-mobility settings. At a user velocity of 607.5 km/h, the algorithm maintains a BER below 10−310−3 at 20 dB SNR, satisfying Ultra-Reliable and Low-Latency Communications (URLLC) criteria for Sixth-Generation (6G) networks.  Conclusions  This study presents a low-complexity approach to OTFS channel equalization, addressing both computational and interference challenges in high-mobility scenarios. By leveraging the block-sparse structure of the DD-domain channel matrix and applying Fourier-based diagonalization, the algorithm achieves near-linear complexity while maintaining competitive BER performance. The decision feedback mechanism further enhances robustness, enabling adaptive channel estimation with reduced pilot overhead.Key contributions include:Block-sparse matrix decomposition that facilitates sequential ISI elimination through the use of guard intervals and Toeplitz circulant structures.Fourier-based diagonalization that replaces matrix inversion with element-wise division, reducing computational complexity by orders of magnitude.A closed-loop decision feedback scheme that improves NMSE by 12.5 dB while halving the required pilot overhead.Simulation results under the 3GPP EVA model confirm the algorithm’ suitability for high-speed applications, such as vehicular networks and high-speed rail communications. Future work will explore extensions to large-scale Multiple-Input Multiple-Output (MIMO) systems, adaptive channel tracking, and multi-user interference suppression, with the aim of integrating this framework into 6G URLLC systems.
Collaborative Interference Resource Allocation Method Based on Improved Secretary Bird Algorithm
LI Yibing, SUN Liuqing, QI Changlong
Available online  , doi: 10.11999/JEIT240709
Abstract:
  Objective  In the complex electromagnetic environment of Networked Radars (NR), efficiently utilizing limited interference resources to reduce enemy detection capabilities and support successful penetration remains a critical challenge. Existing heuristic algorithms, while partially effective, do not jointly optimize interference patterns, beams, and power resources in multi-beam systems, limiting their applicability in penetration scenarios. To address this limitation, this study proposes an interference resource allocation strategy based on the Improved Secretary Bird Optimization Algorithm (ISBOA). The proposed strategy minimizes detection probability by integrating Cauchy mutation and global collaborative control, enabling the joint optimization of interference patterns, beams, and power across multiple jammers. This approach ensures rational resource allocation, enhances search capability, and improves convergence accuracy, thereby meeting the demands of penetration scenarios. The findings provide a novel solution for interference resource allocation in multi-beam systems against NR.  Methods  This study models the complex interference resource allocation problem as a multi-constrained nonlinear mixed-integer programming problem and addresses it using an improved intelligent optimization algorithm. A mixed-integer programming model incorporating interference patterns, beams, and power resources is established, with the detection and fusion probability of networked radar as the performance evaluation metric. The model accounts for the dynamic interactions between radars and jammers, as well as the pulse compression gains of various interference patterns. To overcome the limitations of the traditional Secretary Bird Optimization Algorithm (SBOA) in handling discrete variables and complex constraints, the study integrates Cauchy mutation and global collaborative control strategies. Cauchy mutation leverages its long-tail characteristics to enhance the algorithm’s global search capability, reducing the risk of convergence to local optima. The global collaborative control strategy incorporates penalty factors to ensure compliance with multi-variable constraints, enabling the simultaneous optimization of discrete and continuous variables.  Results and Discussions  This study presents an innovative interference resource allocation method for multi-beam jamming systems targeting networked radar, leveraging the ISBOA. By integrating Cauchy mutation and global cooperative control strategies, ISBOA significantly enhances optimization performance. Simulation results indicate that ISBOA outperforms other algorithms, including the original SBOA, Harris Hawks Optimizer (HHO), and Sparrow Search Algorithm (SSA). In a scenario with six jammers and eight radars, ISBOA achieved an optimal function value of 0.6095, which is notably lower than 0.8158 (SBOA), 1.2666 (HHO), and 1.3679 (SSA) (Fig. 4). Moreover, ISBOA demonstrated faster convergence and greater stability across 50 independent experiments, yielding an average optimal function value of 0.6892 (Fig. 5) and a convergence error of 0.1449 (Fig. 6). ISBOA’s joint optimization of interference patterns, beams, and power resources enables more efficient allocation of jamming resources and reduces the detection probability of networked radar. This advantage is further validated across various scenarios, where ISBOA consistently outperformed other algorithms in solution quality and computational efficiency (Fig. 8). The experimental results highlight ISBOA’s robustness and adaptability, demonstrating its potential for application in complex battlefield environments.  Conclusions  This study proposes an optimization method for interference resource allocation in multi-beam jamming systems targeting networked radar scenarios, utilizing ISBOA. A mixed-integer programming model integrating interference patterns, beams, and power resources is developed. ISBOA incorporates Cauchy mutation and global cooperative control strategies to enhance global search capability and stability. Simulation results demonstrate that ISBOA outperforms the original SBOA, HHO, and SSA in terms of convergence speed and search efficiency. ISBOA exhibits superior stability and enables more rational allocation of interference resources, effectively reducing the detection probability of networked radar. Moreover, ISBOA demonstrates strong adaptability and robustness across various scenarios, providing an effective solution for interference resource allocation in complex battlefield environments.
Detection and Interaction Analysis of Place Cell Firing Information in Dual Brain Regions of Awake Active Rats
LI Ming, XU Wei, XU Zhaojie, MO Fan, YANG Gucheng, LV Shiya, LUO Jinping, JIN Hongyan, LIU Juntao, CAI Xinxia
Available online  , doi: 10.11999/JEIT250024
Abstract:
  Objective  Continuous monitoring of neural activities in free-moving rats is essential for understanding brain function but presents significant challenges regarding the stability and biocompatibility of the detection device. This study aims to provide comprehensive data on brain activity by simultaneously monitoring two brain regions. This approach is crucial for elucidating the neural encoding differences within these regions and the information exchange between them, both of which are integral to spatial memory and navigation processes. Spatial navigation is a fundamental behavior in rats, vital for their survival and interaction with their environment. Central to this behavior are place cells—neurons that selectively respond to an animal's location, forming the basis of spatial memory and navigation. This study focuses on the hippocampal CA1 region and the Barrel Cortex (BC), both of which are critical for spatial processing. By monitoring these regions simultaneously, the aim is to uncover the neural dynamics underlying spatial memory formation and retrieval. Understanding these dynamics provides insights into the neural mechanisms of spatial cognition and memory, which are fundamental to higher cognitive functions and are often disrupted in neurological disorders such as Alzheimer's disease and schizophrenia.  Methods  To achieve dual brain region monitoring, a four-electrode MicroElectrode Array (MEA) is designed to conform to the shape of the dual brain regions and is surface-modified with a Polypyrrole/Silver Nanowire (PPy/AgNW) nanocomposite material. Each probe of the MEA consists of eight recording sites with a diameter of 20 µm and one reference site. The MEA is fabricated using Microelectromechanical Systems (MEMS) technology and modified via an electrochemical deposition process. The PPy/AgNW nanocomposite modification is selected for its low impedance and high biocompatibility, which are critical for stable, long-term recordings. The deposition of PPy/AgNW is carried out using cyclic voltammetry. The stability of the modified MEA is assessed by cyclic voltammetry in phosphate-buffered saline to simulate in vivo charge/discharge processes. The MEA is then implanted into the CA1 and BC regions of rats, and neural activities are recorded during a two-week spatial memory task. Spike signals are analyzed to identify place cells and assess their firing patterns, while Local Field Potential (LFP) power is measured to evaluate overall neural activity. Mutual information analysis is performed to quantify the interaction between the two brain regions. The experimental setup includes a behavior arena where rats perform spatial navigation tasks, with continuous neural signal recording using the modified MEA.  Results and Discussions  The PPy/AgNW-modified MEA exhibits low impedance (53.01 ± 2.59 kΩ) at 1 kHz (Fig. 2). This low impedance is critical for high-fidelity signal acquisition, enabling the detection of subtle neural activities. The stability of the MEA is evaluated through 1000 cycles of cyclic voltammetry scanning, demonstrating high capacitance retention (92.51 ± 2.21%) and no significant increase in impedance (Fig. 3). These results suggest that the MEA maintains stable performance over extended periods, which is essential for long-term in vivo monitoring. The modified MEA successfully detects neural activities from the BC and CA1 regions over the two-week period. The average firing rates and LFP power in both regions progressively increase, indicating enhanced neural activity as the rats become more familiar with the spatial memory task (Fig. 4). This increase suggests that the rats' spatial memory and navigation abilities improve over time, likely due to increased familiarity with the environment and task requirements. Place cells are identified in the recorded neurons, confirming the presence of spatially selective neuronal activity (Fig. 5). The identification of place cells is a key finding, as these neurons are fundamental to spatial memory and navigation. Additionally, the spatial stability of place cells in the CA1 region is higher than in the BC region, indicating functional differences between these areas in spatial memory processing (Fig. 5). This suggests that the CA1 region plays a more critical role in spatial memory consolidation. Mutual information analysis reveals significant information exchange between the dual brain regions during the initial memory phase, suggesting a role in memory storage (Fig. 6). This inter-regional communication is crucial for understanding how spatial information is processed and stored in the brain. The observed increase in mutual information over time indicates that the interaction between the BC and CA1 regions becomes more pronounced as the rats engage in spatial navigation, highlighting the dynamic nature of neural interactions during memory formation and retrieval.  Conclusions  This study successfully demonstrated continuous dual brain region monitoring in freely moving rats using a PPy/AgNW-modified MEA. The findings reveal dynamic interactions between the BC and CA1 regions during spatial memory tasks and highlight the importance of place cells in memory formation. Monitoring neural activities in dual brain regions over extended periods provides new insights into the neural basis of spatial memory and navigation. The results suggest that the CA1 region plays a critical role in spatial memory consolidation, while the BC region also contributes to spatial processing. This distinction highlights the value of studying multiple brain regions simultaneously to gain a comprehensive understanding of neural processes. The PPy/AgNW-modified MEA serves as a powerful tool for investigating the complex neural mechanisms underlying spatial cognition and memory, with potential applications in related neurological disorders.
LLM Channel Prediction Method for TDD OTFS Low-Earth-Orbit Satellite Communication Systems
YOU Yuxin, JIANG Xinglong, LIU Huijie, LIANG Guang
Available online  , doi: 10.11999/JEIT250105
Abstract:
Orthogonal Time Frequency Space (OTFS) modulation shows promise in Low Earth Orbit (LEO) satellite-to-ground communications. However, rapid Doppler shift variation and high latency in LEO systems lead to channel aging. Real-time channel estimation increases the computational complexity of onboard receivers and reduces transmission efficiency due to substantial pilot overhead. This study addresses a Ka-band Multiple-Input Single-Output (MISO) OTFS satellite-to-ground communication system by designing a Downlink (DL) channel prediction scheme based on Uplink (IL) channel estimation. A high-precision channel estimation method is proposed, combining matched filtering with data detection to extract UL Channel State Information (CSI). An Adaptive Sparse Large Language Model (ASLLM)-based channel prediction network is then constructed to predict DL CSI. Compared with existing methods, simulations show that the proposed approach achieves lower Normalized Mean Square Error (NMSE) and Bit Error Rate (BER), with improved generalization across multiple scenarios and within an acceptable computational complexity range.  Objective   LEO satellite communication systems offer advantages over Medium-Earth-Orbit (MEO) and Geostationary-Earth-Orbit (GEO) systems, particularly in terms of reduced transmission latency and lower path loss. Therefore, LEO satellites are considered a key element of the Sixth-Generation (6G) Non-Terrestrial Network (NTN) satellite internet architecture. However, high-mobility channels between LEO satellites and ground stations introduce significant challenges for conventional Orthogonal Frequency Division Multiplexing (OFDM), resulting in marked performance degradation. OTFS modulation, which operates in the Delay-Doppler (DD) domain, has been shown to outperform OFDM in high-mobility scenarios, Multiple-Input Multiple-Output (MIMO) systems, and millimeter-wave frequency bands. This performance advantage is attributed to its robustness to Doppler shifts and inter-symbol interference. In modern Time Division Duplexing (TDD) satellite communication systems, OTFS receivers require high-complexity real-time channel estimation, and transmitters rely on extensive pilot overhead to encode CSI for reliable data recovery. To mitigate these limitations, channel prediction schemes using UL CSI to predict DL CSI have been proposed. However, broadband MISO-OTFS systems with large antenna arrays and high-resolution transmission demand precise and efficient CSI prediction under rapidly varying DD-domain conditions. The dynamic and rapidly aging characteristics of DD domain CSI present significant challenges for accurate prediction in broadband, high-mobility, and large-scale antenna communication systems. To address this, an ASLLM-based channel prediction method is developed. The proposed method enables accurate prediction of DD-domain CSI under these conditions.  Methods  By modeling the input–output relationship of a MISO OTFS satellite-to-ground communication system, this study proposes a data-assisted fractional Doppler matched filtering algorithm for channel estimation. This method leverages the shift property of correlation functions and integrates iterative optimization through Minimum Mean Square Error (MMSE) signal detection to achieve accurate estimation of DD domain CSI. The resulting high-precision CSI serves as a reliable input for the subsequent prediction network. The task of predicting DL slot CSI from UL slot CSI is formulated as a minimization of the NMSE between the network’s predicted CSI and the true DL CSI. The proposed ASLLM prediction network consists of a preprocessing layer, an embedding layer, a Generative Pre-trained Transformer (GPT) layer, and an output layer. The raw DD-domain CSI is first processed through the preprocessing layer to extract convolutional features. In the embedding layer, a value attention module and a position attention module are applied to convert the CSI features into a structured, text-like input suitable for GPT processing. The value attention module adaptively extracts sparse feature values of the CSI, while the position attention module encodes positional characteristics in a non-trainable manner. The core of the prediction network is a pre-trained, open-source GPT-2 backbone, which is used to model and forecast the CSI sequence. The network output is then passed through a linear transformation layer to recover the predicted DD-domain CSI.  Results and Discussions  The satellite-to-ground channel is modeled using the NTN-TDL-D dual mobility channel and simulated with QuaDRiGa. First, the performance of the data-assisted matched filtering channel estimation method is validated (Fig. 7). At a Signal-to-Noise Ratio (SNR) of 20 dB, the BER reaches the order of 0.001 after three iterations. Next, training loss curves for several neural network models are compared (Fig. 8). The ASLLM model exhibits the fastest convergence and highest stability. It also achieves superior NMSE and BER performance in MMSE data detection compared with other approaches (Fig. 9). ASLLM demonstrates strong generalization across different channel models and varying terminal velocities (Fig. 10). However, in cross-frequency generalization scenarios, a small number of additional training samples are still required to maintain accuracy (Fig. 11). Finally, ablation experiments confirm the contribution of each core module within the ASLLM architecture (Table 2). Comparisons of network parameters, training time, and inference time indicate that the computational complexity of ASLLM remains within an acceptable range (Table 3).  Conclusions  This study proposes a channel prediction method for TDD MISO OTFS systems, termed ASLLM, tailored to high-mobility scenarios such as communication between LEO satellites and high-speed trains. The approach leverages high-precision historical UL CSI, obtained through a data-assisted matched filtering algorithm, to predict future DL CSI. By extracting sparse features from DD domain CSI, the method fine-tunes a pre-trained GPT-2 model—originally trained on general knowledge—to improve predictive accuracy. Simulation results show that: (1) considering both computational complexity and estimation accuracy, optimal stopping criteria for the channel estimation algorithm are defined as an iteration number of 3 and a threshold of 0.001; (2) ASLLM outperforms existing prediction methods in terms of convergence speed, NMSE, BER, and generalization capability; and (3) each module of the network contributes effectively to performance, while overall computational complexity remains within a feasible range.
Modeling and Simulation Analysis of Inter-satellites Pseudo-noise Ranging for Space Gravitational Wave Detection
SUN Chenying, YAO Weilai, LIANG Xindong, JIA Jianjun
Available online  , doi: 10.11999/JEIT250121
Abstract:
  Objective  Inter-satellite laser interferometry for space gravitational wave detection is constrained by orbital dynamics and other perturbations, which cause continuous variations in inter-satellite distances. Therefore, laser frequency noise becomes the dominant noise source in the inter-satellite interferometry system. To suppress this noise, the Time Delay Interferometry (TDI) algorithm is applied during data post-processing, where a virtual equal-arm interferometer is synthesized by shifting and combining data streams. However, accurate TDI combinations depend on precise knowledge of absolute inter-satellite distances at the picometer level. Any deviation in these measurements may propagate into errors in the final processed data. To address this issue, an inter-satellite ranging scheme based on Pseudo-Random Noise (PRN) is proposed. This method enables both inter-satellite ranging and data communication, providing theoretical support for autonomous satellite navigation as well as inter-satellite ranging and communication in space-based gravitational wave missions.  Methods  To reduce power consumption and spacecraft mass, the inter-satellite ranging task is implemented using existing laser links for scientific measurement. Only a small fraction of the available power is allocated to the ranging subsystem to avoid degrading the phase stability of science measurements. A low-depth Binary Phase-Shift Keying (BPSK) modulation scheme based on PRN is proposed to enable laser ranging and data communication as auxiliary functions of the high-precision inter-satellite interferometry system. The ranging system architecture incorporates a Digital Phase-Locked Loop (DPLL) for carrier synchronization and a Delay-Locked Loop (DLL) for PRN code synchronization. Theoretical limitations of ranging accuracy are systematically analyzed, including contributions from shot noise, integration time, inter-code interference, optical data bit encoding, and the impulse response of the DPLL. These analyses guide improvements in both the DPLL and DLL designs. A Direct Digital Synthesizer (DDS) is used to generate the heterodyne signal. Simulation verification of unidirectional ranging, bidirectional ranging and inter-satellite data communication is performed on a Field Programmable Gate Array (FPGA) platform.  Results and Discussions  The simulation results (Table 2, Table 3) demonstrate that the optimization methods proposed in (Fig.9, Fig.11) effectively reduce the effects of data encoding and inter-code interference on the ranging accuracy of the delay-locked tracking loop, respectively. As shown in (Table 4), in a single delay-locked tracking loop, the dominant factor limiting ranging accuracy is data bit encoding for optical communication when the local PRN code is absent; otherwise, shot noise becomes the primary source of error. (Fig.16) illustrates the distortion of the PRN code caused by the phasemeter pulse response, and shows that Manchester encoding significantly mitigates this distortion. The final simulation results after applying all optimization techniques are summarized in (Table 5). With a modulation depth of approximately 0.4 rad, corresponding to an equivalent optical power of less than 4%, the Root Mean Square (RMS) errors for both unidirectional and bidirectional ranging are approximately 3 cm at a measurement rate of 3 Hz with an 80 MHz sampling frequency. For unidirectional ranging with data streams encoded at 19 kbps and 39 kbps, the corresponding RMS ranging errors are approximately 6 cm and 20 cm, respectively. Bidirectional ranging supports data transmission only at 19 kbps, yielding an RMS error of approximately 7 cm. When the phase modulation depth is reduced to 0.2 rad (corresponding to an equivalent optical power below 1%), the RMS ranging error is approximately 6 cm; if 19 kbps data are transmitted simultaneously, the RMS error increases to approximately 12 cm. These simulation results confirm that sub-meter absolute distance resolution is achievable under all tested conditions.  Conclusions  Based on the Taiji plan, an absolute distance measurement scheme utilizing low-depth phase modulation of PRN codes is proposed. A receiver model based on a DPLL and a DLL is established. The limiting factors affecting inter-satellite ranging accuracy are analyzed, leading to improvements in the ranging model. The simulation results, following comprehensive optimizations, show that the primary limiting factors of ranging accuracy are unavoidable shot noise and the encoding of data bits for optical communication. At a clock sampling rate of 80 MHz, with a PRN code phase modulation depth of 0.4 rad, the bidirectional ranging RMS error is approximately 7 cm when communication data is encoded at 19 kbps. When the modulation depth is reduced to 0.2 rad, the RMS error increases to approximately 12 cm while transmitting 19 kbps data concurrently. These simulation results demonstrate a clear improvement over meter-level accuracy, and the ranging model offers valuable insights for space gravitational wave detection and satellite autonomous navigation. Given the complexity of clock synchronization, it is assumed in this study that the clocks of the transmitter and receiver are fully synchronized. Further research will address clock synchronization issues, and electrical and optical experiments will be conducted to assess the performance of the proposed architecture in future work.
Collaborative Multi-agent Trajectory Optimization for Unmanned Aerial Vehicles Under Low-altitude Mixed-obstacle Airspace
FENG Simeng, ZHANG Yunyi, LIU Kai, LI Baolong, DONG Chao, ZHANG Lei, WU Qihui
Available online  , doi: 10.11999/JEIT250012
Abstract:
  Objective  The rapid expansion of the low-altitude economy has driven the development of low-altitude intelligent networks as a key component of the Internet of Things (IoT). In such networks, the growing number of users challenges the ability of Unmanned Aerial Vehicles (UAVs) with mobile base stations to sustain data transmission quality. Efficient access technologies are therefore essential to ensure service quality as user density increases. At the same time, the growing complexity of airspace elevates the risk of in-flight collisions, necessitating integrated strategies to improve both communication efficiency and flight safety. This study proposes a collaborative trajectory planning framework for multiple UAVs operating in low-altitude, mixed-obstacle environments. The approach incorporates Non-Orthogonal Multiple Access (NOMA) to increase spectral efficiency and communication capacity, together with a discrete collision probability map for obstacle avoidance. A novel multi-UAV communication and obstacle-avoidance model is developed, and an optimized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is introduced to schedule users and plan UAV trajectories. The objective is to maximize communication energy efficiency while ensuring reliable obstacle avoidance. The proposed method effectively enhances multi-UAV coordination in complex airspace and improves the overall communication performance.  Methods  To ensure energy efficiency and reliable obstacle avoidance for multiple UAVs operating in low-altitude, mixed-obstacle environments, a multi-user communication system model is proposed, incorporating collaborative multi-UAV trajectory planning. This model comprises two key components. First, a collision probability model based on discrete obstacles extends the conventional low-altitude obstacle representation into a probabilistic collision map. Second, a multi-user communication framework is constructed using fractional-order transmission energy allocation under NOMA, integrating both UAV communication and flight energy models within a unified UAV energy efficiency framework. Based on this model, the problem of maximizing energy efficiency is formulated, accounting for coordinated UAV communication and obstacle avoidance. To solve this problem, an integrated strategy is proposed. A multi-agent direction-preprocessing K-means++ algorithm is first used to enhance convergence during user scheduling optimization. Based on the optimized user allocation and environmental awareness, a state space is defined together with a 3D action space consisting of 27 directional movement options. The MADDPG algorithm is then trained by alternately updating Actor and Critic networks over the defined state-action space. Once trained, the network outputs trajectory planning policies that achieve both effective obstacle avoidance and optimized communication energy efficiency.  Results and Discussions  The proposed trajectory planning framework applies a user scheduling algorithm that dynamically allocates users at each time step, incorporating the positions of other UAVs, obstacles, and associated collision probabilities as environmental inputs. The MADDPG network is trained using a reward function defined by energy efficiency and collision probability, enabling the generation of trajectory planning solutions that maintain both communication performance and flight safety for multiple UAVs. Simulation results show that the planned trajectories—depicted by red, yellow, and blue lines—are shorter on average than those obtained using the traditional safety radius method (Fig. 3). Compared with trajectory planning approaches based on varying safety radius values, the proposed method achieves an approximately 8-fold reduction in average collision probability (Fig. 5). In terms of communication performance, the NOMA-based approach significantly outperforms Frequency-Division Multiple Access (FDMA). Furthermore, the proposed algorithm, incorporating multi-agent direction preprocessing optimization, yields an average improvement of 10.81% in communication energy efficiency over the non-optimized variant, as evaluated by the mean across multiple iterations (Fig. 6). The network also demonstrates rapid environmental adaptation within 20 training iterations and exhibits superior generalization compared to conventional reward-based reinforcement learning algorithms (Fig. 4).  Conclusions  This paper presents a multi-UAV collaborative communication and trajectory planning solution for ensuring both flight safety and communication performance in low-altitude mixed-obstacle airspace during multi-user operations. A UAV collaborative NOMA communication system model, based on a collision probability map, is developed. An optimized MADDPG algorithm for user scheduling is introduced to address the multi-UAV trajectory planning problem, aiming to maximize communication energy efficiency. The algorithm comprises two key components: firstly, a user scheduling algorithm based on K-means++ to establish user-UAV connection relationships; secondly, the MADDPG algorithm, which generates UAV trajectory planning solutions under dynamic environmental conditions and established connection relationships. Simulation results reveal the following key findings: (1) The optimized MADDPG algorithm enhances multi-UAV communication while ensuring flight safety; (2) The proposed algorithm significantly improves obstacle avoidance performance, reducing collision probability by approximately 8-fold compared to traditional methods; (3) The inclusion of multi-agent direction preprocessing optimizes communication energy efficiency by 10.81%. However, this study only considers a low-altitude environment with mixed static obstacles. In real-world scenarios, obstacles may move or intrude dynamically, and future work should explore the impact of dynamic obstacles on trajectory planning.
A Review and Prospect of Cybersecurity Research on Air Traffic Management Systems
WANG Buhong, LUO Peng, YANG Yong, ZHAO Zhengyang, DONG Ruochen, GUAN Yongjian
Available online  , doi: 10.11999/JEIT240966
Abstract:
  Significance   The air traffic management system is a critical national infrastructure that impacts both aerospace security and the safety of lives and property. With the widespread adoption of information, networking, and intelligent technologies, the modern air traffic management system has evolved into a space-air-ground-sea integrated network, incorporating heterogeneous systems and multiple stakeholders. The network security of the system can no longer be effectively ensured by device redundancy, physical isolation, security by obscurity, or human-in-the-loop strategies. Due to the stringent requirements for aviation airworthiness certification, the implementation of new cybersecurity technologies is often delayed. New types of cyberattacks, such as advanced persistent threats and supply chain attacks, are increasingly prevalent. Vulnerabilities in both hardware and software, particularly in embedded systems and industrial control systems, are continually being exposed, widening the attack surface and increasing the number of potential attack vectors. Cyberattack incidents are frequent, and the network security situation remains critical.   Progress   The United States’ Next Generation Air Transportation System (NextGen), the European Commission’s Single European Sky Air Traffic Management Research (SESAR), and the Civil Aviation Administration of China have prioritized cybersecurity in their development plans for next-generation air transportation systems. Several countries and organizations, including the United States, Japan, China, the European Union, and Germany, have established frameworks for the information security of air traffic management systems. Although network and information security for air traffic management systems is gaining attention, many countries prioritize operational safety over cybersecurity concerns. Existing security specifications and industry standards are limited in addressing network and information security. Most of them focus on top-level design and strategic directions, with insufficient attention to fundamental theories, core technologies, and key methodologies. Current review literature lacks a comprehensive assessment of assets within air traffic management systems, often focusing only on specific components such as aircraft or airports. Furthermore, research on aviation information security mainly addresses traditional concerns, without fully considering the intelligent and dynamic security challenges facing next-generation air transportation systems.   Conclusions   This paper comprehensively examines the complexity of the cybersecurity ecosystem in air traffic management systems, considering various entities such as electronic-enabled aircraft, communication, navigation, Surveillance/Air Traffic Management (CNS/ATM), smart airports, and intelligent computing. It focuses on asset categorization, information flow, threat analysis, attack modeling, and defense mechanisms, integrating dynamic flight phases to systematically review the current state of cybersecurity in air traffic management systems. Several scientific issues are identified that must be addressed in constructing a secure ecological framework for air traffic management. Based on the Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) model, this paper analyzes typical attack examples related to the four ecological entities (Figs. 7, 9, 12, and 14) and constructs an ATT&CK matrix for air traffic management systems (Fig. 15). Additionally, with the intelligent development goal of next-generation air transportation systems as a guide, ten typical applications of intelligent air traffic management are outlined (Fig. 13, Table 11), with a systematic analysis of the attack patterns and defense mechanisms of their intelligent algorithms (Tables 12, 13). These findings provide theoretical references for the development of smart civil aviation and the assurance of cybersecurity in China.   Prospects   Currently, the cybersecurity ecosystem of air traffic management systems is highly complex, with unclear mechanisms, indistinct boundaries for cybersecurity assets, and incomplete security assurance requirements. Moreover, there is a lack of comprehensive, systematic, and holistic cybersecurity design and defense mechanisms, which limits the ability to counter various subjective, human-driven, and emerging types of malicious cyberattacks. This paper highlights key research challenges in areas such as dynamic cybersecurity analysis, attack impact propagation modeling, human-in-the-loop cybersecurity analysis, and distributed intrusion detection systems. Cybersecurity analysis of air traffic management systems should be conducted within the dynamic operational environment of a space-air-ground-sea integrated network, accounting for the cybersecurity ecosystem and analyzing it across different spatial and temporal dimensions. As aircraft are cyber-physical systems, cybersecurity threat analysis should focus on the interrelated propagation mechanisms between security and safety, as well as their cascading failure models. Furthermore, humans serve as the last line of defense in cybersecurity. When performing threat modeling and risk assessment for avionics systems, it is crucial to fully incorporate “human-in-the-loop” characteristics to derive comprehensive and objective conclusions. Finally, the design, testing, certification, and updating of civil aviation avionics systems are constrained by strict airworthiness requirements, preventing the rapid implementation of advanced cybersecurity technologies. Distributed anomaly detection systems, however, currently represent an effective technical approach for combating cyberattacks in air traffic management systems.
Multi-Hop UAV Ad Hoc Network Access Control Protocol: Deep Reinforcement Learning-Based Time Slot Allocation Method
SONG Liubin, GUO Daoxing
Available online  , doi: 10.11999/JEIT241044
Abstract:
  Objective  Unmanned Aerial Vehicle (UAV) ad hoc networks have gained prominence in emergency and military operations due to their decentralized architecture and rapid deployment capabilities. However, the coexistence of saturated and unsaturated nodes in dynamic multi-hop topologies often results in inefficient time-slot utilization and network congestion. Existing Time Division Multiple Access (TDMA) protocols show limited adaptability to dynamic network conditions, while conventional Reinforcement Learning (RL)-based approaches primarily target single-hop or static scenarios, failing to address scalability challenges in multi-hop UAV networks. This study explores dynamic access control strategies that allow idle time slots of unsaturated nodes to be efficiently shared by saturated nodes, thereby improving overall network throughput.  Methods  A Deep Q-Learning-based Multi-Hop TDMA (DQL-MHTDMA) protocol is developed for UAV ad hoc networks. First, a backbone selection algorithm classifies nodes into saturated (high-traffic) and unsaturated (low-traffic) groups. The saturated nodes are then aggregated into a joint intelligent agent coordinated through UAV control links. Second, a distributed Deep Q-Learning (DQL) framework is implemented in each TDMA slot to dynamically select optimal transmission node sets from the saturated group. Two reward strategies are defined: (1) throughput maximization and (2) energy efficiency optimization. Third, the joint agent autonomously learns network topology and the traffic patterns of unsaturated nodes, adaptively adjusting transmission probabilities to meet the targeted objectives. Upon detecting topological changes, the agent initiates reconfiguration and retraining cycles to reconverge to optimal operational states.  Results and Discussions  Experiments conducted in static (16-node) and mobile (32-node) scenarios demonstrate the protocol’s effectiveness. As the number of iterations increases, the throughput gradually converges towards the theoretical optimum, reaching its maximum after approximately 2,000 iterations (Fig. 5). In Slot 4, the total throughput achieves the theoretical optimum of 1.8, while the throughput of Node 4 remains nearly zero. This occurs because the agent selects transmission sets {1, 8} or {2, 8} to share the channel, with transmissions from Node 1 preempting Node 4’s sending opportunities. Similarly, the total throughput of Slot 10 also attains the theoretical optimum of 1.8, resulting from the algorithm’s selection of conflict-free transmission sets {1} or {2} to share the channel simultaneously. The throughput of the DQL-MHTDMA algorithm is compared with that of other algorithms and the theoretical optimal value in odd-numbered time slots under Scenario 1. Across all time slots, the proposed algorithm achieves or closely approximates the theoretical optimum, significantly outperforming the traditional fixed-slot TDMA algorithm and the CF-MAC algorithm. Notably, the intelligent agent operates without prior knowledge of traffic patterns in each time slot or the topology of nodes beyond its own, demonstrating the algorithm’s ability to learn both slot occupancy patterns and network topology. This enables it to intelligently select the optimal transmission set to maximize throughput in each time slot. In the mobile (32-node) scenario, when the relay selection algorithm detects significant topological changes, the protocol is triggered to reselect actions. After each change, the algorithm rapidly converges to optimal action selection schemes and adaptively achieves near-theoretical-optimum maximum throughput across varying topologies (Fig. 9). Under the optimal energy efficiency objective policy, energy efficiency in time slot 11 converges after 2,000 iterations, reaching a value close to the theoretical optimum (Fig. 10). Compared to the throughput-oriented algorithm, energy efficiency improves from 0.35 to 1. This occurs because the throughput-optimized algorithm preferentially selects transmission sets {1, 8} or {2, 8} to maximize throughput. However, as Node 11 lies within the 2-hop neighborhood of both Nodes 1 and 8, concurrent channel occupancy induces collisions, significantly degrading energy efficiency. In contrast, the energy-efficiency-optimized algorithm preferentially selects an empty transmission set (i.e., no scheduled transmissions), thereby maximizing energy efficiency while maintaining moderate throughput levels. The paper presents statistical comparisons of energy efficiency against theoretical optima across eight distinct time slots in the static (16-node) scenario. As demonstrated in multi-hop network environments, the proposed algorithm achieves or closely approaches theoretical optimum energy efficiency values in all slots. Furthermore, while maintaining energy efficiency guarantees, the algorithm delivers significantly higher throughput compared to conventional TDMA protocols.  Conclusions  This paper addresses the access control problem in multi-hop UAV ad hoc networks, where saturated and non-saturated nodes coexist. A DQL-MHTDMA is proposed. By consolidating saturated nodes into a single large agent, the protocol learns network topology and time-slot occupation patterns to select optimal access actions, thereby maximizing throughput or energy efficiency in each time slot. Simulation results demonstrate that the algorithm exhibits fast convergence, stable performance, and achieves the theoretically optimal values for both throughput and energy efficiency objectives.
Joint Design of Integrated Sensing And Communication Waveforms and Receiving Filters
LIU Tao, LI Xiangxuan, LI Yubo
Available online  , doi: 10.11999/JEIT241082
Abstract:
  Objective  With the continuous increase in wireless communication traffic and the growing scarcity of spectrum resources, the overlapping frequency bands of communication and radar systems have led to mutual interference. Therefore, Integrated Sensing And Communication (ISAC) has emerged as a critical area of research. A key technology in this field is the design of ISAC waveforms, which is of significant research interest. The Doppler resilience and low sidelobe levels of ISAC waveforms are essential for effective target detection and information transmission in ISAC scenarios. However, designing waveforms that optimize both radar and communication performance presents substantial challenges. To address these challenges, a method for the joint design of ISAC waveforms and receiving filters is proposed. An ISAC model based on mismatched filtering is proposed, and an optimization problem is formulated. The Iterative Twisted appROXimation (ITROX) algorithm is presented to solve this nonconvex problem with guaranteed convergence. This approach enables the design of unimodular ISAC waveforms with Doppler resilience, achieving enhanced performance in both communication and radar functions.  Methods  To design ISAC waveforms that optimize radar and communication performance, the concept of mismatched filtering is introduced to formulate an optimization problem. The requirements for Doppler-resilient ISAC waveforms are first analyzed, followed by the proposal of a waveform model based on mismatched filtering. An optimization problem is then formulated, with the objective of minimizing the Weighted Integrated Sidelobe Level (WISL) and the Loss-in-Processing Gain (LPG). Constraints include the unimodular property of the transmitted waveform, the phase difference between the transmitted ISAC waveform and the communication data-modulated waveform, and the energy of the receiving filter. To solve this nonconvex optimization problem, the task is transformed into identifying a suitable Mismatched Filtering Sequences Pair (MFSP) under multiple constraints. An ISAC waveform design algorithm based on an improved ITROX framework is proposed to simplify the optimization process. The core concept of the ITROX algorithm is to iteratively search for the optimal projection of the matrix set, with the goal of maximizing the main lobe and minimizing the sidelobes within the region of interest. This approach minimizes WISL and LPG, satisfying the objective function requirements. Additionally, the combination of the three constraints ensures that the waveform meets both communication and radar sensing requirements. The SQUAREd Iterative Method (SQUAREM) is employed to improve the algorithm's convergence speed. The balance between WISL and LPG is controlled by adjusting the coefficients.  Results and Discussions  The ITROX-based ISAC waveform design method proposed in this paper effectively solves the formulated optimization problem, resulting in unimodular ISAC waveforms with Doppler resilience. Compared to existing ISAC waveform methods, the proposed ISAC waveform demonstrates a lower sidelobe level and Symbol Error Rate (SER) within the region of interest, with only a minor sacrifice in LPG. This leads to significant improvements in both radar sensing and communication performance. Simulation results validate the effectiveness of the proposed ISAC waveforms. These results show that the proposed method exhibits excellent convergence, with WISL rapidly converging to a stable value as iterations increase (Fig. 1). When the LPG coefficient is set to 0.9, a low sidelobe level and SER are achieved, despite a processing gain loss of 0.91 dB (Fig. 2, Fig. 3). For the same phase difference threshold, the proposed ISAC waveform exhibits a lower SER than existing methods, indicating superior communication performance (Fig. 4). When comparing the ISAC waveform designed by this method to existing methods with the same time-delay interval width, the proposed waveform demonstrates a lower sidelobe level, with sidelobes nearly zero, approaching ideal correlation performance (Fig. 5, Fig. 6). This leads to significant improvements in target detection by the ISAC system. Furthermore, the proposed ISAC waveform exhibits excellent Doppler resilience, maintaining low sidelobe levels within the given Doppler interval (Fig. 6), which contributes to improved target detection performance.  Conclusions  This paper proposes a method for the joint design of ISAC waveforms and receiving filters based on Doppler resilience. By integrating the concept of mismatched filtering with the ISAC model, an optimization problem is formulated to minimize WISL and LPG without compromising communication quality. Additionally, an improved ITROX algorithm is proposed to effectively solve the formulated nonconvex optimization problem. The results demonstrate that the proposed scheme maintains near-ideal correlation performance within the region of interest under specified Doppler intervals, with only a minor sacrifice in LPG, and enables communication with a low SER. Compared to existing ISAC waveform methods, the proposed ISAC waveform exhibits a lower sidelobe level and SER, showing superior radar sensing and communication performance. Furthermore, low sidelobe levels can be achieved in one or more regions of interest to meet different requirements by appropriately adjusting the weighting coefficient. Future work could explore more efficient optimization algorithms to design ISAC waveforms with enhanced Doppler resilience.
Energy Characteristic Map Based Resource Allocation Algorithm for High-density V2V Communications
QIU Gongan, LIU Yongsheng, ZHANG Guoan, LIU Min
Available online  , doi: 10.11999/JEIT250004
Abstract:
  Objective  In high density scenarios, the random resource selection method has limitations in handling the high access collision probability of traffic safety messages under the limited frequency resource. At the same time, the variable topology accompanied by high mobility increases the failure rate of Vehicle to Vehicle (V2V) links. However, the traffic safety messages with ultra-high reliability and ultra-low latency are very important to ensure traffic safety and road efficiency under the present scenarios. To address these challenges, integrating the energy characteristic parameters in sub-frames and sub-carriers into the resource block map has emerged as a promising approach. By incorporating the distributed V2V links and designing effective reward functions, it is possible to decrease the access collision probability and smooth the dynamics of variable topology while maintaining high resource efficiency, thereby better meeting the needs of dense traffic. This research offers an intelligent solution for resource allocation in Cellular Vehicle to Everything (C-V2X) and provides theoretical support for the coordinated access of limited frequency with diverse link quality.  Methods  Based on the sustainable adjacency among the neighborhood vehicles in high-density V2V communications, Energy Characteristic Map (ECM) based resource allocation algorithm is proposed using Deep Reinforcement Learning algorithm. The guidance logic of the ECM algorithm periodically renews the energy indicators of candidate resources to train the weight coefficient matrix of two –layer Deep Neural Network (DNN) based on the characteristic results within the sensing window. The algorithm is then used as the action space in double Deep Q-learning Network (DQN) agent to maximize the V2V throughput, which has a main DQN and a target DQN. The state space in the DQN model includes the energy indicators of candidate resources such as the Received Signal Strength Indicator (RSSI) in sub-frames and Signal-to-Interference plus Noise Ratio (SINR) in sub-carriers, along with dynamic factors like the relative position and speed of other vehicles. The reward function is crucial for ensuring the resource efficiency and performance of the safety messages during the resource blocks selection. It accounts for factors such as the bandwidth and SINR of V2V links rewards to optimize decision-making. Additionally, the discount factor determines the weight of future rewards, balancing the importance of immediate versus future rewards. A lower discount factor typically emphasizes immediate rewards, leading to frequently resource block reselection, while a higher discount factor enhances the robustness of occupied resource.  Results and Discussions  The ECM algorithm periodically renews the energy indicators of candidate resources based on the characteristic results within the sensing window, which then serves as the action space in the double DQN agent. By defining an appropriate reward function, the main DQN in double DQN agent is established to select the candidate resource with high energy indicators for V2V links. The numerical results (Expression 11 and Expression 15) between the packet received ratio and the energy indicators are analyzed using the discrete-time Markov chains. Simulation results show that the end-to-end disseminating performance of safety messages under variable V2V distances, simulated on WiLabV2Xsim, is represented by the blue line (Fig.6, Fig.7). The reliability, PRR, is more than 0.95 under less than 160 veh/km (the blue line), while the comparative PRR is more than 0.95 under less than 120 veh/km (the green line) and 90 veh/km (the red line), respectively (Fig.10). At the same time, the latency, TD, is less than 3 ms under less than 180 veh/km (the blue line), while the comparative TD is less than 3 ms under less than 160 veh/km (the green line) and about 80 veh/km (the red line), respectively (Fig.11). The resource utilization, RU, is more than 0.6 under less than 180 veh/km (the blue line), while the comparative RU is more than 0.6 under less than 160 veh/km (the green line) and about 80 veh/km (the red line), respectively(Fig.12), demonstrating a 10~20% improvement in resource efficiency. When the discount factor is set to 0.9 while the learning rate is set to 0.01(Fig.8, Fig.9), the VUE selects the resource blocks that balances immediate and long-term throughput, effectively improving the robustness of the main DQN, which meets the advanced V2V service requirements such as platooning in C-V2X.  Conclusions  This paper addresses the challenge of resource allocation in high-density V2V communications by integrating the ECM algorithm with double DQN agent. The proposed resource selection scheme enhances the RSS algorithm by establishing distributed V2V links using high quality resource blocks to maximize throughput. The scheme is evaluated through disseminating safety messages simulations under variable density, and the results show that: (1) The proposed scheme has high reliability with more than 0.95 PRR and ultra-low latency with less than 3 ms TD under upper 160 veh/km. (2) The resource efficiency has been improved by 10~20% over the RSS method; (3) Long-term and short-term rewards are considered by selecting the discount factor of 0.9 and the learning rate of 0.01 and enhance the robustness of DQN model. However, this study has not considered different resource characteristics for the heterogeneous messages with diverse Quality of Service (QoS) providing, which should be accounted for in future work.
Left Atrial Scar Segmentation Method Combining Cross-Modal Feature Excitation and Dual Branch Cross Attention Fusion
RUAN Dongsheng, SHI Zhebin, WANG Jiahui, LI Yang, JIANG Mingfeng
Available online  , doi: 10.11999/JEIT240775
Abstract:
  Objective  Atrial Fibrillation (AF) is a common arrhythmia associated with increased mortality. The distribution and extent of left atrial fibrosis are critical for predicting the onset and persistence of AF, as fibrotic tissue alters cardiac electrical conduction. Accurate segmentation of left atrial scars is essential for identifying fibrotic lesions and informing clinical diagnosis and treatment. However, this task remains challenging due to the irregular morphology, sparse distribution, and small size of scars. Deep learning models often perform poorly in scar feature extraction owing to limited supervision of atrial boundary information, which results in detail loss and reduced segmentation accuracy. Although increasing dataset size can improve performance, medical image acquisition is costly. To address this, the present study integrates prior knowledge that scars are generally located on the atrial wall to enhance feature extraction while reducing reliance on large labeled datasets. Two boundary feature enhancement modules are proposed. The Cross-Modal feature Excitation (CME) module encodes atrial boundary features to guide the network’s attention to atrial structures. The Dual-Branch Cross-Attention (DBCA) fusion module combines Magnetic Resonance Imaging (MRI) and boundary features at a deeper level to enhance boundary scar representation and improve segmentation accuracy.  Methods  This study proposes an enhanced U-shaped encoder–decoder framework for left atrial scar segmentation, incorporating two modules: the CME module and the DBCA module. These modules are embedded within the encoder to strengthen attention on atrial boundary features and improve segmentation accuracy. First, left atrial cavity segmentation is performed on cardiac MRI using a pre-trained model to obtain a binary mask. This binary map undergoes dilation and erosion to generate a Signed Distance Map (SDM), which is then used together with the MRI as input to the model. The SDM serves as an auxiliary representation that introduces boundary constraints. The CME module, integrated within the encoder’s convolutional blocks, applies channel and spatial attention mechanisms to both MRI and SDM features, thereby enhancing boundary information and guiding attention to scar regions. To further reinforce boundary features at the semantic level, the DBCA module is positioned at the bottleneck layer. This module employs a two-branch cross-attention mechanism to facilitate deep interaction and fusion of MRI and boundary features. The bidirectional cross-attention enables SDM and MRI features to exchange cross-modal information, reducing feature heterogeneity and generating semantically enriched and robust boundary fusion features. A combined Dice and cross-entropy loss function is used during training to improve segmentation precision and scar region identification.  Results and Discussions  This study uses a dataset of 60 left atrial scar segmentations from the LAScarQS 2022 Task 1. The dataset is randomly divided into 48 training and 12 test cases. Several medical image segmentation models, including U-Net, nnUNet, and TransUNet, are evaluated. Results show that three-dimensional segmentation consistently outperforms two-dimensional approaches. The proposed method exceeds the baseline nnUNet, with a 2.17% improvement in Dice score and a 4.82% increase in accuracy (Table 1). Visual assessments confirm improved sensitivity to small scar regions and enhanced attention to boundaries (Fig. 6, Fig. 7). To assess model performance, comparative and ablation experiments are conducted. These include evaluations of encoder configurations (shared vs independent), feature fusion strategies (CME, DBCA, and CBAM), and fusion weight parameters α and β. An independent encoder incorporating both CME and DBCA modules achieves the highest performance (Table 3), with the optimal weight configuration at α = 0.7 and β = 0.3 (Table 5). The effect of different left atrial border widths 2.5 mm, 5.0 mm, and 7.5 mm is also analyzed. A 5.0 mm width provides the best segmentation results, whereas 7.5 mm may extend beyond the relevant region and reduce accuracy (Table 6).  Conclusions  This study integrates the proposed CME and DBCA modules into the nnUNet framework to address detail loss and feature extraction limitations in left atrial scar segmentation. The findings indicate that: (1) The CME module enhances MRI feature representation by incorporating left atrial boundary information across spatial and channel dimensions, improving the model’s focus on scar regions; (2) The DBCA module enables effective learning and fusion of boundary and MRI features, further improving segmentation accuracy; (3) The proposed model outperforms existing medical image segmentation methods on the LAScarQS2022 dataset, achieving a 2.17% increase in Dice score and a 4.82% gain in accuracy compared to the baseline nnUNet. Despite these improvements, current deep learning models remain limited in their sensitivity to small and poorly defined scars, which often results in segmentation omissions. Challenges persist due to the limited dataset size and the relatively small proportion of scar tissue within each image. These factors constrain the training process and model generalizability. Future work should focus on optimizing scar segmentation under small-sample conditions and addressing sample imbalance to improve overall performance.
Real-time Adaptive Suppression of Broadband Noise in General Sensing Signals
WEN Yumei, ZHU Yu
Available online  , doi: 10.11999/JEIT250018
Abstract:
  Objective  Broadband noise is inevitable in sensing outputs due to thermal noise from the sensing system and various uncorrelated environmental disturbances. Adaptive filtering is a common method for removing such noise. At convergence, the adaptive filter output provides the optimal estimate of the sensing signal. However, during actual sensing, changes in the sensing signal lead to alterations in the statistical characteristics of the output. Therefore, the adaptive process must be re-adjusted to converge to a new steady state. The filter output during this adjustment is not the optimal estimate and introduces distortion, thereby adding extra noise. Fast-converging adaptive algorithms are typically employed to improve the filter’s response speed to such changes. Despite the speed of convergence and the methods used to update filter coefficients, the adjustment process remains unavoidable, during which the filter output is distorted, and additional noise is introduced. To ensure the filter remains at steady state without being influenced by changes in the sensing signal, a new adaptive filtering method is proposed. This method ensures that the input to the adaptive filter remains stationary, thereby preventing output distortion and the introduction of extra noise.  Methods  First, a threshold \begin{document}$ R $\end{document} and quantization scale \begin{document}$ Q $\end{document} are defined in terms of the noise standard deviation, \begin{document}$ \sigma $\end{document}, where \begin{document}$ R = 3\sqrt 2 \sigma $\end{document} and \begin{document}$ Q = 3\sigma $\end{document}. A quantization transformation is applied to the sensing output \begin{document}$ x(n) $\end{document} in real time, with the transformation result \begin{document}$ q(n) $\end{document} used as the new sequence to be filtered. When the absolute value of the first-order difference of \begin{document}$ x(n) $\end{document} is no less than \begin{document}$ R $\end{document}, the sensing signal \begin{document}$ s(n) $\end{document} is considered to have changed, and \begin{document}$ p(n) $\end{document} is set as the quantization value of \begin{document}$ x(n) $\end{document} according to \begin{document}$ Q $\end{document}. When the absolute value of the first-order difference of \begin{document}$ x(n) $\end{document} is less than \begin{document}$ R $\end{document}, \begin{document}$ s(n) $\end{document} is considered unchanged, and \begin{document}$ p(n) $\end{document} is equal to the previous value, i.e., \begin{document}$ p(n) = p(n - 1) $\end{document}. Let \begin{document}$ q(n) = x(n) - p(n) $\end{document}, \begin{document}$ q(n) $\end{document} contains both the information of the sensing signal and the noise. Although its variance may change slightly, the mean of \begin{document}$ q(n) $\end{document} remains 0, ensuring that \begin{document}$ q(n) $\end{document} stays relatively stationary. Next, \begin{document}$ q(n - {n_0}) $\end{document} is used as the input to the adaptive filter, with \begin{document}$ q(n) $\end{document} serving as the reference for the adaptive filter. Here, \begin{document}$ q(n - {n_0}) $\end{document} represents the time delay of \begin{document}$ q(n) $\end{document} and \begin{document}$ {n_0} $\end{document} denotes the length of the time delay. This method performs adaptive linear prediction of \begin{document}$ q(n) $\end{document} and filters out broadband noise. Finally, the output of the adaptive filter, \begin{document}$ y(n) $\end{document}, is compensated with \begin{document}$ p(n) $\end{document} to obtain an estimation of the sensing signal \begin{document}$ s(n) $\end{document} by removing noise.  Results and Discussions  The maximum mean square errors produced by the proposed method and conventional adaptive algorithms are compared using computer-simulated noisy band-limited step signals and noisy one-sided sinusoidal signals. Additionally, Signal-to-Noise Ratio (SNR) improvements obtained during filtering are also evaluated concurrently. For the noisy band-limited step signal (Table 1), the maximum mean square error of the proposed method is only 0.18% of that produced by the Recursive Least Squares (RLS) algorithm and 0.15%~0.19% of those generated by the Least Mean Square (LMS) algorithms. Correspondingly, the SNR improvement is 25.88 dB higher than the RLS algorithm and between 28.65 dB and 32.35 dB greater than the LMS algorithms. In processing a noisy one-sided sinusoidal signal (Table 2), the maximum mean square error generated by the proposed method is 0.3% of that generated by the RLS algorithm and 0.06%~0.08% of that generated by the compared LMS algorithms. The SNR improvement is 10.25 dB higher than that of the RLS algorithm and 26.53 dB~29.61 dB higher than that of the compared LMS algorithms. Figures 3 and 5 illustrate the quantization transformation outcomes for both the noisy band-limited step signal and noisy sinusoidal signal, demonstrating stability and consistency with theoretical expectations. Real sensing outputs primarily cover static or quasi-static signals (Figures 7 and 8); step or step-like signals (Figures 9 and 10), and periodic or quasi-periodic signals (Figures 11 and 12). Comparative analysis of the proposed method against common adaptive algorithms on varied real sensing outputs consistently shows superior filtering performance by the proposed method, with minimal distortion and no additional noise introduction, regardless of whether the sensing signals undergo changes.  Conclusions  A new adaptive filtering method is proposed in this paper. The proposed method ensures that the adaptive filter always operates at a steady state, avoiding the introduction of additional noise caused by distortion during the adjustment to the new steady state. The results from computer simulations and actual signal processing demonstrate that the proposed method provides effective filtering for both dynamic and static sensing signals, indicating that it outperforms commonly used adaptive algorithms.
Multi-Model Fusion-Based Abnormal Trajectory Correction Method for Unmanned Aerial Vehicles
WANG Wei, SHE Dingchen, WANG Jiaqi, HAN Dairu, JIN Benzhou
Available online  , doi: 10.11999/JEIT241026
Abstract:
  Objective  The opening of low-altitude airspace and the widespread deployment of Unmanned Aerial Vehicles (UAVs) have significantly increased low-altitude flight activities. Trajectory planning is essential for ensuring UAVs operate safely in complex environments. However, wireless remote control links are vulnerable to interference and spoofing attacks, leading to deviations from planned trajectories and posing serious safety risks. To mitigate these risks, UAV position parameters can be predicted and used to replace erroneous navigation system values, thereby correcting abnormal trajectories. Existing prediction-based correction methods, however, exhibit low efficiency and error accumulation over long-term predictions, limiting their practical application. To address these limitations, this study proposes a multi-model fusion method to improve the efficiency and accuracy of abnormal trajectory correction, providing a robust solution for real-world UAV operations.  Methods  An Long Short-Term Memory (LSTM)-Transformer prediction model, integrating LSTM and Transformer, is proposed to exploit the strengths of both architectures in time series forecasting. LSTM efficiently captures short-term dependencies in sequential data, whereas Transformer is well-suited for modeling long-term dependencies. By combining these architectures, the proposed model enhances the capture of both short-term and long-term dependencies, reducing prediction errors. The overall framework of the LSTM-Transformer prediction model is illustrated in (Fig. 3). The input time series data undergoes preprocessing before being fed into the LSTM and Transformer sub-models, each generating a corresponding feature vector. These feature vectors are concatenated and further processed by a fully connected layer to extract intrinsic data features, ultimately producing the prediction results. To further optimize the model, a blockwise attention strategy is proposed. The detailed computation process is shown in (Fig. 4). During self-attention calculations in the Transformer sub-model, the input sequence is divided into multiple sub-blocks, allowing for parallel computation. The results are then concatenated to obtain the final output. This approach effectively reduces the computational complexity of the Transformer sub-model while improving the efficiency of abnormal trajectory correction. The blockwise attention strategy not only enhances computational efficiency but also maintains prediction accuracy, making it a crucial component of the proposed method.  Results and Discussions  Experiments are conducted using a public dataset to predict UAV positional parameters, including longitude, latitude, and altitude. The dataset’s feature parameters are presented in (Table 1). The trajectory correction performance of the proposed method is evaluated and compared with other correction methods using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). (Fig. 5) and (Fig. 6) present the error metrics of the proposed method in comparison with Support Vector Regression (SVR), CNN-LSTM, and LSTM-RF under different prediction step sizes and measurement noise standard deviation conditions. The results indicate that the proposed method achieves the lowest correction errors. At a prediction step size of 20 and a measurement noise standard deviation of 0.19, the proposed method achieves RMSE, MAE, and MAPE values of 0.2971, 0.2208, and 21.688%, respectively. Compared with SVR, CNN-LSTM, and LSTM-RF, the RMSE is reduced by 39.52%, 6.22%, and 20.65%, the MAE by 45.5%, 8.46%, and 20.52%, and the MAPE by 8.955%, 2.03%, and 3.532%, respectively. (Fig. 7) and (Fig. 8) compare the proposed method with the original LSTM-Transformer, the Transformer with the blockwise attention optimization strategy, and individual LSTM and Transformer models in terms of error metrics under different prediction steps and measurement noise standard deviation conditions. When the prediction step is 20 and the measurement noise standard deviation is 0.19, the proposed method achieves RMSE reductions of 12.23%, 4.07%, 1.36%, and 3.48%, MAE reductions of 19.36%, 6.76%, 3.83%, and 4.21%, and MAPE reductions of 3.84%, 3.616%, 2.075%, and 2.087%, compared to the other four correction methods. These findings demonstrate the superior performance of the proposed method in reducing trajectory correction errors. The runtime efficiency of the proposed method under different prediction steps is evaluated, as shown in (Fig. 9). With a prediction step size of 20, the proposed method completes the prediction in 0.699 s, which is 35.87% faster than the original LSTM-Transformer model. This confirms that the blockwise attention optimization strategy enhances correction efficiency. Finally, (Fig. 10) presents trajectory comparisons, illustrating the accuracy of the proposed method. The predicted trajectories closely align with actual trajectories, outperforming baseline methods in correcting UAV abnormal trajectories under various conditions.  Conclusions  The proposed multi-model fusion method for UAV abnormal trajectory correction enhances correction efficiency and reduces errors more effectively than benchmark methods. The results demonstrate that the method achieves accurate and reliable trajectory correction, making it suitable for practical UAV applications.
Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles
HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng
Available online  , doi: 10.11999/JEIT240738
Abstract:
  Objective  The demand for Unmanned Aerial Vehicles (UAVs) in distributed sensing applications has increased significantly due to their low cost, flexibility, mobility, and ease of deployment. In these applications, the coordination of multi-UAV sensing tasks, communication strategies, and flight trajectory optimization presents a significant challenge. Although there have been preliminary studies on the joint optimization of UAV communication strategies and flight trajectories, most existing work overlooks the impact of the randomly distributed and dynamically updated task airspace model on the optimal design of UAV communication and flight strategies. Furthermore, accurate UAV energy consumption modeling is often lacking when establishing system design goals. Energy consumption during flight, sensing, and data transmission is a critical issue, especially given the UAV’s limited payload capacity and energy supply. Achieving an accurate energy consumption model is essential for extending UAV operational time. To address the requirements of multiple UAVs performing distributed sensing, particularly when tasks are dynamically updated and data must be transmitted to ground base stations, this paper explores the optimal design of joint UAV sensing task allocation, base station association for data backhaul, flight strategy planning, and transmit power control.  Methods  To coordinate the relationships among UAVs, base stations, and sensing tasks, a protocol framework for multi-UAV distributed task sensing applications is first proposed. This framework divides the UAVs’ behavior during distributed sensing into four stages: cooperation, movement, sensing, and transmission. The framework ensures coordination in the UAVs’ movement to the task area, task sensing, and the backhaul transmission of sensed data. A sensing task model based on dynamic updates, a UAV movement model, a UAV sensing behavior model, and a data backhaul transmission model are then established. A revenue function, combining task sensing utility and task execution costs, is designed, leading to a joint optimization problem of UAV task allocation, communication base station association, and flight strategy. The objective is to maximize the long-term weighted utility-cost. Given that the optimization problem involves high-dimensional decision variables in both discrete and continuous forms, and the objective function is non-convex with respect to these variables, the problem is a typical non-convex Mixed-Integer Non-Linear Programming (MINLP) problem. It falls within the NP-Hard complexity class. Centralized optimization algorithms for this formulation require a central node with high computational capacity and the collection of substantial additional information, such as channel state and UAV location data. This results in high information-interaction overhead and poor scalability. To overcome these challenges, the problem is reformulated as a Markov Game (MG). An effective algorithm is designed by leveraging the distributed coordination concept of Multi-Agent (MA) systems and the exploration capability of deep Reinforcement Learning (RL) within the optimization solution space. Specifically, due to the complex coupling between the continuous and discrete action spaces in the MG problem, a novel solution algorithm called Multi-Agent Independent-Learning Compound-Action Actor-Critic (MA-IL-CA2C) based on Independent Learning (IL) is proposed. The core idea is as follows: first, the independent-learning algorithm is applied to extend single-agent RL to a MA environment. Then, deep learning is used to represent the high-dimensional action and state spaces. To handle the combined discrete and continuous action spaces, the UAV action space is decomposed into discrete and continuous components, with the DQN algorithm applied to the discrete space and the DDPG algorithm to the continuous space.  Results and Discussions  The computational complexity of action selection and training for the proposed MA-IL-CA2C algorithm is theoretically analyzed. The results show that its complexity is almost equivalent to that of the two benchmark algorithms, DQN and DDPG. Additionally, the performance of the proposed algorithm is simulated and analyzed. When compared with the DQN, DDPG, and Greedy algorithms, the MA-IL-CA2C algorithm demonstrates lower network energy consumption throughout the network operation (Fig. 6), improved system revenue (Fig. 5, Fig. 8, and Fig. 9), and optimized UAV flight strategies (Fig. 7).  Conclusions  This paper addresses and solves the optimal design problems of joint UAV sensing task allocation, data backhaul base station association, flight strategy planning, and transmit power control for multi-UAV distributed task sensing. A new MA-IL-CA2C algorithm based on IL is proposed. The simulation results show that the proposed algorithm achieves better system revenue while minimizing UAV energy consumption.
Survey of Unified Representation Technology of Multi-dimensional Information for Low Altitude Intelligent Network
DONG Chao, CUI Can, JIA Ziye, ZHU Yian, ZHANG Lei, WU Qihui
Available online  , doi: 10.11999/JEIT240835
Abstract:
  Significance   The Low Altitude Intelligent Network (LAIN) has emerged as a critical productive force in recent years, particularly with the growing strategic role of the low-altitude economy in national development plans. As an integral part of smart city infrastructure and advanced air mobility systems, LAIN contributes both to economic growth and to airspace security. By integrating unmanned aerial vehicles, fifth-generation communication technologies, and artificial intelligence, LAIN enables real-time monitoring and provides services for urban traffic, agriculture, and disaster management. This integration optimizes resource allocation and enhances public safety. However, the rapid development of LAIN results in a vast array of distributed aircraft and ground equipment that generate large volumes of heterogeneous data in various formats. The absence of a unified representation standard significantly hinders the efficient utilization of data within the LAIN ecosystem, presenting substantial challenges for its widespread application in complex real-world scenarios. Therefore, the development of a unified data representation model for multi-dimensional and heterogeneous information within LAIN is essential to eliminate data heterogeneity, enhance data utilization efficiency, and promote the deep integration of the low-altitude economy with the digital economy.   Process   Existing research has explored innovative methods and technologies for information representation and addressing potential challenges in the LAIN. However, current solutions remain domain-specific and lack adaptability to the dynamic environment of LAIN. The absence of targeted research and standards makes it difficult to establish a unified representation for multi-source data. To bridge this gap, a heterogeneous information unified representation model is proposed for LAIN. This paper aims to address the challenges posed by complex data and information in the LAIN environment, particularly within the context of the sixth generation of communication technologies, and to provide new approaches for data management and application in LAIN. First, the heterogeneous data types within LAIN are categorized, highlighting their key characteristics and application scenarios. A platform for LAIN data integration and fusion is then developed, incorporating multiple technologies to facilitate efficient data collection, transmission, processing, and visual display. Additionally, the challenges of achieving a unified representation of multi-dimensional and heterogeneous information within LAIN are analyzed. Finally, promising methods for data fusion and representation are discussed, including data fusion, spatiotemporal gridding data technology, multi-mode technology, and knowledge graphs. These methods aim to establish a unified knowledge representation model and achieve semantic alignment, enabling the integration of data from diverse sources. Specifically, multi-source data are preprocessed to enhance understandability and availability through multi-level fusion, integrating multi-dimensional information from various sensors and data sources within a unified framework. Spatiotemporal gridding standardizes data formats and captures spatiotemporal changes, thereby effectively processing and integrating multi-source, multi-dimensional spatial data. Furthermore, integrating multi-mode data through multi-mode technology is expected to improve decision-making accuracy, while the knowledge graph links multi-source data, constructing a knowledge network that standardizes and correlates information from various sources, formats, and semantics.   Prospects   With the advancement of multi-dimensional data unified representation technology, the LAIN is poised to integrate with edge computing, radio knowledge description languages, large language models, and other emerging technologies to enable intelligent analysis and autonomous decision-making for low-altitude systems. Specifically, data processing can be optimized through edge computing. By positioning edge devices closer to the terminal, edge computing facilitates preprocessing and preliminary analysis at the data source. This technology enhances response speed and efficiency, providing high-quality services for the rapid acquisition and unified representation of LAIN information. Data from various sensors and systems can be structured and represented in an organized manner, facilitating data exchange between different systems, enabling readable spectrum management policies, and reducing interference incidents. Additionally, large language models can assist in constructing and refining knowledge graphs, advancing the intelligent operation and management of low-altitude aircraft. These promising technologies are expected to support further fusion and unified representation of LAIN data, laying a foundation for future research in the LAIN field.  Conclusions   This paper systematically addresses the challenges of multi-dimensional data representation in the LAIN through a combination of theoretical innovation and technological integration. The main contributions of this paper include: (1) A summary of related works in the field, with an introduction to potential heterogeneous data types, their key characteristics, and relevant application scenarios. (2) The proposal of a low-altitude information fusion and monitoring system, with an analysis of the challenges in achieving unified data representation. (3) The introduction of key technologies such as data fusion, spatiotemporal gridding data technology, multi-mode technology, and knowledge graphs. Additionally, edge computing technology, radio knowledge description language, and large language model technology are integrated to enhance data fusion and unified representation in LAIN. The findings of this study provide both theoretical and technical support for the development of LAIN, fostering the efficient utilization and intelligent advancement of information resources.
Aerial Target Intention Recognition Method Integrating Information Classification Processing and Multi-scale Embedding Graph Robust Learning with Noisy Labels
SONG Zihao, ZHOU Yan, CAI Yichao, CHENG Wei, YUAN Kai, LI Hui
Available online  , doi: 10.11999/JEIT241074
Abstract:
  Objective  Aerial Target Intention Recognition (ATIR) predicts and assesses the intentions of non-cooperative targets by integrating information acquired and processed by various sensors. Accurate recognition enhances decision-making, aiding commanders and combatants in steering engagements favorably. Therefore, robust and precise recognition methods are essential. Advances in big data and detection technologies have driven research into deep-learning-based intention recognition. However, noisy labels in target intention recognition datasets hinder the reliability of traditional deep-learning models. To address this issue, this study proposes an intention recognition method incorporating Information Classification Processing (ICP) and multi-scale robust learning. The trained model demonstrates high accuracy even in the presence of noisy labels.  Methods  This method integrates an ICP network, a cross-scale embedding fusion mechanism, and multi-scale embedding graph learning. The ICP network performs cross-classification processing by analyzing attribute correlations and differences, facilitating the extraction of embeddings conducive to intention recognition. The cross-scale embedding fusion mechanism employs target sequences at different scales to train multiple Deep Neural Networks (DNNs) simultaneously. It sequentially integrates robust embeddings from fine to coarse scales. During training, complementary information across scales enables a cross-teaching strategy, where each encoder selects clean-label samples based on a small-loss criterion. Additionally, multi-scale embedding graph learning establishes relationships between labeled and unlabeled samples to correct noisy labels. Specifically, for high-loss unselected samples, the Speaker-listener Label PropagAtion (SLPA) algorithm refines their labels using the multi-scale embedding graph, improving model adaptation to the class distribution of target attribute sequences.  Results and Discussions  When the proportion of symmetric noise is 20% (Table 1), the test accuracy of the Cross-Entropy (CE) method exceeds 80%, demonstrating the effectiveness of the ICP network. The proposed method achieves both test accuracy and a Macro F1 score (M F1) above 92%. At higher noise levels—50% symmetric noise and 40% asymmetric noise (Table 1)—the performance of other methods declines significantly. In contrast, the proposed method maintains accuracy and M F1 above 84%, indicating greater stability and robustness. This strong performance can be attributed to: (1) Cross-scale fusion, which integrates complementary information from different scales, enhancing the separability and robustness of fused embeddings. This ensures the selection of high-quality samples and prevents performance degradation caused by noisy labels in label propagation. (2) SLPA in multi-scale embedding graph learning, which stabilizes label propagation even when the dataset contains a high proportion of noisy labels.  Conclusions  This study proposes an intelligent method for recognizing aerial target intentions in the presence of noisy labels. The method effectively addresses noise label by integrating an ICP network, a cross-scale embedding fusion mechanism, and multi-scale embedding graph learning. First, an embedding extraction encoder based on the ICP network is constructed using acquired target attributes. The cross-scale embedding fusion mechanism then integrates encoder outputs from sequences at different scales, facilitating the extraction of multi-scale features and enhancing the reliability of clean samples identified by the small-loss criterion. Finally, multi-scale embedding graph learning, incorporating SLPA, refines noisy labels by leveraging selected clean labels. Experiments on the ATIR dataset across various noise types and levels demonstrate that the proposed method achieves significantly higher test accuracy and M F1 than other baseline approaches. Ablation studies further validate the effectiveness and robustness of the network architecture and mechanisms.
Distributionally Robust Task Offloading for Low-Altitude Intelligent Networks
JIA Ziye, JIANG Guanwang, CUI Can, ZHANG Lei, WU Qihui
Available online  , doi: 10.11999/JEIT240799
Abstract:
  Objective   The rapid development of Low-Altitude Intelligent Networks (LAINs) and the widespread adoption of Multi-access Edge Computing (MEC) have introduced challenges related to the random variability in task data sizes, which constrains the efficiency of LAIN-assisted MEC networks. Although task offloading has been extensively studied, most existing research overlooks the uncertainty in task sizes. This randomness can lead to unexpected outages and inefficient resource utilization, making it difficult to meet quality-of-service requirements. Distributionally Robust Optimization (DRO) based on uncertainty sets is a promising approach to addressing these challenges. By formulating and solving a DRO problem that accounts for task uncertainties, this study provides a robust and conservative solution applicable to various LAIN-related scenarios.  Methods   This study proposes an LAIN-assisted MEC network comprising multiple hovering Unmanned Aerial Vehicles (UAVs), a High-Altitude Platform (HAP), and Ground Users (GUs). To accurately model task size randomness, three probabilistic distance metrics—L1, L and Fortet-Mourier (FM)—are introduced to construct uncertainty sets based on historical data. A DRO problem is then formulated using these uncertainty sets to optimize task offloading decisions within the proposed network. The objective is to minimize system latency under the worst-case probability distribution of task sizes, thereby enhancing system robustness. The proposed DRO problem, structured as a minimization-maximization mixed-integer programming model, is solved iteratively through decomposition into an inner and an outer problem. The inner problem, a linear programming problem, is addressed using standard solvers such as GUROBI. For the outer problem, the low-complexity Branch and Bound (BB) method is employed to solve the integer programming component efficiently by systematically exploring subsets of the solution space and pruning infeasible regions using upper and lower bounds. To handle large-scale and multi-constraint scenarios, a heuristic Binary Whale Optimization Algorithm (BWOA) is further integrated to accelerate convergence. Therefore, the Distributionally Robust Task Offloading Optimization Algorithm (DRTOOA) is developed by combining BB and BWOA. Initially, BB determines a subset of binary variables, followed by BWOA optimization for the remaining variables. This process is repeated iteratively until convergence is achieved.  Results and Discussions   The performance of the proposed DRTOOA is evaluated through numerical simulations. System latency is analyzed under three probabilistic distance metrics used for constructing uncertainty sets (Fig. 3). As the tolerance of the uncertainty sets increases, system latency rises across all metrics. Notably, the latency achieved via DRTOOA is lower than that obtained using the Exhaustive Method (EM) but higher than that using the BB method, demonstrating its robustness against uncertainties. In terms of computational efficiency, DRTOOA outperforms other benchmark algorithms by achieving the shortest latency, highlighting its effectiveness in solving large-scale problems (Fig. 4). Among the three probabilistic distance metrics, the FM metric yields the lowest system latency with relatively stable performance as the tolerance changes (Fig. 5). Additionally, the impact of uncertainty set tolerance on the probability distribution of task sizes is examined (Fig. 6). As the tolerance decreases, the task size distribution aligns more closely with the reference distribution. Conversely, increasing the tolerance results in a higher probability of larger task sizes. Notably, optimization based on the FM probabilistic distance metric exhibits greater stability under varying tolerances. Furthermore, the impact of HAP quota limitations and the number of GUs on system latency are analyzed (Figs. 7 and 8). System latency decreases as HAP quotas increase, indicating that additional HAP resources alleviate task processing pressure. Conversely, an increase in the number of GUs leads to higher system latency due to the greater computational demand. Overall, DRTOOA effectively optimizes system latency and demonstrates superior performance compared with other baseline algorithms in terms of robustness and computational efficiency.  Conclusions   This study addresses the task offloading problem in LAIN-assisted MEC networks, considering the uncertainty in task sizes. By constructing uncertainty sets based on different probabilistic distance metrics and formulating a DRO problem, the DRTOOA is proposed, effectively integrating the BB method with the BWOA. Simulation results demonstrate that: (1) Compared with the BB method and the EM, DRTOOA effectively reduces system latency, demonstrating higher efficiency in problem-solving. (2) Among the three probabilistic distance metrics—FM distance, L1 norm distance, and L norm distance—the FM metric exhibits the most stability, yielding the lowest system latency under the same conditions. (3) System latency is influenced by factors such as the tolerance of uncertainty sets, HAP quota limitations, and the number of GUs. However, this study assumes static or quasi-static network nodes for simplification, limiting the consideration of UAV flexibility and dynamicity. Future research should explore the impact of UAV and HAP mobility, as well as real-world factors such as communication interference and equipment failures, on task offloading decisions and overall system performance.
Research on Security, Privacy, and Energy Efficiency in Unmanned Aerial Vehicle-Assisted Federal Edge Learning Communication Systems
LU Weidang, FENG Kai, DING Yu, LI Bo, ZHAO Nan
Available online  , doi: 10.11999/JEIT240847
Abstract:
  Objective  Unmanned Aerial Vehicle-Assisted Federal Edge Learning (UAV-Assisted FEL) communication addresses the data isolation problem and mitigates data leakage risks in terminal devices. However, eavesdroppers may exploit model updates in FEL to recover original private data, significantly threatening the system’s privacy and security.  Methods  To address this issue, this study proposes a secure aggregation and resource optimization scheme for UAV-Assisted FEL communication systems. Terminal devices train local models using local data and update parameters, which are transmitted to a global UAV. The UAV aggregates these parameters to generate new global model parameters. Eavesdroppers attempt to intercept the transmitted parameters to reconstruct the original data. To enhance security-privacy energy efficiency, the transmission bandwidth, CPU frequency, and transmit power of terminal devices, along with the CPU frequency of the UAV, are jointly optimized. An evolutionary Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to solve this optimization problem. The algorithm intelligently interacts with the system to achieve secure aggregation and resource optimization while meeting latency and energy consumption requirements.  Results and Discussions  The simulation results validate the effectiveness of the proposed scheme. The experiments evaluate the effects of the scheme on key performance metrics, including system cost, secure transmission rate, and secure privacy energy efficiency, from multiple perspectives. As shown in (Fig. 2), with an increasing number of terminal devices, system cost, secure transmission rate, and secure privacy energy efficiency all increase. These results indicate that the proposed scheme ensures system security and enhances energy efficiency, even in multi-device scenarios. As shown in (Fig. 3), under varying global iteration counts, the system balances latency and energy consumption by either extending the duration to lower energy consumption or increasing energy consumption to reduce latency. The secure transmission rate rises with the number of global iterations, as fewer iterations allow the system to tolerate higher energy consumption and latency per iteration, leading to reduced transmission power from terminal devices to meet system constraints. Additionally, secure privacy energy efficiency improves with increasing global iterations, further demonstrating the scheme’s capacity to ensure system security and reduce system cost as global iterations increase. As shown in (Fig. 4), during UAV flight, secure privacy energy efficiency fluctuates, with higher secure transmission rates observed when the communication environment between terminal devices and the UAV is more favorable. As shown in (Fig. 5), the proposed scheme is compared with two baseline schemes: Scheme 1, which minimizes system latency, and Scheme 2, which minimizes system energy consumption. The proposed scheme significantly outperforms both baselines in cost overhead. Scheme 1 achieves a slightly higher secure transmission rate than the proposed scheme due to its focus on minimizing latency at the expense of higher energy consumption. Conversely, Scheme 2 shows a considerably lower secure transmission rate as it prioritizes minimizing energy consumption, resulting in lower transmission power and compromised secure transmission rates. The results indicate that the secure privacy energy efficiency of the proposed scheme significantly exceeds that of the baseline schemes, further demonstrating its effectiveness.  Conclusions  To enhance data transmission security and reduce system costs, this paper proposes a secure aggregation and resource optimization scheme for UAV-Assisted FEL. Under constraints of limited computational and communication resources, the scheme jointly optimizes the transmission bandwidth, CPU frequency, and transmission power of terminal devices, along with the CPU frequency of the UAV, to maximize the secure privacy energy efficiency of the UAV-Assisted FEL system. Given the complexity of the time-varying system and the strong coupling of multiple optimization variables, an advanced DDPG algorithm is developed to solve the optimization problem. The problem is first modeled as a Markov Decision Process, followed by the construction of a reward function positively correlated with the secure privacy energy efficiency objective. The proposed DDPG network then intelligently generates joint optimization variables to obtain the optimal solution for secure privacy energy efficiency. Simulation experiments evaluate the effects of the proposed scheme on key system performance metrics from multiple perspectives. The results demonstrate that the proposed scheme significantly outperforms other benchmark schemes in improving secure privacy energy efficiency, thereby validating its effectiveness.
An Improved Modulation Recognition Method Based on Hybrid Kolmogorov-Arnold Convolutional Neural Network
ZHENG Qinghe, LIU Fanglin, YU Lisu, JIANG Weiwei, HUANG Chongwen, LI Bin, SHU Feng
Available online  , doi: 10.11999/JEIT250161
Abstract:
  Objective  With the rapid growth of communication devices and increasing complexity of electromagnetic environments, spectrum efficiency has become a critical performance metric for sixth-generation communication systems. Modulation recognition is an essential function of dynamic spectrum access, aiming to automatically identify the modulation scheme of received signals to enhance spectrum utilization. In practice, wireless signals are often affected by multipath propagation, interference, and noise, which pose challenges for accurate recognition. To address these issues, this study proposes a deep learning-based approach using an end-to-end model that eliminates manual feature extraction, mitigates limitations of handcrafted features, and improves recognition accuracy. By transferring general knowledge from signal classification to modulation recognition, a well-generalized method based on a hybrid Kolmogorov-Arnold Convolutional Neural Network (KA-CNN) is developed. This approach supports reliable communication in applications such as intelligent transportation, the Internet of Things (IoT), vehicular ad hoc networks, and satellite communication.  Methods  The proposed modulation recognition method first decomposes the signal into a multi-dimensional wavelet domain using a dual-tree complex wavelet packet transform. Different frequency components are then combined to construct a multi-scale signal representation, enabling the neural network to learn consistent features across frequencies. A deep learning structure, KA-CNN, is designed by integrating spline functions with nonlinear activation functions to enhance nonlinear fitting and continuous learning of periodic features. Spline functions are used to address the curse of dimensionality. To improve adaptability to varying signal parameters and enhance generalization across communication scenarios, multilevel grid training with Lipschitz regularization constraints is applied. In KA-CNN, the hybrid module transfers the characteristics of the spline function into convolution operations, which improves the model’s capacity to capture complex mappings between input signals and modulation schemes while retaining the efficiency of the Kolmogorov-Arnold network. This enhances both the expressive power and adaptability of deep learning models under complex communication conditions.  Results and Discussions  During the experimental phase, modulation recognition performance testing, ablation study, and comparative analysis are conducted on three publicly available datasets (RadioML 2016.10a, RadioML 2018.01a, and CSPB.ML.2023) to evaluate the performance of KA-CNN. Results show that KA-CNN achieves modulation recognition accuracies of 65.14%, 65.56%, and 78.40% on RadioML 2016.10a, RadioML 2018.01a, and CSPB.ML.2023, respectively (Figure 6). The main performance limitation arises in the classification of QPSK versus 8PSK, AM-DSB versus WBFM, and high-order QAM modulation types (Figure 7). Maximum differences in recognition accuracy of KA-CNN driven by different signal representations reach 2.04%, 3.46%, and 4.54% across the three datasets, demonstrating the effect of signal representation (Figure 8). The wavelet packet transform constructs a multi-scale time-frequency representation of signals that is insensitive to the maximum decomposition scale L and supports complementary learning of different modulation features. The hybrid Kolmogorov–Arnold convolutional module and the multi-dimensional perceptual cascade attention mechanism play key roles in enhancing modulation recognition accuracy, particularly under relatively high Signal-To-Noise Ratio (SNR) conditions (Figure 9). Additionally, finer grids and higher decomposition orders improve the model’s ability to extract discriminative signal features, thereby increasing recognition accuracy (Figure 10). Finally, a comparative evaluation against several deep learning models—including GGCNN, Transformer, PR-LSTM, and MobileViT—confirms the superior performance of KA-CNN (Figure 11).  Conclusions  This study proposes a hybrid KA-CNN to address the reduced modulation recognition accuracy caused by noise and parameter variation, as well as the limited generalization across communication scenarios in existing deep learning models. By integrating spline functions with nonlinear activation functions, KA-CNN mitigates the curse of dimensionality and improves its capacity for continuous learning of periodic features. A dual-tree complex wavelet packet transform is used to construct a multi-scale signal representation, enabling the model to extract consistent features across frequencies. The model is trained using multilevel grids with Lipschitz regularization constraints to enhance adaptability to varying signal parameters and improve generalization. Experimental results on three public datasets demonstrate that KA-CNN improves modulation recognition accuracy and exhibits robust generalization, particularly under low SNRs.
Dual-Memristor Brain-Like Chaotic Neural Network and Its Application in IoMT Data Privacy Protection
LIN Hairong, DUAN Chenxing, DENG Xiaoheng, Geyong Min
Available online  , doi: 10.11999/JEIT241133
Abstract:
  Objective  In recent years, frequent breaches of medical data have posed significant threats to patient privacy and health security, highlighting the urgent need for effective solutions to protect medical data privacy and security during transmission. This paper proposes a novel data privacy protection method for the Internet of Medical Things (IoMT) based on a dual-memristor-inspired brain-like chaotic neural network to address this challenge.  Methods  Leveraging the synaptic bionic characteristics of memristors, a dual-memristor brain-like chaotic neural network model based on the Hopfield neural network is developed. The complex chaotic dynamics of this model are thoroughly analyzed using nonlinear dynamics tools, including bifurcation diagrams, Lyapunov exponent spectra, phase portraits, time-domain waveforms, and basins of attraction. To validate its practicality and reliability, a hardware platform is created using a Microcontroller Unit (MCU), and hardware experiments confirm the model’s complex dynamic behaviors. Based on this model, an efficient IoMT data privacy protection method is designed by utilizing the complex chaotic properties of the dual-memristor brain-like chaotic neural network. A comprehensive security analysis of the encryption of colored medical image data is also performed.  Results and Discussions  The results demonstrate that the proposed network not only exhibits complex grid-like multi-structure chaotic attractors but also possesses the capability to regulate planar initial condition displacements, significantly enhancing its potential for cryptographic applications. Experimental findings indicate that this method performs exceptionally well across key metrics, including a large key space, low pixel correlation, high key sensitivity, and strong robustness against noise and data loss attacks.  Conclusions  This study presents an innovative and effective solution for protecting medical data privacy in IoMT environments, providing a solid foundation for the development of secure technologies in intelligent healthcare systems.
3D Reconstruction of Metro Tunnel Based on Path Likelihood Model and HMM Sequence Matching Localization
HU Zhaozheng, WANG Shuheng, MENG Jie, FENG Feng, ZHU Ziwei, LI Weigang
Available online  , doi: 10.11999/JEIT241122
Abstract:
  Objective  As the operational mileage of metro systems in China continues to increase, the inspection and maintenance of metro tunnels have become more critical. Accurate 3D reconstruction of metro tunnels is essential for construction, inspection, and maintenance. However, in severely degraded tunnel environments, existing SLAM algorithms based on laser or vision often struggle to construct maps and face limitations in complex scenarios. To address this challenge, this paper proposes a method for large-scale 3D reconstruction of metro tunnels by utilizing the matching of the Path Likelihood Model (PLM) and the Hidden Markov Model (HMM). The 3D reconstruction task is divided into two key processes: odometer positioning and high-precision 3D reconstruction via graph optimization. High-precision 3D reconstruction is achieved by effectively addressing both components.  Methods  For odometer-based localization, this paper presents a method that incorporates the PLM. The PLM is developed using kernel density estimation to analyze the vehicle's track path, effectively representing the vehicle's positional information as a probability distribution. Within the framework of a particle filter, this method converts the constructed PLM into position observations of the vehicle. Additionally, data from the onboard Inertial Measurement Unit (IMU) and the wheel speed sensor are integrated to enhance localization accuracy. To minimize cumulative errors in odometer-based localization, this paper reformulates the problem of loop closure detection as a sequence matching problem using the Viterbi algorithm within the framework of the HMM. This method effectively addresses the instability associated with single-frame matching in loop closure detection and significantly improves the overall performance. To resolve the reconstruction problem, this paper presents a method for 3D reconstruction using large-scale factor graph optimization. By optimizing the pose graph with multiple constraints, it enables high-precision 3D reconstruction of extensive metro tunnels.  Results and Discussions  The proposed method and model are tested and validated at the WeiJianian-ShuangShuianian and ShaHeyuan-DongZikou metro stations in Chengdu. The experimental results are as follows: the effectiveness of the proposed method is confirmed through two sets of ablation experiments, DR and DR+PATH. Furthermore, by comparing the results with those of two notable open-source LIDAR algorithms, LIO-SAM and Faster-LIO, the superiority of this method is demonstrated. The reconstruction accuracy achieved is high, and the reconstruction error remains consistent even as the running distance increases. Therefore, the method is suitable for application in real operational processes.  Conclusions  This paper addresses the challenges of 3D reconstruction in metro tunnels by proposing a novel algorithm that combines the PLM with HMM sequence matching. The PLM is developed using drawing information, which serves as the foundation for the reconstruction process. Within the framework of particle filtering, the likelihood model is used to correct errors from the IMU and wheel speed sensor. This results in accurate odometer readings for the onboard robot. Furthermore, the issue of loop matching is reformulated as an HMM sequence matching problem. By constructing loop constraints, accumulated positioning errors are effectively eliminated. Finally, the pose and loop constraints derived from the odometer data are integrated into the optimization model for a large-scale factor map, enabling high-precision 3D reconstruction of the metro tunnel. Field tests conducted at the WeiJianian-ShuangShuianian and ShaHeyuan-DongZikou metro stations in Chengdu, with comparisons with other algorithms, demonstrate that the proposed PLM and HMM sequence matching algorithm significantly improve 3D reconstruction accuracy in metro tunnels, particularly in severely degraded environments.
Research on Weld Defect Detection Method Based on Improved DETR
DAI Zheng, LIU Xiaojia, PAN Quan
Available online  , doi: 10.11999/JEIT241009
Abstract:
  Objective  Welding technology plays a pivotal role in industrial manufacturing, where X-ray image evaluation serves as a critical inspection method for assessing the internal quality of weld seams. X-ray inspection is effective in identifying defects such as slag inclusions, incomplete penetration, and porosity, which helps prevent structural failures and ensures the reliability and durability of welded components. This process is a fundamental quality control measure in industrial manufacturing. However, challenges persist in the assessment of weld seam X-ray images, particularly in relation to high workloads and inefficiencies. Conventional models often experience multi-scale feature information loss during feature extraction due to the significant variation in the size and morphology of defects, such as porosity, slag inclusions, and incomplete penetration, found in large structural weld seams. To address these limitations, the Detection Transformer with Concatenated Expand Convolutions and Augmented Feature Pyramid Networks (CADETR) model is proposed to improve detection performance for weld defects in large structural components.  Methods  The CADETR model is proposed for detecting weld defects in large structural components. The model comprises three core components: the DETR network, concatenated expand convolution (CEC) network, and Augmented Feature Pyramid Network (AFPN). The DETR network applies multi-head self-attention mechanisms to effectively capture global contextual relationships among feature map positions, enhancing perceptual capability and detection accuracy for weld defects. The CEC module adopts a composite expanded convolution structure, widening convolutional kernel receptive fields and significantly improving feature extraction for defects across various scales. The AFPN module reinforces multi-scale defect feature extraction by integrating hierarchical feature maps and employing a feature batch elimination mechanism, reducing overfitting and enhancing generalization performance in multi-scale defect detection. Additionally, a Penalized Cross Entropy Loss (PCE-Loss) function is proposed, which applies increased penalties to incorrect defect predictions, further improving model robustness and precision.  Results and Discussions  The performance of the CADETR defect detection model is evaluated through a comparative analysis with multiple models, including Faster RCNN, ECASNet, GeRCNN, DETR, MDCBNet, HPRT-DETR, and YOLOv1. Weld seam X-ray image data are input into each model, with variations in loss values recorded during the training process. Model performance in defect detection is assessed using Precision, Recall, and mAP metrics. Experimental results show that the CADETR model exhibits slightly higher loss values compared to HPRT-DETR and YOLOv1 but lower values than other benchmark models (Fig. 7). The CADETR model demonstrates superior performance in mAP, achieving 91.6%, exceeding all comparative models (Table 3). The CADETR model proves particularly effective in detecting defects characterized by a high proportion of small targets and significant shape variations (Fig. 8).  Conclusions  This study addresses the challenges of detecting weld defects with significant variations in size and morphology in large structural components through the CADETR weld defect detection model. Evaluation using a welded seam X-ray image dataset revealed the following key findings: (1) The sequential integration of the CEC module, AFPN module, and PCE-Loss function into the baseline DETR framework improved mAP by 4.6%, 4.5%, and 3.4%, respectively, validating the contribution of each component. (2) The CADETR model achieved a 91.6% mAP for weld defect detection, with a single-image inference time of 0.036 s. (3) Compared to the original DETR, CADETR demonstrated a 9.9% improvement in mAP. For future implementation, the CADETR model will be deployed in a Browser/Server (B/S) architecture-based weld defect detection system, where both software algorithms and computational hardware resources will be hosted on cloud servers. This design ensures stable operational workflows and facilitates cross-platform data resource sharing.
A Medical Video Segmentation Algorithm Integrating Neighborhood Attention and State Space Model
DING Jianrui, ZHANG Ting, LIU Jiadong, NING Chunping
Available online  , doi: 10.11999/JEIT240755
Abstract:
  Objective  Accurate segmentation of lesions in medical videos is crucial for clinical diagnosis and treatment. Unlike static medical images, videos provide continuous temporal information, enabling tracking of lesion evolution and morphological changes. However, existing segmentation methods primarily focus on processing individual frames, failing to effectively capture temporal correlations across frames. While self-attention mechanisms have been used to model long-range dependencies, their quadratic computational complexity renders them inefficient for high-resolution video segmentation. Additionally, medical videos are often affected by motion blur, noise, and illumination variations, which further hinder segmentation accuracy. To address these challenges, this paper proposes a novel medical video segmentation algorithm that integrates neighborhood attention and a State Space Model (SSM). The approach aims to efficiently capture both local and global spatiotemporal features, improving segmentation accuracy while maintaining computational efficiency.  Methods  The proposed approach comprises two key stages: local feature extraction and global temporal modeling, designed to efficiently capture both spatial and temporal dependencies in medical video segmentation.In the first stage, a deep convolutional network is used to extract spatial features from each video frame, providing a detailed representation of anatomical structures. However, relying solely on spatial features is insufficient for medical video segmentation, as lesions often undergo subtle morphological changes over time. To address this, a neighborhood attention mechanism is introduced to capture short-term dependencies between adjacent frames. Unlike conventional self-attention mechanisms, which compute relationships across the entire frame, neighborhood attention selectively attends to local regions around each pixel, reducing computational complexity while preserving essential temporal coherence. This localized attention mechanism enables the model to focus on small but critical changes in lesion appearance, making it more robust to motion and deformation variations. In the second stage, an SSM module is integrated to capture long-range dependencies across the video sequence. Unlike Transformer-based approaches, which suffer from quadratic complexity due to the self-attention mechanism, the SSM operates with linear complexity, significantly improving computational efficiency while maintaining strong temporal modeling capabilities. To further enhance the processing of video-based medical data, a 2D selective scanning mechanism is introduced to extend the SSM from 1D to 2D. This mechanism enables the model to extract spatiotemporal relationships more effectively by scanning input data along multiple directions and merging the results, ensuring that both local and global temporal structures are well represented. The combination of neighborhood attention for local refinement and SSM-based modeling for long-range dependencies enables the proposed method to achieve a balance between segmentation accuracy and computational efficiency. The model is trained and evaluated on multiple medical video datasets to verify its effectiveness across different segmentation scenarios, demonstrating its capability to handle complex lesion appearances, background noise, and variations in imaging conditions.  Results and Discussions  The proposed method is evaluated on three widely used medical video datasets: thyroid ultrasound, CVC-ClinicDB, and CVC-ColonDB. The model achieves Intersection Over Union (IOU) scores of 72.7%, 82.3%, and 72.5%, respectively, outperforming existing state-of-the-art methods. Compared to the Vivim model, the proposed method improves IOU by 5.7%, 1.7%, and 5.5%, highlighting the advantage of leveraging temporal information. In terms of computational efficiency, the model achieves 23.97 frames per second (fps) on the thyroid ultrasound dataset, making it suitable for real-time clinical applications. A comparative analysis against several state-of-the-art methods, including UNet, TransUNet, PraNet, U-Mamba, LKM-UNET, RMFG, SALI, and Vivim, demonstrates that the proposed method consistently outperforms these approaches, particularly in complex scenarios with significant background noise, occlusions, and motion artifacts. Specifically, on the CVC-ClinicDB dataset, the proposed model achieves an IOU of 82.3%, exceeding the previous best approach (80.9%). On the CVC-ColonDB dataset, which presents additional challenges due to lighting variations and occlusions, the model attains an IOU of 72.5%, outperforming the previous best method (70.8%). These results highlight the importance of incorporating both local and global temporal information to enhance segmentation accuracy and robustness in medical video analysis.  Conclusions  This study proposes a medical video segmentation algorithm that integrates neighborhood attention and an SSM to capture both local and global spatiotemporal features. This integration enables an effective balance between segmentation accuracy and computational efficiency. Experimental results demonstrate the superiority of the proposed method over existing approaches across multiple medical video datasets. The main contributions include: the combined use of neighborhood attention and SSM for efficient spatiotemporal feature extraction; a 2D selective scanning mechanism that extends SSMs for video-based medical segmentation; improved segmentation performance exceeding that of state-of-the-art models while maintaining real-time processing capability; and enhanced robustness to background noise and lighting variations, improving reliability in clinical applications. Future work will focus on incorporating prior knowledge and anatomical constraints to refine segmentation accuracy in cases with ambiguous lesion boundaries; developing advanced boundary refinement strategies for challenging scenarios; extending the framework to multi-modal imaging data such as CT and MRI videos; and optimizing the model for deployment on edge devices to support real-time processing in point-of-care and mobile healthcare settings.
Wire Length Driven Tension Refine Based Macro Placer
ZHU Yanzhen, YAN Haopeng, CAI Shuting, GAO Peng
Available online  , doi: 10.11999/JEIT241079
Abstract:
  Objective  With the introduction of reuse methodologies in integrated circuit design, the utilization of macro cells in Very Large Scale Integration (VLSI) has significantly increased. However, the considerable size difference between macro cells and standard cells presents a significant challenge for circuit placers. This study proposes a novel macro placer, WIMPlace, based on tension fine-tuning and wirelength-driven approaches. The aim is to address issues such as density imbalance and degradation of solution quality observed in existing mixed-size placers, thereby providing a more effective solution for VLSI design.  Methods  The proposed method in this paper consists of four stages: preprocessing, pre-placement, macro cell fine-tuning, and macro legalization. Initially, a weight-based partitioning approach is employed to group standard cells with macro into supersets, addressing density issues during the initial placement (Section 3.1). In the pre-placement stage, the DREAMPlace 2.0 tool is used for placing standard cells, and the initial positions of macro cells are determined based on the locations of these clusters (Section 3.2). A local tension model, inspired by the principle of surface tension in liquids, is then adopted to fine-tune the positions of macros, ensuring that connections between standard cells and macros are as compact as possible (Section 3.3, Fig. 2). Finally, a constraint graph-based macro legalization strategy is applied to prevent overlaps between macros (Section 3.4, Fig. 3).  Results and Discussions  Experimental results demonstrate that the WIMPlace achieves exceptional performance on the MMS benchmark, outperforming other advanced mixed-size placers, such as ePlace-MS and DREAMPlace 4.0. Specifically, in 15 out of 16 cases, it achieved the shortest wirelength, with average reductions of 4.31% and 2.39%, respectively (Section 4, Table 2). Additionally, WIMPlace exhibits excellent solution stability, particularly showing a linear increase in runtime as the number of cells increases (Section 4, Fig. 5), indicating that the algorithm not only optimizes wirelength effectively but also demonstrates high computational efficiency. Notably, in the newblue3 case, despite the macro cells occupying a significant portion of the chip area, WIMPlace still demonstrated strong adaptability.  Conclusions  In summary, WIMPlace, as proposed in this paper, is an efficient macro cell placer that achieves gradual fine-tuning optimization of macro cells by combining gradient field movements based on a surface tension analogy and employing preprocessing techniques to balance macros with their associated standard cells. Compared to existing mixed-size placers, WIMPlace demonstrates superior performance across multiple key metrics, particularly in wirelength optimization. Future work could focus on integrating additional design objectives, such as timing, congestion, and thermal management, to enhance the applicability and flexibility of WIMPlace. This study provides new perspectives and technical approaches for VLSI design.
Reconfigurable Intelligent Surface-empowered Covert Communication Strategies for D2D Systems
LV Lu, ZHENG Pengwei, YANG Long, CHEN Jian
Available online  , doi: 10.11999/JEIT250045
Abstract:
  Objective   The rising demand for secure communication in sensitive data transmission scenarios has increased interest in covert communication research. Existing Device-to-Device (D2D) covert communication solutions typically employ additional uncertainty mechanisms, such as artificial noise, leading to elevated energy consumption and implementation complexity. This study addresses these issues by investigating a novel covert communication strategy enabled by Reconfigurable Intelligent Surfaces (RIS). The strategy exploits RIS to enhance wireless propagation for legitimate users and simultaneously introduces controlled phase-shift uncertainty to impair eavesdropping effectiveness. The primary objective is to maximize the covert communication rate among D2D users while maintaining a low probability of detection and guaranteeing the Quality of Service (QoS) requirements for cellular users.  Methods   The proposed framework consists of an RIS-assisted D2D communication network comprising one cellular user, one pair of D2D users, and an eavesdropper aiming to detect ongoing communications. A comprehensive optimization problem is established to jointly optimize the transmit powers of both the cellular and D2D transmitters, as well as the phase shifts of the RIS, to maximize the covert communication rate for D2D users. Given the non-convex nature and highly interdependent variables within the optimization problem, an alternating optimization algorithm utilizing Gaussian randomization is developed. This algorithm iteratively determines the optimal transmission powers and RIS phase-shift configurations, adhering strictly to constraints on power consumption, RIS characteristics, and covert communication detection probabilities. Additionally, Successive Interference Cancellation (SIC) is integrated at the D2D receiver to effectively mitigate interference from cellular communications, facilitating accurate decoding of covert signals.  Results and Discussions   Simulation results confirm the efficacy of the proposed RIS-enabled covert communication strategy, showing significant performance enhancements over traditional methods. The inclusion of RIS notably improves the covert communication rate for D2D transmissions. For instance, increasing the number of RIS reflective elements enhances system performance further by introducing greater uncertainty in the received signals at the eavesdropper, thus complicating detection efforts (Fig. 8). Furthermore, it is observed that the cellular user’s transmit power inherently acts as an effective shield, increasing confusion for eavesdropping attempts and thus reducing detection accuracy.Convergence of the proposed optimization algorithm is validated through iterative simulation experiments, demonstrating stable and reliable performance across varied conditions and constraints (Fig. 4). Additionally, Monte Carlo simulations verify the accuracy of the analytical expressions derived for the minimum average detection error probability achievable by the eavesdropper, highlighting the critical role of RIS in generating sufficient energy uncertainty to ensure covert communication effectiveness (Fig. 5, Fig. 6). Comparative analyses further illustrate the superior performance of the proposed RIS-based approach relative to conventional artificial noise techniques, particularly in scenarios demanding high covert communication rates. Moreover, the integration of RIS and SIC methods demonstrates notable benefits; SIC efficiently reduces interference from cellular signals, maintaining the cellular user’s QoS without compromising the integrity of covert signals decoded at the D2D receiver.  Conclusions  This study proposes an advanced RIS-empowered covert communication strategy tailored specifically for D2D networks. The approach successfully leverages RIS-induced phase-shift uncertainty and capitalizes on cellular transmissions as natural interference sources, significantly enhancing covert communication capabilities. Through joint optimization of transmission power allocation and RIS configurations, the proposed method effectively maximizes the covert communication rate while satisfying QoS constraints for cellular users. These promising results establish a solid foundation for future exploration into active RIS-assisted communication schemes and the development of sophisticated optimization strategies aimed at further improving covert communication effectiveness.
One-sided Personalized Differential Privacy Random Response Algorithm Driven by User Sensitive Weights
LIU Zhenhua, WANG Wenxin, DONG Xinfeng, WANG Baocang
Available online  , doi: 10.11999/JEIT250099
Abstract:
  Objective  One-sided differential privacy has received increasing attention in privacy protection due to its ability to shield sensitive information. This mechanism ensures that adversaries cannot substantially reduce uncertainty regarding record sensitivity, thereby enhancing privacy. However, its use in practical datasets remains constrained. Specifically, the random response algorithm under one-sided differential privacy performs effectively only when the proportion of sensitive records is low, but yields limited results in datasets with high sensitivity ratios. Examples include medical records, financial transactions, and personal data in social networks, where sensitivity levels are inherently high. Existing algorithms often fail to meet privacy protection requirements in such contexts. This study proposes an extension of the one-sided differential privacy random response algorithm by introducing user-sensitive weights. The method enables efficient processing of highly sensitive datasets while substantially improving data utility and maintaining privacy guarantees, supporting secure analysis and application of high-sensitivity data.  Methods  This study proposes a one-sided personalized differential privacy random response algorithm comprising three key stages: sensitivity specification, personalized sampling, and fixed-value noise addition. In the sensitivity specification stage, user data are mapped to sensitivity weight values using a predefined sensitivity function. This function reflects both the relative importance of each record to the user and its quantified sensitivity level. The resulting sensitivity weights are then normalized to compute a comprehensive sensitivity weight for each user. In the personalized sampling stage, the data sampling probability is adjusted dynamically according to the user’s comprehensive sensitivity weight. Unlike uniform-probability sampling employed in conventional methods, this personalized approach reduces sampling bias and improves data representativeness, thereby enhancing utility. In the fixed-value noise addition stage, the noise amount is determined in proportion to the comprehensive sensitivity weight. In high-sensitivity scenarios, a larger noise value is added to reinforce privacy protection; in low-sensitivity scenarios, the noise is reduced to preserve data availability. This adaptive mechanism allows the algorithm to balance privacy protection with utility across different application contexts.  Results and Discussions  The primary innovations of this study are reflected in three areas. First, a one-sided personalized differential privacy random response algorithm is proposed, incorporating a sensitivity specification function to allocate personalized sensitivity weights to user data. This design captures user-specific sensitivity requirements across data attributes and improves system efficiency by minimizing user interaction. Second, a personalized sampling method based on comprehensive sensitivity weights is developed to support fine-grained privacy protection. Compared with conventional approaches, this method dynamically adjusts sampling strategies in response to user-specific privacy preferences, thereby increasing data representativeness while maintaining privacy. Third, the algorithm’s sensitivity shielding property is established through theoretical analysis, and its effectiveness is validated via simulation experiments. The results show that the proposed algorithm outperforms the traditional one-sided differential privacy random response algorithm in both data utility and robustness. In high-sensitivity scenarios, improvements in query accuracy and robustness are particularly evident. When the data follow a Laplace distribution, for the sum function, the Root Mean Square Error (RMSE) produced by the proposed algorithm is approximately 77.28% of that generated by the traditional algorithm, with the threshold upper bound set to 0.6 (Fig. 4(c)). When the data follow a normal distribution, in the coefficient of variation function, the RMSE produced by the proposed algorithm remains below 200 regardless of whether the upper bound of the threshold t is 0.7, 0.8, or 0.9, while the RMSE of the traditional algorithm consistently exceeds 200 (Fig. 5(g,h,i)). On real-world datasets, the proposed algorithm achieves higher data utility across all three evaluated functions compared with the traditional approach (Fig. 6).  Conclusions  The proposed one-sided personalized differential privacy random response algorithm achieves effective performance under an equivalent level of privacy protection. It is applicable not only in datasets with a low proportion of sensitive records but also in those with high sensitivity, such as healthcare and financial transaction data. By integrating sensitivity specification, personalized sampling, and fixed-value noise addition, the algorithm balances privacy protection with data utility in complex scenarios. This approach offers reliable technical support for the secure analysis and application of highly sensitive data. Future work may investigate the extension of this algorithm to scenarios involving correlated data in relational databases.
SHI Huaifeng, ZHOU Long, PAN Chengsheng, CAO Kangning, LIU Chaofan, LV Miao
Available online  , doi: 10.11999/JEIT241132
Abstract:
Rank-Two Beamforming Algorithm Based on Alternating Optimization Assisted by Intelligent Reflecting Surface
ZHOU Kai, YU Lan, GUO Qiang
Available online  , doi: 10.11999/JEIT241107
Abstract:
  Objective  To address the limitations of current optimization methods for Intelligent Reflecting Surface (IRS)-aided communication systems—such as high computational complexity, lack of closed-form solutions, and real-time transmission constraints—this study proposes an efficient joint active–passive beamforming algorithm to improve spectral efficiency and real-time performance. As the number of users increases, conventional rank-1 beamforming lacks sufficient design flexibility, highlighting the need for advanced approaches to avoid performance bottlenecks. This challenge is central to the practical deployment of large-scale Multiple-Input Single-Output (MISO) systems.  Methods  A hierarchical optimization framework is proposed to resolve the non-convex design problem in IRS-assisted MISO systems. A joint beamforming model is developed for downlink multi-user scenarios, incorporating Alamouti Space–Time Block Coding (STBC) and rank-2 beamforming to maximize the Weighted Sum Rate (WSR) under total power and IRS unit modulus constraints. The framework jointly optimizes the transmit and reflection matrices to improve spectral efficiency. To address the non-convexity of the formulation, an alternating optimization strategy is adopted. At the base station, a Weighted Minimum Mean-Square Error (WMMSE) algorithm is applied to refine the rank-2 beamforming design, and ensure efficient power allocation. For IRS phase shift optimization, an improved Riemannian Gradient Algorithm (RGA) is proposed. This algorithm integrates restart mechanisms and dynamic scaling vector transmission to accelerate convergence by avoiding local optima. Step size sensitivity is reduced using relaxed Wolfe conditions, which improves computational efficiency without loss of global optimality.  Results and Discussions  The improved Riemannian gradient optimization algorithm achieves faster convergence and markedly higher WSR performance , attributed to the incorporation of restart strategies and dynamic scaling vector transmission mechanisms, outperforming conventional algorithms (Fig. 3). The proposed rank-2 beamforming scheme yields substantially better system performance than traditional rank-1 techniques (Fig. 3). Simulations further evaluate the effect of varying the number of IRS reflection elements. Across different configurations, the proposed algorithm consistently enhances WSR and outperforms benchmark algorithms (Fig. 4). In addition, it maintains robust performance under varying base station transmit power levels and antenna counts, with rank-2 beamforming preserving clear advantages over rank-1 designs (Fig. 5, Fig. 6). Finally, simulation results identify optimal IRS deployment positions. System performance peaks when the IRS is placed near the base station or users, whereas intermediate placement leads to performance degradation, highlighting the critical role of deployment strategy in practical applications (Fig. 7).  Conclusions  This study addresses the problem of spectral efficiency maximization in IRS-aided communication systems by proposing a joint rank-2 beamforming and alternating optimization framework. For transmit-side optimization, the WMMSE algorithm is applied to enable efficient power allocation in the rank-2 beamforming design. In parallel, an improved RGA is developed for optimizing the IRS phase shift matrix. This algorithm incorporates adaptive initial step selection based on relaxed Wolfe conditions and integrates restart strategies to avoid local optima. Simulation results confirm that the proposed framework achieves faster convergence and higher user sum rate performance compared to conventional algorithms. Moreover, rank-2 beamforming consistently provides superior system efficiency relative to traditional rank-1 methods across a range of scenarios.
A Multiparameter Spoofing Detection Method Based on Parallel CNN-Transformer Neural Network with Gating Mechanism
ZHUANG Xuebin, NIU Ben, LIN Zijian, ZHANG Linjie
Available online  , doi: 10.11999/JEIT240977
Abstract:
  Objective  Global Navigation Satellite Systems (GNSS) provide location, velocity, and timing services globally and are widely used. However, their signals are highly susceptible to interference from natural environments or human factors, and existing single-parameter and multi-parameter detection methods have limitations. In an increasingly complex electromagnetic environment, satellite navigation systems face a growing risk of deception and interference. Therefore, it is essential to refine deception interference detection techniques to enhance the generality and adaptability of detection algorithms. This study proposes a multi-parameter deception interference detection algorithm that addresses the limitations of existing methods, ensures the secure and reliable operation of GNSS receivers, and contributes to the safety and stability of satellite navigation systems.  Methods  Extract key information from the receiver tracking phase. Select five observation metrics: code rate, discriminator result, Doppler shift, carrier-to-noise ratio, and SQM index. Due to the large fluctuations in the original values, apply sliding window processing using Moving Variance (MV) and Moving Mean (MA) to obtain nine feature parameters, forming a multidimensional time series sample. This approach better captures signal feature trends, reduces the effect of data fluctuations, and provides a stable and reliable data foundation for subsequent detection. Construct a Parallel CNN-Transformer Neural network (PCTN) based on a gating mechanism. The network consists of three convolutional neural network modules, eight Transformer encoder modules, and one gating module. The gating mechanism learns the weights of the two branches, fuses their outputs, and detects deception interference signals. Evaluate the model using the TEXBAT dataset and an actual dataset, comparing its performance with five existing algorithms.  Results and Discussions  The PCTN algorithm performs well on the TEXBAT dataset. As shown in Fig. 6, its classification accuracy for real signals reaches 99.222%, exceeding that of the five comparison algorithms. The ROC curve (Fig. 8) and evaluation metrics (Table 3) indicate that the PCTN algorithm achieves the highest AUC value and outperforms others in accuracy, precision, recall, and F1 score, demonstrating stable classification performance across various deception scenarios and effectively distinguishing deception signals from real signals. A deception interference collection platform collects actual data, and after fine-tuning, the model is tested. The PCTN algorithm maintains significant advantages, achieving the highest AUC value in the ROC curve (Fig. 10). As shown in Table 4, its detection accuracy remains above 94.5%, exceeding other algorithms. Compared with its performance on the TEXBAT dataset, the PCTN algorithm exhibits only a 5% decrease on the actual dataset, significantly lower than other algorithms. This demonstrates its robustness, strong generalization capability, and effectiveness in detecting deception interference in new scenarios.  Conclusions  This study proposes a multi-parameter deception interference detection algorithm based on Deep Learning (DL). The method extracts multiple parameter features from the receiver tracking stage, forms multidimensional time series samples, and employs the PCTN model for detection. Experimental results demonstrate that, compared with five existing algorithms, the proposed method offers significant advantages. On the TEXBAT dataset, it achieves high accuracy across various deception scenarios. On the actual dataset, it exhibits better generalization performance and effectively differentiates deceptive signals from real signals, even with new datasets. Future research can focus on deploying the algorithm on hardware platforms to enable real-time and accurate deception interference detection in practical satellite navigation scenarios. This will further enhance the security of satellite navigation systems and support the reliable application of satellite navigation technology in complex electromagnetic environments.
A Survey on Trajectory Planning and Resource Allocation in Unmanned Aerial Vehicle-assisted Edge Computing Networks
WANG Kan, CAO Tielin, LI Xujie, LI Hongyan, LI Meng, ZHOU Momiao
Available online  , doi: 10.11999/JEIT241071
Abstract:
  Significance   Unmanned Aerial Vehicle-assisted Mobile Edge Computing (UAV-MEC) is recognized for its flexible deployment, rapid response, wide-area coverage, and distributed computing capabilities, demonstrating significant potential in smart cities, environmental monitoring, and emergency rescue. Traditional ground-based MEC systems, constrained by fixed edge server deployments, are inadequate for dynamic user distributions and remote area demands. The integration of UAVs with MEC is a critical advancement, where dynamic trajectory planning and resource allocation enhance network energy efficiency, computational efficiency, and service quality, supporting the development of low-altitude intelligent networking. This integration addresses the efficient offloading and real-time processing of computation-intensive tasks through air-ground collaborative optimization, providing foundational technical support for future 6G networks and low-altitude economies. Existing surveys in this field predominantly focus on the integration of UAVs and MEC from a resource allocation perspective, while trajectory planning and its joint optimization with resource allocation are largely overlooked. Furthermore, the distinctions between online and offline optimization are insufficiently addressed in existing surveys, necessitating a systematic analysis of current theories and methods to guide future research.  Progress   In UAV-MEC, joint optimization of trajectory planning and resource allocation has progressed in both online and offline domains. Algorithm frameworks, including alternating optimization and reinforcement learning, have been shown to effectively balance computational complexity with optimization performance. (1) Offline optimization: (a) Energy efficiency optimization: Existing studies employ alternating optimization methods, such as Block Coordinate Descent (BCD) and Successive Convex Approximation (SCA), as well as heuristic algorithms, including differential evolution and dynamic programming, to jointly optimize trajectories, task offloading, and resource allocation, minimizing energy consumption for both users and UAVs. Further reductions in system energy consumption are achieved by integrating Wireless Power Transfer (WPT) and Reconfigurable Intelligent Surfaces (RIS). (b) Latency optimization: Non-Orthogonal Multiple Access (NOMA) and task scheduling strategies are utilized to minimize user-perceived latency. A multi-UAV collaborative framework based on game theory and reinforcement learning is proposed. (c) Multi-objective optimization: The Dinkelbach method is introduced to address fractional programming problems, facilitating joint optimization of computational efficiency, throughput, and secure capacity. Digital Twin (DT) technology is integrated to approximate global optimality. (2) Online optimization: (a) Lyapunov framework: Long-term stochastic problems are decoupled into per-slot optimizations through temporal decomposition. Convex optimization is combined with dynamic trajectory and resource allocation adjustments to adapt to time-varying channels and user mobility. (b) Reinforcement learning: Multi-Agent Deep Reinforcement Learning (MADRL) is applied to multi-UAV collaboration, with expert knowledge guidance and noise injection incorporated to accelerate algorithm convergence. (3) Hybrid optimization: A “pre-planning + online adjustment” strategy is proposed. In the offline phase, clustering algorithms and particle swarm optimization are used to generate high-quality samples for training Deep Neural Networks (DNNs). In the online phase, incremental learning is applied to dynamically fine-tune DNNs for unknown scenarios, balancing global planning with real-time responsiveness.  Conclusions  Despite notable advancements, several critical challenges in UAV-assisted MEC remain unresolved: (1) Incomplete future state information: The formulation of offline optimization problems typically assumes full knowledge of environmental state information over future time horizons. However, in multi-UAV scenarios involving multi-dimensional parameters, acquiring complete and accurate state information across extended periods remains difficult, limiting the applicability of offline methods. (2) Real-time multi-UAV coordination: Enhancing system efficiency and task completion quality requires real-time coordination among UAVs. This process demands extensive information exchange within UAV swarms, complex obstacle avoidance, and high-dimensional control adjustments. Collaborative computation offloading and multi-UAV trajectory planning remain challenging due to their inherent complexity. (3) Security vulnerabilities in air-ground links: The line-of-sight propagation and open transmission environment of UAV-assisted MEC networks expose offloaded data to risks such as eavesdropping, data tampering, and signal interference. Current approaches predominantly rely on physical-layer security, whereas active defense mechanisms against emerging threats, including deep-fake signal attacks, are still underdeveloped. (4) Lack of integration across air-space-ground networks: The absence of standardized interfaces and unified cross-domain resource scheduling protocols hinders the coordination of spectrum, computing, and caching resources between satellites and UAV-MEC systems. This limitation restricts the realization of globally orchestrated heterogeneous networks. (5) Energy constraints and service quality tradeoff: UAV endurance directly affects service coverage and operational sustainability. Although energy efficiency is emphasized in both offline and online optimization strategies, a fundamental tradeoff persists between energy consumption and edge service quality. These technical bottlenecks continue to restrict the transition of UAV-MEC from theoretical frameworks to large-scale real-world deployment.   Prospects   Future research is expected to progress in the following directions: (1) Intelligent environmental perception will advance through the integration of Gated Recurrent Units (GRUs) or Temporal Convolutional Networks (TCNs) for dynamic parameter prediction, while Generative Adversarial Networks (GANs) will be leveraged to complement incomplete environmental state information. (2) Multi-UAV collaboration and energy efficiency will be enhanced through the development of service migration mechanisms and DT-driven MADRL frameworks, combined with solar/WPT technologies to improve UAV endurance. (3) Secure communication mechanisms will evolve with the combination of beamforming, physical-layer security techniques, and homomorphic encryption-based task offloading protocols to address eavesdropping and data tampering in air-ground channels. (4) Heterogeneous network integration will focus on exploring a “cloud-edge-device-satellite” architecture to expand UAV-MEC coverage and robustness in 6G networks, with the development of satellites-assisted cross-domain resource scheduling algorithms. (5) Green computing paradigms will emerge through the integration of energy harvesting and service migration mechanisms to reduce computing loads, promoting sustainable low-altitude intelligent computing ecosystems.
An improved Interacting Multiple Model Algorithm and Its Application in Airport Moving Target Tracking
LU Qixing, TANG Xinmin, QI Ming, GUAN Xiangmin
Available online  , doi: 10.11999/JEIT241150
Abstract:
  Objective  With the rapid growth of air traffic and expanding airport infrastructure, airport surfaces have become increasingly complex and congested. Higher aircraft density on taxiways and runways, increased ground vehicles, and obstacles complicate surface operations and heighten the risk of conflicts due to limited situational awareness. Pilots and ground controllers may struggle to obtain accurate environmental data, leading to potential safety hazards. To enhance surface surveillance and reduce collision risks, this study proposes an improved Interacting Multiple Model (IMM) filtering algorithm with adaptive transition probabilities. Unlike traditional IMM algorithms that rely on a fixed Markov transition probability matrix, the proposed method dynamically adjusts state transition probabilities to better adapt to operational conditions. This approach enhances tracking accuracy and improves aircraft trajectory prediction on airport surfaces, thereby increasing the safety and stability of ground operations.  Methods  The proposed algorithm integrates observation data and filtering residuals, constructing a fuzzy inference system for maneuver intensity using a fuzzy inference algorithm. This system infers the mapping relationship between observation data and the explicit state set in the Hidden Markov Model (HMM), deriving the corresponding state sequence. This process accurately captures target state changes, enhancing behavior prediction. The Baum-Welch algorithm in HMM is applied to solve the state transition matrix and update the observation probability matrix in real time, optimizing the adaptive update strategy for state transition probabilities. This improves model adaptability and accuracy across different environments. The algorithm integrates the fuzzy inference system for maneuver intensity with HMM and incorporates it into the IMM algorithm, forming a Fuzzy Hidden Markov-Interacting Multiple Model (FHMM-IMM) algorithm for real-time maneuvering target estimation. This approach significantly enhances tracking accuracy, particularly in complex and dynamic environments, ensuring high precision and stability for practical applications.  Results and Discussions  The proposed improved IMM algorithm is validated using actual airport surface ADS-B trajectory data. The results show that the algorithm adaptively adjusts parameters under non-equidistant prediction conditions, maintaining stable tracking performance (Figure 8). The position, velocity, and acceleration tracking error curves in both two-dimensional and one-dimensional Cartesian coordinates indicate a significant reduction in overall error, enhancing tracking accuracy (Figures 9, 10, and 11). Comparison with other algorithms confirms that the proposed method achieves a more stable tracking trajectory with lower errors, demonstrating superior performance (Figures 12, 13, 14, and 15). According to (Tables 2, 3, and 4), the two-dimensional position tracking accuracy improves by 63.5%, 54.3%, 40.3%, and 22.7%. The X-direction position tracking accuracy improves by 44.9%, 51.8%, 33.8%, and 35.2%, while the Y-direction position tracking accuracy improves by 63.9%, 62.9%, 52.7%, and 43.4%. The algorithm meets the real-time operational requirements of airport surface monitoring, further validating its effectiveness.  Conclusions  This study highlights the importance of precise four-dimensional trajectory tracking and prediction for airport surface aircraft, particularly in complex environments. Accurate trajectory tracking enhances taxiing safety and operational efficiency, addressing the challenges posed by increasing aircraft density on runways and taxiways. To improve tracking accuracy, an improved IMM algorithm with adaptive transition probabilities, based on Kalman filtering, is proposed. The main contributions are as follows: (1) A fuzzy inference system for maneuver intensity is developed, deriving explicit and hidden state sets and corresponding state sequences to capture target dynamics more accurately. (2) The FHMM-IMM algorithm is introduced for real-time estimation of maneuvering targets, incorporating time-varying state transition probabilities to enhance multi-model tracking and prediction in dynamic environments. (3) Experimental validation using real ADS-B trajectory data demonstrates that the FHMM-IMM algorithm achieves superior trajectory fitting, significantly reducing model errors. It also improves tracking accuracy for position, velocity, and acceleration in both two-dimensional and one-dimensional scenarios, verifying the effectiveness of the proposed model. These improvements provide a more precise and real-time solution for airport surface aircraft trajectory prediction and tracking, contributing to enhanced operational safety and efficiency.
Two-stage Multistatic Passive Localization of a Moving Object Under Unknown Transmitter Position and Velocity
ZUO Yan, CHEN Wangrong, PENG Dongliang
Available online  , doi: 10.11999/JEIT240664
Abstract:
  Objective  This study addresses target localization in multistatic passive radar systems under conditions where the transmitter’s position and velocity are unknown. Multistatic passive radar systems utilize covert deployment, exhibit resilience to jamming, and provide wide-area coverage. Conventional localization techniques rely on precise transmitter state information, which is often unavailable due to the mobility of transmitters mounted on dynamic platforms. Environmental disturbances can further introduce inaccuracies in position and velocity measurements. In non-cooperative scenarios, direct acquisition of transmitter state information is typically infeasible. Existing localization methods, such as the Two-Step Weighted Least Squares (TSWLS) approach, exhibit a threshold effect under high noise conditions, while the Semi-Definite Programming (SDP) method achieves Cramér-Rao Lower Bound (CRLB) accuracy but incurs excessive computational costs, limiting real-time applicability. To address these challenges, a localization algorithm is formulated that enables high-precision tracking of moving targets under uncertain transmitter conditions while maintaining relatively low computational complexity.  Methods  Multistatic passive radar localization systems employ two receiving channels: the reference channel, which receives direct signals from the transmitter, and the surveillance channel, which captures signals reflected from the target. Delay-Doppler cross-correlation between the reflected and reference signals enables the measurement of time delay and Doppler shift. The time delay from the reference channel corresponds to the distance between the transmitter and receivers, denoted as the Direct Range (DR), while the associated Doppler shift represents the Direct Range Rate (DRR). Similarly, the bistatic time delay from the surveillance channel corresponds to the sum of the distances between the target, transmitter, and receivers, referred to as the Bistatic Range (BR), with the associated Doppler shift representing the Bistatic Range Rate (BRR). A two-stage localization algorithm is proposed for estimating the positions and velocities of both the target and transmitter. In the first stage, a Constrained Weighted Least Squares (CWLS) problem is formulated using DR and DRR measurements to estimate the transmitter’s position and velocity. In the second stage, the estimated transmitter state is incorporated into the DR/DRR and BR/BRR measurements to construct a new CWLS problem. This problem is then solved using the Quasi-Newton method to determine the position and velocity of the moving target.  Results and Discussions  Compared with traditional localization approaches that rely solely on indirect path information (BR/BRR), incorporating direct path information (DR/DRR) for joint estimation improves target localization accuracy when the transmitter’s position and velocity are unknown (Figure 1). The performance of the proposed two-stage localization algorithm is evaluated through Monte Carlo simulations and compared with the TSWLS method, the SDP approach, and the CRLB. Estimation accuracy is assessed using the Mean Squared Error (MSE), while algorithm complexity is evaluated based on runtime. In scenarios with only four receivers, the TSWLS algorithm fails to provide accurate estimates, whereas the proposed algorithm maintains localization performance, with deviations occurring only when noise reaches 25 dB (Figure 2). When five receivers with uncorrelated noise are used, the TSWLS algorithm deviates from the CRLB and exhibits a threshold effect at a measurement noise level of 10 dB. At 30 dB, the proposed algorithm reduces the MSE for target position estimation by approximately 7 m2 compared to the TSWLS algorithm, slightly outperforming the SDP algorithm, and reduces the MSE for target velocity estimation by approximately 10 (m/s)2, approaching the localization accuracy of the SDP algorithm (Figure 3). When 5 receivers with correlated noise are used, the TSWLS algorithm begins to deviate from the CRLB at a noise variance of 15 dB and exhibits significant performance degradation at 30 dB. Under these conditions, the proposed algorithm reduces the MSE for target position estimation by approximately 5 m2 compared to the TSWLS algorithm, slightly outperforming the SDP algorithm, and reduces the MSE for target velocity estimation by approximately 7 (m/s)2, achieving localization accuracy comparable to the SDP algorithm (Figure 4). While the SDP algorithm has higher computational complexity and longer runtime, the proposed algorithm achieves a shorter runtime while maintaining localization accuracy, demonstrating good real-time performance (Table 1).  Conclusions  This study investigates the localization of a moving target in a multistatic passive radar system when the transmitter’s position and velocity are unknown. By leveraging time delay and Doppler frequency shift measurements from both direct and indirect paths, a quadratic constraint model is formulated and iteratively solved using the Quasi-Newton method. Simulation results demonstrate that the proposed algorithm can achieve CRLB accuracy even under high-noise conditions. Compared with localization algorithms based on joint transmitter and target estimation using TSWLS and SDP, the proposed algorithm achieves lower computational complexity and enables three-dimensional moving target localization with only four receivers. The proposed method, designed for a single-transmitter multiple-receiver system, can be directly extended to multiple-transmitter multiple-receiver configurations. Additionally, this method assumes time synchronization between the transmitter and receivers. Future research will focus on extending multistatic passive radar localization to scenarios where the transmitter is not synchronized with the receivers.
Research on Low-Power Transmission Method for Group-Connected Beyond-Diagonal Reconfigurable Intelligent Surface-assisted Communication Systems
WANG Hong, LI Peiqi, LI Heyi, WANG Peiyu
Available online  , doi: 10.11999/JEIT241029
Abstract:
  Objective  Reconfigurable Intelligent Surface (RIS) technology enables dynamic reconfiguration of the wireless communication environment. Among recent advancements, Beyond-Diagonal RIS (BD-RIS) has emerged as a novel architecture, featuring a phase-shift matrix unconstrained by diagonal form. This allows simultaneous adjustment of phase and amplitude, offering greater design flexibility and improved system performance. However, while prior studies have primarily focused on BD-RIS-assisted downlink systems, the uplink counterpart remains unexplored. Unlike downlink transmission, where only the total base station power is constrained, uplink transmission imposes individual power limitations on each user, necessitating different optimization models. Therefore, existing downlink-oriented design approaches cannot be directly applied to uplink scenarios. This study proposes a low-power transmission method tailored for BD-RIS-assisted uplink systems, addressing the unique constraints and challenges of uplink-communication.  Methods  This study investigates a group-connected BD-RIS-assisted uplink communication system to minimize total transmit power by jointly optimizing the equalizer, user transmit power, and BD-RIS phase-shift matrix. The Minimum Mean-Square Error (MMSE) equalizer is employed to maximize the Signal-to-Interference-plus-Noise Ratio (SINR) of each received signal. Subsequently, an analytical expression linking user transmit power and the phase-shift matrix is derived. The phase-shift optimization problem is then reformulated as an unconstrained univariate optimization problem. Finally, an alternating optimization approach is applied to iteratively refine the equalizer, user transmit power, and BD-RIS phase-shift matrix, achieving minimal system transmit power.  Results and Discussions  The proposed scheme is compared with benchmark methods, and simulation results demonstrate its superior convergence (Fig. 2) and performance (Figs. 3 and 4). The group-connected BD-RIS achieves lower total transmit power than the traditional single-connected RIS (Figs. 3 and 4) due to its ability to manipulate both the phase and amplitude of signals, leading to enhanced system performance. Furthermore, larger group sizes in the group-connected BD-RIS result in improved performance (Figs. 3 and 4), as increased group size provides greater design flexibility, further optimizing system efficiency.  Conclusions  To address the limitations of existing BD-RIS research, this study investigates a group-connected BD-RIS-assisted uplink communication system and proposes a method to minimize total transmit power. An optimization problem is formulated to minimize user transmit power, and an alternating optimization approach is employed to iteratively refine the equalizer, user transmit power, and BD-RIS phase-shift matrix. Specifically, the MMSE equalizer maximizes each user’s SINR, a closed-form expression for user power is derived, and the phase-shift optimization problem is transformed into an unconstrained single-variable optimization problem, achieving minimal system power consumption. Simulation results indicate that, compared with the traditional single-connected RIS, the group-connected BD-RIS achieves lower system transmit power, with performance improving as group size increases. This study assumes perfect channel state information; however, in practical RIS-assisted communication systems, accurately obtaining ideal channel state information is challenging. Future research should consider the effects of non-ideal channel state information.
Optimal and Suboptimal Architectures of Millimeter-wave Large-scale Arrays for 6G
HONG Wei, XU Jun, CHEN Jixin, HAO Zhangcheng, ZHOU Jianyi, YU zhiqiang, YANG Guangqi, JIANG Zhihao, YU Chao, HU Yun, HOU Debin, ZHU Xiaowei, CHEN Zhe, ZHOU Peigen
Available online  , doi: 10.11999/JEIT250109
Abstract:
  Objective  Beamforming array technology is a key enabler for modern radio systems, evolving across three primary dimensions: advancements in electromagnetic theory, innovations in semiconductor manufacturing, and iterations in system architectures. The development of mobile communication has driven the progress of beamforming array technologies. Hybrid beamforming array technology, in particular, was established as a critical solution for 5G millimeter-wave communications under the 3GPP Release 15 standard. To address the needs of future 6G communication and sensing, millimeter-wave beamforming arrays will evolve towards ultra-large-scale (>1000 elements), intelligent (AI-driven), and heterogeneous (integration of photonics and electronics) architectures, serving as the foundation for ubiquitous intelligent connectivity. This article investigates optimal and suboptimal large-scale beamforming array architectures for 6G millimeter-wave communications.  Methods  Large-scale beamforming arrays can be classified into three primary types: analog-domain beamforming arrays, digital-domain beamforming arrays, and hybrid-domain beamforming arrays. These arrays can be further categorized into single-beam and multi-beam configurations based on the number of supported beams. Each category includes various architectural variants (Figure 1). Analog-domain beamforming arrays (Figure 2) consist of passive and active beamforming arrays. Active beamforming arrays are further divided into Radio Frequency (RF) phase-shifting, Intermediate Frequency (IF) phase-shifting, and Local Oscillator (LO) phase-shifting architectures. Digital-domain implementations (Figure 3) include symmetric and asymmetric full-digital beamforming architectures. Hybrid-domain configurations (Figure 4) offer various combinations, such as architectures integrating RF phase-shifting phased subarrays with digital beamforming, or hybrid multi-beam arrays that combine passive beamforming networks with digital processing. In terms of performance, the symmetric full-digital beamforming architecture (Figure 5) is considered the optimal solution among all beamforming arrays. However, it faces challenges such as high system costs, excessive power consumption, and increased complexity due to the need for numerous high-speed ADCs and real-time processing of large data streams. To address these limitations in symmetric full-digital multi-beam large-scale arrays, an asymmetric large-scale full-digital multi-beam array architecture was proposed (Figure 6). Additionally, a spatial-domain orthogonal hybrid beamforming array (Figure 8) is proposed, which uses differentiated beamforming techniques across spatial dimensions to implement large-scale hybrid beamforming arrays.  Results and Discussions  Current 5G millimeter-wave base stations primarily utilize hybrid beamforming massive MIMO array architectures (Figure 7), which integrate RF phase-shifting phased subarrays with digital-domain beamforming. In this configuration, each two-dimensional RF phase-shifting phased subarray is connected to a dedicated Up-Down Converter (UDC), followed by secondary digital beamforming processing. However, this architecture limits independent beam control flexibility, often leading to the abandonment of digital beamforming in practical implementations. Therefore, each beam achieves only subarray-level gain. For mobile communication base stations, the adoption of asymmetric full-digital phased arrays (Figure 6), which include large-scale transmit arrays and smaller receive arrays, offers an optimal balance between cost, power consumption, complexity, and performance. This configuration meets uplink/downlink traffic demands while enabling wider receive beams (corresponding to compact receive arrays) that support accurate angle-of-arrival estimation. Theoretically, hybrid multi-beam architectures that combine digital beamforming in the horizontal dimension with analog beamforming in the vertical dimension (or vice versa) can reduce system complexity, cost, and power consumption without degrading performance, potentially outperforming current 5G mmWave hybrid beamforming solutions. However, these architectures face limitations due to beam bundling. Building upon the 4-subarray hybrid beamforming architecture (256 elements) used in existing 5G mmWave base stations (Figure 7), a spatial-domain orthogonal hybrid beamforming array is proposed (Figure 8). In this configuration, vertical-dimension elements are grouped into sixteen 1D RF phase-shifting phased subarrays (16 elements each), with each subarray connected to individual UDCs and ADC/DAC chains. The 16 data streams are processed through horizontal-dimension digital beamforming, achieving spatial orthogonality between the analog and digital beamforming domains. This architecture preserves full-aperture gain for each beam, supporting enhanced transmission rates and system capacity. This hybrid multi-beam solution maintains the same beamforming chip count as conventional hybrid architectures (Figure 7) for identical array scales, requiring only an increase in UDC channels from 4 to 16, with minimal cost impact. The proposed solution supports 16 simultaneous beams/data streams, resulting in a significant capacity improvement. For dual-polarization configurations, this extends to 32 beams/data streams, further enhancing system capacity. In horizontal-digital/vertical-analog implementations, all beams align along the horizontal plane, with vertical scanning limited to simultaneous elevation adjustment (Figure 9). Although vertical beam grouping enables independent elevation control, it results in beam gain degradation.  Conclusions  From a performance perspective, the symmetric full-digital beamforming array architecture can be considered the optimal solution. However, it is hindered by high system complexity and cost. The asymmetric full-digital beamforming array architecture significantly reduces system complexity and cost while closely approaching the performance of its symmetric counterpart, making it a suboptimal yet practical choice for large-scale beamforming arrays. Additionally, the spatially orthogonal hybrid beamforming array architecture—such as the design combining vertical analog phased subarrays with horizontal digital beamforming—represents another suboptimal solution. Notably, this hybrid architecture outperforms current 5G mmWave hybrid beamforming systems in terms of performance.
Scene Text Detection Based on High Resolution Extended Pyramid
WANG Manli, DOU Zeya, CAI Mingzhe, LIU Qunpo, SHI Yannan
Available online  , doi: 10.11999/JEIT241017
Abstract:
  Objective  Text detection, a critical branch of computer vision, has significant applications in text translation, autonomous driving, and hill information processing. Although existing text detection methods have improved detection performance, several challenges remain in complex natural scenes. Scene text exhibits substantial scale variations, making multi-scale text detection difficult. Additionally, inadequate feature utilization hampers the detection of small-scale text. Furthermore, increasing the receptive field often necessitates reducing image resolution, which results in severe spatial information loss and diminished feature saliency. To address these challenges, this study proposes the High-Resolution Extended Pyramid Network (HREPNet), a scene text detection method based on a high-resolution extended pyramid structure.  Methods  First, an improved feature pyramid was constructed by incorporating a high-resolution extension layer and a super-resolution feature module to enhance text resolution features and address the issue of low-resolution text. Additionally, a multi-scale feature extraction module was integrated into the backbone network to facilitate feature transfer. By leveraging a multi-branch dilated convolution structure and an attention mechanism, the model effectively captured multi-scale text features, mitigating the challenge posed by significant variations in text scale. Finally, an efficient feature fusion module was proposed to selectively integrate high-resolution and multi-scale features, thereby minimizing spatial information loss and addressing the problem of insufficient effective features.  Results and Discussions  Ablation experiments demonstrated that the simultaneous application of HREP, Multi-scale Feature Extraction Module (MFEM) and Efficient Feature Fusion Module (EFFM) significantly enhanced the model’s text detection performance. Compared with the baseline, the proposed method improved accuracy and recall by 6.3% and 8.9, respectively, while increasing the F-measure by 7.6%. These improvements can be attributed to MFEM, which enhances multi-scale text detection, facilitates efficient feature transmission from the top to the bottom of the high-resolution extended pyramid, and supports the extraction of text features at different scales. This process enables HRFP to generate high-resolution features, thereby substantially improving the detection of low-resolution and small-scale text. Moreover, the large number of feature maps generated by HREP and MFEM are refined through EFFM, which effectively suppresses spatial redundancy and enhances feature expression. The proposed method demonstrated significant improvements in detecting text across different scales, with a more pronounced effect on small-scale text compared to large-scale text. Visualization results illustrate that, for small-scale text images (384 pixel), the detected text box area of the proposed method aligns more closely with the actual text area than that of the baseline method. Experimental results confirm that HREPNet significantly improves the accuracy of small-scale text detection. Additionally, for large-scale text images (2,048 pixel), the number of correctly detected text boxes increased considerably, demonstrating a substantial improvement in recall for large-scale text detection. Comparative experiments on public datasets further validated the effectiveness of HREPNet. The F-measure improved by 7.6% on ICDAR2015, 5.5% on CTW1500, and 3.0% on Total-Text, with significant enhancements in both precision and recall.  Conclusions  To address challenges related to large-scale variation, low resolution, and insufficient effective features in natural scene text detection, this study proposes a text detection network based on a High-Resolution Extended Pyramid. The High-Resolution Extended Pyramid is designed with the MFEM and the EFFM. Ablation experiments demonstrate that each proposed improvement enhances text detection performance compared with the baseline model, with the modules complementing each other to further optimize model performance. Comparative experiments on text images of different scales show that HREPNet improves text detection across various scales, with a more pronounced enhancement for small-scale text. Furthermore, experiments on natural scene and curved text demonstrate that HREPNet outperforms other advanced algorithms across multiple evaluation metrics, exhibiting strong performance in both natural scene and curved text detection. The method also demonstrates robustness and generalization capabilities. However, despite its robustness, the model has a relatively large number of parameters, which leads to slow inference speed. Future research will focus on optimizing the network to reduce the number of parameters and improve inference speed while maintaining accuracy, recall, and F-measure.
CFS-YOLO: An Early Fire Detection Method via Coarse and Fine Grain Search and Focus Modulation
FANG Xianjin, JIANG Xuefeng, XU Liuquan, FANG Zhongyi
Available online  , doi: 10.11999/JEIT240928
Abstract:
  Objective   Fire is a frequent disaster, and detecting early fire phenomena effectively can significantly reduce casualties. Traditional fire detection methods, which rely on sensor devices, struggle to accurately detect fires in open spaces. With the development of deep learning, fire detection can be automated through image capture devices like cameras, improving detection accuracy. However, early-stage fires are small and often obscured by occlusion or fire-like objects. Fire detection models, such as Faster Region-based Convolutional Neural Networks (RCNN) and You Only Look Once (YOLO), often fail to meet real-time detection requirements due to their large number of parameters, which slow down inference. Additionally, existing models face challenges in preserving fire edges and color features, leading to reduced detection accuracy. To address these issues, this paper proposes CFS-YOLO, an early-stage fire recognition model that incorporates coarse- and fine-grained search and focus modulation, enhancing both the speed and accuracy of fire detection.  Methods   To enhance the detection efficiency of the model, a coarse- and fine-grained search strategy is introduced to optimize the lightweight structure of the Unit Inference Block (UIB) module, which consists of four possible instantiations (Fig. 2). A coarse-grained search quickly evaluates different network architectures by adapting the network topology, adding optional convolutional modules to the UIB, and modifying the arrangement and combination of the modules. Dimensionality tuning is performed during the search process to select feature map dimensions and convolutional kernel sizes, generating candidate architectures by expanding or compressing the network width. During the filtering process, candidate architectures are evaluated based on multiple performance metrics. A multi-objective optimization approach is used to find the Pareto-optimal solution, retaining candidate architectures that balance accuracy and efficiency. Weight sharing is employed to improve parameter reuse. The fine-grained search refines the candidate architectures from the coarse-grained search, dynamically adjusting hyperparameters such as learning rate, batch size, regularization coefficient, and optimization algorithm according to the model's performance during training. It analyzes and adjusts each module layer by layer to accelerate convergence and better adapt to data complexity. To address the challenges posed by complex scenes and interfering objects, a focus-modulated attention mechanism is introduced, as shown in (Fig. 3). The input fire images are processed through a lightweight linear layer, followed by selective aggregation of contextual information to the modulators of each query token through a hierarchical contextualization module and gating mechanism. These aggregated modulators are injected into each query token via affine changes to generate outputs. This approach helps tackle the challenges of detecting small targets or objects in complex backgrounds, effectively capturing long-range dependencies and contextual information in the image. Finally, to account for the effects of anchor shape and angle, the model introduces a ShapeIoU loss function (Fig. 4). This function considers the influence of distance, shape, and angle between the true and prior frames on the predictor’s frame regression, enabling accurate measurement of the similarity between the true and predicted frames.  Results and Discussions   (Table 1) presents the results of the ablation experiments. The results show that CFS-YOLO achieved optimal performance. Compared to the baseline model, CFS-YOLO improves precision, recall, and F1 score by 13.33%, 4.96%, and 9.36%, respectively, and increases FPS by 22 fps. The model also shows significant improvements in APflame, APsmoke, and mAP, with increases of 11.1%, 16.2%, and 13.65%, respectively, validating the model's effectiveness. (Fig. 6) illustrates the detection heat map for the ablation model, demonstrating that the combination of the focus-modulated attention mechanism and the ShapeIoU loss function effectively captures key features, confirming their synergistic effect. (Fig. 7) shows the loss curve plots for IoU and ShapeIoU. At the 80th training cycle, the loss of the baseline model stabilizes and converges to 0.5. In contrast, the bounding box loss and DFL loss with the ShapeIoU loss function converge to 0.3 by the 40th cycle, while the classification loss reaches 0.15 by the 80th epoch, highlighting the effectiveness of the ShapeIoU loss function. (Table 2) compares the performance with several state-of-the-art target detection models, while (Table 3) presents a comparison of different flame detection algorithms.The results show that CFS-YOLO leads in performance and demonstrates higher computational efficiency, indicating its potential application value in the flame detection field. (Fig. 10) and (Fig. 11) provide visualizations of the CFS-YOLO detection results, showing its excellent performance in capturing fire information despite background interference and small fire targets.  Conclusions   CFS-YOLO demonstrates outstanding performance in early fire detection, achieving detection speeds of up to 75 fps. It provides high inference speeds, meeting the requirements for real-time detection. Compared to state-of-the-art object detection models, CFS-YOLO outperforms in both detection accuracy and speed.
Mixture Distribution-Based Truth Discovery Algorithm under Local Differential Privacy
ZHANG Pengfei, AN Jianlong, CHENG Xiang, ZHANG Zhikun, SUN Li, ZHANG Ji, ZHU Yibo
Available online  , doi: 10.11999/JEIT240936
Abstract:
  Objective  Mobile crowd sensing is recognized as one of the significant means for data collection, wherein a fundamental challenge lies in discovering the “truth” from a multitude of sensing data of varying quality. To address potential privacy leakage issues during the truth discovery process, existing methods often incorporate local differential privacy techniques to protect the data submitted by workers. However, these methods fail to adequately consider the negative impact of Gaussian noise, which reflects worker quality, on the accuracy of the noise “truth”. Moreover, directly applying the Laplace mechanism for privacy protection introduces excessive noise due to the randomness and unbounded nature of the Laplace distribution, resulting in poor precision and utility of truth discovery. Additionally, existing truth discovery methods are either designed for discrete value scenarios or often fail to strictly satisfy Local Differential Privacy (LDP). Therefore, designing a truth discovery algorithm based on mixed distributions that strictly adheres to LDP poses a significant challenge. This is particularly true in continuous value scenarios, where balancing privacy protection with the accuracy of truth discovery, as well as efficiently optimizing the complexity of mixed distribution models to enhance algorithm precision and efficiency, remains a critical issue to be resolved.  Methods  A novel algorithm, termed Mixture distributiOn-based truth discOvery under local differeNtial privacy (MOON), is proposed. This algorithm primarily considers both the Gaussian noise inherent in the data uploaded by workers, which reflects their quality, and the exogenous Laplace noise injected to protect private data. Based on the mixed noise distribution, new iterative equations for truth discovery are designed. Specifically, each worker first injects Laplace noise into their sensed data and uploads the noise-added data to the server. Subsequently, a probabilistic model combining Gaussian and Laplace noise is constructed and jointly estimated. Finally, the constrained optimization problem is solved using the Lagrange multiplier method to derive iterative equations for worker quality and the “true value” of the noise.  Results and Discussions  Experimental results demonstrate that, across two real-world datasets, as the privacy budget ε increases, the MOON algorithm exhibits the least impact on utility compared to other benchmark algorithms. Furthermore, when compared to the state-of-the-art TESLA algorithm, MOON achieves at least a 20% improvement in precision (Fig.3). In the context of truth discovery, weight updating is a critical component. Therefore, the experiments also validate the differences between the mixed noise weight distribution derived by the MOON algorithm and the true weight distribution across different datasets (Fig.4). The results indicate that the weight distribution obtained by MOON is closer to the true distribution, aligning with the utility analysis presented in the algorithm analysis section. This is attributed to the smaller scale of noise added to high-quality data. Additionally, the runtime of the MOON algorithm is generally higher than that of the non-privacy-preserving truth discovery algorithm NoPriv, being approximately twice as long (Fig.5), which is consistent with the theoretical analysis. This is due to the injection of Laplace noise into the data uploaded by workers in MOON, necessitating more iterations to converge to the final truth. However, since both runtimes are measured in seconds, this discrepancy is considered acceptable in practical applications.  Conclusions  Existing truth discovery algorithms that satisfy local differential privacy (LDP) fail to adequately account for the negative impact of Gaussian noise, which reflects worker quality, on the accuracy of the noise "truth." Moreover, while directly applying the Laplace mechanism for noise addition strictly ensures LDP compliance, the randomness and unbounded nature of the Laplace distribution result in excessive noise injection. To address these issues, the MOON algorithm is proposed in this work. Theoretical analysis demonstrates that MOON achieves privacy protection while maintaining low computational and communication complexity. Experimental results on two real-world datasets show that, compared to the latest advancements, MOON improves the precision of the derived "truth" by 20% with minimal additional computational overhead. In future work, the potential social relationships among workers, which may lead to similarities in the data submitted by certain workers, as well as functional dependencies among task attributes, will be leveraged to further enhance the accuracy of truth discovery under local differential privacy.
Research on Resource Scheduling of Distributed CNN Inference System Based on AirComp
LIU Qiaoshou, DENG Yifeng, HU Haonan, YANG Zhenwei
Available online  , doi: 10.11999/JEIT241022
Abstract:
  Objective   In traditional AirComp systems, the computational accuracy is directly affected by the alignment of received signal phases from different transmitters. When applied to distributed federated learning and distributed inference systems, phase misalignment can introduce computational errors, reducing model training and inference accuracy. This study proposes the MOSI-AirComp system, in which transmitted signals in each computation round originate from the same node, thereby eliminating signal phase alignment issues.  Methods  (1) A dual-branch training model is proposed, increasing network complexity only during training. The traditional model is extended to a dual-branch structure, where the lower branch retains the original model, and the upper branch incorporates additional loss layers for training. (2) A MOSI-AirComp-based weight-power control scheme is introduced. Each node is equipped with multiple transmitting antennas and a single receiving antenna. Pre-trained model weights are offloaded to task nodes as part of the power control factor, which adjusts transmission power during inference. This optimization enhances signal amplitude for convolution operations while reducing computation time. Since data transmission originates from the same node, phase alignment issues are avoided. AirComp integrates signals from multiple antennas for convolution summation, enabling airborne convolution. (3) A TSP-based node selection algorithm is proposed, using weight mean and path as evaluation parameters to determine the optimal transmission path, ensuring efficient data transmission.  Results and Discussions  Compared to the traditional network model, the dual-branch training model significantly improves inference accuracy under small-scale fading. For the MNIST and CIFAR-10 datasets, accuracy increases by 2%–18% and 0.4%–11.2% under different SNR values (Fig. 5 and Fig. 6). The MSE decreases by 0.056–0.154 and 0.047–0.23 under different maximum node power budgets (Fig. 7). In noise-only scenarios, inference accuracy improves by 0.7%–5.5% and 0.3%–7.1% under different SNR values (Fig. 5 and Fig. 6), while MSE decreases by 0.035–0.152 and 0.056–0.253 under different maximum node power budgets (Fig. 8).  Conclusions  A MOSI-AirComp system is proposed to address the phase alignment issue inherent in traditional AirComp scenarios. The system enables airborne convolution through a power control scheme and enhances the traditional network model with a dual-branch structure. The upper branch simulates multiplicative Rayleigh fading using loss layers and incorporates model data into the convolution layer output of the lower branch to simulate additive noise effects. To account for node limitations in IoT networks, a model-weight-improved Traveling Salesman Problem (TSP) node selection algorithm is proposed. Future advancements in AirComp deployment for distributed computing and communication frameworks hold promise, particularly with the rapid development of 6G and IoT.
A Novel Silicon Carbide (SiC) MOSFET with Diode Integration Technology
MA Chao, CHEN Weizhong, ZHANG Bo
Available online  , doi: 10.11999/JEIT250180
Abstract:
This paper proposes a novel double-trench Silicon Carbide (SiC) MOSFET that integrates a Schottky diode structure to improve reverse recovery and switching characteristics. In the proposed design, the conventional right-side trench channel is replaced by a Schottky diode, and a split-gate structure is connected to the source. The Schottky diode suppresses body diode conduction and eliminates the bipolar degradation effect. The split gate reduces the coupling area between the gate and drain, thereby lowering the feedback capacitance and gate charge. In addition, when the split gate is connected to a high potential, it attracts electrons to form an accumulation layer near the source, which increases electron density. During reverse conduction, the current flows through the Schottky diode, while the split gate enhances electron concentration and thus current density. The split-gate structure also shields the gate from the drain, reducing the Gate–Drain Charge (QGD) and improving switching performance.  Objective  Conventional Double-Trench MOSFETs (DT-MOS) typically require an external anti-parallel diode to function as a freewheeling diode in converter and inverter systems. This necessitates additional module area and increases parasitic capacitance and inductance. Utilizing the body diode as a freewheeling diode could reduce cost and save space. However, this approach presents two major challenges. First, due to the wide bandgap of SiC, the turn-on voltage of the intrinsic body diode rises significantly (approaching 3 V), which increases switching loss. This paper presents a new DT-MOS, referred to as SDT-MOS, with an integrated Schottky diode, demonstrated using TCAD SENTAURUS. In the proposed structure, the conventional right-side channel is replaced with a Schottky junction, and a source-connected split gate is embedded in the gate oxide. The SDT-MOS achieves low power consumption and reduced reverse recovery current.  Methods  Sentaurus TCAD is used to simulate and analyze the electrical performance of the proposed structure and its conventional counterpart. The simulation includes key physical models, such as mobility saturation under high electric fields, Auger recombination, Okuto–Crowell impact ionization, bandgap narrowing, and incomplete ionization. To improve simulation accuracy and align the results with experimental data, interface traps and fixed charges at the SiC/SiO2 interface are also considered.  Results and Discussions  The Miller capacitance (Crss or CGD) extracted at Vds of 400 V is 29 pF/cm2 for the SDT-MOS, representing a 61% reduction compared to the DT-MOS, which has a CGD of 74 pF/cm2. This reduction is primarily attributed to the integrated split-gate structure, which decreases the capacitive coupling between the gate and drain electrodes (Fig. 7). The total switching loss (Eon + Eoff) of the SDT-MOS is 1.58 mJ/cm2 , which is 59.3% lower than that of the DT-MOS (3.88 mJ/cm2 ), due to the improved switching characteristics enabled by the split gate (Fig. 10). In addition, the peak reverse recovery current (IRRM) and reverse recovery charge (QRR) of the SDT-MOS are 165 A/cm2 and 1.39 μC/cm2, representing reductions of 31.3% and 54%, respectively, compared to the DT-MOS (Fig. 11).  Conclusions  A novel double-trench SiC MOSFET (SDT-MOS) with an integrated Schottky diode has been numerically investigated. In this structure, the right-side channel of a conventional DT-MOS is replaced with a Schottky diode, and a split gate is connected to the source. This configuration results in improved switching and reverse recovery performance. With appropriate optimization of key design parameters, the SDT-MOS retains the fundamental characteristics of a standard MOSFET. Compared with the conventional DT-MOS, the proposed device suppresses body diode conduction, mitigates bipolar degradation, and achieves a 64.9% reduction in QGD. Switching loss is reduced by 59.3%, and QRR is reduced by 54%. These enhancements make the SDT-MOS a strong candidate for high-efficiency, high–power density applications.
Research on Fast Iterative TDOA Localization Method Based on Spatial Grid Gradients
WANG Jie, WU Linghao, BU Xiangxi, LI Hang, LIANG Xingdong
Available online  , doi: 10.11999/JEIT241105
Abstract:
  Objective  In various application scenarios such as Unmanned Aerial Vehicle (UAV) formation, emergency rescue, and low-altitude intelligent networks, passive localization technologies that offer low latency and high precision are of significant practical value. The Time Difference of Arrival (TDOA) localization method is widely adopted for wireless signal source localization due to its ability to operate without requiring the target to actively transmit a signal and its strong adaptability to different environments. Among the various methods used to enhance the accuracy of TDOA localization, the Taylor Iterative Method has gained significant popularity. However, this method requires the calculation of a Taylor expansion for each iteration, resulting in a high computational load. This computational burden often leads to issues such as poor real-time performance and degraded accuracy, which hinder the application of TDOA localization technology in low-latency engineering contexts. To overcome these challenges, this paper proposes a novel TDOA rapid iterative localization method based on spatial grid gradients. The proposed method can significantly reduce computational time while maintaining high levels of localization accuracy.  Methods  The proposed approach is based on the concept of spatial gridization, incorporating insights derived from the inherent gradient relationships between neighboring grids. These relationships are leveraged to integrate the grid framework into an iterative compensation model. This integration addresses the performance limitations associated with grid width in traditional gridization algorithms, thereby enhancing the efficiency of the iterative localization process. The overall computational process is divided into two distinct stages: preprocessing and iterative localization. The preprocessing stage occurs during the system's initialization phase and includes constructing the spatial grid, calculating the TDOA gradients between grid points, and establishing the grid-based iterative matrix. Once this preprocessing is complete, the results are stored and readily accessible for future localization processes. During the localization stage, the precomputed iteration matrix is directly invoked and the initial value for the target's position. The method then calculates and compensates for the deviation between the initial value and the actual target position. By employing a grid-based approach, the significant computational workload typically encountered during iterative localization is shifted to the preprocessing phase. This leads to a marked reduction in localization time, significantly improving computational efficiency.  Results and Discussions  To validate the effectiveness and performance of the proposed algorithm, simulations and field experiments are conducted. The results are compared with those of the classic spatial gridization algorithm and the Taylor Iterative Method. It is observed that the classic spatial gridization algorithm experiences a significant loss in localization accuracy as the grid width increases, accompanied by a dramatic increase in computation time. In contrast, the proposed algorithm remains unaffected by grid width and outperforms the traditional spatial gridization method in both localization accuracy and computation time (Fig. 2). A deeper comparison of the proposed algorithm with the Taylor Iterative Method is made by analyzing the effects of TDOA estimation errors, initial value errors, and iteration thresholds on the performance of both algorithms. Specifically, under varying TDOA estimation errors, the proposed algorithm reduces the average computation time by 76% compared to the Taylor Iterative Method, while maintaining similar localization accuracy (Fig. 3). Under varying initial value errors, the proposed algorithm reduces average computation time by 78%, with comparable localization accuracy (Fig. 4). As the iteration threshold increases, both algorithms experience a slight reduction in localization accuracy; however, their overall performance remains similar. In this scenario, the proposed algorithm still reduces computation time by approximately 76% when compared to the Taylor Iterative Method (Fig. 5). To further verify the applicability of the proposed algorithm in real-world scenarios, field experiments are also conducted. The field test results confirm the validity of the proposed method, demonstrating a 78% reduction in computation time compared to the Taylor Iterative Method, while maintaining comparable localization accuracy (Table 1).  Conclusions  The proposed TDOA fast iterative localization method, based on spatial grid gradients, effectively reduces computational complexity while maintaining localization accuracy. This method is well-suited for high real-time passive localization applications. It significantly enhances both the efficiency and practicality of TDOA localization systems. Future work will focus on expanding the applicability of this algorithm by integrating it with other localization techniques, such as Time of Arrival (TOA), Angle of Arrival (AOA), and Frequency Difference of Arrival (FDOA). This integration is expected to facilitate the development of low-altitude economic activities and contribute to advancing the capabilities of localization technologies.
Personalized Tensor Decomposition Based High-order Complementary Cloud API Recommendation
SUN Mengmeng, LIU Xiaowei, CHEN Wenhui, SHEN Limin, YOU Dianlong, CHEN Zhen
Available online  , doi: 10.11999/JEIT250003
Abstract:
  Objective  With the emergence of the cloud era in the Internet of Things, cloud Application Programming Interfaces (APIs) have become essential for managing data element dynamics, facilitating AI algorithm implementation, and coordinating access to computing resources. Cloud APIs have developed into critical digital infrastructure that supports the digital economy and the operation of service-oriented software. However, the rapid expansion of cloud APIs has impacted users’ decision-making processes and complicated the promotion of cloud APIs. This situation underscores the urgent need for effective cloud API recommendation methods to foster the development of the API economy and encourage the widespread adoption of cloud APIs. While existing research has focused on modeling invocation preferences, search keywords, or a combination of both to recommend suitable cloud APIs for a given Mashup, it does not address the need for personalized high-order complementary cloud APIs in practical software development. Personalized high-order complementary cloud API recommendation aims to provide developers with APIs that align with their personalized invocation preferences and complement the other APIs in their query set, thereby addressing the developers’ joint interests.  Methods  To address this issue, a P ersonalized T ensor D ecomposition-based H igh-order C omplementary cloud API R ecommendation (PTDHCR) method is proposed. First, the invocation relationships between Mashups and cloud APIs, as well as the complementary relationships between cloud APIs, are represented as a three-dimensional tensor. RECAL tensor decomposition is applied to jointly learn and uncover personalized asymmetric complementary relationships between cloud APIs. Second, a personalized high-order complementary perception network is designed to account for the varying influence of different complementary relationships on recommendations. This network dynamically calculates the attention of a Mashup to the complementary relationships between different query and candidate cloud APIs using the multi-modal features of the Mashup, query cloud APIs, and candidate cloud APIs. Finally, the personalized complementary relationships are extended to higher orders, yielding a comprehensive personalized complementarity between candidate cloud APIs and the query set.  Results and Discussions  Extensive experiments are conducted on two real cloud API datasets. First, PTDHCR is compared with 11 baseline methods suitable for personalized high-order complementary cloud API recommendation. The experimental results (Tables 2 and 3) show that, on the PWA dataset, PTDHCR outperforms the best baseline by 0.12%, 0.14%, 1.46%, and 2.93% in terms of AUC. HR improves by 0.91%, 1.01%, 3.54%, and 10.84%, while RMSE decreases by 0.33%, 0.7%, 1.36%, and 2.67%. PTDHCR also performs well on the HGA dataset, significantly outperforming the baseline methods in AUC, HR@10, and RMSE metrics. Second, experiments are conducted with varying complementary thresholds to evaluate PTDHCR’s performance at different complementary orders. The experimental results (Figures 2 and 3) indicate that PTDHCR’s recommendation performance improves progressively as the complementary order increases. This improvement is attributed to the method’s ability to incorporate more complementary information, thereby enhancing its recommendation capability. Next, a comparison experiment is performed to assess whether the personalized high-order complementary perception network can better capture high-order complementary relationships than the mean-value and semantic similarity-based methods. The experimental results (Figures 5 and 6) demonstrate that the personalized high-order complementary perception network outperforms other methods. This is due to the network’s ability to consider the contribution of different complementary relationships and dynamically compute the Mashup’s attention to each complementary relationship. Finally, an example is provided, evaluating the predicted probability of a Mashup invoking other candidate cloud APIs, given that it has already invoked the “Google Maps API” and the “Google AdSense API.” This example illustrates the personalized nature of the high-order complementary cloud API recommendation achieved by the PTDHCR method.  Conclusions  Existing methods fail to address the actual needs of developers for personalized high-order complementary cloud APIs in the development of service-oriented software. This paper defines the recommendation problem of personalized high-order complementary cloud APIs and proposes a solution. A personalized high-order complementary cloud API recommendation method based on tensor decomposition is introduced. Initially, the invocation relationships between Mashups and cloud APIs, as well as the complementary relationships between cloud APIs, are modeled as a three-dimensional tensor. RECAL tensor decomposition technology is then applied to jointly learn and explore the personalized asymmetric complementary relationships. Additionally, a high-order complementary perception network is constructed to dynamically compute Mashups’ attention towards various complementary relationships, which extends these relationships to higher orders. Experimental results show that PTDHCR outperforms state-of-the-art cloud API recommendation methods on real cloud API datasets. PTDHCR offers an effective approach to address the cloud API selection problem and contributes to the healthy development and popularization of the cloud API economy.
An Efficient Lightweight Network for Intra-pulse Modulation Identification of Low Probability of Intercept Radar Signals
WANG Xudong, WU Jiaxin, CHEN Binbin
Available online  , doi: 10.11999/JEIT240848
Abstract:
  Objective  Low Probability of Intercept (LPI) radar enhances stealth, survivability, and operational efficiency by reducing the likelihood of detection, making it widely used in military applications. However, accurately analyzing the intra-pulse modulation characteristics of LPI radar signals remains a key challenge for radar countermeasure technologies. Traditional methods for identifying radar signal modulation suffer from poor noise resistance, limited applicability, and high misclassification rates. These limitations necessitate more robust approaches capable of handling LPI radar signals under low Signal-to-Noise Ratios (SNRs). This study proposes an advanced deep learning-based method for LPI radar signal recognition, integrating Hybrid Dilated Convolutions (HDC) and attention mechanisms to improve performance in low SNR environments.  Methods  This study proposes a deep learning-based framework for LPI radar signal modulation recognition. The training dataset includes 12 types of LPI radar signals, including BPSK, Costas, LFM, NLFM, four multi-phase, and four multi-time code signals. To enhance model robustness, a comprehensive preprocessing pipeline is applied. Initially, raw signals undergo SPWVD and CWD time-frequency analysis to generate two-dimensional time-frequency feature maps. These maps are then processed through grayscale conversion, Wiener filtering for denoising, principal component extraction, and adaptive cropping. A dual time-frequency fusion method is subsequently applied, integrating SPWVD and CWD to enhance feature distinguishability (Fig. 2). Based on this preprocessed data, the model employs a modified GhostNet architecture, Dilated CBAM-GhostNet (DCGNet). This architecture integrates HDC and the Convolutional Block Attention Module (CBAM), optimizing efficiency while enhancing the extraction of spatial and channel-wise information (Fig. 7). HDC expands the receptive field, enabling the model to capture long-range dependencies, while CBAM improves feature selection by emphasizing the most relevant spatial and channel-wise features. The combination of HDC and CBAM strengthens feature extraction, improving recognition accuracy and overall model performance.  Results and Discussions  This study analyzes the effects of different preprocessing methods, network architectures, and computational complexities on LPI radar signal modulation recognition. The results demonstrate that the proposed framework significantly improves recognition accuracy, particularly under low SNR conditions. A comparison of four time-frequency analysis methods shows that SPWVD and CWD achieve higher recognition accuracy (Fig. 8). These datasets are then fused to evaluate the effectiveness of image enhancement techniques. Experimental results indicate that, compared to datasets without image enhancement, the fusion of SPWVD and CWD reduces signal confusion and improves feature discriminability, leading to better recognition performance (Fig. 9). Comparative experiments validate the contributions of HDC and CBAM to recognition performance (Fig. 10). The proposed architecture consistently outperforms three alternative network structures under low SNR conditions, demonstrating the effectiveness of HDC and CBAM in capturing spatial and channel-wise information. Further analysis of three attention mechanisms confirms that CBAM enhances feature extraction by focusing more effectively on relevant time-frequency regions (Fig. 11). To comprehensively evaluate the proposed network, its performance is compared with ResNet50, MobileNetV2, and MobileNetV3 using the SPWVD and CWD fusion-based dataset (Fig. 12). The results show that the proposed network outperforms the other three networks under low SNR conditions, confirming its superior recognition capability for low SNR radar signals. Finally, computational complexity and storage requirements are assessed using floating-point operations and parameter count (Table 2). The results indicate that the proposed network maintains relatively low computational complexity and parameter count, ensuring high efficiency and low computational cost. Overall, the proposed deep learning framework improves radar signal recognition performance while maintaining efficiency.  Conclusions  This study proposes a deep learning-based method for LPI radar signal modulation recognition using the DCGNet model, which integrates dilated convolutions and attention mechanisms. The framework incorporates an advanced image enhancement preprocessing pipeline, leveraging SPWVD and CWD time-frequency feature fusion to improve feature distinguishability and recognition accuracy, particularly under low SNR conditions. Experimental results confirm that DCGNet outperforms existing methods, demonstrating its practical potential for radar signal recognition. Future research will focus on optimizing the model further and extending its applicability to a wider range of radar signal types and scenarios.
Cooperative Spectrum Sensing Method Against Spectrum Sensing Data Falsification Attacks Based on Multiscale Entropy
WANG Anyi, GONG Jianchao, ZHU Tao
Available online  , doi: 10.11999/JEIT241091
Abstract:
  Objective  With the rapid development of 5G and Internet of Things (IoT) technologies and the increasing number of devices accessing wireless networks, Cognitive Radio (CR) technology offers an effective solution to alleviate spectrum resource scarcity. CR allows Secondary Users (SU) to perform spectrum sensing and share the Primary User (PU) frequency band. However, Cooperative Spectrum Sensing (CSS) is vulnerable to Spectrum Sensing Data Falsification (SSDF) attacks by Malicious Users (MU), which degraded sensing performance. Existing anti-SSDF algorithms, while reducing the effects of SSDF attacks, face challenges in detecting MU under complex attack strategies. This study proposes the use of multiscale entropy to enhance anti-SSDF attack schemes. By updating the reputation value of SU sensing results through multiscale analysis, the detection performance and MU detection rate of the CSS algorithm under various attack strategies are significantly improved. This work provides a solution to the problem of SSDF attacks under complex strategies and offers a theoretical foundation for CSS technology in areas with scarce spectrum resources.  Methods  The multiscale entropy algorithm calculates the reputation value of the SU using a sliding window model. It extracts effective features from the local sensing results of the SU and converts them into weights, which are then used to update the SU’s reputation value. The final reputation value is compared with a threshold to identify MU. The sliding window model collects SU sensing results from different time slots and computes the reputation value by comparing them with the Fusion Center (FC) judgment results. A normalization function processes the updated reputation value to derive the final global decision. A higher reputation value indicates that the SU’s local sensing result is more reliable, while a lower reputation value suggests the user may be an MU.  Results and Discussions  The multiscale entropy algorithm performs multiscale analysis of the SU perception results based on a sliding window model. This approach mitigates the impact of MU on CSS system performance by extracting effective features to counter Independent Attacks (IA) and Collaborative Attacks (CA). Simulation results show that CA affects CSS performance more significantly than IA (Fig. 3, Fig. 4). The proposed algorithm effectively identifies MUs under both attack strategies (Fig. 7, Fig. 9), demonstrating its effectiveness. Additionally, the algorithm exhibits low complexity (Table 1). Under both IA and CA, when the attack probability exceeds 0.4, the MU detection rate improves by an average of 3.56%, 0.77%, and 6.45%, 36.92%, respectively, compared to the baseline algorithms. These results highlight the strong anti-attack capability of the proposed algorithm.  Conclusions  This paper addresses the SSDF attacks problem in CSS. The detection capability of MU is constrained by the reliability of the global decision. To mitigate this issue, a CSS method based on multiscale entropy is proposed. The method, built on the sliding window model, utilizes multiscale entropy to extract feature information that enhances judgment accuracy, thereby updating the reputation value and improving the global decision. Simulation results demonstrate that the proposed algorithm exhibits strong resistance to SSDF attacks and performs well in MU detection under both IA and CA, with lower complexity. This approach is particularly suitable for regions with scarce spectrum resources, ensuring the reliable operation of CR systems and enabling efficient spectrum utilization by effectively identifying MU. Future work will explore the application of deep learning techniques in CSS against SSDF attacks, aiming to further enhance network performance.
Dynamic Spectrum Access Algorithm for Evaluating Spectrum Stability in Cognitive Vehicular Networks.
MA Bin, YANG Zumin, XIE Xianzhong
Available online  , doi: 10.11999/JEIT240927
Abstract:
  Objective  With the exponential growth of vehicle terminals and the widespread adoption of cognitive vehicular network applications, the existing licensed spectrum resources are inadequate to meet the communication demands of Cognitive Vehicular Networks (CVN). The rapid development of CVN and the increasing complexity of vehicular communication scenarios have intensified spectrum resource scarcity. Dynamic Spectrum Access (DSA) technology has emerged as a key solution to alleviate this scarcity by enabling efficient use of underutilized spectrum bands. While current DSA algorithms ensure basic spectrum utilization, they struggle to comprehensively evaluate spectrum stability and meet the differentiated stability requirements of vehicular network applications. For example, safety-critical applications such as collision avoidance systems demand ultra-reliable, low-latency communication, while infotainment applications prioritize high throughput. This paper proposes a novel framework integrating spectrum stability assessment with deep reinforcement learning. The framework constructs a multi-dimensional parameter-based model for spectrum stability, designs a reinforcement learning architecture incorporating gated mechanisms and dueling neural networks, and establishes a dynamically adaptive reward function to enable intelligent spectrum resource allocation. This research offers a solution for vehicular network spectrum management that combines theoretical depth with practical engineering value, paving the way for more reliable and efficient vehicular communication systems.  Methods  This study employs an integrated approach to address the spectrum allocation challenges in CVN. A time-series prediction model is developed using Long Short-Term Memory (LSTM) neural networks, which leverage three-dimensional time-series data of Signal-to-Noise Ratio (SNR), Received Signal Strength (RSS), and bandwidth to make multi-step predictions for future cycles. The rate of change for each parameter is calculated as a stability evaluation metric, providing a quantitative measure of spectrum stability. To ensure consistency in the evaluation process, the rate of change for each parameter is normalized using Min-Max normalization, and the standardized results are input into the K-Means algorithm for stability clustering of the rate-of-change vectors. By calculating the centroid coordinates of each cluster and their norms, a stability index is derived, forming the stability assessment model. Building upon the Deep Q-Network (DQN), a Gated Recurrent Unit (GRU) is introduced to create a temporal state encoder that captures the temporal dependencies in spectrum data. Additionally, a Dueling Network architecture is employed to decouple the state value and action advantage functions, enabling more accurate estimation of the long-term value of spectrum allocation decisions. The reward function incorporates trade-off coefficients to achieve a reasonable allocation of spectrum resources with different stability levels, ensuring a balance between spectrum utilization and collision probability while meeting the diverse stability requirements of vehicular network applications. The proposed framework is designed to be scalable and adaptable to various vehicular network scenarios, including urban, highway, and rural environments.  Results and Discussions  Simulation results show that the optimized stepwise prediction algorithm significantly improves performance. In both the training and test sets, the algorithm achieves a Mean Squared Error (MSE) of less than 0.1, with no significant overfitting observed (Fig. 5, Fig. 6). This indicates that the algorithm generalizes well to unseen data, making it suitable for real-world deployment. Additionally, the loss function of the proposed algorithm decreases significantly as the number of iterations increases, converging around 150 iterations (Fig. 7). The prediction accuracy also stabilizes around 150 iterations (Fig. 8), suggesting that the algorithm achieves consistent performance within a reasonable training period. These results demonstrate that the proposed prediction algorithm can deliver high-accuracy multi-step predictions for stability parameters across a sufficient number of channels, providing a solid foundation for spectrum stability assessment. Furthermore, the proposed access algorithm consistently outperforms comparative algorithms in terms of spectrum utilization over 20 iterations, while maintaining lower collision probabilities (Fig. 9, Fig. 10). As the number of iterations increases, the cumulative stability index and throughput of the proposed algorithm steadily improve, exceeding the performance of comparative algorithms at all stages. This demonstrates that the proposed algorithm can meet the diverse requirements of vehicle terminals for channel stability and throughput, while ensuring high spectrum utilization and low collision probability. As the number of vehicle terminals increases, the proposed algorithm exhibits faster convergence compared to other algorithms, confirming its robustness in large-scale scenarios. These findings highlight the potential of the proposed framework to meet the growing demands of next-generation vehicular networks.  Conclusions  This study proposes an integrated "evaluation-decision-optimization" spectrum management paradigm for CVN. By proposing a multi-dimensional time-series feature-based spectrum stability quantification framework and designing a hybrid deep reinforcement learning architecture incorporating gated mechanisms and dueling networks, the research addresses the critical challenge of balancing spectrum efficiency with stability in dynamic vehicular environments. The development of an interpretable reward function enables intelligent spectrum allocation that adapts to diverse quality-of-service requirements, ensuring that both safety-critical and non-safety-critical applications receive the necessary resources. Experimental results show significant improvements in spectrum utilization, collision probability, and system throughput compared to traditional approaches, while maintaining robust performance in large-scale scenarios. These findings advance the theoretical understanding of spectrum management in CVN and provide a practical framework for implementing adaptive DSA solutions in next-generation intelligent transportation systems. Future research will explore extending the proposed framework to support multi-agent scenarios, where multiple vehicles and infrastructure nodes collaboratively optimize spectrum allocation. Additionally, integrating edge computing and federated learning techniques could further enhance the scalability and efficiency of the framework. The proposed methodology offers a scalable and efficient approach to spectrum resource allocation, paving the way for more reliable and high-performance vehicular communication networks.
Joint Design of Transmission Sequences and Receiver Filters Based on the Generalized Cross Ambiguity Function
WEN Cai, WEN Shu, ZHANG Xiang, XIAO Hao, LI Zhangping
Available online  , doi: 10.11999/JEIT240905
Abstract:
  Objective  A set of orthogonal waveforms with favorable correlation properties enhances target detection and anti-jamming performance in Multiple-Input Multiple-Output (MIMO) radar systems. Jointly designing the transmit sequence set and receive filter bank introduces additional degrees of freedom, reducing auto- and cross-correlation. However, research on their joint design based on the Generalized Cross Ambiguity Function (GCAF) is limited and primarily focuses on reducing the peak sidelobe level. Since a low Integrated Sidelobe Level (ISL) is also critical for radar imaging and target detection, this study formulates the joint design problem with the objective of minimizing the ISL of the GCAF, subject to mainlobe gain and dynamic range constraints.  Methods  This paper proposes a Maximum Block Improvement–Successive Convex Approximation (MBI-SCA) method for the nonconvex optimization problem involving High-Order Polynomials (HOP). The MBI algorithm decomposes the nonconvex problem into multiple subproblems, which are then solved iteratively using the SCA method. To further reduce computational cost, an Alternating Direction Penalty Method (ADPM) is introduced. This algorithm, which supports parallel implementation, dynamically updates the penalty factor in each iteration, ensuring the penalty term gradually converges to zero. This guarantees algorithm convergence and accelerates the search for a better feasible solution.  Results and Discussions  The proposed MBI-SCA algorithm converges in approximately 12 iterations, while the MBI-ADPM algorithm achieves faster convergence in about 10 iterations (Fig. 1). The running time of the MBI-ADPM algorithm increases as the sequence length varies from 16 to 512, whereas the MBI-SCA algorithm exhibits an overall increase, with a decrease at \begin{document}${2^8}$\end{document}, likely due to an decrease in the number of iterations when the SCA method solves the subproblem (Fig. 2). Both algorithms demonstrate strong performance, with GCAF values in the locally optimized region significantly lower than those in the unoptimized region, all below –200 dB (Fig. 3). However, MBI-ADPM achieves better local optimization, reducing GCAF values to –320 dB, whereas MBI-SCA reaches only –260 dB (Fig. 4). The parameter \begin{document}$K$\end{document} determines the optimization interval’s range, and as \begin{document}$K$\end{document} increases from 5 to 35, the ISL values of both methods also increase. For MBI-SCA, the optimal range of the parameter \begin{document}$K$\end{document} is \begin{document}$5 \le K \le 15$\end{document}, where the integral sidelobe levels remain below –50 dB, meeting the low sidelobe requirement. In contrast, MBI-ADPM performs best values of \begin{document}$K$\end{document} are 5 and 10, achieving an objective function value close to –300 dB (Fig. 7).  Conclusions  This paper proposes a joint design method for transmit waveforms and receive filters that minimizes the GCAF ISL under mainlobe gain and dynamic range constraints, addressing the reduction of auto- and cross-correlation integral levels in MIMO radar waveform sets. To solve the quartic nonconvex optimization problem, the original problem is first decomposed into manageable subproblems using the MBI algorithm, which are then solved iteratively with the SCA algorithm. To further reduce computational complexity, the ADPM algorithm is introduced to solve the SCA subproblems. Simulation results demonstrate that the MBI-ADPM algorithm converges faster and achieves a lower ISL than MBI-SCA for shorter distance intervals of interest.
A High-quality Factor Mode-localized MEMS Electric Field Sensor
WANG Guijie, CHU Zhaozhi, YANG Pengfei, RAN Lifang, PENG Chunrong, LI Jianhua, ZHANG Bo, WEN Xiaolong
Available online  , doi: 10.11999/JEIT241008
Abstract:
  Objective  High-performance Micro-Electro-Mechanical Systems (MEMS) Electric Field Sensors (EFS) are essential for measuring atmospheric electric fields and non-contact voltage. The mode localization effect can significantly improve resolution and is a recent focus in EFS research. However, in weakly coupled resonant systems, mode aliasing occurs when the quality factor is low, hindering the extraction of valid amplitude information. This study proposes a novel resonant MEMS EFS based on mode localization. The sensor employs a Double-Ended Tuning Fork (DETF) structure and a T-shaped tether to minimize energy loss, achieving a high-quality factor and resolution while effectively mitigating mode aliasing. This study presents theoretical analysis and numerical simulations. A prototype is fabricated and tested at a pressure of \begin{document}$ {10}^{-3}\;\mathrm{P}\mathrm{a} $\end{document}. Experimental results demonstrate that within an electric field range of \begin{document}$ 0~90\;\mathrm{k}\mathrm{V}/\mathrm{m} $\end{document}, the EFS exhibits a resolution of \begin{document}$ 32(\mathrm{V}/\mathrm{m})/\sqrt {\mathrm{H}\mathrm{z}} $\end{document}, and a quality factor of 42,423.  Methods  The sensor comprises two coupled resonators based on a tuning fork and T-shaped tether structure. It utilizes the principle of mode localization and an amplitude ratio output metric to enhance electric field sensing performance and prevent mode aliasing. The primary measurement principle is based on the transmission of induced charge from the electric field sensing electrode to the perturbed electrode of Resonator 1 through an electrical connection. This perturbed electrode generates a negative electrostatic perturbation, inducing mode localization in the coupled resonators. The resulting change in the amplitude ratio enables electric field detection. Furthermore, the tuning fork and T-shaped tether structure are designed to minimize clamping and anchor losses, thereby achieving a high-quality factor and effectively mitigating mode aliasing.  Results and Discussions  This study presents a mode-localized MEMS EFS that achieves a high-quality factor of 42,423 and a high resolution of \begin{document}$ 32\;(\mathrm{V}/\mathrm{m})/\sqrt {\mathrm{H}\mathrm{z}} $\end{document}, effectively preventing mode aliasing. Experiments are conducted in a vacuum chamber at a pressure of \begin{document}$ {10}^{-3}\;\mathrm{P}\mathrm{a} $\end{document}. The vacuum environment leads to heat accumulation from the amplifiers on the circuit board, increasing the board’s temperature and causing temperature drift in the sensor. Temperature drift is identified as the primary source of error in sensor testing. Future work will focus on testing the sensor chip with vacuum packaging to mitigate temperature drift caused by the vacuum chamber. Further optimization of the chip and circuit structures is conducted to minimize the effects of feedthrough and parasitic capacitance. Additionally, a differential structure will be designed to enhance common-mode rejection.  Conclusions  This study addresses mode aliasing in weakly coupled structures by proposing a mode-localized EFS based on a DETF and a T-shaped tether design. The DETF reduces clamping losses, while the T-shaped tether minimizes anchor losses. These structural optimizations reduce energy dissipation, enhance the quality factor, and effectively mitigate mode aliasing. The structural design, working principle, and sensitivity characteristics of the sensor are analyzed through numerical simulations, demonstrating that a lower quality factor under the same coupling strength can induce mode aliasing. The sensor fabrication process is introduced, and a prototype is developed. A testing system is established to evaluate the sensor’s performance in both open-loop and closed-loop configurations. Experimental results indicate that under a pressure of \begin{document}$ {10}^{-3}\;\mathrm{P}\mathrm{a} $\end{document} and within an electric field range of \begin{document}$ 0~90\;\mathrm{k}\mathrm{V}/\mathrm{m} $\end{document}, the sensor achieves a quality factor of 42,423, a resolution of \begin{document}$ 32\;(\mathrm{V}/\mathrm{m})/\sqrt {\mathrm{H}\mathrm{z}} $\end{document}, and a sensitivity of \begin{document}$ 0.0336\;/(\mathrm{k}\mathrm{V}/\mathrm{m}) $\end{document}. The sensor demonstrates a high-quality factor and excellent electric field resolution while effectively mitigating mode aliasing in mode-localized sensors. This work provides valuable insights for EFS research and the structural design of mode-localized sensors.
Patch-based Adversarial Example Generation Method for Multi-spectral Object Tracking
MA Jiayi, XIANG Xinyu, YAN Qinglong, ZHANG Hao, HUANG Jun, MA Yong
Available online  , doi: 10.11999/JEIT240891
Abstract:
  Objective   Current research on tracker-oriented adversarial sample generation primarily focuses on the visible spectral band, leaving a gap in addressing multi-spectral conditions, particularly the infrared spectrum. To address this, this study proposes a novel patch-based adversarial sample generation framework for multi-spectral object tracking. By integrating adversarial texture generation modules and adversarial shape optimization strategies, the framework disrupts the tracking model’s interpretation of target textures in the visible spectrum and impairs the extraction of thermal salient features in the infrared spectrum, respectively. Additionally, tailored loss functions, including mis-regression loss, mask interference loss, and maximum feature discrepancy loss, guide the generation of adversarial patches, leading to the expansion or deviation of tracking prediction boxes and weakening the correlation between template and search frames in the feature space. Research on adversarial sample generation contributes to the development of robust object tracking models resistant to interference in practical scenarios.  Methods   The proposed framework integrates two key components. A generative adversarial network (GAN) synthesizes texture-rich patches to interfere with the tracker’s semantic understanding of target appearance. This module employs upsampling layers to generate adversarial textures that disrupt the tracker’s ability to recognize and localize targets in the visible spectrum. A deformable patch algorithm dynamically adjusts geometric shapes to disrupt thermal saliency features. By optimizing the length of radial vectors, the algorithm generates adversarial shapes that interfere with the tracker’s extraction of thermal salient features, which are critical for infrared object tracking. Tailored loss functions are designed for different trackers. Mis-regression loss and mask interference loss guide attacks on region-proposal-based trackers (e.g., SiamRPN) and mask-guided trackers (e.g., SiamMask), respectively. These losses mislead the regression branches of region-proposal-based trackers and degrade the mask prediction accuracy of mask-guided trackers. Maximum feature discrepancy loss reduces the correlation between template and search features in deep representation space, further weakening the tracker’s ability to match and track targets. The adversarial patches are generated through iterative optimization of these losses, ensuring cross-spectral attack effectiveness.  Results and Discussions   Experimental results validate the method’s effectiveness. In the visible spectrum, the proposed framework achieves attack success rates of 81.57% (daytime) and 81.48% (night) against SiamRPN, significantly outperforming state-of-the-art methods PAT and MTD (Table 1). For SiamMask, success rates reach 53.65% (day) and 52.77% (night), demonstrating robust performance across different tracking architectures (Fig. 3). In the infrared spectrum, the method attains attack success rates of 71.43% (day) and 81.08% (night) against SiamRPN, exceeding the HOTCOLD method by more than 30% (Table 2). For SiamMask, the success rates reach 65.95% (day) and 65.85% (night), highlighting the effectiveness of the adversarial shape optimization strategy in disrupting thermal salient features. Multi-scene robustness is further demonstrated through qualitative results (Fig. 4), which show consistent attack performance across diverse environments, including roads, grasslands, and playgrounds under varying illumination conditions. Ablation studies confirm the necessity of each loss component. The combination of mis-regression and feature discrepancy losses improves the SiamRPN attack success rate to 75.95%, while the mask and feature discrepancy losses enhance SiamMask attack success to 65.91% (Table 3). Qualitative and quantitative experiments demonstrate that the adversarial samples proposed in this study effectively increase attack success rates against trackers in multi-spectral environments. These results highlight the framework’s ability to generate highly effective adversarial patches across both visible and infrared spectra, offering a comprehensive solution for multi-spectral object tracking security.   Conclusions   This study addresses the gap in multi-spectral adversarial attacks on object trackers by proposing a novel patch-based adversarial example generation framework. The method integrates a texture generation module for visible-spectrum attacks and a shape optimization strategy for thermal infrared interference, effectively disrupting trackers’ reliance on texture semantics and heat-significant features. By designing task-specific loss functions, including mis-regression loss, mask disruption loss, and maximum feature discrepancy loss, the framework enables precise attacks on both region-proposal and mask-guided trackers. Experimental results demonstrate the adversarial patches’ strong cross-spectral transferability and environmental robustness, causing trackers to deviate from targets or produce excessively enlarged bounding boxes. This work not only advances multi-spectral adversarial attacks in object tracking but also provides insights into improving model robustness against real-world perturbations. Future research will explore dynamic patch generation and extend the framework to emerging transformer-based trackers.
Error State Kalman Filter Multimodal Fusion SLAM Based on MICP Closed-loop Detection
CHEN Dan, CHEN Hao, WANG Zichen, ZHANG Heng, WANG Changqing, FAN Lintao
Available online  , doi: 10.11999/JEIT240980
Abstract:
  Objective   Single-sensor Simultaneous Localization and Mapping (SLAM) technology has limitations, including large mapping errors and poor performance in textureless or low-light environments. The 2D-SLAM laser performs poorly in natural and dynamic outdoor environments and cannot capture object information below the scanning plane of a single-line LiDAR. Additionally, long-term operation leads to cumulative errors from sensor noise and model inaccuracies, significantly affecting positioning accuracy. This study proposes an Error State Kalman Filter (ESKF) multimodal tightly coupled 2D-SLAM algorithm based on LiDAR MICP closed-loop detection to enhance environmental information acquisition, trajectory estimation, and relative pose estimation. The proposed approach improves SLAM accuracy and real-time performance, enabling high-precision and complete environmental mapping in complex real-world scenarios.   Methods   Firstly, sensor data is spatiotemporally synchronized, and the LiDAR point cloud is denoised. MICP matching closed-loop detection, including initial ICP matching, sub-ICP matching, and key ICP matching, is then applied to optimize point cloud matching. Secondly, an odometer error model and a point cloud matching error model between LiDAR and machine vision are constructed. Multi-sensor data is fused using the ESKF to obtain more accurate pose error values, enabling real-time correction of the robot’s pose. Finally, the proposed MICP-ESKF SLAM algorithm is compared with several classic SLAM methods in terms of closed-loop detection accuracy, processing time, and robot pose accuracy under different data samples and experimentally validated on the Turtlebot2 robot platform.   Results and Discussions   This study addresses the reduced accuracy of 2D grid maps due to accumulated odometry errors in large-scale mobile robot environments. To overcome the limitations of visual and laser SLAM, the paper examines laser radar Multi-layer Iterative Closest Point (MICP) matching closed-loop detection and proposes a visual-laser odometry tightly coupled SLAM method based on the ESKF. The SLAM algorithm incorporating MICP closed-loop detection achieves higher accuracy than the Cartographer algorithm on the test set. Compared to the Karto algorithm, the proposed MICP-ESKF-SLAM algorithm shows significant improvements in detection accuracy and processing speed. As shown in Table 2, the multimodal MICP-ESKF-SLAM algorithm has the lowest median relative pose error, approximately 3% of that of the Gmapping algorithm. The average relative pose error is reduced by about 40% compared to the MICP-SLAM algorithm, demonstrating the advantages of the proposed approach in high-precision positioning. Furthermore, multi-sensor fusion via ESKF effectively reduces cumulative errors caused by frequency discrepancies and sensor noise, ensuring timely robot pose updates and preventing map drift.  Conclusions   This study proposes a 2D-SLAM algorithm that integrates MICP matching closed-loop detection with ESKF. By estimating errors, optimizing state updates, and applying corrected increments to the main state, the approach mitigates cumulative drift caused by random noise and internal error propagation in dynamic environments. This enhances localization and map construction accuracy while improving the real-time performance of multi-sensor tightly coupled SLAM. The ESKF multi-sensor tightly coupled SLAM algorithm based on multi-layer ICP matching closed-loop detection is implemented on the Turtlebot2 experimental platform for large-scale scene localization and mapping. Experimental results demonstrate that the proposed algorithm effectively integrates LiDAR and machine vision data, achieving high-accuracy robot pose estimation and stable performance in dynamic environments. It enables the accurate construction of a complete, drift-free environmental map, addressing the challenges of 2D mapping in complex environments that single-sensor SLAM algorithms struggle with, thereby providing a foundation for future research on intelligent mobile robot navigation.
A Cybersecurity Entity Recognition Approach Based on Character Representation Learning and Temporal Boundary Diffusion
HU Ze, LI Wenjun, YANG Hongyu
Available online  , doi: 10.11999/JEIT240953
Abstract:
  Objective  The vast amount of unstructured cybersecurity information available online holds significant value. Named Entity Recognition (NER) in cybersecurity facilitates the automatic extraction of such information, providing a foundation for cyber threat analysis and knowledge graph construction. However, existing cybersecurity NER research remains limited, primarily relying on general-purpose approaches that struggle to generalize effectively to domain-specific datasets, often resulting in errors when recognizing cybersecurity-specific terms. Some recent studies decompose the NER task into entity boundary detection and entity classification, optimizing these subtasks separately to enhance performance. However, the representation of complex cybersecurity entities often exceeds the capability of single-feature semantic representations, and existing boundary detection methods frequently produce misjudgments. To address these challenges, this study proposes a cybersecurity entity recognition approach based on character representation learning and temporal boundary diffusion. The approach integrates character-level feature extraction with a boundary diffusion network based on a denoising diffusion probabilistic model. By focusing on optimizing entity boundary detection, the proposed method improves performance in cybersecurity NER tasks.  Methods  The proposed approach divides the NER task into two subtasks: entity boundary detection and entity classification, which are processed independently, as illustrated (Fig. 1). For entity boundary detection, a Question-Answering (QA) framework is adopted. The framework first generates questions about the entities to be extracted, concatenates them with the corresponding input sentences, and encodes them using a pre-trained BERT model to extract preliminary semantic features. Character-level feature extraction is then performed using a Dilated Convolutional Residual Character Network (DCR-CharNet), which processes character-level information through dilated residual blocks. Dilated convolution expands the model’s receptive field, capturing broader contextual information, while a self-attention mechanism dynamically identifies key features. These components enhance the global representation of input data and provide multi-dimensional feature representations. A Temporal Boundary Diffusion Network (TBDN) is then applied for entity boundary detection. TBDN employs a fixed forward diffusion process that introduces Gaussian noise to entity boundaries at each time step, progressively blurring them. A learnable reverse diffusion process subsequently predicts and removes noise at each time step, enabling the gradual recovery of accurate entity boundaries and leading to precise boundary detection. For entity classification, an independent network is trained to assign labels to detected entities. Like boundary detection, this subtask also adopts a QA framework. A cybersecurity-specific pre-trained language model, SecRoBERTa, encodes the concatenated question and input data to extract entity classification features. These features are then processed through a linear-layer-based entity classifier, which outputs the recognized entity type.  Results and Discussions   The performance of the proposed approach is evaluated on the DNRTI cybersecurity dataset, with comparative results against baseline methods presented (Table 3). The proposed approach achieved a 0.40% improvement in F1-score over UTERMMF, a model incorporating character-level, part-of-speech, and positional features along with inter-word relationship classification. Compared to CTERMRFRAT, which employs an adversarial training framework, the proposed approach improved the F1-score by 1.65%. Additionally, it outperformed BERT+BiLSTM+CRF by 5.20% and achieved gains of 12.21%, 17.90%, and 18.31% over BERT, CNN+BiLSTM+CRF, and IDCNN+CRF, respectively. These results highlight that boundary detection accuracy is a key factor limiting NER performance, and optimizing boundary detection methods can significantly enhance overall model effectiveness. The proposed approach’s emphasis on boundary detection enables more accurate identification of entity boundaries, contributing to higher F1-scores. However, in terms of accuracy, it slightly underperforms CNN+BiLSTM+CRF. This discrepancy is attributed to class imbalance in the dataset, where certain categories are overrepresented while others are underrepresented. The approach demonstrates strong performance in handling minority categories, but its focus on rare entities slightly reduces prediction accuracy for common categories, affecting overall accuracy. Despite this trade-off, the approach enhances entity boundary detection, reducing misidentifications and improving precision and recall, thereby increasing the F1-score. Errors in boundary detection may propagate to the entity classification stage, impacting overall accuracy. However, the proposed two-stage approach, which prioritizes boundary detection optimization, ensures more precise boundary identification, which is crucial for improving NER performance. In terms of computational efficiency, the proposed approach is compared with DiffusionNER (Table 4), another diffusion-based NER model. Results indicate that the proposed approach requires fewer parameters, achieves faster inference speeds, and delivers higher F1-scores under the same hardware and software conditions.  Conclusions  Enhancing boundary detection efficiency significantly improves NER performance. The proposed approach reduces resource consumption while achieving superior performance compared to recent baseline methods in cybersecurity NER tasks.
Networking and Resource Allocation Methods for Opportunistic UAV-assisted Data Collection
SUN Weihao, WANG Hai, QIN Zhen, QU Yuben
Available online  , doi: 10.11999/JEIT241053
Abstract:
  Objective  Unmanned Aerial Vehicles (UAVs) tasked with customized operations, such as environmental monitoring and intelligent logistics, are referred to as opportunistic UAVs. These UAVs, while traversing the task area, can be leveraged by ground nodes in regions that are either uncovered or heavily loaded, enabling them to function as data storage. This reduces the operational costs associated with deploying dedicated UAVs for data collection. In practice, however, the flight paths of opportunistic UAVs are uncontrolled, and the data-uploading capabilities of ground nodes in various regions vary. To enhance efficiency, ground nodes can actively form a network, pre-aggregate data, and allocate resources to cluster head nodes located advantageously for data transmission. Despite extensive research into networking technologies, two key challenges remain. First, existing studies predominantly focus on static networking strategies, overlooking the reliability of data aggregation in mobile scenarios. Ground nodes involved in tasks such as emergency response, disaster relief, or military reconnaissance may exhibit mobility. The dynamic topology of these mobile nodes, coupled with non-line-of-sight transmission path loss and severe signal fading, creates substantial challenges for reliable transmission, leading to bit errors, packet losses, and retransmissions. Therefore, mobile ground nodes must dynamically adjust their subnet data transmission strategies based on the time-varying relative distances between cluster members and heads.Second, most studies focus on data aggregation capacity within subnets but fail to consider the uploading capabilities of cluster heads. In opportunistic communication scenarios, where UAV flight paths are uncontrolled, the data-uploading capacity of each subnet is constrained by the minimum of the data collected, aggregation capacity, and uploading capability. Therefore, effective networking strategies for opportunistic UAV-assisted data collection must account for the relationships between cluster members, cluster heads, and UAVs. Coordinated resource allocation and subnet formation strategies are essential to improving system performance. In summary, exploring networking and resource allocation methods for opportunistic UAV-assisted data collection is of significant practical importance.  Methods  Due to the interdependent nature of the subnet data transmission, resource allocation, and formation strategies, the problem presents a large state space that is difficult to solve directly. To address this, a decomposition approach is applied. First, given the subnet formation strategy, the paper sequentially derives the closed-form solutions for the subnet data transmission and resource allocation strategies, significantly simplifying the original problem. Next, the subnet formation subproblem is modeled as a formation game. An altruistic networking criterion is proposed, and using potential game theory, it is proven that the formulated game has at least one pure strategy Nash equilibrium. A subnet formation strategy based on the best response method is proposed. Finally, the convergence and complexity of the proposed algorithm are analyzed.  Results and Discussions  Simulation results confirm the effectiveness of the proposed algorithm. As shown in the networking diagram, the algorithm predominantly selects nodes near the flight path as cluster heads due to their superior data uploading capabilities (Fig. 2, Fig. 3(a)). The data uploaded is constrained by the minimum values of the data collected, data aggregation capacity, and data uploading capacity, creating a bottleneck. In this context, the algorithm balances subnet data aggregation and uploading capacities, ultimately improving transmission efficiency (Fig. 3(b)). Additionally, the relationship between distance and subnet data transmission strategy is evaluated. Specifically, the proposed transmission strategy reduces the amount of data aggregated for reliability as the distance increases, while increasing data aggregation for efficiency when the distance decreases (Fig. 4). This dynamic transmission approach enhances reliability as the amount of aggregated data fluctuates (Fig. 5(a)). Furthermore, the proposed algorithm outperforms benchmark networking schemes with increasing iteration numbers, demonstrating up to a 56.3% improvement (Fig. 5(b)). Finally, regardless of variations in flight speed, the proposed algorithm consistently shows superior transmission efficiency (Fig. 5(c)).  Conclusions  This paper explores terrestrial networking and resource allocation methods to enhance the transmission efficiency of opportunistic UAV-assisted data collection. The strategies for subnet data transmission, resource allocation, and formation are jointly addressed. The paper derives closed-form solutions for the subnet data transmission and resource allocation strategies sequentially, followed by the formulation of the subnet formation strategy as a formation game, which is solved using the best response method. Extensive simulation results validate the performance improvements. However, this study considers only scenarios with a single opportunistic UAV. In practical applications, multiple UAVs may coexist, requiring further analysis of the time-varying relationships between cluster heads and UAVs in future work.
A Decision-making Method for UAV Conflict Detection and Avoidance System
TANG Xinmin, LI Shuai, GU Junwei, GUAN Xiangmin
Available online  , doi: 10.11999/JEIT240503
Abstract:
  Objective   With the rapid increase in UAV numbers and the growing complexity of airspace environments, Detect-and-Avoid (DAA) technology has become essential for ensuring airspace safety. However, the existing Detection and Avoidance Alerting Logic for Unmanned Aircraft Systems (DAIDALUS) algorithm, while capable of providing basic avoidance strategies, has limitations in handling multi-aircraft conflicts and adapting to dynamic, complex environments. To address these challenges, integrating the DAIDALUS output strategies into the action space of a Markov Decision Process (MDP) model has emerged as a promising approach. By incorporating an MDP framework and designing effective reward functions, it is possible to enhance the efficiency and cost-effectiveness of avoidance strategies while maintaining airspace safety, thereby better meeting the needs of complex airspaces. This research offers an intelligent solution for UAV avoidance in multi-aircraft cooperative environments and provides theoretical support for the coordinated management of shared airspace between UAVs and manned aircraft.   Methods   The guidance logic of the DAIDALUS algorithm dynamically calculates the UAV’s collision avoidance strategy based on the current state space. These strategies are then used as the action space in an MDP model to achieve autonomous collision avoidance in complex flight environments. The state space in the MDP model includes parameters such as the UAV's position, speed, and heading angle, along with dynamic factors like the relative position and speed of other aircraft or potential threats. The reward function is crucial for ensuring the UAV balances flight efficiency and safety during collision avoidance. It accounts for factors such as success rewards, collision penalties, proximity to target point rewards, and distance penalties to optimize decision-making. Additionally, the discount factor determines the weight of future rewards, balancing the importance of immediate versus future rewards. A lower discount factor typically emphasizes immediate rewards, leading to faster avoidance actions, while a higher discount factor encourages long-term flight safety and resource consumption.  Results and Discussions   The DAIDALUS algorithm calculates the UAV’s collision avoidance strategy based on the current state space, which then serves as the action space in the MDP model. By defining an appropriate reward function and state transition probabilities, the MDP model is established to explore the impact of different discount factors on collision avoidance. Simulation results show that the optimal flight strategy, calculated through value iteration, is represented by the red trajectory (Fig. 7). The UAV completes its flight in 203 steps, while the comparative experiment trajectory (Fig. 8) consists of 279 steps, demonstrating a 27.2% improvement in efficiency. When the discount factor is set to 0.99 (Fig. 9, Fig. 10), the UAV selects a path that balances immediate and long-term safety, effectively avoiding potential collision risks. The airspace intrusion rate is 5.8% (Fig. 11, Fig. 12), with the closest distance between the threat aircraft and the UAV being 343 meters, which meets the safety requirements for UAV operations.  Conclusions   This paper addresses the challenge of UAV collision avoidance in complex environments by integrating the DAIDALUS algorithm with a Markov Decision Process model. The proposed decision-making method enhances the DAIDALUS algorithm by using its guidance strategies as the action space in the MDP. The method is evaluated through multi-aircraft conflict simulations, and the results show that: (1) The proposed method improves efficiency by 27.2% over the DAIDALUS algorithm; (2) Long-term and short-term rewards are considered by selecting a discount factor of 0.99 based on the relationship between the discount factor and reward values at each time step; (3) In multi-aircraft conflict scenarios, the UAV effectively handles various conflicts and maintains a safe distance from threat aircraft, with a clear airspace intrusion rate of only 5.8%. However, this study only considers ideal perception capabilities, and real-world flight conditions, including sensor noise and environmental variability, should be accounted for in future work.
Personalized Federated Learning Method Based on Collation Game and Knowledge Distillation
SUN Yanhua, SHI Yahui, LI Meng, YANG Ruizhe, SI Pengbo
Available online  , doi: 10.11999/JEIT221203
Abstract:
To overcome the limitation of the Federated Learning (FL) when the data and model of each client are all heterogenous and improve the accuracy, a personalized Federated learning algorithm with Collation game and Knowledge distillation (pFedCK) is proposed. Firstly, each client uploads its soft-predict on public dataset and download the most correlative of the k soft-predict. Then, this method apply the shapley value from collation game to measure the multi-wise influences among clients and quantify their marginal contribution to others on personalized learning performance. Lastly, each client identify it’s optimal coalition and then distill the knowledge to local model and train on private dataset. The results show that compared with the state-of-the-art algorithm, this approach can achieve superior personalized accuracy and can improve by about 10%.
The Range-angle Estimation of Target Based on Time-invariant and Spot Beam Optimization
Wei CHU, Yunqing LIU, Wenyug LIU, Xiaolong LI
Available online  , doi: 10.11999/JEIT210265
Abstract:
The application of Frequency Diverse Array and Multiple Input Multiple Output (FDA-MIMO) radar to achieve range-angle estimation of target has attracted more and more attention. The FDA can simultaneously obtain the degree of freedom of transmitting beam pattern in angle and range. However, its performance is degraded due to the periodicity and time-varying of the beam pattern. Therefore, an improved Estimating Signal Parameter via Rotational Invariance Techniques (ESPRIT) algorithm to estimate the target’s parameters based on a new waveform synthesis model of the Time Modulation and Range Compensation FDA-MIMO (TMRC-FDA-MIMO) radar is proposed. Finally, the proposed method is compared with identical frequency increment FDA-MIMO radar system, logarithmically increased frequency offset FDA-MIMO radar system and MUltiple SIgnal Classification (MUSIC) algorithm through the Cramer Rao lower bound and root mean square error of range and angle estimation, and the excellent performance of the proposed method is verified.
Special Topic on Low Altitude Intelligent Networking
Wide-Area Multilateration Time Synchronization Method Based on Signal Arrival Time Modeling
TANG Xinmin, ZHOU Yang, LU Qixing, GUAN Xiangmin
Available online  , doi: 10.11999/JEIT240670
Abstract:
  Objective  Wide-Area Multilateration (WAM), a high-precision positioning technology currently under nationwide deployment, is widely applied in aircraft positioning on airport surfaces and in terminal areas. However, as WAM depends on collaborative signal processing across multiple stations, challenges such as time synchronization and computational complexity continue to constrain positioning accuracy. This study develops a mathematical model for time synchronization and “same-message” extraction based on Time Of Arrival (TOA), achieving synchronization by calculating the “synchronized start time” of ground sensors. The proposed method offers low computational complexity and is straightforward to implement. To enhance TOA estimation accuracy and reduce synchronization error, a joint filtering strategy—Variable Moving Average Filtering and Kalman (VMAF-Kalman)—is proposed to minimize TOA counting deviations introduced by clock drift. The model addresses synchronization challenges in distributed station deployments and employs joint filtering to correct initial clock source deviations.  Methods  This study addresses the challenge of high-precision TOA acquisition by proposing a joint filtering method that combines VMAF-Kalman. This approach filters the phase difference count between the GPS 1 Pulse Per Second (1PPS) signal and the local crystal oscillator 1PPS signal, producing a stable reference clock to mitigate the effects of noise and oscillator aging that induce clock drift. Therefore, stable TOA counting with a precision of 2.5 ns is achieved in Field-Programmable Gate Arrays (FPGAs). To resolve synchronization issues in distributed WAM systems, a time synchronization model based on TOA is proposed, which determines the synchronized start time of remote stations. Additionally, a same-message extraction model is developed to identify the TOA of identical messages, enabling accurate multilateration positioning.  Results and Discussions  Two experiments evaluate the proposed method and model: a filtering performance comparison and an actual flight trajectory positioning experiment for time synchronization validation. The latter includes two simulation scenarios: Scenario 1 consists of drone positioning tests, and Scenario 2 consists of civil aviation aircraft positioning tests. The simulation results indicate that the joint filtering method outperforms single filtering approaches, reducing TOA counting errors by 36.84% and 25.36% in the respective scenarios. Both the drone and civil aviation tests demonstrate high positioning accuracy, with errors and update rates meeting standard requirements. These findings confirm the practicality of the proposed method and the improved synchronization accuracy of the model.  Conclusions  Firstly, the proposed VMAF-Kalman joint filtering method demonstrates clear advantages over single filtering algorithms in both performance and hardware efficiency. Simulation results show that the output of the PID controller remains within a narrower fluctuation range, while TOA counting errors are reduced by 36.84% and 25.36%, respectively. These findings confirm that joint filtering stabilizes clock signals, improves TOA counting accuracy in FPGAs, and reduces synchronization errors. Secondly, the time synchronization and same-message extraction models developed in this study simplify existing synchronization methods by enabling WAM synchronization and TOA extraction through algorithmic computation alone. Simulations incorporating actual flight data in low-altitude airspace, verified across multiple positioning algorithms, further validate the model. Drone test results show that vertical Root Mean Square Error (RMSE) and deviation remain within 20 m, with horizontal RMSE below 10 m. For civil aviation aircraft, all algorithms achieved accuracy rates above 80%, with average errors under 300 m and position update intervals within 5 s, meeting established standards. The experimental outcomes confirm the feasibility and applicability of the proposed model for high-precision WAM time synchronization.
Jointly Optimized Deployment and Power for Unmanned Aerial Vehicle - Satellite Assisted Cell-Free Massive MIMO Systems
ZHAO Haitao, LIU Ying, WANG Qin, LIU Miao, ZHU Hongbo
Available online  , doi: 10.11999/JEIT240058
Abstract:
  Objective   This study addresses persistent limitations in resource availability, cognitive adaptability, and spatial coverage in traditional Cell-Free massive Multiple-Input Multiple-Output (CF-mMIMO) systems. A novel framework is proposed that integrates power control and Unmanned Aerial Vehicle (UAV) placement within a Low Earth Orbit (LEO) satellite-assisted downlink architecture. The objective is to enhance communication efficiency and system robustness in coverage-constrained wireless environments, particularly under dynamic user distributions and challenging propagation conditions.  Methods   The proposed framework adopts a hybrid optimization model that jointly considers user association, power allocation, and UAV deployment, based on the known spatial distribution of ground users and access points. With LEO satellite support, the architecture extends coverage and strengthens transmission links. The optimization problem aims to maximize the minimum achievable user data rate, subject to constraints on coverage, power, and cross-layer interference. Owing to the nonconvex and coupled nature of the variables, an iterative algorithm is developed using block coordinate descent and successive convex approximation. The original problem is decomposed into three interdependent subproblems—user association, power allocation, and UAV positioning—which are solved alternately to obtain a near-optimal solution.   Results and Discussions   Simulation results confirm that the proposed framework significantly improves system-wide throughput, communication robustness, and spectral efficiency. Compared with conventional CF-mMIMO systems, the integration of UAVs and LEO satellites enhances adaptability to non-uniform user distributions and challenging wireless environments. The strategy enables real-time adjustment of UAV positions and transmission power, improving load balancing, reducing interference, and expanding service coverage. Performance metrics, including the minimum user rate and total system capacity, demonstrate the proposed method’s effectiveness in complex, heterogeneous network settings.  Conclusions   This study proposes a scalable and adaptive approach for next-generation communication networks by integrating aerial and satellite components into terrestrial CF-mMIMO systems. The combination of intelligent UAV deployment and adaptive power control enables efficient resource management while maintaining high reliability and wide-area coverage. The proposed strategy represents a promising direction for future air-space-ground integrated networks, supporting high-throughput, energy-efficient, and resilient wireless services in both urban and remote scenarios.
Low-complexity MRC Receiver Algorithm Based on OTFS System
WANG Zhenduo, JI Tianzhi, SUN Rongchen
Available online  , doi: 10.11999/JEIT241056
Abstract:
  Objective  Ultra-high-speed mobile applications—such as Unmanned Aerial Vehicles (UAVs), high-speed railways, satellite communications, and vehicular networks—place increasing demands on communication systems, particularly under high-Doppler conditions. Orthogonal Time Frequency Space (OTFS) modulation offers advantages in such environments due to its robustness against Doppler effects. However, conventional receiver algorithms rely on computationally intensive matrix operations, which limit their efficiency and degrade real-time performance in high-mobility scenarios. This paper proposes a low-complexity Maximum Ratio Combining (MRC) receiver for OTFS systems that avoids matrix inversion by exploiting the structural characteristics of OTFS channel matrices in the Delay-Doppler (DD) domain. The proposed receiver achieves high detection performance while substantially reducing computational complexity, supporting practical deployment in ultra-high-speed mobile communication systems.  Methods  The proposed low-complexity receiver algorithm applies MRC in the DD domain to iteratively extract and coherently combine multipath components. This approach enhances Bit Error Rate (BER) performance by optimizing signal aggregation while avoiding computationally intensive operations. To further reduce complexity, the algorithm incorporates interleaving and deinterleaving operations that restructure the channel matrix into a sparse upper triangular Heisenberg form. This transformation enables efficient matrix decomposition and facilitates simplified processing. To address the computational burden associated with matrix inversion during symbol detection, a low-complexity LDL decomposition algorithm is introduced. Compared with conventional matrix inversion techniques, this method substantially reduces computational overhead. Furthermore, a low-complexity inversion method for lower triangular matrices is implemented to further improve efficiency during the decision process. Simulation results confirm that the proposed receiver achieves BER performance comparable to that of traditional MRC algorithms while significantly lowering computational complexity.  Results and Discussions  Simulation results confirm that the proposed low-complexity MRC receiver achieves BER performance comparable to that of conventional MRC receivers while substantially improving computational efficiency under high-mobility conditions (Fig. 3). The algorithm is evaluated across a range of environments, including scenarios characterized by high-speed motion and complex multipath interference. It outperforms Linear Minimum Mean Square Error (LMMSE) equalizers and Gauss–Seidel iterative equalization algorithms. Despite its reduced complexity, the proposed receiver maintains the same BER performance as traditional MRC methods. The algorithm demonstrates effective scalability as the number of symbols and subcarriers increases. Under conditions of increased system complexity, the receiver sustains computational efficiency without performance degradation (Fig. 4, Fig. 5). These results support its suitability for practical deployment in high-speed mobile communication systems employing OTFS modulation. The receiver also exhibits strong resilience to variations in wireless channel models. Across both typical urban multipath scenarios and high-velocity vehicular conditions, it maintains stable BER performance (Fig. 8). In addition, the receiver demonstrates robust tolerance to Doppler shift fluctuations and variable noise levels. These characteristics enable its application in dynamic environments with rapidly changing channel conditions. The algorithm’s efficiency and performance stability make it particularly well suited for real-time implementation in ultra-high-mobility networks, including UAV systems, high-speed rail communications, and other next-generation wireless platforms. By reducing computational complexity without compromising detection accuracy, the proposed receiver supports large-scale deployment of OTFS-based systems, addressing key performance and scalability challenges in emerging communication infrastructures.  Conclusions  This study proposes a low-complexity MRC receiver algorithm for OTFS systems. By introducing an interleaver and deinterleaver, the channel matrix is transformed into a sparse upper triangular form, enabling efficient inversion with reduced computational cost. In addition, the receiver integrates a low-complexity LDL decomposition algorithm and an upper triangular matrix inversion method to further minimize the computational burden associated with matrix operations. Simulation results confirm that the proposed receiver achieves equivalent BER performance to conventional MRC receivers. Moreover, under identical channel conditions, it demonstrates superior BER performance relative to linear receivers.
Differentially Private Federated Learning Based Wideband Spectrum Sensing for the Low-Altitude Unmanned Aerial Vehicle Swarm
DONG Peihao, JIA Jibin, ZHOU Fuhui, WU Qihui
Available online  , doi: 10.11999/JEIT241042
Abstract:
  Objective  Wideband Spectrum Sensing (WSS) for Unmanned Aerial Vehicles (UAVs) in low-altitude intelligent networks is essential for efficient spectrum monitoring and utilization. However, sampling at the Nyquist rate incurs high hardware and computational costs. Moreover, the high mobility of UAVs subjects them to rapidly changing spectral environments, which significantly reduces sensing accuracy and presents major challenges for UAV-based WSS.  Methods  A low-complexity Feature-Splitting Wideband Spectrum Sensing neural Network (FS-WSSNet) is proposed to achieve high sensing accuracy while reducing the operational cost of UAVs through sub-Nyquist sampling. To integrate spectral knowledge and computational resources across multiple UAVs and enable adaptation to varying spectrum environments, an online model adaptation algorithm based on Differential Privacy Federated Transfer Learning (DPFTL) is further proposed. Before model parameters are uploaded to a central computation platform, noise is added according to local differential privacy constraints. This enables spectrum knowledge sharing while preserving data privacy within the UAV swarm, allowing FS-WSSNet on each UAV to rapidly adapt to dynamic spectral conditions.  Results and Discussions  Simulation results demonstrate the effectiveness of the proposed FS-WSSNet and the DPFTL-based online model adaptation algorithm. FS-WSSNet achieves substantially higher prediction accuracy than the comparison models, confirming that omitting convolutional layers degrades performance and supporting the design rationale of FS-WSSNet (Fig. 3). In addition, FS-WSSNet consistently outperforms the baseline scheme across all Signal-to-Noise Ratio (SNR) conditions (Fig. 4). Its Receiver Operating Characteristic (ROC) curve, which lies closer to the top-left corner, indicates improved detection performance across various thresholds (Fig. 5). FS-WSSNet also exhibits significantly lower computational complexity compared with the baseline (Table 1). Furthermore, under the proposed DPFTL-based scheme (Algorithm 1), FS-WSSNet maintains robust performance across different target scenarios without requiring local adaptation samples. This approach not only preserves data privacy but also improves the model’s generalization ability (Figs. 69).  Conclusions  This study proposes a cooperative WSS scheme based on DPFTL for low-altitude UAV swarms. First, data received by UAVs are processed using multicoset sampling to enable cost-efficient sub-Nyquist acquisition. The resulting signals are input into a low-complexity FS-WSSNet for accurate and efficient spectrum detection. An online model adaptation algorithm based on DPFTL is then developed, introducing noise to model parameters before upload to ensure data privacy. By supporting spectrum knowledge sharing and collaborative training, the algorithm effectively integrates the computational and data resources of multiple UAVs to construct a robust model adaptable to various scenarios. Simulation results confirm that the proposed scheme provides an efficient WSS solution for resource-constrained low-altitude UAV networks, achieving both privacy protection and adaptability across scenarios.
A Localization Algorithm for Multiple Radiation Sources in Low-altitude Intelligent Networks Based on Sparse Tensor Completion and Density Peaks Clustering
CHEN Zhibo, GUO Daoxing
Available online  , doi: 10.11999/JEIT241050
Abstract:
  Objective   This paper addresses key technologies for multi-source localization in low-altitude intelligent networks, aiming to achieve precise spatial localization of multiple unknown radiation sources in dynamic low-altitude environments. The localization is based on signal strength data collected by spectrum monitoring devices mounted on Unmanned Aerial Vehicles (UAVs). Traditional localization methods encounter three major challenges in practical applications: significant spatial sparsity of measurement data due to the constrained flight trajectories of UAVs, signal strength fluctuations caused by environmental noise and shadow fading, and exponential increases in algorithm complexity as the number of unknown radiation sources grows. These factors lead to a substantial decline in localization performance in dynamic low-altitude scenarios, highlighting the need for a more robust multi-source localization framework.  Methods   To address these issues, this study proposes a collaborative localization algorithm that integrates sparse tensor completion with an improved Density Peak Clustering (DPC) method. The proposed approach decomposes multi-source localization into two progressive stages: three-dimensional tensor reconstruction and density peak detection. First, the sparse measurement data from UAVs are modeled as a three-dimensional sparse tensor containing spatial coordinates and signal strength, fully characterizing the spatial distribution of signals in the target area. A tensor completion network based on convolutional autoencoders is then designed to intelligently infer the signal strength in unmeasured regions through deep feature learning, effectively alleviating the data sparsity issue. Based on the reconstructed complete signal distribution, an improved DPC algorithm is introduced. By incorporating an adaptive truncation distance to optimize local density calculations and constructing a decision graph using Mahalanobis distance, the algorithm accurately identifies density peaks (i.e., radiation source locations) and suppresses outliers.   Results and Discussions   The innovation of this method is reflected in the following three aspects: (1) Enhanced noise robustness: By reconstructing the signal spatial distribution through tensor completion and eliminating pseudo-peaks caused by noise interference using DPC clustering. Under noise power conditions of –20 dBm, the algorithm achieved a missed detection probability of 16.62% and a false alarm probability of 11.13%, while maintaining an average localization error of 12.15 m (Fig. 11, Fig. 12); (2) Improved weak signal detection capability: By utilizing local density features rather than traditional signal strength threshold detection, the localization performance for low-power radiation sources was improved. Under conditions with radiation source transmission power of 5 dBm to 10 dBm and at a 30% sampling rate, the algorithm achieved a missed detection probability of 3.12% and a false alarm probability of 3.56%, significantly outperforming two baseline algorithms (Fig. 9, Fig. 10); (3) Optimized multi-source resolution performance: Simulation experiments demonstrated that in scenarios with 10 coexisting radiation sources, the method achieved an average localization error of 6.42m, representing a 46.94% improvement over the existing best method’s performance of 12.10 meters. Additionally, the fluctuation in localization error across scenarios with 2 to 10 radiation sources was maintained within ±9% (Fig. 7, Fig. 8).  Conclusions   This study constructs a two-stage localization framework, “tensor completion-density clustering,” which combines radio map estimation with the improved DPC algorithm for the first time, addressing the challenges of sparse measurement, noise interference, and multi-source coupling in low-altitude scenarios. The proposed algorithm can reconstruct the three-dimensional signal strength distribution from sparse measurement data obtained by UAVs and accurately localize multiple unknown radiation sources. It maintains strong performance under complex conditions, such as sparse measurements, environmental noise, and multi-source scenarios. This method provides a practical and robust solution for UAV spectrum monitoring applications. The technology offers theoretical support for tasks such as the rapid traceability of interference sources in emergency communications and collaborative spectrum sensing in UAV swarms, with significant application potential in areas such as smart city aerial monitoring and battlefield electromagnetic situational awareness.
Radar, Navigation and Array Signal Processing
Methods for Enhancing Positioning Reliability in Indoor and Underground Satellite-shielded Environments
YI Qingwu, HUANG Lu, YU Baoguo, LIAO Guisheng
Available online  , doi: 10.11999/JEIT240870
Abstract:
This paper proposes a method to enhance the reliability of indoor positioning by combining an unsupervised autoencoder with nonlinear filtering. A Denoising Variational AutoEncoder model assisted by a deep Convolutional Neural Network (DVAE-CNN) is designed to regulate the positioning results from multiple aspects, including measurement data quality evaluation, target state transition modeling, and weight update strategies aided by environmental prior information. This approach addresses the issue of low positioning reliability caused by information loss, errors, and disturbances in complex indoor environments. Compared to positioning results without the reliability control mechanism, the proposed method improves average positioning accuracy by 74.6% and positioning reliability by 88.2%. Extensive experiments conducted in the venues of the Beijing 2022 Winter Olympics demonstrate that the proposed method provides highly robust, reliable, and continuous positioning services, showing significant potential for practical application and promotion.   Objective  With the rapid development of indoor positioning technologies, ensuring high reliability and trustworthiness in complex indoor and underground satellite-shielded environments remains a critical challenge. Existing methods often prioritize accuracy and continuity but neglect reliability under environmental disturbances such as signal loss, noise, and multipath effects. To address these limitations, this study proposes a multi-level trustworthiness enhancement framework by integrating an unsupervised Denoising Variational AutoEncoder with a Convolutional Neural Network (DVAE-CNN) and nonlinear particle filtering. The goal is to improve positioning reliability through data quality assessment, environmental prior information fusion, and adaptive state transition constraints, thereby supporting robust location-based services in challenging environments like the 2022 Beijing Winter Olympics venues.  Methods  The proposed framework combines a DVAE-CNN model for denoising and feature extraction with a particle filtering mechanism incorporating environmental priors and sensor data. The DVAE-CNN evaluates measurement data quality by reconstructing noisy inputs and identifying anomalies through reconstruction probability thresholds. Concurrently, nonlinear particle filtering integrates multi-source heterogeneous data (e.g., inertial sensors, Wi-Fi, and indoor maps) to constrain particle distributions based on motion patterns and structural boundaries. A weight update strategy dynamically adjusts particle importance using prior knowledge, while adaptive step-length estimation refines Pedestrian Dead Reckoning (PDR) to reduce cumulative errors.  Results and Discussions  Extensive experiments in controlled environments and real-world Olympic venues demonstrate significant improvements. Compared to baseline methods without trustworthiness mechanisms, the proposed approach achieves a 74.6% increase in average positioning accuracy and an 88.2% enhancement in reliability. In dynamic tests at the Beijing Winter Olympics venues, the method eliminated trajectory jumps caused by signal loss and improved coverage continuity by 34%, ensuring seamless navigation in complex indoor spaces. The fusion of DVAE-CNN-based anomaly detection and environmental constraints effectively suppressed "wall-penetrating" particles, enhancing result plausibility.  Conclusions  This study addresses the critical issue of positioning trustworthiness in indoor and underground environments by integrating data-driven anomaly detection with multi-source fusion. Key contributions include: (1) A DVAE-CNN model that improves data quality assessment and noise resilience; (2) A particle filtering framework leveraging environmental priors and adaptive PDR for robust state estimation; (3) Validation in high-stakes scenarios, achieving sub-meter accuracy and high reliability. Limitations, such as PDR’s cumulative errors, warrant further exploration. Future work will focus on real-time optimization and sensor noise modeling for broader applications.
Global Navigation Satellite System Partial Ambiguity Resolution Method Integrating Ionospheric Delay Correction and Multi-frequency Signal Optimization
ZHANG Xu, YANG Jie
Available online  , doi: 10.11999/JEIT240682
Abstract:
  Objective  Global Navigation Satellite System (GNSS) high-precision positioning is widely applied due to its accuracy. However, the integrity of the Ambiguity Resolution (AR) process remains limited, particularly in occluded environments and over long baselines. Traditional AR methods are often affected by ionospheric delay errors, which become substantial when the ionospheric conditions differ between reference and rover stations. This paper proposes a Modified Partial Ambiguity Resolution (MPAR) method that integrates ionospheric delay correction models with multi-frequency signal optimization. The combined approach improves GNSS positioning accuracy and reliability under varied environmental and baseline conditions.  Methods  To reduce the effect of ionospheric delay on AR, this study incorporates an Ionospheric delay correction model into the geometry-free Cascade Integer Resolution (ICIR) method. ICIR resolves the integer ambiguities of Extra-Wide Lane (EWL), Wide Lane (WL), and Narrow Lane (NL) combinations using carrier phase measurements with different wavelengths. The ionospheric delay correction model enables compensation for differential delays between stations, improving AR accuracy, particularly over long baselines. To further enhance data usage—especially in cases of low-quality observations—a two-stage partial AR strategy is employed. In the first stage, the ICIR method is applied to an optimal subset of satellites selected based on tri-frequency availability and high elevation angles. For the non-optimal subset, which may include satellites with limited frequencies or weaker signal quality, the Least-Squares AMBiguity Decorrelation Adjustment (LAMBDA) method is used in geometric mode, with assistance from the ambiguity-fixed results of the optimal subset. This integrated approach reduces computational complexity and improves the AR success rate and reliability. The MPAR method proceeds as follows: (1) select the optimal satellite subset based on frequency availability and elevation angle; (2) apply the ICIR method to resolve ambiguities in this subset; (3) use the fixed ambiguities from the optimal subset to assist in resolving ambiguities for the non-optimal subset via the LAMBDA method; (4) obtain the final integer ambiguity solution for the full epoch.  Results and Discussions  The proposed MPAR method is validated using two datasets collected under different environments: one from Tokyo, characterized by complex urban occlusion and long baselines (approximately 1 700 meters), and another from Wuhan, featuring an open campus environment and short baselines (approximately 600 meters). The results show that the MPAR method outperforms traditional PAR methods in positioning accuracy, AR success rate, and computational efficiency. As shown in (Fig. 3) and (Fig. 5), satellite visibility in the Tokyo dataset is significantly affected by occlusion, leading to fewer available satellites compared to the Wuhan dataset. Despite these challenges, the MPAR method achieves the highest success rate and the lowest Average Standard Deviation (ASD) in all tested scenarios, including GPS, BDS, and dual-system modes (Table 2 and Table 4). In the Tokyo dataset, the MPAR method reduces the ASD by up to 40% compared to traditional methods, reflecting its robustness in complex environments. The AR success rate also significantly improves with the MPAR method. As presented in (Table 3) and (Table 5), the MPAR method achieves AR success rates exceeding 90% in all tested scenarios, with a peak rate of 99.4% in the GPS/BDS dual-system mode of the Wuhan dataset. These results demonstrate the effectiveness of the proposed method in enhancing AR reliability under challenging conditions. In terms of computational efficiency, the MPAR method exhibits balanced performance. Although the use of the ionospheric delay correction model slightly increases computational complexity, the overall efficiency remains competitive, with an average solution time of approximately 0.13 seconds per epoch (Table 3 and Table 5). This performance supports the suitability of the MPAR method for real-time applications. Furthermore, Table 3 (Tokyo dataset) and Table 5 (Wuhan dataset) summarize the performance metrics of the five AR methods evaluated. The MPARICIR method achieves the highest AR success rates across all systems and environments, reaching 93.1% and 99.4% for the Tokyo and Wuhan datasets, respectively. Notably, the MPARICIR method maintains a high success rate while reducing computation time compared to other methods, indicating its efficiency. These results support the effectiveness and robustness of the proposed MPARICIR method in improving GNSS positioning performance.  Conclusions  This study proposes an MPAR method for high-precision GNSS positioning. By integrating ionospheric delay correction models with multi-frequency signal optimization, MPAR combines the strengths of geometry-free and geometry-based AR strategies. The method effectively reduces the effect of ionospheric delay, particularly over long baselines and in occluded environments. Experimental results confirm that MPAR improves positioning accuracy, AR success rate, and computational efficiency relative to conventional methods. Its consistent performance across varied environments and baseline lengths highlights its suitability for broad application in high-precision GNSS positioning.
Wireless Communication and Internet of Things
Research on Signal Detection of Adaptive O-OFDM Symbol Decomposition in Rough Set Information System
JIA Kejun, CHE Jiaqi, LIU Jiaxin, XIAN Yuqin, QIN Cuicui, YANG Boran
Available online  , doi: 10.11999/JEIT240864
Abstract:
  Objective  Adaptive Optical Orthogonal Frequency Division Multiplexing Symbol Decomposition with Serial Transmission (O-OFDM-ASDST) effectively suppresses the nonlinear clipping distortion of Light-Emitting Diodes (LEDs) in Visible Light Communication (VLC). However, incorporating decomposition symbols in the O-OFDM-ASDST system incorporates Additive White Gaussian Noise (AWGN), which degrades the Bit Error Rate (BER). To address this issue, this study proposes an O-OFDM-ASDST signal detection algorithm based on the Rough Set Theory (RST) information system and the indiscernibility relation of artificial intelligence particle computing.  Methods  An O-OFDM-ASDST signal detection algorithm is proposed based on the RST information system and the indiscernibility relation of artificial intelligence particle computing. The algorithm consists of two stages: RST preprocessing and attribute reduction reconstruction. In the first stage, the RST information system is constructed by using preprocessed time-domain sampled values as theoretical domain data. The signal characteristics of these sampled values are converted into symbolic attributes, serving as the conditional attributes of the RST information system, while the upper and lower amplitude thresholds are designated as decision attributes. The RST attribute dependence formula, combined with an attribute importance-based addition and deletion method, is applied to establish decision rules and classify the information system. In the second stage, the indiscernibility relation is derived from the decision rules, and attribute reduction is performed on the constructed information system. This reduction process is applied to the time-domain sampled values within the upper and lower thresholds, followed by reconstruction.  Results and Discussions  The performance of the proposed detection algorithm is verified using the Monte Carlo simulation method. The results demonstrate that this algorithm effectively suppresses AWGN at the O-OFDM-ASDST receiver, enhances BER performance, and significantly reduces computational complexity and processing delay. For instance, when the PhotoDetector (PD) is positioned at the center of the room [3,3,0.85], the ACO-OFDM-ASDST system achieves Signal-to-Noise Ratio (SNR) gains of approximately 1 dB and 1.2 dB under 4QAM and 16QAM modulation, respectively, at a BER of 10-5. The DCO-OFDM-ASDST system achieves SNR gains of approximately 1 dB and 2 dB under the same conditions (Fig. 7). Similarly, when the PD is located at the edge of the room [0.5,0.5,0.85], the ACO-OFDM-ASDST system achieves SNR gains of approximately 0.8 dB and 1.1 dB for 4QAM and 16QAM, respectively, at a BER of 10-5, while the DCO-OFDM-ASDST system achieves SNR gains of approximately 2.5 dB and 3.2 dB (Fig. 8). The proposed detection algorithm also maintains favorable BER performance under different DC bias levels. For example, under 16QAM modulation with DC bias values of 0.3 V, 0.4 V, and 0.6 V, the DCO-OFDM-ASDST system achieves SNR gains of approximately 2.2 dB, 2 dB, and 0.8 dB, respectively, at a BER of 10–5 (Fig. 9). Furthermore, the complexity of the proposed detection algorithm in the ACO-OFDM-ASDST system is only one-tenth that of the contrast signal detection algorithm (Fig. 12). As the number of symbol decompositions increases, the proposed algorithm requires fewer computing resources compared to the contrast detection algorithm. For instance, in the ACO-OFDM-ASDST system with 4QAM, 16QAM, and 64QAM modulation, when the number of symbol decompositions is 4, the computational resources required by the contrast detection algorithm amount to 4096, whereas those required by the proposed detection algorithm are 408, 600, and 736, respectively. This corresponds to reductions in computational resource consumption by factors of 1/10, 1/7, and 1/6, respectively (Fig. 13). Additionally, the proposed detection algorithm exhibits lower processing latency.  Conclusions  The O-OFDM-ASDST signal detection algorithm is implemented using the RST information system and the indiscernibility relation, effectively suppressing AWGN in decomposed symbols. The simulation results confirm the effectiveness of the proposed algorithm, demonstrating superior BER performance compared to other signal detection methods. Notably, high BER performance is maintained even at the room’s edge, highlighting the algorithm’s reliability, coverage, and robustness. Additionally, the proposed algorithm exhibits low complexity and reduced processing delay. It not only mitigates LED nonlinear distortion but also effectively suppresses AWGN in decomposition symbols, thereby enhancing BER transmission performance and improving the overall O-OFDM system performance.
Regularized Neural Network-Based Normalized Min-Sum Decoding for LDPC Codes
ZHOU Hua, ZHOU Ming, ZHANG Likang
Available online  , doi: 10.11999/JEIT240860
Abstract:
  Objective  The application of deep learning in communications has demonstrated significant potential, particularly in Low Density Parity Check (LDPC) code decoding. As a rapidly evolving branch of artificial intelligence, deep learning effectively addresses complex optimization problems, making it suitable for enhancing traditional decoding techniques in modern communication systems. The Neural-network Normalized Min-Sum (NNMS) algorithm has shown improved performance over the Min-Sum (MS) algorithm by incorporating trainable neural network models. However, NNMS decoding assigns independent training weights to each edge in the Tanner graph, leading to excessive training complexity and high storage overhead due to the large number of weight parameters. This significantly increases computational demands, posing challenges for implementation in resource-limited hardware. Moreover, the excessive number of weights leads to overfitting, where the model memorizes training data rather than learning generalizable features, degrading decoding performance on unseen codewords. This issue limits the practical applicability of NNMS-based decoders and necessitates advanced regularization techniques. Therefore, this study explores methods to reduce NNMS decoding complexity, mitigate overfitting, and enhance the decoding performance of LDPC codes.  Methods  Building on the traditional NNMS decoding algorithm, this paper proposes two partially weight-sharing models: VC-SNNMS (sharing weights for edges from variable nodes to check nodes) and CV-SNNMS (sharing weights for edges from check nodes to variable nodes). These models apply a weight-sharing strategy to specific edge types in the bipartite graph, reducing the number of training weights and computational complexity. To mitigate neural network overfitting caused by the high complexity of NNMS and its variants, a regularization technique is proposed. This leads to the development of the Regularized NNMS (RNNMS), Regularized VC-SNNMS (RVC-SNNMS), and Regularized CV-SNNMS (RCV-SNNMS) algorithms. Regularization refines network parameters by modifying the loss function and gradients, penalizing excessively large weights or redundant features. By reducing model complexity, this approach enhances the generalization ability of the decoding neural network, ensuring robust performance on both training and test data.  Results and Discussions  To evaluate the effectiveness of the proposed algorithms, extensive simulations are conducted under various Signal-to-Noise Ratio (SNR) conditions. The performance is assessed in terms of Bit Error Rate (BER), decoding complexity, and convergence speed. Additionally, a comparative analysis of NNMS, SNNMS, VC-SNNMS, CV-SNNMS, and their regularized variants systematically examines the effects of weight-sharing and regularization on neural network-based decoding. Simulation results show that for an LDPC code with a block length of 576 and a code rate of 0.75, when BER = 10–6, the RNNMS, RVC-SNNMS, and RCV-SNNMS algorithms achieve SNR gains of 0.18 dB, 0.22 dB, and 0.27 dB, respectively, compared to their corresponding NNMS, VC-SNNMS, and CV-SNNMS algorithms. Notably, the RVC-SNNMS algorithm demonstrates the best performance, with SNR gains of 0.55 dB, 0.51 dB, and 0.22 dB compared to the BP, NNMS, and SNNMS algorithms, respectively (Fig. 3). Furthermore, under different numbers of decoding iterations, the RVC-SNNMS algorithm consistently outperforms the others in BER performance. Specifically, at BER = 10-6 with 15 decoding iterations, it achieves SNR gains of 0.57 dB and 0.1 dB compared to the NNMS and SNNMS algorithms, respectively (Fig. 4). Similarly, for an LDPC code with a block length of 1056, when BER = 10-5 and 10 decoding iterations are used, the RVC-SNNMS algorithm attains SNR gains of 0.34 dB and 0.08 dB compared to the NNMS and SNNMS algorithms, respectively (Fig. 5).  Conclusions  This study investigates the performance of NNMS and SNNMS for LDPC code decoding and proposes two partially weight-sharing algorithms, VC-SNNMS and CV-SNNMS. Simulation results show that weight-sharing strategies effectively reduce training complexity while maintaining competitive BER performance. To address the overfitting issue associated with the high complexity of NNMS-based algorithms, regularization is incorporated, leading to the development of RNNMS, RVC-SNNMS, and RCV-SNNMS. Regularization effectively mitigates overfitting, enhances network generalization, and improves error-correcting performance for various LDPC codes. Simulation results indicate that the RVC-SNNMS algorithm achieves the best decoding performance due to its reduced complexity and the improved generalization provided by regularization.
Image and Intelligent Information Processing
Model and Data Dual-driven Joint Limited-Angle CT Reconstruction and Metal Artifact Reduction Method
SHI Baoshun, CHENG Shizhan, JIANG Ke, FU Zhaoran
Available online  , doi: 10.11999/JEIT240703
Abstract:
  Objective  Computed Tomography (CT) is widely used in medical diagnostics due to its non-destructive, non-contact imaging capabilities. To lower cancer risk from radiation exposure, clinical practice often limits scanning angles, referred to as Limited-Angle CT (LACT). Incomplete projection data in LACT leads to wedge-shaped artifacts in reconstructions using Filtered Back-Projection (FBP) algorithms. These artifacts worsen in the presence of metallic implants. Although LACT reconstruction without metal and full-angle CT Metal Artifact Reduction (MAR) have been extensively studied, the joint task of Limited-Angle and Metal Artifact Reduction (LAMAR) has received limited attention. This study proposes a model- and data-driven CT network that integrates a Task Selection (TS) module to apply appropriate gradient descent steps for different tasks. This enables simultaneous processing of LACT and LAMAR. The network also incorporates dual-domain information interaction during alternating iterations to reconstruct high-quality CT images.  Methods  First, a dual-domain reconstruction model integrating both model- and data-driven model is constructed to address the joint task of LACT reconstruction and LAMAR. The model comprises four components: an image-domain data fidelity term, a projection-domain data fidelity term, an image-domain regularization term, and a projection-domain regularization term. These terms are solved using an alternating iteration strategy. The image- and projection-domain subproblems are addressed using the proximal gradient descent algorithm, with the iterative process unrolled into a Deep Neural Network (DNN). Each stage of the deep unrolling network includes three components: a TS module, a projection-domain subnetwork, and an image-domain subnetwork. The TS module dynamically determines gradient descent step sizes for the LACT and LAMAR tasks by comparing image-domain FBP reconstruction results with predefined thresholds. The projection-domain subnetwork is shared by both tasks. Finally, the data-driven proximal network comprises the projection-domain and image-domain subnetworks. The projection-domain subnetwork includes an encoder, a dual-branch structure, and a decoder. The encoder has two stages, each consisting of a convolutional layer followed by an activation function; the decoder mirrors this architecture. A Transformer-based long-range branch incorporates non-metal trace information into a self-attention mechanism to guide correction of metal trace data using contextual information from non-metal regions. A short-range branch, composed of six residual blocks, extracts local features. The outputs of the two branches are fused using a weighted strategy before being passed to the decoder. The image-domain subnetwork is implemented as an attention-based U-Net. Channel and spatial attention mechanisms are applied before each of the four downsampling operations in the U-Net encoder. This design allows the decoder to more effectively leverage encoded information for high-quality CT image reconstruction without increasing the number of network parameters.  Results and Discussions  Experimental results on both LACT reconstruction and LAMAR tasks show that the proposed method outperforms existing CT reconstruction algorithms in both qualitative and quantitative evaluations. Quantitative comparisons (Table 1) indicate that the proposed method achieves higher average Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and lower Root Mean Square Error (RMSE) for both tasks across three angular ranges. Specifically, average PSNR improvements for the LAMAR and LACT tasks reach 2.78 dB, 2.88 dB, and 2.32 dB, respectively, compared with the best-performing baseline methods. Qualitative comparisons (Fig 4 and Fig 5) show that reconstructing CT images and projection data through alternating iterations, combined with dual-domain information interaction, enables the network to effectively suppress composite artifacts and improve the reconstruction of soft tissue regions and fine structural details. These results consistently exceed those of existing approaches. Visual assessment of reconstruction performance on the clinical dataset for the LAMAR task (Fig 6) further demonstrates the method’s effectiveness in reducing metal artifacts around implants. The reconstructed images exhibit clearer structural boundaries and improved tissue visibility, indicating strong generalization to previously unseen clinical data.  Conclusions  To address the combined task of LACT reconstruction and LAMAR, this study proposes a dual-domain, model- and data-driven reconstruction framework. The optimization problem is solved using an alternating iteration strategy and unfolded into a model-driven CT reconstruction network, with each subnetwork trained in a data-driven manner. In the projection-domain network, a TS module identifies the presence of metallic implants in the initial CT estimates, allowing a single model to simultaneously handle cases with and without metal. A trace-aware projection-domain proximal subnetwork, integrating Transformer and convolutional neural network architectures, is designed to capture both local and non-local contextual features for restoring metal-corrupted regions. In the image-domain network, a U-Net architecture enhanced with channel and spatial attention mechanisms is used to maximize spatial feature utilization and improve reconstruction quality. Experimental results on the AAPM and DeepLesion datasets confirm that the proposed method consistently outperforms existing algorithms under various limited-angle conditions and in the presence of metal artifacts. Further evaluation on the SpineWeb dataset demonstrates the network’s generalization capability across clinical scenarios.
Satellite Navigation
Research on GRI Combination Design of eLORAN System
LIU Shiyao, ZHANG Shougang, HUA Yu
Available online  , doi: 10.11999/JEIT201066
Abstract:
To solve the problem of Group Repetition Interval (GRI) selection in the construction of the enhanced LORAN (eLORAN) system supplementary transmission station, a screening algorithm based on cross interference rate is proposed mainly from the mathematical point of view. Firstly, this method considers the requirement of second information, and on this basis, conducts a first screening by comparing the mutual Cross Rate Interference (CRI) with the adjacent Loran-C stations in the neighboring countries. Secondly, a second screening is conducted through permutation and pairwise comparison. Finally, the optimal GRI combination scheme is given by considering the requirements of data rate and system specification. Then, in view of the high-precision timing requirements for the new eLORAN system, an optimized selection is made in multiple optimal combinations. The analysis results show that the average interference rate of the optimal combination scheme obtained by this algorithm is comparable to that between the current navigation chains and can take into account the timing requirements, which can provide referential suggestions and theoretical basis for the construction of high-precision ground-based timing system.