Advanced Search
Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Evaluation of DeepION model based on SPP Navigation Positioning During Active Solar Condition
WANG Zitong, FU Haiyang, JIANG Zhuojun, CAI Dijia
 doi: 10.11999/JEIT250662
[Abstract](0) [FullText HTML](0) [PDF 15016KB](0)
Abstract:
  Objective  Accurate characterization of ionospheric variability is a critical prerequisite for reliable Global Navigation Satellite System (GNSS) positioning, especially during geomagnetic storms when rapid and highly structured disturbances occur. Existing empirical and physics-based ionospheric models often struggle to represent storm-time ionospheric dynamics and small-scale irregularities in real time. This study aims to develop a unified data-driven ionospheric modeling framework that takes GNSS-derived Slant Total Electron Content (STEC) time series (estimated from GNSS observations) as input and learns the spatiotemporal mappings to key ionospheric parameters, including STEC, Vertical Total Electron Content (VTEC), and the Rate of TEC Index (ROTI). By leveraging deep operator learning, the proposed framework seeks to enhance short-term ionospheric modeling and forecasting capability under disturbed conditions and to provide more reliable ionospheric corrections for single-frequency GNSS positioning.  Methods  This study proposes a unified data-driven ionospheric modeling framework, named DeepION, based on the Deep Operator Network (DeepONet) architecture. The framework takes STEC time series as the primary input, and learns nonlinear spatiotemporal mappings to key ionospheric parameters. Specifically, DeepION enables modeling and prediction of STEC and VTEC, while ROTI is subsequently derived from the predicted STEC series. In the network design, a convolutional neural network (CNN) is employed as the branch network to extract spatiotemporal features from historical STEC time series. The trunk network consists of a multi-layer fully connected architecture with periodic time encoding, whose inputs include GNSS observation geometry and temporal information, enabling the model to capture the continuous temporal dynamics of ionospheric behavior. During data preprocessing, a VTEC-based modeling strategy is first applied to estimate and remove receiver Differential Code Biases (DCB), thereby obtaining high-quality STEC observations. The model is then trained and validated using the STEC observations during the May 2024 geomagnetic storm. The model outputs include ray-path STEC values, gridded VTEC fields, and derived ROTI time series. Furthermore, the proposed framework is evaluated by incorporating the model-derived VTEC corrections into GNSS Single Point Positioning (SPP) experiments. The modeled and observed ionospheric parameters are compared under both geomagnetically quiet and disturbed conditions to comprehensively assess the modeling accuracy and practical performance of DeepION.  Results and Discussions  The experimental results demonstrate that the proposed DeepION model can robustly characterize ionospheric spatiotemporal variability under different space weather conditions, capturing both large-scale structures and small-scale disturbances during geomagnetic storms. On STEC forecasting, the model achieves a Root Mean Square Error (RMSE) of 12.8 TECU over a 3-day prediction horizon, maintaining high consistency with observed GNSS measurements (Fig.4). Moreover, the model effectively predicts ionospheric irregularities, as shown by the close match between predicted and observed ROTI time series at mid-latitude stations NVSK (Fig.5). For VTEC modeling, DeepION-generated global VTEC maps accurately reproduce equatorial anomalies and storm-enhanced density regions, closely matching the CODE-SH benchmark while outperforming empirical models such as Klobuchar and NeQuick in both spatial resolution and structural fidelity (Fig.6). Further analysis of ray-path level performance shows that STEC derived from DeepION-based VTEC mapping yields the lowest residual errors at the mid-to-high latitude station NLIB, achieving an RMSE of 6.80 TECU, outperforming Klobuchar, NeQuick, and slightly improving upon CODE-SH (Fig. 7). In GNSS positioning applications, SPP results indicate that DeepION-derived ionospheric corrections consistently reduce positioning errors at both CUSV and NLIB stations, particularly in the vertical and geometric components during storm-time conditions, demonstrating enhanced robustness under intensified geomagnetic disturbances (Fig. 8, Fig. 9).  Conclusions  This study presents DeepION, a data-driven ionospheric modeling framework based on the Deep Operator Network architecture, which learns spatiotemporal relationships between GNSS-derived STEC observations and key ionospheric parameters. With a CNN-based branch network and a periodically encoded trunk network, DeepION models and predicts STEC and VTEC, and then derives ROTI from the predicted STEC series. Experiments using global GNSS data during the May 2024 geomagnetic storm show that DeepION can capture storm-time ionospheric variability and achieves stable performance in STEC forecasting and global VTEC reconstruction. Compared with conventional empirical and physics-based models, DeepION provides improved modeling accuracy and spatial representation. Furthermore, GNSS Single Point Positioning experiments indicate that ionospheric corrections derived from DeepION lead to reduced positioning errors at both mid- and high-latitude stations, particularly in the vertical and geometric components under disturbed geomagnetic conditions. These results highlight the practical value of DeepION for GNSS ionospheric correction during space weather events. Overall, DeepION offers a scalable framework for data-driven ionospheric modeling, and future work will extend it to multi-GNSS constellations, longer prediction lead time, and additional ionospheric observations.
A Point Cloud Slice-based UAV SLAM for 3D Reconstruction of Large Container Port Areas
HU Zhaozheng, ZUO Zhihang, XU Cong, TAO Qianwen, LIU Chao, MENG Jie
 doi: 10.11999/JEIT251112
[Abstract](35) [FullText HTML](10) [PDF 13510KB](9)
Abstract:
  Objective  With the continuous advancement of port intelligence, the demand for digital management in container port areas is increasingly growing. In large container yard scenarios, 3D reconstruction of the yard environment can be achieved by utilizing drone Simultaneous Localization and Mapping (SLAM) technology. However, container port areas contain an abundance of repetitive semantic structural information, where traditional semantic matching methods suffer from low efficiency and poor accuracy. Furthermore, during the 3D reconstruction process conducted by drones over container port areas, the lanes between yards present large feature-sparse regions, which can easily lead to odometry degradation. Additionally, the extensive presence of repetitive scene features also interferes with loop closure detection. To address these issues, this paper proposes a slicing method for rapid feature extraction, which is further optimized based on the characteristics of the container yard scenario. A UAV point cloud slicing SLAM method tailored for large-scale container port 3D reconstruction is introduced, enabling high-precision 3D reconstruction.  Methods  To address point cloud semantic extraction, this paper proposes a point cloud slicing method for rapid feature extraction, which quickly extracts the principal direction and divides the point cloud into multiple layers to efficiently obtain multi-layer semantic point clouds. The slicing method is further optimized based on the characteristics of the container yard scenario: the principal plane extraction is simplified using the direction of gravity, and the elevation range of each container layer is adaptively obtained through point cloud gradient changes to construct multi-layer sliced point clouds. Subsequently, a progressive adaptive LiDAR odometry based on sliced point clouds is constructed, which adaptively identifies degraded scenarios using elevation slices and employs an incremental iterative strategy for inter-layer slice fusion matching, thereby improving the accuracy, efficiency, and stability of the LiDAR odometry. In addition, a factor graph optimization method that fuses information from sliced point clouds is designed. By performing fusion voting on the matching results of multi-layer sliced point clouds, erroneous results are filtered out and the impact of repetitive structures on loop closure detection is reduced; slice factors are then used to construct factor graph edges, enhancing global optimization and achieving efficient and stable 3D reconstruction.  Results and Discussions  The feasibility and effectiveness of the proposed method are verified through testing in Carla simulations and real-world scenarios at a large container port in Wuhan. Results are as follows: First, through comparative analysis with three algorithms—RANSAC, Region Growth, and 3DG_SEG—the efficiency and accuracy of the proposed semantic extraction algorithm are demonstrated. Furthermore, by comparing mapping trajectories with two renowned open-source LiDAR algorithms, FAST-LIO2 and Faster-LIO, the superiority of the proposed odometry method is proven. Finally, comparisons of speed and confidence level are conducted with six algorithms: ICP, NDT, GICP, Fast_GICP, Scan Context+ICP, and Quatro. Simultaneously, the loop closure detection module from LIO-SAM is integrated into FAST-LIO2, and the Scan Context module into Faster-LIO. The mapping trajectories are then compared with that of the proposed algorithm, validating the effectiveness of the proposed loop closure detection algorithm. The proposed method achieves high 3D reconstruction accuracy; therefore, it is suitable for practical application in operational processes.  Conclusions  The proposed method uses an efficient point cloud slicing technique and a multi-layer slice matching mechanism. Points within the same elevation range form a slice point cloud (Slice), and the segmentation process is called slice generation. This enables efficient and robust 3D reconstruction in large-scale scenes with repetitive features.First, the LiDAR point cloud is aligned to the Z-axis using IMU-derived gravity direction. A sliding window records density gradient changes to adaptively determine each layer’s elevation range. This simplifies slicing and reduces the impact of non-standard containers or ground height variations on semantic extraction.Multi-layer slice data are then integrated into the odometry module to detect degenerate scenarios. Under normal conditions, progressive slice matching initializes pose estimation; otherwise, IMU-based iterative Kalman filtering is used.Finally, fusion voting removes outliers from multi-layer slice matching results. The best match initializes loop closure for global container point cloud registration, enabling dual-stage loop closure detection and slice factor construction. Integrating slice point cloud information into factor graph optimization unifies coordinates and achieves efficient, robust 3D reconstruction.
Secure and Covert MIMO Short Packet Communications with Location-Uncertain Malicious Nodes
TIAN Bo, YANG Weiwei, YANG Xiaoqin, BAI Mengmeng
 doi: 10.11999/JEIT260059
[Abstract](29) [FullText HTML](11) [PDF 2799KB](2)
Abstract:
  Objective  This paper investigates secure and covert short packet communication in multi-input multi-output (MIMO) wireless systems with location-uncertain malicious nodes over quasi-static Rician fading channels. In the considered scenario, a legitimate transmitter sends confidential short packets to a legitimate receiver, while multiple monitoring nodes attempt to detect whether the transmission exists and multiple eavesdropping nodes attempt to intercept the confidential information. Since malicious nodes may remain silent and their exact positions are unavailable to the legitimate system, their spatial uncertainty brings significant challenges to joint covertness and secrecy analysis. To address this problem, this paper establishes a unified analytical and optimization framework for secure covert short packet transmission, aiming to characterize the coupling relationship among covertness, secrecy, and reliability, and to improve the average effective secrecy and covert rate (AESCR).  Methods  The transmitter adopts singular value decomposition (SVD)-based precoding, and the legitimate receiver applies maximum ratio combining (MRC) to enhance the legitimate link. The monitoring nodes and eavesdropping nodes are modeled as two independent Poisson point processes (PPPs) outside a circular protection zone centered at the transmitter, which captures the spatial randomness of malicious nodes. For covertness analysis, each monitoring node is assumed to perform optimal likelihood ratio detection with full knowledge of the system model, noise power, channel state, and codebook information. By using the Chernoff bound and the Bhattacharyya coefficient, a theoretical lower bound on the minimum detection error probability of a single monitoring node is first derived. Then, by combining stochastic geometry with the distribution of the strongest monitoring node, a tractable lower bound on the average minimum detection error probability is obtained. For secrecy analysis, the finite blocklength normal approximation is used to account for both decoding error and information leakage penalties. The legitimate channel is statistically characterized according to the Rician fading condition, while the strongest eavesdropping node is analyzed through stochastic geometry. Based on these results, an approximate analytical expression for the average secrecy rate is derived. Furthermore, AESCR is introduced as a comprehensive performance metric that jointly reflects reliability, secrecy, and covertness. Under the average covert constraint and the short packet length constraint, a joint optimization problem for transmit power and packet length is formulated. By exploiting the monotonic properties of the objective function and the covert constraint, the original coupled optimization problem is transformed into a one-dimensional search problem.  Results and Discussions  Simulation results verify the accuracy of the theoretical derivations and reveal the influence of key system parameters. Both the simulated average minimum detection error probability and its theoretical lower bound decrease as the packet length increases, and higher transmit power further reduces the detection error probability, indicating that excessive power makes the transmission more exposed to monitoring nodes (Fig. 2). Increasing the number of monitoring-node antennas strengthens spatial reception capability and further degrades covertness (Fig. 2). Enlarging the protection zone improves covertness because malicious nodes are forced to remain farther away from the transmitter, whereas increasing the monitoring-node density weakens this benefit by raising the probability that a strong monitoring node appears near the protection-zone boundary (Fig. 3). The average secrecy rate increases with packet length and gradually approaches the asymptotic secrecy-capacity upper bound, because the finite blocklength rate penalty becomes smaller when the packet length grows (Fig. 4). The proposed AESCR first increases and then decreases with packet length, confirming the existence of an optimal packet length; this phenomenon results from the tradeoff between reduced finite-blocklength penalty and increased detection exposure (Fig. 5). Larger malicious-node density and more malicious-node antennas reduce system performance, since they enhance both monitoring and eavesdropping capabilities (Fig. 5). Relaxing the covert constraint improves the achievable AESCR, because the system can select a higher transmit power or a more favorable packet length (Fig. 6). The results under different Rician factors also show that the proposed analytical framework is applicable to both Rician and Rayleigh fading conditions (Fig. 6). Increasing the number of legitimate receive antennas improves AESCR, and a larger transmit antenna array brings additional SVD precoding gain (Fig. 7). Compared with benchmark schemes, the proposed joint optimization of transmit power and packet length consistently outperforms the scheme with fixed packet length and power-only optimization, demonstrating the necessity of jointly balancing reliability, secrecy, and covertness in MIMO short packet transmission (Fig. 8).  Conclusions  This paper develops a stochastic-geometry-based analytical framework for MIMO secure covert short packet communication with location-uncertain multi-antenna malicious nodes. By deriving a lower bound on the average minimum detection error probability, obtaining an approximate analytical expression for the average secrecy rate, and introducing AESCR, the proposed framework reveals the fundamental tradeoff among covertness, secrecy, and reliability under finite blocklength transmission. The results show that increasing the number of legitimate transmit and receive antennas improves secure covert performance, whereas higher malicious-node density and more malicious-node antennas degrade system performance. The existence of an optimal packet length further demonstrates that packet length and transmit power must be jointly designed. Therefore, the proposed joint optimization method provides an effective solution for secure covert short packet transmission in mission-critical and low-latency wireless systems.
One-step Reconstruction Diffusion Model for Poisoning Attack on QoS-aware cloud API Recommender System
TAN Zeyu, WANG Haoyuan, QI Mingyang, SUN Mengmeng, SHEN Limin, CHEN Zhen
 doi: 10.11999/JEIT260115
[Abstract](26) [FullText HTML](16) [PDF 1282KB](3)
Abstract:
  Objective  In the cloud era, cloud Application Programming Interface (cloud API), as the best carrier for data output, capability replication and service delivery, has become an indispensable core element for service-oriented software development and operation. With the rapid increase in the number of cloud APIs, it is difficult for users to choose from a large number of cloud APIs with the same functions. For this purpose, researchers introduced Quality of Service (QoS) to effectively differentiate cloud APIs based on their non-functional attributes. Therefore, QoS-aware cloud API recommender systems (QARS) are gradually playing an increasingly important role in guiding users to choose the most suitable cloud API. However, existing research mainly focuses on improving the accuracy of QARS, ignoring the security risks brought about by the economic benefits of cloud APIs and the openness of the network environment. These risks are especially evident in the threats posed by poisoning attacks. Attackers manipulate the recommendations by injecting fake users, causing serious damage to the fairness and credibility of the QoS-aware cloud API recommender system. To counter the threat of poisoning attacks, this paper reveals the attack mechanisms of diffusion model-based attack methods from the perspective of learning defense through attacking, inspiring the design of corresponding defense methods.  Methods  This paper systematically defines the attack process of poisoning attacks and fake user profiles, and proposes attack scales to flexibly simulate poisoning attacks. Then, to reveal the attack principle of the diffusion model-based attack method, this paper further proposes a Preference guided one-step reconstruction Diffusion model-based Poisoning Attack framework (PDPA) to simulate poisoning attacks. Following the collaborative principle that similar users may have similar preferences toward cloud APIs, the fake users generated by the attack method need to ensure that both their QoS values and the distribution of cloud API invocations remain similar to those of real users, thereby exploiting the collaborative influence of fake users to interfere with the QARS's modeling of user preferences. Therefore, to effectively carry out poisoning attacks, PDPA aims to generate fake users that are similar to real users. Firstly, PDPA uses the One-step reconstruction Diffusion Model (ODM) to model the QoS data and the invocation distribution of real users, respectively. ODM avoids the error accumulation that occurs during the iterative denoising process caused by the noise dependence of standard diffusion models, enabling ODM to generate fake user cloud API invocation behaviors similar to those of real users, thereby ensuring that fake users can effectively have a collaborative influence. Subsequently, in order to improve the attack performance, PDPA systematically selects fake users with a preference for invoking the target cloud API to fill the maximum QoS value. This not only enhances the aggressiveness of fake users, but also alleviates the interference of the target cloud API's addition on the invocation behavior of fake users, ensuring the concealment of fake users.  Results and Discussions  The experiment was conducted in the real-world QoS dataset WS-DREAM. Firstly, this paper uses six recommendation methods as target recommender systems, and six baseline attack methods to simulate poisoning attacks. The experimental results (Table 3) reveal the vulnerability of the recommender system to poisoning attacks. Each attack method can cause damage to the accuracy of the recommender system. PDPA achieves the best attack performance in most experimental settings, which is attributed to its sufficient modeling of user invocation preferences, thereby enabling fake users to effectively exert collaborative influence on the QARS. Secondly, the comparison of the F1 and distribution in latent space of fake users generated by ODM and the standard diffusion model was conducted. The experimental results (Figure 2) verify that ODM is superior to the standard diffusion model not only in terms of stealth but also as reflected in low-dimensional visualization. Subsequently, the ablation study on each module of PDPA was conducted. The experimental results (Tables 4 and 5) verify that each module of PDPA is a necessary guarantee for the attack performance and concealment of fake users. Finally, the comparison of MAE and F1 on various attack scales was conducted to verify the impact of attack scale on the attack effect and concealment of fake users. The experimental results (Figure 3 and Table 6) indicate that increasing the attack scale could effectively enhance the attack performance, but it would also lead to an increase in the number of detected fake users.  Conclusions  To counter the threat of poisoning attacks, this paper explores the attack process and key attack parameters of poisoning attacks, and reveals the vulnerability of the QoS-aware cloud API recommender system by simulating poisoning attacks. This paper simulates poisoning attacks on QARS by constructing the PDPA, which demonstrates the significant potential of diffusion models in poisoning attacks and validates the necessity of separately modeling QoS data and cloud API invocations through ablation studies. Furthermore, PDPA reveals the underlying mechanism of generating fake users via diffusion models, providing insights for designing targeted countermeasures.
Joint Optimization Method for Pairwise Constrained Projection Clustering Integrating Two-row Update Strategy
ZHU Jianyong, CHEN Kun, YANG Hui, NIE Feiping
 doi: 10.11999/JEIT260111
[Abstract](19) [FullText HTML](10) [PDF 2600KB](2)
Abstract:
  Objective  As data structures grow increasingly complex, conventional unsupervised clustering techniques often fail to achieve satisfactory performance. Semi-supervised clustering, which leverages limited prior information, has thus become increasingly popular due to its ability to improve clustering quality. While existing methods have made progress, they suffer from two critical drawbacks. First, traditional constrained projection clustering algorithms typically adopt a two-step independent strategy: learning the projection matrix first and then performing kmeans clustering. This separation causes the projection deviation to propagate directly to the clustering process without correction, leading to the accumulation of learning errors. Moreover, applying pairwise constraints only at the projection stage deviates from the core principle of using prior information to guide the clustering process. Second, many current methods, such as those based on spectral clustering, handle pairwise constraints implicitly (e.g., through eigen-decomposition of a modified similarity matrix). This implicit handling often fails to strictly satisfy constraints, particularly Cannot-Link constraints which are non-transitive, resulting in high constraint violation rates. To this end, this paper proposes a Joint Optimization Method for Pairwise Constrained Projection Clustering Integrating Two-row Update Strategy (PCITUS). The primary objective is to unify dimensionality reduction and clustering into a single framework to avoid information loss and to design an explicit optimization strategy that minimizes constraint violations while enhancing computational efficiency.  Methods  The proposed PCITUS model integrates constraint projection and clustering into a unified objective function to achieve collaborative optimization, while directly optimizing pairwise constraints. First, the algorithm utilizes the transitive property of Must-Link (ML) constraints, where all samples belonging to the same ML connected component are merged into a single "hyper-point" in the feature space. This preprocessing step naturally ensures that all ML constraints are satisfied. Subsequently, a trade-off parameter is introduced to incorporate projection learning as a regularization term within the clustering framework, enabling the two components to be jointly optimized within a unified objective. Moreover, prior information is embedded into the clustering process by transforming pairwise constraints into row-wise constraints on the indicator matrix. Subsequently, this paper employs an improved coordinate descent method to solve the discrete indicator matrix directly, effectively enhances computational efficiency and find better results. Furthermore, a core innovation is the two-row simultaneous optimization strategy designed for handling Cannot-Link (CL) constraints. PCITUS explicitly checks for CL conflicts by simultaneously evaluating the objective function values for swapping conflicting rows to sub-optimal classes and selects the scenario yielding the higher value.  Results and Discussions  Extensive experiments were conducted on 8 benchmark datasets and compared against 9 state-of-the-art semi-supervised clustering algorithms. The quantitative results in terms of Accuracy (ACC) and Normalized Mutual Information (NMI) demonstrate the superiority of PCITUS (Table 4 and Table 5). PCITUS achieves the highest performance on most datasets. Notably, on the Mushroom dataset, the NMI metric improved by 7.2% compared to the second-best algorithms. The comparison with CNP (a two-step projection method) confirms that the unified framework effectively mitigates error propagation and information loss, this also stems from the fact that a better projection space can lead to a clearer clustering structure, while a more reasonable clustering structure, in turn, guides the formation of a more discriminative projection space. The effectiveness of the explicit constraint handling is further illustrated (Fig. 1), PCITUS exhibits no ML constraint violations due to the hyper-point merging strategy. For CL constraints, because of the two-row simultaneous optimization strategy, PCITUS maintains an extremely low violation rate (e.g., 0.57% on Mushroom and 0.41% on Satimage), significantly outperforming methods that handle constraints implicitly. Additionally, parameter sensitivity analysis (Fig. 2) indicates that the performance of PCITUS is stable across a wide range of the trade-off parameter, and noise sensitivity experiments (Fig. 3a and Fig. 3b) highlights its robustness. The convergence curves (Fig. 3c and Fig. 3d) and runtime comparisons (Table 7) validate its computational efficiency, showing rapid convergence and typically reaching a stable objective function value within approximately 10 iterations.  Conclusions  To tackle the difficulties in optimizing cannot-link constraints, as well as the inherent limitations of traditional constraint projection clustering frameworks based on a two-step separation scheme, this paper presents PCITUS, a novel semi-supervised clustering framework that jointly optimizes pairwise constraint projection and clustering structures. By integrating the projection objective into the clustering framework as a regularizer, the proposed method ensures that the subspace learning and data partitioning processes mutually enhance each other, jointly approaching the global optimum. Furthermore, pairwise constraints are integrated throughout the entire learning process, ensuring that prior knowledge is fully utilized during optimization. The introduction of the coordinate descent method with a specific "two-row simultaneous update strategy" allows for the direct and precise allocation of Cannot-Link constraints, significantly reducing constraint violations. Experimental results validate that PCITUS not only outperforms existing algorithms in clustering performance but also exhibits strong robustness to parameter variations.
Energy-Efficient Trajectory Planning and Resource Optimization for UAV Relay Communications over Hybrid RF/FSO Links
LI Baolong, PAN Wenwei, JIANG Hao, FENG Simeng, WU Qihui
 doi: 10.11999/JEIT260139
[Abstract](33) [FullText HTML](17) [PDF 2702KB](5)
Abstract:
  Objective  In low-altitude communication networks, hybrid RF/FSO UAV relaying can effectively alleviate RF spectrum congestion and enhance uplink data aggregation efficiency. However, in obstacle-rich urban environments, FSO backhaul links are highly susceptible to blockage and may experience intermittent outages, resulting in a severe mismatch between the RF uplink arrival rate and the FSO backhaul service rate. Meanwhile, UAV trajectory planning is constrained by obstacle-avoidance and flight dynamics. To address these coupled challenges, this paper investigates an energy-efficiency maximization problem by jointly optimizing multiuser NOMA-based RF access and the UAV’s three-dimensional obstacle-avoiding trajectory, while incorporating buffer-assisted RF/FSO rate decoupling.  Methods  A time-slotted UAV relaying model where multiple ground users upload data to the UAV via an RF link using NOMA is considered in the paper. The UAV decodes the superposed signals using successive interference cancellation (SIC) and determines the decoding order in each slot according to the received power ranking. The successfully received data are then forwarded to the base station (BS) through an FSO backhaul link. Urban blockage is modeled using 3D geometric obstacles, and a visibility test is employed to determine whether each relevant link is in LOS or non-line-of-sight (NLOS), thereby capturing the spatially correlated and time-varying characteristics of the RF access rate and the intermittent FSO backhaul capacity. To suppress the blockage-induced mismatch between uplink and backhaul rates, a finite-capacity buffer is deployed at the UAV. In each slot, the forwardable amount is jointly limited by the instantaneous FSO backhaul capability and the amount of data available in the buffer, while buffer-capacity constraints prevent overflow. System energy efficiency is defined as the ratio of the cumulative data successfully delivered to the BS over the mission horizon to the UAV propulsion energy consumption, where the propulsion power is modeled as a function of the UAV’s velocity and acceleration to reflect the impact of flight dynamics. Under 3D flight-region boundaries, prescribed start, end locations, discrete-time kinematic equations, maximum velocity and acceleration limits, and obstacle collision-avoidance constraints, a non-convex optimization problem is formulated with cross-slot multiuser transmit powers and the UAV 3D trajectory as decision variables. Furthermore, an alternating optimization framework is developed. With a fixed trajectory, the propulsion energy is fixed and maximizing energy efficiency becomes equivalent to increasing the end-to-end successfully forwarded data, yielding a power-optimization subproblem. Due to NOMA coupling and logarithmic rate expressions, this subproblem remains non-convex and is handled via successive convex approximation (SCA). With fixed transmit powers, particle swarm optimization (PSO) is used to search candidate 3D trajectories in a continuous space. To ensure feasibility under strict dynamics and safety constraints, a quadratic-programming (QP) projection is employed to enforce velocity and acceleration constraints, and collision checks are performed on trajectory waypoints and inter-slot line segments to guarantee obstacle-free flight. These two optimization procedures are alternately performed, resulting in a joint design that satisfies flight-dynamics feasibility and collision avoidance while significantly improving energy efficiency.  Results and Discussion   Simulations are conducted in an urban airspace containing multiple users, a BS, and dense 3D obstacles. Blockage causes frequent LOS/NLOS switching as the UAV moves. Figures 2 and 3 present comparisons of the 3D trajectory and its planar projection, respectively. Compared to the initial trajectory, the optimized trajectory exhibits clear detours and necessary altitude adjustments, and achieves collision-free flight while satisfying velocity and acceleration constraints, thus validating the feasibility and safety of the proposed trajectory planning approach. Figure 4 presents the energy-efficiency convergence behavior under different user transmit-power budgets. The proposed alternating optimization typically stabilizes within a small number of outer iterations. Meanwhile, the converged energy efficiency increases with higher power budgets, demonstrating the synergy between power control and trajectory adaptation. Furthermore, Figure 5 depicts the buffer evolution over time. It is observed that the buffer gradually accumulates when the backhaul is blocked or experiences strong fading, and is quickly drained once the UAV enters regions where LOS backhaul becomes available and FSO capacity improves. In order to further quantify the buffering gain, Figure 6 compares the system energy efficiency achieved by the proposed buffering mechanism and the no-buffer scheme. Compared to the no-buffer scheme, the proposed mechanism enables store-and-forward-based temporal smoothing during backhaul interruptions, thereby significantly improving system energy efficiency. Figure 7 illustrates the energy-efficiency convergence behavior under different buffer capacities. It is observed that as the buffer capacity increases, the converged energy-efficiency level is significantly improved. This is because a larger buffer enhances the UAV’s ability to temporarily store incoming data, thereby effectively alleviating data accumulation and transmission blockage when the access-link rate and backhaul-link rate are mismatched or when the backhaul link is constrained. Figure 8 compares the performance of four benchmark schemes, namly a non-optimized baseline, a power optimization scheme, a trajectory optimization scheme, and the proposed joint power-and-trajectory optimization scheme. It is found that the coordinated design of power allocation and obstacle-avoiding trajectory substantially improves end-to-end energy efficiency, and that trajectory optimization often plays a more dominant role under blockage-limited conditions.  Conclusion  The paper investigates a hybrid RF/FSO UAV relaying scheme with NOMA and an onboard buffering mechanism for low-altitude urban communication environments. Given the dense obstacles, frequent blockage, the fragility of FSO links, and stringent flight-dynamics constraints, an energy-efficiency maximization problem is formulated for the joint optimization of multiuser NOMA power allocation and the UAV trajectory. Accordingly, an SCA-based power-allocation method and an obstacle-avoiding trajectory design combining PSO with QP projection are developed. The obtained trajectory satisfies flight-dynamics feasibility and collision-avoidance requirements while significantly improving throughput per unit propulsion energy. Simulation results demonstrate that the planned trajectory can effectively avoid obstacles, and the onboard buffer provides an effective cushion between RF access and FSO backhaul to mitigate rate mismatch. In addition, the proposed method consistently outperforms benchmark schemes in terms of energy efficiency. Meanwhile, the trajectory optimization is shown to be generally more effective than power allocation in improving the overall system performance.
Non-Terrestrial Network Architecture and Key Technologies for Civil Aviation
LIU Xiangnan, QIU Yu, HUANG Zhipeng, ZHANG Haijun
 doi: 10.11999/JEIT260348
[Abstract](43) [FullText HTML](15) [PDF 4324KB](6)
Abstract:
  Significance   Currently, civil aviation communications rely heavily on terrestrial base stations and narrowband satellite communications. This setup not only leaves significant coverage blind spots in scenarios like remote airspace, transoceanic routes, and polar flights—failing to meet the high-reliability requirements of core operations such as real-time flight monitoring and engine health data transmission—but also suffers from pain points including bandwidth constraints, poor passenger connectivity experience, and insufficient communication resilience in emergency scenarios.  Progress   In this context, we review the evolution of NTN technologies in the civil aviation sector and track the latest research progress worldwide and domestically on civil aviation NTN networks, including network frameworks, mobility management, and resource management. To enable NTN networks to better serve the civil aviation, we approach the topic from three perspectives—network frameworks, mobility management, and resource management—introducing key technologies in network architecture, access and mobility management, and novel resource control mechanisms within NTN systems.  Conclusions  For civil aviation, NTN can not only completely fill the coverage gaps of terrestrial communications, but also balance high-speed passenger connectivity with efficient transmission of airline operational data, enhancing the industry’s operational efficiency and service quality. It lays a technical foundation for cutting-edge scenarios like future air-space integrated transportation and civil aviation unmanned aerial vehicle networking, serving as a key enabler to address civil aviation’s communication challenges and drive the industry’s upgrade toward greater safety, efficiency, and intelligence.  Prospects   With the continuous advancement of key technologies such as networking architecture design, mobility management, and resource management, the proposed solutions are expected to offer more efficient, stable, and intelligent communication support for the civil aviation industry. In the long term, such NTN-enabled communication frameworks will play an essential role in supporting the digital transformation and intelligent upgrading of civil aviation operations.
A Tensor Framework for ISAC: Information Fusion Enhanced Channel Estimation and Target Localization
YU Weijia, DU Jianhe, CHEN Yuanzhi, HE Jing, ZHANG Peng, GUAN Yalin
 doi: 10.11999/JEIT251371
[Abstract](29) [FullText HTML](10) [PDF 3286KB](5)
Abstract:
  Objective  Communication and sensing systems are evolving toward higher frequency bands, larger antenna arrays, and greater miniaturization, driving their increasing convergence in terms of hardware architecture, channel characteristics, and signal processing. This synergy gives rise to integrated sensing and communication (ISAC), in which the joint estimation of channel and sensing target parameters has become a primary research hotspot. Although existing studies have realized the co-estimation of these two categories of parameters based on a unified tensor framework, several limitations remain. On the one hand, current research focuses primarily on parameter estimation itself, without further transforming the multidimensional estimation results into precise localization of scatterer points (SPs), mobile terminals, and sensing targets, which makes it difficult to achieve a complete spatial characterization of the wireless propagation environment. On the other hand, limited attention has been paid to the fusion mechanism between channel and sensing target parameter information, thereby hampering the further improvement of parameter estimation and localization accuracy.  Methods  To address the problems of parameter estimation and localization for channels/sensing targets in millimeter-wave multiple-input multiple-output ISAC systems, a tensor decomposition algorithm based on information fusion is proposed. First, a unified fourth-order parallel factor model is constructed at the base station for the estimation of uplink channel and sensing target parameters. To reduce computational complexity, the fourth-order tensor model is transformed into a third-order form, and the trilinear alternating least squares method is adopted to estimate the three factor matrices. Furthermore, by exploiting the special structure of a factor matrix, the proposed algorithm incorporates a closed-form decomposition to decouple the coupled factor matrix, from which the angle of departure, angle of arrival, time delay, Doppler shift, and coefficients are extracted from the four estimated factor matrices. On this basis, the localization of mobile transmitter (MT), SPs, and sensing targets is realized separately using geometric relationships, while the estimation accuracy of SPs is effectively improved by fusing the Doppler shift and position information of SPs and sensing targets. Besides, the Cramér-Rao bound is derived to establish a theoretical performance benchmark for the five parameters.  Results and Discussions  The first simulation experiment shows that the proposed algorithm and the Op-QALS algorithm outperform the Co-SVD-BALS algorithm in both channel/sensing target parameter estimation and localization (Fig. 2, Fig. 3, Fig. 4). With information fusion, the proposed algorithm achieves the best performance in Doppler shift and position estimation for SPs (Fig. 2(d), Fig. 4(a)). This is attributed to the fact that both the proposed algorithm and Op-QALS algorithm fully exploit the multi-dimensional structure of the received signal, and the fusion operation further enhances the estimation capability of the proposed algorithm, whereas the Co-SVD-BALS algorithm suffers from severe error accumulation during its stepwise factor matrix estimation. Moreover, the average processing time (APT) required by the proposed algorithm for localization is slightly higher than that of Co-SVD-BALS algorithm, but significantly lower than that of Op-QALS algorithm (Table 1 and Table 2). Therefore, the proposed algorithm achieves excellent parameter estimation and localization performance at a reasonable computational cost. The second simulation experiment shows that under two signal-to-noise ratio levels, the localization accuracy of all algorithms improves gradually with the increase of \begin{document}$ K $\end{document}, while the proposed algorithm maintains comparable SP and MT localization accuracy to Op-QALS algorithm, but with notably lower APT (Fig. 5). Furthermore, the incorporation of the fusion operation does not significantly increase the APT of the proposed algorithm (Fig. 5(d)). The third simulation experiment indicates that increasing \begin{document}$ {M}_{\mathrm{RE}}\left(M_{\mathrm{RE}}^{\mathrm{s}}\right) $\end{document}and \begin{document}$ N $\end{document} helps enhance the ability of the proposed algorithm to resolve multipath signals, thereby obtaining more precise localization performance (Fig. 6).  Conclusions  This paper proposes a unified tensor framework-based information fusion algorithm for channel/sensing target parameter estimation and localization. By exploiting the Vandermonde structure of a factor matrix, the proposed algorithm maintains estimation accuracy while reducing complexity. Besides, fusion operation further improves SP estimation and localization without significantly increasing computational overhead. Future work will extend the algorithm to more general array configurations and explore higher-order tensor processing in multi-base-station cooperation or multi-user access scenarios.
Physical-layer Security in Visible Light Communications: Fundamental Theories, Key Techniques, and Future Challenges
WANG Jinyuan, YAN Xinrun, LIN Zihan, LI Yuanyuan, LI Zheng, ZHANG Xin
 doi: 10.11999/JEIT260338
[Abstract](27) [FullText HTML](13) [PDF 1939KB](3)
Abstract:
  Significance   Due to the broadcast nature of optical signals, information security represents a critical research direction in visible light communication (VLC). Conventional encryption techniques address network security issues at the upper layers of the protocol stack through access control, cryptographic protection, and end-to-end encryption. However, their security relies on the assumption that eavesdroppers possess limited computational capabilities, an assumption that currently faces significant challenges. In recent years, physical layer security (PLS) has emerged as a novel information security paradigm and has attracted considerable attention from researchers worldwide. PLS exploits the randomness, heterogeneity, and distinctiveness between the main channel and the eavesdropping channel to achieve secure information transmission at the physical layer. To date, extensive research achievements have been made regarding PLS techniques in conventional radio frequency wireless communications (RFWC). Nevertheless, due to substantial differences in frequency bands, transmitted signals, power representations, and channel characteristics, PLS research results from RFWC systems cannot be directly applied to VLC. Although scholars worldwide have conducted research on VLC PLS technology, the foundational theories, key techniques, and future challenges involved in VLC PLS still lack a systematic review. To bridge this gap, this paper presents a comprehensive survey of VLC PLS technology.  Progress   To evaluate and enhance system performance, a classic VLC PLS system model—comprising the received signal model, the input constraint model, and the channel gain model—is initially established. A comprehensive theoretical framework for performance evaluation is then developed, encompassing instantaneous performance metrics, statistical performance metrics, and asymptotic performance metrics. Specifically, to characterize instantaneous performance, existing works on instantaneous secrecy capacity and instantaneous secrecy rate across different scenarios are summarized. As statistical performance metrics, average secrecy capacity, average secrecy rate, secrecy outage probability, probability of strictly positive secrecy capacity, and interception probability are analyzed. To demonstrate asymptotic performance, secrecy diversity order and secrecy degrees of freedom are derived. Furthermore, to enhance the PLS performance, advanced technologies, including secure beamforming, artificial noise, physical region protection, secure coding, and secure diversity, are summarized.  Prospects   Despite existing research achievements, numerous challenges remain in VLC PLS. This paper identifies four critical challenges: (i) Accurate PLS performance limit: Deriving exact expression of secrecy capacity under VLC's unique physical constraints remains challenging. (ii) Incomplete evaluation framework: Some key metrics widely used in RFWC have not been investigated in VLC, and the construction of a comprehensive VLC PLS performance evaluation framework remains unresolved. (iii) Limitations of existing methods: Conventional PLS performance enhancement methods typically adopt a “modeling-optimization-verification” separated research paradigm, often falling into a vicious cycle of “inaccurate modeling-suboptimal solutions-limited performance gains”. Therefore, it is imperative to integrate novel technologies (such as deep learning, reinforcement learning, and digital twins) to construct a data-model dual-driven framework for VLC PLS performance enhancement. (iv) Hardware platform gap: The absence of dedicated hardware platforms featuring adversarial topologies and real-time processing capabilities significantly impedes the practical deployment of VLC PLS technologies. Therefore, addressing these challenges is essential for transitioning VLC PLS from theoretical advances to commercial applications.  Conclusions  The broadcast nature of optical signals renders VLC systems vulnerable to eavesdropping attacks. This paper presents a comprehensive survey of PLS in VLC, covering system models, performance metrics (instantaneous, statistical, and asymptotic), and key performance enhancement technologies including secure beamforming, artificial noise, physical region protection, secure coding, and secure diversity. Despite significant progress, challenges remain in establishing accurate performance bounds, complete evaluation frameworks, novel enhancement techniques, and practical hardware implementations. By exploiting channel disparities at the physical layer without relying on complex encryption, PLS represents a paradigm shift in security assurance, paving the way for next-generation secure and reliable VLC networks.
A Multimodal Sentiment Analysis Model with Multi-source Knowledge guided Visual Confidence Perception
PENG Juhong, ZHANG Zhi, LIU Peng, GE Wenhui, LIU Chen, LIAO Lingxin, ZHANG Kai
 doi: 10.11999/JEIT260063
[Abstract](47) [FullText HTML](13) [PDF 2286KB](6)
Abstract:
  Objective  Multimodal sentiment analysis is often affected by visual noise from complex environments, image-text sentiment inconsistency, and imbalanced modality contributions. When all modalities are treated without distinction, visual noise can degrade model performance. A robust mechanism is therefore needed to evaluate visual confidence and filter redundant visual information.  Methods  A Multimodal Sentiment Analysis Model with Multi-Source Knowledge-guided Visual confidence Perception (MKVP) is proposed (Fig. 1). A multi-source knowledge guidance matrix is constructed using syntactic-dependency, sentiment-intensity, and aspect-focused operators (Fig. 2). Guided by this matrix, the Visual Confidence Perception (VCP) module measures semantic affinity and dynamically suppresses irrelevant visual noise (Fig. 3). A dual-stream parallel interaction module is then used to support deep cross-modal alignment, and a global gated fusion mechanism further adjusts the fusion weights of different modalities.  Results and Discussions  Extensive experiments are conducted on the MVSA-Single, MVSA-Multiple, and HFM datasets. The proposed MKVP model achieves accuracy and F1 scores of 77.56% and 76.70%, 72.72% and 70.66%, and 87.26% and 86.78%, respectively. Compared with the baseline models, the accuracy and F1 score are improved by 2.45% and 3.68%, 2.19% and 2.21%, and 1.83% and 1.91%, respectively (Table 3). Ablation studies show that each component contributes to performance, especially the VCP module, which filters visual noise and improves feature quality (Table 5). Feature-space visualization further confirms that the VCP module refines semantic representations by promoting clearer clustering of samples with the same sentiment polarity (Fig. 4). Case studies on mismatched image-text samples also verify the ability of the model to resolve cross-modal semantic conflicts (Table 6). Model-complexity analysis shows that MKVP maintains high computational efficiency and low inference latency (Table 8).  Conclusions  The proposed MKVP framework reduces the effects of visual noise and image-text sentiment inconsistency in multimodal sentiment analysis. By using multi-source knowledge to guide visual confidence perception and combining dual-stream interaction with dynamic gated fusion, the model learns robust sentiment representations from noisy multimodal data. This method provides an efficient and reliable solution for complex social media scenarios.
A Testability Evaluation Method Based on Reconvergent Fan-Out
WU Wenjun, LIANG Huaguo, YOU Chang, DOU Xianrui, XIAO Jiahui, LU Yingchun
 doi: 10.11999/JEIT251286
[Abstract](172) [FullText HTML](73) [PDF 1154KB](27)
Abstract:
  Objective  As the scale and structural complexity of integrated circuits continue to increase, accurate testability evaluation becomes essential for Trojan detection, fault diagnosis, and test-point optimization in modern Design-for-Testability (DFT) flows. Metrics such as controllability, observability, and fault coverage depend on reliable probabilistic modeling of signal propagation. However, existing analytical and learning-based approaches often lose accuracy in circuits with dense Reconvergent Fan-Out (RFO) structures, where strong signal correlation invalidates classical independence assumptions and causes substantial estimation bias. Although several enhanced techniques attempt to incorporate structural information, many have high computational cost or limited scalability in deeper or highly reconvergent logic networks. This work addresses these limitations by proposing a testability evaluation method that incorporates RFO structural characteristics to improve modeling accuracy while maintaining practical computational efficiency.  Methods  The proposed approach starts with a structural analysis algorithm that identifies RFO regions through topological traversal of the circuit. A dedicated RFO recognition mechanism maps each root fan-out node to its corresponding RFO nodes, capturing the structural dependencies that govern correlated signal behavior and providing the basis for accurate probabilistic modeling. Building on this structural extraction, a weighted conditional probability model is formulated to correct testability distortion in reconvergent regions. Unlike previous optimization schemes, the weighting strategy assigns influence-based weights derived from the contribution of each root node to the target node, yielding probability estimates that more accurately reflect actual testability behavior. An efficient computational framework is also developed to integrate conditional probability propagation and weight selection into a single topological traversal process, thereby maintaining low algorithmic complexity while improving accuracy.  Results and Discussions  The proposed method is evaluated on representative benchmark circuits from the ISCAS-85, ISCAS-89, ITC’99, and EPFL suites. Performance is assessed in terms of controllability accuracy, ordering consistency, fault coverage estimation, and runtime efficiency. For controllability prediction, the method achieves an average RMSE of 0.0568, which corresponds to an average reduction of 25% relative to existing techniques, as reported in Table 2. Ordering consistency also improves, with the average Spearman correlation coefficient reaching 0.935, outperforming existing techniques. Fault coverage estimation shows similarly strong performance, with an average relative error of 3.64%, which is lower than that of previously reported methods, as shown in Table 1. Runtime analysis further indicates that the proposed framework maintains practical computational efficiency. Across all benchmark circuits, the method achieves an average speedup of 7× while preserving high accuracy, as illustrated in Figure 5.  Conclusions  This work addresses the degradation in testability evaluation accuracy caused by RFO structures in integrated circuits by proposing a reconvergent-fan-out-aware testability analysis method. The presented RFO structure identification algorithm extracts reconvergent information at the topological level and establishes explicit mappings between root nodes and RFO nodes. On this structural basis, a weighted conditional probability model is constructed to mitigate probability distortion induced by signal correlation in RFO regions. An efficient computational framework is further developed to integrate the full computation into a streamlined traversal-based process. Experimental results show that the proposed technique achieves accurate fitting of controllability RMSE and ordering consistency relative to simulation-based ground truth. In testability estimation, the predicted fault coverage values also closely match the simulation results. While maintaining high accuracy, the method also has low computational overhead.
Aerial Spatio-Temporal Image Generation via Latent Diffusion Models
SHANG Yuying, HOU Yingyan, LIU Zinan, LU Wanxuan, HUANG Yuhong, WANG Yixiao, YU Hongfeng, FU Kun
 doi: 10.11999/JEIT260165
[Abstract](175) [FullText HTML](102) [PDF 4980KB](19)
Abstract:
  Objective  Aerial Earth observation plays a pivotal role in environmental monitoring, disaster warning, and urban planning. However, constraints such as flight-platform endurance and mission-window timeliness often prevent acquired aerial imagery from fully characterizing the long-term evolution of the Earth's surface. Although pre-trained latent diffusion models have shown strong potential for image generation, their application in aerial scenarios remains challenging because of the scarcity of high-quality temporal annotation data and semantic-visual misalignment caused by variable observation scales. To address these challenges, this paper proposes ASTIG, a training-free framework for Aerial Spatio-Temporal Image Generation. By leveraging the generative priors of pre-trained latent diffusion models and Large Language Models (LLMs), ASTIG provides a new paradigm for semantically controllable aerial spatio-temporal image generation.  Methods  ASTIG consists of three coordinated components. First, a dynamic semantic decomposition process is proposed to parse complex descriptions of aerial scene evolution into frame-level visual prompts, thereby compensating for the lack of temporal semantic annotations in existing aerial image-text datasets. Second, a Linguistic Binding (LB) strategy is proposed to establish explicit associations between key ground objects and their corresponding visual attributes within the cross-attention mechanism of the diffusion model, thereby improving the semantic response precision of the generated images. Third, a Temporal Anchor Attention (TAA) mechanism is incorporated. It uses dual reference frames to maintain subject stability and background consistency across the generated spatio-temporal image sequence, thus suppressing inter-frame temporal drift under training-free conditions.  Results and Discussions  ASTIG and the baseline methods are evaluated on 7,236 high-quality aerial spatio-temporal descriptions using six automated metrics, including subject consistency, background consistency, temporal flickering, motion smoothness, aesthetic quality, and imaging quality. Quantitative results (Tables 1 and 2) show that ASTIG outperforms the baseline methods in spatio-temporal image generation, with improvements of 3.91% in subject consistency and 4.57% in motion smoothness over the frame-prompt baseline. Qualitative comparisons (Fig. 4) further show its strong ability to model long-term surface evolution in aerial imagery. Ablation studies validate the individual effectiveness of the LB strategy and the TAA mechanism (Table 3 and Fig. 5). Sensitivity analyses of the intervention steps (Table 4 and Fig. 6) and binding strength (Table 5 and Fig. 7) further identify suitable parameter settings. Extension experiments from satellite perspectives (Figs. 8 and 9) also show that ASTIG has the potential to generalize beyond aerial platforms to broader Earth observation scenarios.  Conclusions  This paper proposes ASTIG, a training-free framework for aerial spatio-temporal image generation that addresses the scarcity of high-quality long-term temporal data and semantic-visual misalignment. By leveraging the generative priors of pre-trained latent diffusion models and LLMs, ASTIG integrates a dynamic semantic decomposition process, an LB strategy, and a TAA mechanism to improve temporal semantic construction, semantic response precision, and inter-frame consistency. Experimental results show that ASTIG outperforms existing baseline methods across multiple automated evaluation metrics, providing a new paradigm for aerial spatio-temporal image generation. As a training-free method, ASTIG is still limited by the prior knowledge of the backbone model. Future work will examine geometric correction and nadir-view prior constraints to better align the generated results with the physical properties of satellite imagery.
Joint Channel Estimation and Diagnosis for Blocked RIS-Assisted Multi-User Multipath Millimeter-Wave Systems
LI Shuangzhi, LIU Cong, WANG Ning, HAN Gangtao, GUO Xin
 doi: 10.11999/JEIT260093
[Abstract](168) [FullText HTML](47) [PDF 3043KB](11)
Abstract:
  Objective  Reconfigurable Intelligent Surface (RIS) can effectively modulate Millimeter-Wave (mmWave) signals and reshape the wireless propagation environment. In practical deployments, however, RIS elements are vulnerable to adverse weather and physical obstructions, which cause unpredictable distortion and motivate joint channel estimation and blockage diagnosis. Most existing studies focus on single-user systems, whereas multi-user scenarios remain insufficiently studied. This gap creates an opportunity to exploit the common RIS blockage vector and the shared RIS-Base Station (BS) channel across users. This paper therefore proposes a low-complexity framework for joint channel estimation and blockage diagnosis by exploiting the sparsity and correlation of multi-user cascaded channels.  Methods  Under the assumption that all User Equipment (UE) shares the same RIS-BS channel and is affected by a common RIS blockage vector, the problem is divided into two stages. First, a target UE is selected. The sparsity of the mmWave channel and blockage vector, together with the linear dependence among RIS-BS paths, is used to formulate a sparse recovery problem. A hierarchical Bayesian model is then adopted, and an efficient Sparse Bayesian Learning (SBL) algorithm is used for joint recovery. Second, partial Channel State Information (CSI) obtained from the target UE is used to construct a common channel matrix that combines the RIS-BS channel and blockage information. Channel estimation for the remaining UEs is then reformulated as another sparse recovery problem.  Results and Discussions  A low-complexity strategy for cascaded channel estimation and blockage diagnosis is developed by exploiting the sparsity and correlation of multi-user cascaded channels and the commonality of the RIS blockage vector. Ideal estimation results are used as a theoretical lower bound, and the proposed algorithm is compared with two benchmark schemes. Simulation results show that the proposed algorithm consistently outperforms the benchmark schemes (Fig. 1). Specifically, a higher target-user Signal-to-Noise Ratio (SNR) improves the Normalized Mean Square Error (NMSE), which confirms the importance of target-user selection (Fig. 2). The algorithm also shows good convergence as the number of iterations increases (Fig. 3), and its performance approaches the ideal case more closely as the number of time frames increases (Fig. 4). In addition, the method remains robust as the number of blocked elements increases (Fig. 5). More BS antennas further improve performance by enhancing array orthogonality (Fig. 6). By exploiting path correlation, the proposed method achieves better estimation accuracy with slightly lower runtime (Table 1). However, estimation accuracy decreases as the number of paths increases because the model becomes more complex (Figs. 7 and 8).  Conclusions  This paper proposes a joint channel estimation and blockage diagnosis framework for blocked RIS-assisted multi-user multipath mmWave systems. Simulation results show that the method approaches the theoretical performance bound in complex multipath environments. It also maintains clear performance advantages under high blockage rates while reducing computational complexity through the use of common channel structures. This study provides a practical solution to performance degradation in RIS deployment, clarifies the effects of key parameters, and offers guidance for system design. Because practical blockages often exhibit block-sparse or structured-sparse characteristics, future work may incorporate structured priors, such as group sparsity and Markov random fields, into the SBL framework to capture spatial correlation and improve diagnostic accuracy and robustness.
Construction of MDS Entanglement-Assisted Quantum Error-Correcting Codes
QU Yuanyue, GAO Jian
 doi: 10.11999/JEIT251251
[Abstract](173) [FullText HTML](63) [PDF 779KB](33)
Abstract:
  Objective  Entanglement-Assisted Quantum Error-Correcting Codes (EAQECCs) provide an effective way to protect quantum information by using pre-shared entanglement between the sender and receiver. Existing constructions of EAQECCs mainly rely on classical cyclic or constacyclic codes and often require strong algebraic constraints, which limit the range of achievable parameters. This paper develops a general and systematic framework for constructing new families of EAQECCs from Twisted Reed-Solomon (TRS) codes over finite fields. The study has two aims. The first is to extend classical Reed-Solomon-based code design to the twisted setting so that richer algebraic structures can be used. The second is to determine the exact number of maximally entangled pairs required to attain the quantum Singleton bound. The final objective is to construct Maximum-Distance Separable (MDS) EAQECCs with greater flexibility and broader parameter ranges than existing methods.  Methods  The proposed method starts from the definition of TRS codes over finite fields. A twist parameter is introduced into the generator matrix, which changes the structure of the corresponding parity-check matrices. By systematically analyzing the associated coset-sum matrices in the twisted and untwisted cases, the rank of the relevant matrix product is determined. This rank equals the number of required entangled pairs and therefore provides the theoretical basis for the construction of EAQECCs. A detailed algebraic analysis shows that the matrix contains a submatrix with entries \begin{document}$ {M}_{l,j}=\displaystyle\sum\nolimits_{y\in W}{\left({\xi }^{j}y\right)}^{tl} $\end{document}, which simplifies to \begin{document}$ t\zeta^{jl} $\end{document}under suitable group-theoretic conditions. The resulting matrix is a Vandermonde matrix, and its full rank gives an explicit characterization of the entanglement structure. This property is then used to construct MDS EAQECCs. Based on these results, two families of EAQECCs are derived according to the number of entangled pairs. The corresponding parameters are tabulated and are shown to satisfy the quantum Singleton bound with equality, which confirms that the constructed codes are MDS.  Results and Discussions  Comprehensive parameter analysis and explicit examples verify the theoretical results. Comparative analysis further shows the flexibility of the proposed framework. Unlike previous constructions that require divisibility conditions such as \begin{document}$ a\mid (q+1) $\end{document}and \begin{document}$ a\mid (q-1) $\end{document}, the present approach remains applicable under broader algebraic settings and thus extends the feasible range of code parameters. This difference is summarized in the remark section and verified numerically. A systematic comparison with existing MDS EAQECCs (Table 4) reveals several new parameter regimes that are not accessible with classical or cyclic-code-based constructions. In particular, the proposed method yields larger code lengths and more flexible entanglement consumption rates \begin{document}$ \dfrac{c}{n} $\end{document}, which improves both the efficiency and the generality of EAQECCs. The algebraic consistency observed across all tested cases supports the correctness and general applicability of the TRS-based framework.  Conclusions  This study establishes an algebraic framework for constructing MDS EAQECCs from TRS codes. By rigorously analyzing the rank properties of coset-sum matrices, the required entanglement is determined precisely, and the conditions under which the constructed codes attain the quantum Singleton bound are identified. Two broad classes of MDS EAQECCs are obtained, corresponding to \begin{document}$ a\mid \left(q+1\right) $\end{document} and \begin{document}$ a\mid \left(q-1\right) $\end{document}, respectively, and both are verified by explicit examples and tabulated results. Compared with existing studies, the proposed approach not only generalizes earlier constructions but also extends the achievable parameter space to cases not covered by Reed-Solomon-code- or cyclic-code-based frameworks. The derived codes show improved structural flexibility, clearer algebraic characterization, and potential value for high-performance quantum information systems. This work therefore provides a unified perspective for the development of algebraically optimized EAQECCs and offers a basis for future studies of TRS-based quantum code families and their efficient encoding implementations.
GNN-driven Beamforming and Resource Allocation for RIS-assisted MISO-OFDMA Multi-group Multicast System
MA Yu, DING Chunxia, JIN Weijie, LI Xiao, JIN Shi
 doi: 10.11999/JEIT251381
[Abstract](92) [FullText HTML](46) [PDF 3666KB](13)
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RISs) have strong potential to improve coverage and Spectral Efficiency (SE) in future wireless networks. However, when RISs are applied to wideband Multiple-Input Single-Output Orthogonal Frequency Division Multiple Access (MISO-OFDMA) systems, their practical benefits are limited by two key challenges. First, RIS reflection coefficients may not match the frequency-selective channel conditions across all subcarriers. Second, subcarrier allocation, Base Station (BS) active beamforming, and RIS passive beamforming are strongly coupled. These challenges become more serious in multi-group multicast scenarios, where shared data streams increase inter-group interference. Therefore, this article proposes a Graph Neural Network (GNN)-driven optimization framework to maximize the system SE through joint active beamforming, passive beamforming, and subcarrier allocation.  Methods  To address the optimization difficulty caused by the strong coupling among subcarrier allocation, BS active beamforming, and RIS passive beamforming, this work develops a model-driven GNN optimization framework. The objective is to maximize the system SE. First, a complete system model containing the BS, RIS, and multi-group multicast users is established (Fig 1). The formulation includes practical constraints, such as the BS transmit power limit, the unit-modulus constraint of RIS elements, and the binary constraint on subcarrier allocation. To satisfy the multicast requirement, the SE of each group is defined as the minimum SE among all users in that group. This definition further increases the non-convexity of the optimization problem.The first component of the proposed network, GNN1 (Fig 3), contains an initialization layer and a message-update layer. For each subcarrier \begin{document}$ n\in \mathcal{N} $\end{document}, every user is modeled as a node, and the input to GNN1 is the set of channel matrices \begin{document}$ \left\{{\mathbf{H}}_{k,n},k\in \mathcal{K}\right\} $\end{document}. Because standard GNNs process real-valued features, each complex channel vector is decomposed into its real and imaginary parts and used as the node feature representation. Group-level aggregation (Fig. 4) and RIS-level aggregation (Fig. 5) are then performed. GNN2 (Fig 6) takes the subcarrier-wise embeddings generated by GNN1 as input and constructs an expanded graph with group nodes (Fig. 7) and an RIS node (Fig. 8). By aggregating messages among subcarrier nodes, group nodes, and the RIS node, GNN2 fuses cross-subcarrier information and captures the global coupling among system components. Based on the integrated representation, GNN2 outputs the BS active beamforming matrix and RIS passive beamforming vector. Output-layer normalization is used to satisfy the physical constraints. Finally, given the beamforming parameters, subcarrier allocation is performed using the maximum-SE criterion. The learning objective is defined as maximizing the total SE.  Results and Discussions  The proposed GNN algorithm consistently outperforms all random benchmark schemes, including APG-randAllocate, APG-randActive, and APG-randPassive, across the full transmit power range from 0 to 20 dBm. This advantage indicates that the proposed method can dynamically handle subcarrier allocation and joint active and passive beamforming optimization. It also maintains stable and superior performance under large transmit-power variations. Overall, the system SE of all schemes increases monotonically with BS transmit power because higher transmit power improves the received signal-to-noise ratio and increases the achievable rate. Compared with the benchmark methods, the GNN adaptively coordinates BS active beamforming and RIS passive beamforming at different power levels and better uses the reflection gain provided by the RIS. Therefore, the GNN maintains a consistent performance advantage across the full power range. Even in the high-power region, it outperforms APG and LAO, which further verifies its robustness (Fig. 10).When the number of RIS elements varies, the GNN maintains a clear performance advantage over both APG and LAO. In general, the system SE increases with the number of RIS elements because a larger RIS provides higher array gain and improves the equivalent channel conditions. According to the numerical results, the proposed GNN achieves a spectral efficiency of 2.066 5 bit/(s·Hz), which is approximately 6.94% and 3.65% higher than those of LAO and APG, respectively. Meanwhile, the average computational time of the GNN is only about 0.007 5 s, which is approximately 4% of that required by the benchmark methods. These results demonstrate that the proposed GNN effectively uses the performance gain provided by RIS scaling and achieves a good balance between system performance and computational complexity (Fig. 11 and Table 2).The relationship between system SE and the number of user groups is then examined under fixed settings for the number of transmit antennas and users. The overall SE decreases as the number of user groups increases. This decrease occurs because more multicast groups lead to stronger inter-group interference and because limited subcarrier resources must be shared among more groups. In all considered scenarios, the proposed GNN consistently outperforms LAO. Although its SE is slightly lower than that of APG, the GNN still achieves about 98% of APG performance while requiring only about 4% of the computational time. This result indicates that the proposed method can reduce computational overhead while maintaining near-optimal system performance, which is useful for real-time or large-scale deployment (Fig. 12).The generalization ability of the proposed GNN is further evaluated by training the model at a fixed transmit power and testing it over a wide transmit power range from 0 to 20 dBm. The training and testing curves almost overlap, indicating that the proposed GNN generalizes well to unseen transmit power levels. Across the full power range, the GNN consistently outperforms the LAO and APG benchmarks, further confirming its robustness and adaptability under different transmission conditions (Fig 13).  Conclusions  For the RIS-assisted MISO-OFDMA system, this paper formulates a joint optimization problem for subcarrier allocation, BS active beamforming, and RIS passive beamforming to maximize the system SE. A model-driven GNN method is proposed to solve this problem. Comparative experiments with benchmark algorithms are conducted to validate the proposed method. The results demonstrate that the proposed GNN algorithm consistently outperforms LAO and APG in overall performance. It also exhibits strong robustness under different numbers of user groups and transmit power settings, which supports its potential for practical deployment in complex engineering scenarios.
2026, 48(4): 1-1.  
[Abstract](96) [FullText HTML](63) [PDF 689KB](16)
Abstract:
2026, 48(4): 1-4.  
[Abstract](72) [FullText HTML](62) [PDF 287KB](11)
Abstract:
Special Topic on Security and Privacy Protection in Cyber-Physical Systems
AutoPenGPT: Drift-Resistant Penetration Testing Driven by Search-Space Convergence and Dependency Modeling
HUANG Weigang, FU Lirong, LIU Peiyu, DU Linkang, YE Tong, XIA Yifan, WANG Wenhai
2026, 48(4): 1401-1411.   doi: 10.11999/JEIT250873
[Abstract](544) [FullText HTML](305) [PDF 3167KB](59)
Abstract:
  Objective  Industrial Control Systems (ICS) are widely deployed in critical sectors and often contain long-standing vulnerabilities due to strict availability requirements and limited patching opportunities. The increasing exposure of external management and access infrastructure has expanded the attack surface and allows adversaries to pivot from boundary components into fragile production networks. Continuous penetration testing of these components is essential but remains costly and difficult to scale when carried out manually. Recent work examines Large Language Models (LLMs) for automated penetration testing; however, existing systems often experience strategy drift and intention drift, which produce incoherent testing behaviors and ineffective exploitation chains.  Methods  This study proposes AutoPenGPT, a multi-agent framework for automated Web security testing. AutoPenGPT uses an adaptive exploration-space convergence mechanism that predicts likely vulnerability types from target semantics and constrains LLM-driven testing through a dynamically updated payload knowledge base. To reduce intention drift in multi-step exploitation, a dependency-driven strategy module rewrites historical feedback, models step dependencies, and generates coherent, executable strategies in a closed-loop workflow. A semi-structured prompt embedding scheme is also developed to support heterogeneous penetration testing tasks while preserving semantic integrity.  Results and Discussions  AutoPenGPT is evaluated on Capture-the-Flag (CTF) benchmarks and real-world ICS and Web platforms. On CTF datasets, it achieves 97.62% vulnerability-type detection accuracy and an 80.95% requirement completion rate, exceeding state-of-the-art tools by a wide margin. In real-world deployments, it reaches approximately 70% requirement completion and identifies seven previously undisclosed vulnerabilities, demonstrating practical effectiveness.  Conclusions   The contributions are threefold. (1) Strategy drift and intention drift in LLM-driven penetration testing are examined and addressed through adaptive exploration and dependency-aware strategy mechanisms that stabilize long-horizon testing behaviors. (2) AutoPenGPT is designed and implemented as a multi-agent penetration testing system that integrates semantic vulnerability prediction, closed-loop strategy generation, and semi-structured prompt embedding. (3) Extensive evaluation on CTF and real-world ICS and Web platforms confirms the effectiveness and practicality of the system, including the discovery of previously unknown vulnerabilities.
Resilient Average Consensus for Second-Order Multi-Agent Systems: Algorithms and Application
FANG Chongrong, HUAN Yuehui, ZHENG Wenzhe, BAO Xianchen, LI Zheng
2026, 48(4): 1412-1423.   doi: 10.11999/JEIT251155
[Abstract](330) [FullText HTML](187) [PDF 4881KB](45)
Abstract:
  Objective  Multi-Agent Systems (MASs) are central to collaborative tasks in dynamic environments, and consensus algorithms are essential for applications such as formation control. However, MASs are vulnerable to misbehaviors (e.g., malicious attacks or accidental faults) that disrupt consensus and degrade system performance. Existing resilient consensus methods for first-order systems are insufficient for second-order MASs, where both position and velocity states must be considered. This study develops a resilient average consensus framework for second-order MASs that maintains accurate collaboration under misbehaviors. The main challenges are distributed error detection and compensation for two-dimensional state errors (position and velocity) using one-dimensional acceleration inputs.  Methods  The study derives sufficient conditions for second-order average consensus under misbehaviors using graph theory and Lyapunov stability analysis. The system is modeled as an undirected graph \begin{document}$ \mathcal{G}=(\mathcal{V},\mathcal{E}) $\end{document}, and agents follow double-integrator dynamics. Two algorithms are proposed. Finite Input-Errors Detection-Compensation (FIDC): For finite control input errors, Detection Strategies 1 and 2 use two-hop communication to detect discrepancies in neighbors’ states or control inputs. Compensation Scheme 1 generates input sequences that satisfy the consensus conditions in Corollary 1. Infinite Attack Detection-Compensation (IADC): For infinite errors in control inputs, velocities, and positions, the detection strategies are extended to identify falsified data. Compensation Schemes 2 and 3 reduce the effect of these errors, and an exponentially decaying error bound isolates persistent attackers. The algorithms are fully distributed and require no global information.  Results and Discussions  Simulations on a 10-agent network demonstrate the effectiveness of the algorithms. Under FIDC, agents reach exact average consensus despite finite input errors caused by malicious or faulty agents (Fig. 3). IADC ensures consensus among normal agents after isolating malicious agents that exceed the error bound (Fig. 4). Experiments on a multi-robot platform confirm resilience to real-world faults (e.g., actuator failures) and attacks (e.g., false data injection). In fault scenarios, FIDC reduces the deviation of the formation center from 180 mm to 34 mm (Fig. 6). Under attacks, IADC isolates malicious robots, allowing normal agents to converge correctly (Fig. 7). Analyses of relaxed Assumption 1 (non-adjacent misbehaving agents) show that Detection Strategy 3 and majority voting address certain connected malicious topologies (Fig. 2), although complex cases need further study.   Conclusions  This work presents a resilient average consensus framework for second-order MASs. Theoretically, the study provides sufficient conditions for consensus under misbehaviors. The FIDC and IADC algorithms enable distributed detection, compensation, and isolation of errors. Simulations and physical experiments verify that the methods achieve accurate average consensus under both finite and infinite errors. Future research will explore extensions to directed networks, time-varying topologies, and higher-dimensional systems.
Data-Driven Secure Control for Cyber-Physical Systems under Denial-of-Service Attacks: An Online Mode-Dependent Switching-Q-Learning Algorithm
ZHANG Ruifeng, YANG Rongni
2026, 48(4): 1424-1433.   doi: 10.11999/JEIT250746
[Abstract](450) [FullText HTML](217) [PDF 2095KB](69)
Abstract:
  Objective   The open network architecture of Cyber-Physical Systems (CPSs) enables flexibility and scalability, but also increases vulnerability to cyber-attacks. In particular, Denial-of-Service (DoS) attacks represent a predominant threat, causing packet loss and performance degradation by channel jamming. CPSs under dormant and active DoS attacks can be modeled as dual-mode switched systems with stable and unstable subsystems, respectively. Therefore, switched system theory provides a promising framework for secure control design with high degrees of freedom and reduced conservatism. However, exact modeling of practical CPSs remains difficult due to attacks and noise. Although Q-learning-based control shows potential for unknown CPSs, a critical gap persists for switched systems with unstable modes, especially in establishing an evaluable stability criterion. Hence, learning-based secure control design and an evaluable security criterion for unknown CPSs under DoS attacks remain open problems.  Methods   An online mode-dependent switching-Q-learning algorithm is proposed to study data-driven secure control and an evaluable criterion for unknown CPSs under DoS attacks. First, CPSs under dormant and active DoS attacks are transformed into switched systems with stable and unstable subsystems, respectively. Then, the optimal control problem of the value function is addressed for model-based switched systems by constructing a Generalized Switching Algebraic Riccati Equation (GSARE) and deriving the corresponding mode-dependent optimal security controller. The existence and uniqueness of the GSARE solution are proved. Based on these results, a data-driven optimal security control law is developed through a novel online mode-dependent switching-Q-learning algorithm. Finally, by using the learned control gains and parameter matrices, a data-driven evaluable security criterion related to attack frequency and duration is established under switching and subsystem constraints.  Results and Discussions   Comparative experiments using a wheeled robot are conducted to verify the efficiency and advantages of the proposed methods. First, comparison between the model-based result (Theorem 1) and the data-driven result (Algorithm 1) shows that the optimal control gains and parameter matrices under threshold errors are successfully obtained from both the GSARE and the proposed learning algorithm, as indicated by the iterative curves (Fig. 2 and Fig. 3). Meanwhile, the tracking errors of the CPS converge to zero under the proposed data-driven controller (Fig. 5), ensuring exponential stability and verifying algorithm effectiveness. Second, the learning process curves (Fig. 4) show that although the initial learned control gain is not stabilizing, Algorithm 1 still converges to an optimal stabilizing gain. This result reduces conservatism compared with existing Q-learning approaches that require stabilizing initial gains. Third, comparison between the proposed data-driven evaluable security criterion (Theorem 2) and existing criteria shows that, even when the learned switching parameters do not satisfy conventional dwell-time constraints, the proposed criterion yields attack frequency and duration bounds under new switching and subsystem constraints. As shown in Tab. 1, the proposed criterion is less conservative than existing evaluable criteria. Finally, applying the learned controller and obtained DoS constraints to robot tracking control demonstrates faster and more accurate trajectory tracking compared with existing Q-learning controllers (Fig. 6 and Fig. 7), confirming the advantages of the proposed approach.  Conclusions   Based on switched system theory and learning-based control, an online mode-dependent switching-Q-learning algorithm and a corresponding evaluable security criterion are presented for unknown CPSs under DoS attacks. (1) By representing CPSs under dormant and active DoS attacks as switched systems with stable and unstable subsystems, respectively, the security problem is transformed into a stabilization problem with increased design freedom and reduced conservatism. (2) A novel online mode-dependent switching-Q-learning algorithm is developed for unknown switched systems with unstable modes, and comparative experiments show reduced conservatism relative to existing Q-learning methods. (3) A data-driven evaluable security criterion is established to characterize attack frequency and duration under switching and subsystem constraints, demonstrating lower conservatism than existing criteria based on single-subsystem or dwell-time constraints.
A Learning-Based Security Control Method for Cyber-Physical Systems Based on False Data Detection
MIAO Jinzhao, LIU Jinliang, SUN Le, ZHA Lijuan, TIAN Engang
2026, 48(4): 1434-1443.   doi: 10.11999/JEIT250537
[Abstract](782) [FullText HTML](263) [PDF 1117KB](127)
Abstract:
  Objective  Cyber-Physical Systems (CPS) constitute the backbone of critical infrastructures and industrial applications, but the tight coupling of cyber and physical components renders them highly susceptible to cyberattacks. False data injection attacks are particularly dangerous because they compromise sensor integrity, mislead controllers, and can trigger severe system failures. Existing control strategies often assume reliable sensor data and lack resilience under adversarial conditions. Furthermore, most conventional approaches decouple attack detection from control adaptation, leading to delayed or ineffective responses to dynamic threats. To overcome these limitations, this study develops a unified secure learning control framework that integrates real-time attack detection with adaptive control policy learning. By enabling the dynamic identification and mitigation of false data injection attacks, the proposed method enhances both stability and performance of CPS under uncertain and adversarial environments.  Methods  To address false data injection attacks in CPS, this study proposes an integrated secure control framework that combines attack detection, state estimation, and adaptive control strategy learning. A sensor grouping-based security assessment index is first developed to detect anomalous sensor data in real time without requiring prior knowledge of attacks. Next, a multi-source sensor fusion estimation method is introduced to reconstruct the system’s true state, thereby improving accuracy and robustness under adversarial disturbances. Finally, an adaptive learning control algorithm is designed, in which dynamic weight updating via gradient descent approximates the optimal control policy online. This unified framework enhances both steady-state performance and resilience of CPS against sophisticated attack scenarios. Its effectiveness and security performance are validated through simulation studies under diverse false data injection attack settings.  Results and Discussions  Simulation results confirm the effectiveness of the proposed secure adaptive learning control framework under multiple false data injection attacks in CPS. As shown in Fig. 1, system states rapidly converge to steady values and maintain stability despite sensor attacks. Fig. 2 demonstrates that the fused state estimator tracks the true system state with greater accuracy than individual local estimators. In Fig. 3, the compensated observation outputs align closely with the original, uncorrupted measurements, indicating precise attack estimation. Fig. 4 shows that detection indicators for sensor groups 2–5 increase sharply during attack intervals, while unaffected sensors remain near zero, verifying timely and accurate detection. Fig. 5 further confirms that the estimated attack signals closely match the true injected values. Finally, Fig. 6 compares different control strategies, showing that the proposed method achieves faster stabilization and smaller state deviations. Together, these results demonstrate robust control, accurate state estimation, and real-time detection under unknown attack conditions.  Conclusions  This study addresses secure perception and control in CPS under false data injection attacks by developing an integrated adaptive learning control framework that unifies detection, estimation, and control. A sensor-level anomaly detection mechanism is introduced to identify and localize malicious data, substantially enhancing attack detection capability. The fusion-based state estimation method further improves reconstruction accuracy of true system states, even when observations are compromised. At the control level, an adaptive learning controller with online weight adjustment enables real-time approximation of the optimal control policy without requiring prior knowledge of the attack model. Future research will extend the proposed framework to broader application scenarios and evaluate its resilience under diverse attack environments.
A Two-Stage Framework for CAN Bus Attack Detection by Fusing Temporal and Deep Features
TAN Mingming, ZHANG Heng, WANG Xin, LI Ming, ZHANG Jian, YANG Ming
2026, 48(4): 1444-1453.   doi: 10.11999/JEIT250651
[Abstract](620) [FullText HTML](334) [PDF 1955KB](47)
Abstract:
  Objective  The Controller Area Network (CAN), the de facto standard for in-vehicle communication, is inherently vulnerable to cyberattacks. Existing Intrusion Detection Systems (IDSs) face a fundamental trade-off: achieving fine-grained classification of diverse attack types often requires computationally intensive models that exceed the resource limitations of on-board Electronic Control Units (ECUs). To address this problem, this study proposes a two-stage attack detection framework for the CAN bus that fuses temporal and deep features. The framework is designed to achieve both high classification accuracy and computational efficiency, thereby reconciling the tension between detection performance and practical deployability.  Methods  The proposed framework adopts a “detect-then-classify” strategy and incorporates two key innovations. (1) Stage 1: Temporal Feature-Aware Anomaly Detection. Two custom features are designed to quantify anomalies: Payload Data Entropy (PDE), which measures content randomness, and ID Frequency Mean Deviation (IFMD), which captures behavioral deviations. These features are processed by a Bidirectional Long Short-Term Memory (BiLSTM) network that exploits contextual temporal information to achieve high-recall anomaly detection. (2) Stage 2: Deep Feature-Based Fine-Grained Classification. Triggered only for samples flagged as anomalous, this stage employs a lightweight one-dimensional ParC1D-Net. The core ParC1D Block (Fig. 4) integrates depthwise separable one-dimensional convolution, Squeeze-and-Excitation (SE) attention, and a Feed-Forward Network (FFN), enabling efficient feature extraction with minimal parameters. Stage 1 is optimized using BCEWithLogitsLoss, whereas Stage 2 is trained with Cross-Entropy Loss.  Results and Discussions  The efficacy of the proposed framework is evaluated on public datasets. (1) State-of-the-art performance. On the Car-Hacking dataset (Table 4), an accuracy and F1-score of 99.99% are achieved, exceeding advanced baselines. On the more challenging Challenge dataset (Table 5), superior accuracy (99.90%) and a competitive F1-score (99.70% are also obtained. (2) Feature contribution analysis. Ablation studies (Tables 6 and 7) confirm the critical role of the proposed features. Removal of the IFMD feature results in the largest performance reduction, highlighting the importance of behavioral modeling. A synergistic effect is observed when PDE and IFMD are applied together. (3) Spatiotemporal efficiency. The complete model remains lightweight at only 0.39 MB. Latency tests (Table 8, 9) demonstrate real-time capability, with average detection times of 0.62 ms on a GPU and 0.93 ms on a simulated CPU (batch size = 1). A system-level analysis (Section 3.5.4) further shows that the two-stage framework is approximately 1.65 times more efficient than a single-stage model in a realistic sparse-attack scenario.  Conclusions  This study establishes the two-stage framework as an effective and practical solution for CAN bus intrusion detection. By decoupling detection from classification, the framework resolves the trade-off between accuracy and on-board deployability. Its strong performance, combined with a minimal computational footprint, indicates its potential for securing real-world vehicular systems. Future research could extend the framework and explore hardware-specific optimizations.
Modeling, Detection, and Defense Theories and Methods for Cyber-Physical Fusion Attacks in Smart Grid
WANG Wenting, TIAN Boyan, WU Fazong, HE Yunpeng, WANG Xin, YANG Ming, FENG Dongqin
2026, 48(4): 1454-1468.   doi: 10.11999/JEIT250659
[Abstract](568) [FullText HTML](496) [PDF 1035KB](62)
Abstract:
  Significance   Smart Grid (SG), the core of modern power systems, enables efficient energy management and dynamic regulation through cyber-physical integration. However, its high interconnectivity makes it a prime target for cyberattacks, including False Data Injection Attacks (FDIAs) and Denial-of-Service (DoS) attacks. These threats jeopardize the stability of power grids and may trigger severe consequences such as large-scale blackouts. Therefore, advancing research on the modeling, detection, and defense of cyber-physical attacks is essential to ensure the safe and reliable operation of SGs.  Progress   Significant progress has been achieved in cyber-physical security research for SGs. In attack modeling, discrete linear time-invariant system models effectively capture diverse attack patterns. Detection technologies are advancing rapidly, with physical-based methods (e.g., physical watermarking and moving target defense) complementing intelligent algorithms (e.g., deep learning and reinforcement learning). Defense systems are also being strengthened: lightweight encryption and blockchain technologies are applied to prevention, security-optimized Phasor Measurement Unit (PMU) deployment enhances equipment protection, and response mechanisms are being continuously refined.  Conclusions  Current research still requires improvement in attack modeling accuracy and real-time detection algorithms. Future work should focus on developing collaborative protection mechanisms between the cyber and physical layers, designing solutions that balance security with cost-effectiveness, and validating defense effectiveness through high-fidelity simulation platforms. This study establishes a systematic theoretical framework and technical roadmap for SG security, providing essential insights for safeguarding critical infrastructure.  Prospects   Future research should advance in several directions: (1) deepening synergistic defense mechanisms between the information and physical layers; (2) prioritizing the development of cost-effective security solutions; (3) constructing high-fidelity information-physical simulation platforms to support research; and (4) exploring the application of emerging technologies such as digital twins and interpretable Artificial Intelligence (AI).
ReXNet: A Trustworthy Framework for Space-air Security Integrating Uncertainty Quantification and Explainability
LIU Zhuang, CHEN Yuran, ZHANG Jiatong, JIANG Yujing, WANG Xuhui
2026, 48(4): 1469-1479.   doi: 10.11999/JEIT251159
[Abstract](363) [FullText HTML](196) [PDF 8945KB](55)
Abstract:
  Objective  The Space-Air-Ground Integrated Network (SAGIN) has emerged as a strategic infrastructure for national development. However, its security vulnerabilities are increasingly evident. The physical, network, and application layers of SAGIN face different security challenges that require targeted protection strategies. Aerospace scenarios require both high predictive accuracy and transparent decision making. Therefore, more robust, reliable, and interpretable intelligent methods are needed to support network security and system trustworthiness.  Methods  A detection framework is proposed that integrates Uncertainty Quantification (UQ) and eXplainable Artificial Intelligence (XAI). In the front-end stage, a Bayesian deep learning method based on Monte Carlo Dropout is adopted to enable probabilistic prediction modeling. This approach separates and quantifies epistemic uncertainty and aleatoric uncertainty, which improves model reliability. In the back-end stage, SHAP and LIME are applied to provide feature attribution for each prediction, improving model interpretability and transparency. Moreover, the intermediate layer of the framework allows flexible replacement of deep learning backbones, enabling adaptation to different space and aerospace application scenarios.  Results and Discussions  Extensive experiments were conducted on representative space-air security datasets, including UAV swarm fault detection, ADS-B injection attacks, and network fraud detection. The experimental results show that the proposed framework achieves high-precision anomaly detection. It also evaluates prediction confidence and identifies unknown samples outside the model knowledge boundary. In addition, the framework generates logically consistent and traceable explanations for model decisions, which improves interpretability and operational reliability. The results indicate that the combined use of UQ and XAI improves the robustness and trustworthiness of intelligent models in aerospace security applications.  Conclusions  This study improves the reliability and transparency of anomaly detection models in the space-air domain. It reflects a transition in artificial intelligence applications from focusing only on prediction accuracy to emphasizing system trustworthiness. Future work will promote practical deployment of the framework. The focus will include real-time processing capability, lightweight implementation, and operation in resource-constrained environments such as onboard and on-orbit systems. These efforts support more secure, autonomous, and efficient operation of SAGIN and contribute to the sustainable development of future space-air information networks.
LLM-based Data Compliance Checking for Internet of Things Scenarios
LI Chaohao, WANG Haoran, ZHOU Shaopeng, YAN Haonan, ZHANG Feng, LU Tianyang, XI Ning, WANG Bin
2026, 48(4): 1480-1494.   doi: 10.11999/JEIT250704
[Abstract](489) [FullText HTML](284) [PDF 5467KB](47)
Abstract:
  Objective  The implementation of regulations such as the Data Security Law of the People’s Republic of China, the Personal Information Protection Law of the People’s Republic of China, and the European Union General Data Protection Regulation (GDPR) has established data compliance checking as a central mechanism for regulating data processing activities, ensuring data security, and protecting the legitimate rights and interests of individuals and organizations. However, the characteristics of the Internet of Things (IoT), defined by large numbers of heterogeneous devices and the dynamic, extensive, and variable nature of transmitted data, increase the difficulty of compliance checking. Logs and traffic data generated by IoT devices are long, unstructured, and often ambiguous, which results in a high false-positive rate when traditional rule-matching methods are applied. In addition, the dynamic business environments and user-defined compliance requirements further increase the complexity of rule design, maintenance, and decision-making.  Methods  A large language model-driven data compliance checking method for IoT scenarios is proposed to address the identified challenges. In the first stage, a fast regular expression matching algorithm is employed to efficiently screen potential non-compliant data based on a comprehensive rule database. This process produces structured preliminary checking results that include the original non-compliant content and the corresponding violation type. The rule database incorporates current legislation and regulations, standard requirements, enterprise norms, and customized business requirements, and it maintains flexibility and expandability. By relying on the efficiency of regular expression matching and generating structured preliminary results, this stage addresses the difficulty of reviewing large volumes of long IoT text data and enhances the accuracy of the subsequent large language model review. In the second stage, a Large Language Model (LLM) is employed to evaluate the precision of the initial detection results. For different categories of violations, the LLM adaptively selects different prompt words to perform differentiated classification detection.  Results and Discussions  Data are collected from 52 IoT devices operating in a real environment, including log and traffic data (Table 2). A compliance-checking rule library for IoT devices is established in accordance with the Cybersecurity Law, the Data Security Law, other relevant regulations, and internal enterprise information-security requirements. Based on this library, the collected data undergo a first-stage rule-matching process, yielding a false-positive rate of 64.3% and identifying 55 080 potential non-compliant data points. Three aspects are examined: benchmark models, prompt schemes, and role prompts. In the benchmark model comparison, eight mainstream large language models are used to evaluate detection performance (Table 5), including Qwen2.5-32B-Instruct, DeepSeek-R1-70B, and DeepSeek-R1-0528 with different parameter configurations. After review and testing by the large language model, the initial false-positive rate is reduced to 6.9%, which demonstrates a substantial improvement in the quality of compliance checking. The model’s own error rate remains below 0.01%. The prompt-engineering assessment shows that prompt design exerts a strong effect on review accuracy (Table 6). When general prompts are applied, the final false-positive rate remains high at 59%. When only chain-of-thought prompts or concise sample prompts are used, the false-positive rate is reduced to approximately 12% and 6%, respectively, and the model’s own error rate decreases to about 30% and 13%. Combining these strategies further reduces the error rate of the small-sample prompt approach to 0.01%. The effect of system-role prompt words on review accuracy is also evaluated (Table 7). Simple role prompts yield higher accuracy and F1 scores than the absence of role prompts, whereas detailed role prompts provide a clearer overall advantage than simple role prompts. Ablation experiments (Table 8) further examine the contribution of rule classification and prompt engineering to compliance checking. Knowledge supplementation is applied to reduce interference and misjudgment among rules, lower prompt redundancy, and decrease the false-alarm rate during large language model review.  Conclusions  A large language model-driven data compliance checking method for IoT scenarios is presented. The method is designed to address the challenge of assessing compliance in large-scale unstructured device data. Its feasibility is verified through rationality analysis experiments, and the results indicate that false-positive rates are effectively reduced during compliance checking. The initial rule-based method yields a false-positive rate of 64.3%, which is reduced to 6.9% after review by the large language model. Additionally, the error introduced by the model itself is maintained below 0.01%.
A Complexity-Reduced Active Interference Cancellation Algorithm in f-OFDM
CHEN Hao, WEN Jiangang, ZOU Yuanping, HUA Jingyu, SHENG Bin
2026, 48(4): 1495-1504.   doi: 10.11999/JEIT251172
[Abstract](245) [FullText HTML](128) [PDF 5498KB](23)
Abstract:
  Objective  Due to spectrum scarcity and diverse communication requirements, a waveform technology with high spectral efficiency, flexible subband configuration, and support for asynchronous communication is required for Sixth Generation mobile communication (6G). Among the candidate waveforms, filtered Orthogonal Frequency Division Multiplexing (f-OFDM) is considered a promising solution that satisfies these requirements. By applying subband filtering, f-OFDM enables flexible subband configuration and asynchronous transmission. However, the filtering mechanism inevitably introduces intrinsic interference into the system. A dominant component of this interference is InTer-subBand Interference (ITBI), which is mainly caused by Out-Of-Band Emission (OOBE) leakage from adjacent subbands. Therefore, suppressing subband OOBE is essential for reducing ITBI and improving the performance of f-OFDM systems. Based on the structure of f-OFDM systems, a Complexity-Reduced Active Interference Cancellation (CRAIC) algorithm is proposed to suppress the OOBE of f-OFDM subbands and improve overall system performance.  Methods  First, based on the spectral structure of f-OFDM, a subset of data subcarriers in the target subband is used to generate Cancellation Carriers (CCs). A CRAIC optimization model for f-OFDM systems is then constructed under the constraint of CCs power. The cost function is defined according to the superposed spectrum of data subcarriers and CCs at Desired Frequency Points (DFPs). Second, by introducing a real-complex domain transformation and reformulating the optimization model, the original complex-domain CRAIC programming problem is converted into a real-domain Second-Order Cone Programming (SOCP) problem, which enables efficient computation. Furthermore, computer simulations evaluate the effects of key parameters on CRAIC performance, including the number of CCs (\begin{document}$ M $\end{document}), the number of data subcarriers used to generate CCs (\begin{document}$ K $\end{document}), and the number of DFPs (\begin{document}$ Q $\end{document}). Based on these evaluations, practical recommendations are provided for configuring CRAIC parameters in f-OFDM systems.  Results and Discussions  Simulation results show that in the edge region of the adjacent subband, the proposed CRAIC algorithm produces the steepest Power Spectral Density (PSD) roll-off compared with the conventional ZP and Origin schemes. This result indicates that CRAIC provides the strongest ITBI suppression in this region and achieves the lowest Bit Error Rate (BER) for Edge Subcarriers (ESs) in the adjacent subband. Specifically, CRAIC achieves a maximum PSD reduction of 4 dB and 12 dB compared with ZP and Origin, respectively (Fig. 2a). This result occurs because the right Q/2 DFPs are largely located in the edge region of SB2, which leads to effective spectral suppression in this area. Therefore, the BER at the edge of SB2 is significantly lower for CRAIC than for Origin, and a visible performance improvement is also observed compared with ZP (Fig. 3a). Furthermore, the effects of key parameters \begin{document}$ M $\end{document}, \begin{document}$ K $\end{document} and \begin{document}$ Q $\end{document} are examined through simulations. The results show that increasing \begin{document}$ M $\end{document} continuously improves OOBE suppression capability (Fig. 4a), although spectral efficiency gradually decreases. In contrast, increasing \begin{document}$ K $\end{document} and \begin{document}$ Q $\end{document} produces only limited performance improvement. When these parameters exceed certain values, further increases do not provide additional gains (Fig. 5a and Fig. 6a). Based on these observations, \begin{document}$ M=4 $\end{document}, \begin{document}$ K=8 $\end{document}, \begin{document}$ Q=4 $\end{document} are selected as typical parameter settings for the scenario considered in this study. Under this configuration, CRAIC (\begin{document}$ K=8 $\end{document}) achieves significant improvements in ES BER compared with Origin and ZP (Fig. 8a), whereas the BER of Internal Subcarriers (ISs) remains nearly the same as that of the two benchmark schemes (Fig. 8b). Compared with the full-scale CRAIC scheme (\begin{document}$ K=20 $\end{document}), CRAIC (\begin{document}$ K=8 $\end{document}) reduces the size of the data-subcarrier mapping matrix by 60% while causing only limited BER degradation (Fig. 8a). These results indicate that the proposed algorithm preserves the performance of the full-scale Active Interference Cancellation (AIC) scheme while substantially reducing computational complexity.  Conclusions  A CRAIC algorithm for filtered OFDM systems is studied. The CRAIC optimization model is constructed under the constraint of CC power, and the cost function is defined based on the superposed spectrum of selected data subcarriers and CCs at DFPs. Through real-imaginary domain conversion and model reformulation, the complex-domain optimization problem is converted into a real-domain SOCP problem. Simulation results show that the CRAIC algorithm effectively reduces the PSD of the target subband, particularly in the transition region of the adjacent subband, which leads to clear improvement in edge BER performance. The effects of key parameters are also evaluated. Increasing \begin{document}$ M $\end{document} increases the performance gain of CRAIC compared with ZP, although spectral efficiency decreases. Increasing \begin{document}$ K $\end{document} improves OOBE suppression, although the gain gradually decreases and computational complexity increases. Increasing \begin{document}$ Q $\end{document} does not continuously reduce PSD. Overall, the CRAIC algorithm improves subband isolation in f-OFDM systems, reduces ITBI, and improves system performance.
Delay Deterministic Routing Algorithm Based on Inter-controller Cooperation for Multi-layer Low Earth Orbit Satellite Networks
HUANG Longhui, DING Xiaojin, ZHANG Gengxin
2026, 48(4): 1505-1516.   doi: 10.11999/JEIT251100
[Abstract](360) [FullText HTML](257) [PDF 2128KB](20)
Abstract:
  Objective  The massive scale and large number of satellites in multi-layer Low Earth Orbit (LEO) constellations produce highly dynamic network topologies. Coupled with time-varying traffic loads, this condition causes temporal fluctuations in satellite network resources, such as available link queue size and link bandwidth. These variations make it difficult to establish stable end-to-end transmission paths and guarantee Quality of Service (QoS). To address this problem, Software-Defined Networking (SDN) is applied to multi-layer LEO constellations. SDN controllers collect network state information and enable unified management of network resources. The constellation is divided into multiple regions, with a controller deployed in each region to coordinate the operation of the constellation. A deterministic delay routing algorithm is designed within the SDN controller to compute inter-region transmission paths for traffic and satisfy deterministic delay requirements.   Methods   A deterministic delay routing algorithm based on controller cooperation is proposed for multi-layer LEO constellations. First, a regional division strategy and controller deployment scheme are designed. The satellite network is partitioned into multiple regions, each managed by a designated controller. Second, criteria are defined for Inter-Satellite Links (ISLs) between satellites within the same layer and across different layers to characterize link communication states. Third, a Time-Varying Graph (TVG) model represents the network topology and link resource attributes, including bandwidth, queue size, and link duration. This model is combined with a multi-destination Lagrange relaxation method to optimize path selection. The resulting paths satisfy both delay and delay jitter constraints. Adjacent regional controllers exchange network state information to support cooperative computation of feasible inter-region transmission paths.   Results and Discussions   To evaluate the proposed method, a simulation system for multi-layer LEO constellations was developed. The performance of the algorithm was tested under different data transmission rates. Compared with IUDR, the proposed method improves network performance by reducing end-to-end delay, delay jitter, and packet loss rate, and by increasing throughput. At a data transmission rate of 3 Mbps, the average end-to-end delay is reduced by 16.0% (Fig. 3(a)), delay jitter by 37.9% (Fig. 3(b)), and packet loss rate by 37.2% (Fig. 3(c)). Throughput increases by approximately 2% (Fig. 3(d)). In terms of signaling overhead, the proposed algorithm achieves a higher Reduction-Improvement Gain Ratio, which increases by approximately 111.8% compared with IUDR. This result indicates superior overall performance of the DDRA-ICC. Additionally, the proposed method shows lower time complexity for route computation than IUDR.   Conclusions   To address deterministic delay requirements for traffic transmission in multi-layer LEO constellations, a controller cooperation-based deterministic delay routing algorithm is proposed. Performance evaluation under different load conditions shows that: (1) Compared with IUDR, the proposed algorithm reduces the average end-to-end delay, delay jitter, and packet loss rate by 16.0%, 37.9%, and 37.2%, respectively, and increases the average throughput by approximately 2%. (2) Although the additional overhead of DDRA-ICC is comparable to that of IUDR, the packet loss rate decreases further to 2.96%, representing a reduction of 52.49%, and the Reduction-Improvement Gain Ratio reaches 1.97. These results indicate lower packet loss, a higher Reduction-Improvement Gain Ratio, and a better balance between signaling overhead and reliability. Therefore, the proposed method provides advantages in ensuring deterministic traffic transmission. Future work may consider additional practical factors, such as satellite node failures and their effects on network performance, to further improve system capability.
Two-Channel Joint Coding Detection for Cyber-Physical Systems Against Integrity Attacks
MO Xiaolei, ZENG Weixin, FU Jiawei, DOU Keqin, WANG Yanwei, SUN Ximing, LIN Sida, SUI Tianju
2026, 48(4): 1517-1527.   doi: 10.11999/JEIT250729
[Abstract](265) [FullText HTML](180) [PDF 1979KB](42)
Abstract:
  Objective  Cyber-Physical Systems (CPS) are widely applied across infrastructure, aviation, energy, healthcare, manufacturing, and transportation, as computing, control, and sensing technologies advance. Due to the real-time interaction between information and physical processes, such systems are exposed to security risks during data exchange. Attacks on CPS can be grouped into availability, integrity, and reliability attacks based on information security properties. Integrity attacks manipulate data streams to disrupt the consistency between system inputs and outputs. Compared with the other two types, integrity attacks are more difficult to detect because of their covert and dynamic nature. Existing detection strategies generally modify control signals, sensing signals, or system models. Although these approaches can detect specific categories of attacks, they may reduce control performance and increase model complexity and response delay.  Methods  A joint additive and multiplicative coding detection scheme for the two-channel structure of control and output is proposed. Three representative integrity attacks are tested, including a control-channel bias attack, an output-channel replay attack, and a two-channel covert attack. These attacks remain stealthy by partially or fully obtaining system information and manipulating data so the residual-based χ2 detector output stays below the detection threshold. The proposed method introduces paired additive watermarking signals with positive and negative patterns, together with paired multiplicative coding and decoding matrices on both channels. These additional unknown signals and parameters introduce information uncertainty to the attacker and cause the residual statistics to deviate from the expected values constructed using known system information. The watermarking pairs and matrix pairs operate through different mechanisms. One uses opposite-sign injection, while the other uses a mutually inverse transformation. Therefore, normal control performance is maintained when no attack is present. The time-varying structure also prevents attackers from reconstructing or bypassing the detection mechanism.  Results and Discussions  Simulation experiments on an aerial vehicle trajectory model are conducted to assess both the influence of integrity attacks on flight paths and the effectiveness of the proposed detection scheme. The trajectory is modeled using Newton’s equations of motion, and attitude dynamics and rotational motion are omitted to focus on positional behavior. Detection performance with and without the proposed method is compared under the three attack scenarios (Fig. 2, Fig. 3, Fig. 4). The results show that the proposed scheme enables effective identification of all attack types and maintains stable system behavior, demonstrating its practical applicability and improvement over existing approaches.  Conclusions  This study addresses the detection of integrity attacks in CPS. Three representative attack types (bias, replay, and covert attacks) are modeled, and the conditions required for their successful execution are analyzed. A detection approach combining additive watermarking and multiplicative encoding matrices is proposed and shown to detect all three attack types. The design uses paired positive-negative additive watermarks and paired encoding and decoding matrices to ensure accurate detection while maintaining normal control performance. A time-varying configuration is adopted to prevent attackers from reconstructing or bypassing the detection elements. Using an aerial vehicle trajectory simulation, the proposed approach is demonstrated to be effective and applicable to cyber-physical system security enhancement.
Dynamic State Estimation of Distribution Network by Integrating High-degree Cubature Kalman Filter and Long Short-Term Memory Under False Data Injection Attack
XU Daxing, SU Lei, HAN Heqiao, WANG Hailun, ZHANG Heng, CHEN Bo
2026, 48(4): 1528-1538.   doi: 10.11999/JEIT250805
[Abstract](344) [FullText HTML](267) [PDF 3457KB](55)
Abstract:
  Objective  Dynamic state estimation of distribution networks is presented as a core technique for maintaining secure and stable operation in cyber-physical power systems. Its practical performance is limited by strong system nonlinearity, high-dimensional state characteristics, and the threat posed by False Data Injection Attack (FDIA). A method that integrates High-degree Cubature Kalman Filter (HCKF) with Long Short-Term Memory network (LSTM) is proposed. HCKF is applied to enhance estimation precision in nonlinear high-dimensional scenarios. The estimation outputs from HCKF and Weighted Least Squares (WLS) are combined for rapid FDIA identification using residual-based analysis. The LSTM model is then employed to reconstruct measurement data of compromised nodes and refine state estimation results. The approach is validated on the IEEE 33-bus distribution system, demonstrating reliable accuracy enhancement and effective attack resilience.  Methods   The strong nonlinearity of distribution networks limits the estimation accuracy of dynamic methods based on the Cubature Kalman Filter (CKF). A hybrid measurement state estimation model that combines data from Phasor Measurement Unit (PMU) and Supervisory Control And Data Acquisition (SCADA) is established. HCKF is applied to enhance estimation performance in nonlinear, high-dimensional scenarios by generating higher-order cubature points. Under FDIA, the estimation outputs from WLS and HCKF are jointly assessed, allowing rapid intrusion detection through residual evaluation and state consistency checking. Once an attack is identified, an LSTM model performs time-series prediction to reconstruct the measurement data of compromised nodes. The reconstructed data replace abnormal values, enabling correction of the final state estimation.  Results and Discussions  Experiments on the IEEE 33-bus distribution system show that without FDIA, HCKF achieves higher estimation accuracy for voltage magnitude and phase angle than CKF. The Average voltage Relative Error (ARE) of voltage magnitude decreases by 57.9%, and the corresponding phase-angle error decreases by 28.9%, confirming the superiority of the method for strongly nonlinear and high-dimensional state estimation. Under FDIA, residual-based detection effectively identifies cyber attacks and avoids false alarms and missed detections. The prediction error of LSTM for the measurement data of compromised nodes and their associated branches remains on the order of 10–6, indicating high reconstruction fidelity. The combined HCKF and LSTM maintains stable state tracking after intrusion, and its performance exceeds that of WLS and adaptive Unscented Kalman Filter.  Conclusions  The dynamic state estimation method that integrates HCKF and LSTM enhances adaptability to strong nonlinearity and high-dimensional characteristics of distribution networks. Rapid and accurate FDIA identification is achieved through residual evaluation, and LSTM reconstructs the measurement data of compromised nodes with high reliability. The method maintains high estimation accuracy under normal operation and preserves stability and precision under cyber intrusion. It offers technical support for secure and stable operation of distribution networks in the presence of malicious attacks.
Security Protection for Vessel Positioning in Smart Waterway Systems Based on Extended Kalman Filter-Based Dynamic Encoding
TANG Fengjian, YAN Xia, SUN Zeyi, ZHU Zhaowei, YANG Wen
2026, 48(4): 1539-1548.   doi: 10.11999/JEIT250846
[Abstract](382) [FullText HTML](162) [PDF 1446KB](56)
Abstract:
  Objective  With the rapid development of intelligent shipping systems, vessel positioning data face severe privacy leakage risks during wireless transmission. Traditional privacy-preserving methods, such as differential privacy and homomorphic encryption, suffer from data distortion, high computational overhead, or reliance on costly communication links, making it difficult to achieve both data integrity and efficient protection. This study addresses the characteristics of vessel stabilization systems and proposes a dynamic encoding scheme enhanced by time-varying perturbations. By integrating the Extended Kalman Filter (EKF) and introducing unstable temporal perturbations during encoding, the scheme uses receiver-side acknowledgments (ACK feedback) to achieve reference-time synchronization and independently generates synchronized perturbations through a shared random seed. Theoretical analysis and simulations show that the proposed method achieves nearly zero precision loss in state estimation for legitimate receivers, whereas decoding errors of eavesdroppers grow exponentially after a single packet loss, effectively countering both single- and multi-channel eavesdropping attacks. The shared-seed synchronization mechanism avoids complex key management and reduces communication and computational costs, making the scheme suitable for resource-constrained maritime wireless sensor networks.  Methods  The proposed dynamic encoding scheme introduces a time-varying perturbation term into the encoding process. The perturbation is governed by an unstable matrix to induce exponential error growth for eavesdroppers. The encoded signal is constructed from the difference between the current state estimate and a time-scaled reference state, combined with the perturbation term. A shared random seed between legitimate parties enables deterministic and synchronized generation of the perturbation sequence without online key exchange. At the legitimate receiver, the perturbation is canceled during decoding, enabling accurate state recovery. Local state estimation at each sensor node is performed using EKF, and the overall communication process is reinforced by acknowledgment-based synchronization to maintain consistency between the sender and receiver.  Results and Discussions  Simulations are conducted in a wireless sensor network with four sensors tracking vessel states, including position, velocity, and heading. The results indicate that legitimate receivers achieve nearly zero estimation error (Fig. 3), Simulations were conducted in a wireless sensor network with multi-sensors tracking vessel states such as position, velocity, and heading. The results show that legitimate receivers achieve nearly zero estimation error (Fig. 3), while eavesdroppers experience exponentially growing errors after a single packet loss. The error growth rate correlates with the instability of the perturbation matrix, confirming the theoretical divergence. In multi-channel scenarios, independent perturbation sequences per channel prevent cross-channel correlation attacks. The scheme maintains low communication and computational overhead, making it practical for maritime environments. Furthermore, the method demonstrates strong adaptability to packet loss and channel variations, fulfilling SOLAS requirements for data integrity and reliability.  Conclusions  A dynamic encoding scheme with time-varying perturbations is proposed for privacy-preserving vessel state estimation. By integrating EKF with an unstable perturbation mechanism, the method ensures high estimation precision for legitimate users and exponential error growth for eavesdroppers. The main contributions are as follows: (1) an encoding framework that achieves zero precision loss for legitimate receivers; (2) a lightweight synchronization mechanism based on shared random seeds, which removes complex key management; and (3) theoretical guarantees of exponential error divergence for eavesdroppers under single- or multi-channel attacks. The scheme is robust to packet loss and channel asynchrony, complies with SOLAS data integrity requirements, and is suitable for resource-limited maritime networks. Future work will extend the method to nonlinear vessel dynamics, adaptive perturbation optimization, and validation in real maritime communication environments.
Overviews
Privacy-preserving Computation in Trustworthy Face Recognition: A Comprehensive Survey
YUAN Lin, WU Yanshang, ZHANG Liyuan, ZHANG Yushu, WANG Nannan, GAO Xinbo
2026, 48(4): 1549-1568.   doi: 10.11999/JEIT251063
[Abstract](781) [FullText HTML](396) [PDF 3747KB](115)
Abstract:
  Significance   With the widespread deployment of face recognition in Cyber-Physical Systems (CPS), including smart cities, intelligent transportation, and public safety infrastructures, privacy leakage has become a central concern for both academia and industry. Unlike many biometric modalities, face recognition operates in highly visible and loosely controlled environments, such as public spaces, consumer devices, and online platforms, where facial image acquisition is easy and pervasive. This exposure makes facial data especially vulnerable to unauthorized collection and misuse. Insufficient protection may lead to identity theft, unauthorized tracking, and deepfake generation, which threaten individual rights and reduce trust in digital systems. Therefore, facial data protection is not only a technical issue but also a significant societal and ethical challenge. This work integrates fragmented research across computer vision, cryptography, and privacy-preserving computation. It provides a unified perspective that guides the development of trustworthy face recognition ecosystems that balance usability, regulatory compliance, and public trust.  Contributions   This paper systematically reviews recent advances in privacy-preserving computation for face recognition, covering both theoretical foundations and practical implementations. The architecture and application pipeline of face recognition systems are first examined, and privacy risks at each stage are identified. At the data collection stage, unauthorized or covert capture of facial images introduces immediate risks of misuse. During model training and deployment, gradient leakage, membership inference, and overfitting may expose sensitive information about individuals contained in training data. At the inference stage, adversaries may reconstruct facial images, perform unauthorized recognition, or associate identities across datasets, which compromises anonymity. To address these threats, existing approaches are classified into four major privacy-preserving paradigms: data transformation, distributed collaboration, image generation, and adversarial perturbation. Within these paradigms, ten representative techniques are analyzed. Cryptographic computation, including homomorphic encryption and secure multiparty computation, enables recognition without revealing raw data but often introduces substantial computational overhead. Frequency-domain learning converts images into spectral representations to suppress identifiable details while retaining discriminative features. Federated learning decentralizes model training and reduces centralized data exposure, although it remains vulnerable to gradient inversion attacks. Image generation techniques, such as face synthesis and virtual identity modeling, reduce reliance on real facial data during training and evaluation. Differential privacy introduces calibrated noise to provide statistical privacy guarantees, whereas face anonymization obscures identifiable visual traits. Template protection and anti-reconstruction mechanisms defend stored facial features against reverse engineering. Adversarial privacy protection introduces imperceptible perturbations that interfere with machine recognition yet preserve human visual perception. Several representative studies in each category are further examined. Commonly used evaluation datasets are summarized. A comparative analysis is conducted across multiple dimensions, including face recognition performance, privacy protection effectiveness, and practical usability. This analysis systematically identifies the strengths and limitations of different types of methods.   Prospects   Several research directions are identified for future work. A primary challenge is to achieve a dynamic balance between privacy protection and system utility. Excessive protection may degrade recognition accuracy, whereas insufficient safeguards expose users to unacceptable risks. Adaptive mechanisms that adjust privacy levels according to context, task requirements, and user consent are therefore required. Another promising direction is the development of inherently privacy-aware recognition paradigms, such as feature representations that minimize identity leakage by design. The establishment of standardized evaluation frameworks for privacy risk and usability is also essential. Such frameworks would enable reproducible benchmarking and facilitate real-world deployment. The emergence of generative foundation models, including diffusion models and large multimodal models, further changes the research landscape. These models enable synthetic data generation and controllable identity representations. However, they also enable more advanced attacks, such as high-fidelity face reconstruction and identity impersonation. Addressing these dual effects requires interdisciplinary collaboration across computer vision, cryptography, law, and ethics, supported by appropriate regulation and continued methodological development.  Conclusions  This paper provides a comprehensive reference for researchers and practitioners engaged in trustworthy face recognition. By integrating advances from multiple disciplines, it promotes the development of effective facial privacy protection technologies and supports the secure, reliable, and ethically responsible deployment of face recognition in practical scenarios. The long-term goal is to establish face recognition as a trustworthy component of CPS that balances functionality, privacy protection, and societal trust.
A Review of Causal Feature Learning in Deep Learning Image Classification Models
WANG Xiaodong, JIANG Ling, LI Huihui, WANG Buhong
2026, 48(4): 1569-1590.   doi: 10.11999/JEIT250738
[Abstract](627) [FullText HTML](433) [PDF 2069KB](90)
Abstract:
  Significance   Deep learning is built on statistical correlations rather than causal relationships. Therefore, such models face major challenges in generalization, interpretability, and stability. Unlike human cognition, which mainly depends on causal discovery and use, current deep learning models remain at the bottom of the Pearl Causal Hierarchy (PCH). Therefore, integrating causal inference into deep learning has become a major research goal. As a core branch of deep learning, image classification models, represented by Convolutional Neural Networks (CNNs), show these limitations particularly clearly. Thus, causal inference is urgently needed to address this bottleneck. Among the available approaches for incorporating causal inference into these models, Causal Feature Learning (CFL), a framework that combines unsupervised machine learning with causal inference, shows clear advantages. Previous studies have confirmed that causal relationships are implicitly embedded in the pixel information of input images in image classification tasks. According to the Causal Coarsening Theorem (CCT), causal knowledge can be obtained from observed image data at low experimental cost. In classification tasks, the optimal solution is given by the Markov Boundary (MB) of the causal Bayesian network for the class variable. These theories strongly support efforts to connect deep image classification models with causal inference through CFL. Overall, the importance of CFL has become increasingly evident, and it is regarded as a promising breakthrough direction for next-generation models.  Progress   This paper provides a comprehensive review of CFL in deep learning image classification models from three core aspects: statistical causal inference theory, correlation analysis methods, and CFL implementations. First, the relevant definitions of CFL and its two mainstream statistical implementation frameworks are introduced, including causal discovery based on the Structural Causal Model (SCM) and causal effect estimation based on the Rubin Causal Model (RCM). Second, correlation analysis methods for deep learning image classification models, which are located at the threshold of the PCH, are systematically summarized from three perspectives: forward, backward, and horizontal. Third, with these auxiliary tools as a foundation, progress in CFL for image classification is classified into four main directions: causal Feature Discovery (CFD), Causal Feature Effect Estimation (CFEE), Causal Representation Learning (CRL), and Spurious Correlation Removal (SCR). CFD is based on the SCM framework and aims to derive confounding-free causal graphs through explicit or implicit causal intervention analysis of image data or models. Under the RCM framework, CFEE uses observed image data to quantitatively evaluate the causal effects of features, while addressing the lack of counterfactual samples and confounding bias. CRL focuses on selecting or extracting high-dimensional features from image data to learn causal relationships and identify low-dimensional cross-image representations. SCR removes non-causal features from images and preserves causal features through different methods. In addition, available toolkits, top conference resources, and academic organizations are listed. This paper also discusses key technical issues and future research directions.  Conclusions  This review summarizes the technological development of CFL. Overall, substantial progress has been made, although challenges remain in different research directions. CFD has the advantage of following the basic logic of causal theory, with clear and simple structures that are easy to understand. However, CFD still faces immature processing methods for high-dimensional image data and limited generalization ability. CFEE can effectively distinguish causal features from confounding features. Its evaluation results are closer to real decision-making logic and show strong general applicability. Common limitations of CFEE include the requirement for observable confounders, strong dependence on causal assumptions, and limited computational efficiency. CRL offers greater flexibility in representation dimensions and can identify causal factors that drive classification while excluding non-causal factors. Its main unresolved problems include generalization bias, factor coupling, prior dependence, weak evaluation, and high cost. SCR is highly targeted but has poor generalization ability. From a broader perspective, CFL should not be restricted to specific methods. Any method that aims to construct causal relationships from microvariables, such as image pixels, to causal macrovariables, such as global semantics, can be considered part of this field. Therefore, CFL remains an open research topic.  Prospects   The goal of causal inference is to move beyond correlation and clarify the causal relationships among variables by designing more rigorous experiments or using more advanced statistical methods. This requires deeper assumptions about feature relationships and broader exploration of underlying causal chains. Both remain highly challenging and are likely to become major focuses of future research in this field. To address the technical challenges in CFL, this paper proposes the following future directions: (1) unifying construction paradigms and establishing standards for image-based SCMs to improve the standardization and consistency of causal discovery; (2) developing RCM methods supported by generative artificial intelligence to address sample scarcity in causal effect estimation; (3) reforming models to learn new image causal representations, thereby fundamentally addressing the inherent limitations of CNNs in CFL; and (4) integrating spurious correlation analysis with reinforcement learning, and using reinforcement learning to equip deep learning image classification models with meta-learning capability for causal exploration. It can be expected that, once these key issues in CFL are resolved, the accuracy, generalization, interpretability, and stability of deep learning image classification models will improve substantially.
A Review of Research on Voiceprint Fault Diagnosis of Transformers
GONG Wenjie, LIN Guosong, WEI Xiaoguang
2026, 48(4): 1591-1607.   doi: 10.11999/JEIT251076
[Abstract](589) [FullText HTML](624) [PDF 6935KB](63)
Abstract:
  Significance   Voiceprint fault diagnosis of transformers has become an active research area for ensuring safe and reliable operation of power systems. Traditional monitoring methods, such as dissolved gas analysis, infrared temperature measurement, and online partial discharge monitoring, exhibit limited real-time capability and rely heavily on expert experience. These limitations hinder effective detection of early-stage faults. Voiceprint fault diagnosis captures operational voiceprint signals from transformers and enables non-contact monitoring for early anomaly warning. This approach offers advantages in real-time performance, sensitivity, and fault coverage. This review systematically traces the technological evolution from traditional signal analysis to deep learning and compares the advantages, limitations, and application scenarios of different models across multiple dimensions. Key challenges are identified, including limited robustness to noise and imbalanced datasets. Potential research directions are proposed, including integration of physical mechanisms with data-driven methods and improvement of diagnostic transparency and interpretability. These analyses provide theoretical support and practical guidance for promoting the transition of voiceprint fault diagnosis from laboratory research to engineering applications.  Progress   Research on voiceprint fault diagnosis of transformers has progressed from traditional signal analysis to an intelligent recognition paradigm based on deep learning, reflecting a clear technological evolution. A bibliometric analysis of 188 papers from the CNKI and Web of Science databases shows that annual publications remained at 1~10 papers between 1997 and 2020, corresponding to an exploratory stage. Studies during this period focused mainly on fundamental voiceprint signal processing methods, including acoustic wave detection, wavelet transform, and Empirical Mode Decomposition (EMD). After 2020, Variational Modal Decomposition (VMD), Mel spectrum, and Mel Frequency Cepstral Coefficient (MFCC) were gradually applied to voiceprint feature extraction. Since 2021, publication output has increased rapidly and reached a historical peak in 2023. This growth was driven by advances in image and speech processing technologies. Early studies emphasized time-domain and frequency-domain analysis of voiceprint signals. Recent research increasingly converts voiceprint signals into two-dimensional time-frequency spectrogram representations. Model architectures have evolved from single-channel feature inputs with single-model outputs to complex frameworks with multi-channel feature extraction and multi-model fusion. Classical machine learning models, including Gaussian Mixture Model (GMM), Support Vector Machine (SVM), Random Forest (RF), and Back Propagation Neural Network (BPNN), form the foundation of voiceprint fault diagnosis but are limited in handling high-dimensional features. Deep learning models, such as Convolutional Neural Network (CNN), Residual neural Network (ResNet), Recurrent Neural Network (RNN), and Transformer, demonstrate advantages in automatic feature extraction and complex pattern recognition, although they require substantial computational resources.  Conclusions  This review summarizes the technological development of transformer voiceprint fault diagnosis from machine learning to deep learning. Although deep learning methods achieve high recognition accuracy for complex voiceprint signals, five major challenges remain. These challenges include limited robustness to noise in non-stationary environments, severe data imbalance caused by scarce fault samples, the black-box nature of deep learning models, fragmented evaluation systems resulting from inconsistent data acquisition standards, and insufficient cross-modal fusion of multi-source data. Sensitivity to environmental noise limits diagnostic performance under varying operating conditions. Data imbalance reduces recognition accuracy for rare fault types. Limited interpretability restricts fault mechanism analysis and diagnostic credibility. Inconsistent sensor placement and sampling parameters lead to poor comparability across datasets. Single-modal voiceprint analysis restricts effective utilization of complementary information from other data sources. Addressing these challenges is essential for advancing voiceprint fault diagnosis from laboratory validation to field deployment.  Prospects   Future research should focus on five directions. First, noise-robust voiceprint feature extraction methods based on physical mechanisms should be developed to address non-stationary interference in complex operating environments. Second, the lack of real-world fault data should be alleviated by constructing electromagnetic field-structural mechanics-acoustic coupling models of transformers to generate high-fidelity voiceprint fault samples, while unsupervised clustering methods should be applied to improve annotation efficiency and quality. Third, explainable deep learning architectures for voiceprint fault diagnosis that incorporate physical mechanisms should be designed. Attention mechanisms combined with SHapley Additive exPlanations, Grad-CAM, and physical equations can support process-level and post hoc interpretation of diagnostic results. Fourth, industry-wide collaboration is required to establish standardized voiceprint data acquisition protocols, benchmark datasets, and unified evaluation systems. Fifth, cross-modal fusion models based on multi-channel and multi-feature analysis should be developed to enable integrated transformer fault diagnosis through comprehensive utilization of multi-source information.
Dataset
A Large-Scale Multimodal Instruction Dataset for Remote Sensing Agents
WANG Peijin, HU Huiyang, FENG Yingchao, DIAO Wenhui, SUN Xian
2026, 48(4): 1608-1622.   doi: 10.11999/JEIT250818
[Abstract](1148) [FullText HTML](376) [PDF 3337KB](197)
Abstract:
  Objective   The rapid advancement of Remote Sensing (RS) technology has reshaped Earth observation research, shifting the field from static image analysis to intelligent, goal-oriented cognitive decision-making. Modern RS systems are expected to perceive complex scenes, reason over heterogeneous information, decompose high-level objectives into executable subtasks, and make decisions under uncertainty. These requirements motivate the development of RS agents, which extend perception models to include reasoning, planning, and interaction functions. However, existing RS datasets remain task-centric and fragmented, as they are usually designed for single-purpose supervised learning such as object detection or land-cover classification. They seldom support multimodal reasoning, instruction following, or multi-step decision-making, all of which are essential for agentic workflows. Current RS vision-language datasets also have limited scale, constrained modality coverage, and simplified text annotations, with insufficient use of non-optical data such as Synthetic Aperture Radar (SAR) and infrared imagery. They further lack instruction-driven interactions that reflect real human-agent collaboration. This study constructs a large-scale multimodal image-text instruction dataset tailored for RS agents. The objective is to establish a unified data foundation that supports perception, reasoning, planning, and decision-making. By training models on structured instructions across diverse modalities and task categories, the dataset supports the development and evaluation of next-generation RS foundation models with agentic capability.  Methods   The dataset is built through a systematic and extensible framework that integrates multi-source RS imagery with instruction-oriented textual supervision. A unified input-output paradigm is defined to ensure compatibility across heterogeneous tasks and model architectures. This paradigm formalizes interactions between visual inputs and language instructions, allowing models to process image pixels, text descriptions, spatial coordinates, region references, and action-oriented outputs. A standardized instruction schema encodes task objectives, constraints, and expected responses in a consistent format. The construction process includes three stages. (1) Data collection and integration: multimodal RS imagery is aggregated from authoritative sources, covering optical, SAR, and infrared modalities with different spatial resolutions, scene types, and geographic distributions. (2) Instruction generation: a hybrid strategy combines rule-based templates with refinement by Large Language Models (LLMs). Template-based generation ensures task completeness and structural consistency, whereas LLM rewriting improves linguistic diversity and instruction complexity. (3) Task categorization and organization: the dataset is organized into nine core task categories and 21 sub-datasets that span low-level perception, mid-level reasoning, and high-level decision-making. A validation pipeline performs automated syntax and format checks, cross-modal consistency verification, and manual review of representative samples to ensure semantic alignment between images and instructions.  Results and Discussions   The dataset contains more than 2 million multimodal instruction samples, making it one of the largest and most comprehensive instruction resources in the RS domain. The inclusion of optical, SAR, and infrared imagery supports cross-modal learning and reasoning across heterogeneous sensing mechanisms. Compared with existing RS datasets, this dataset emphasizes instruction diversity, task compositionality, and agent-oriented interaction rather than isolated perception tasks. Baseline experiments conducted using state-of-the-art multimodal LLMs and RS foundation models show that the dataset supports evaluation across the full spectrum of agentic capabilities, from visual grounding and reasoning to high-level decision-making. The experiments also highlight challenges inherent to RS data, including extreme scale variation, dense object distributions, and long-range spatial dependencies. These challenges indicate important research directions for improving multimodal reasoning and planning in complex RS environments.  Conclusions   This work presents a large-scale multimodal image-text instruction dataset designed for RS agents. By organizing data across nine task categories and 21 sub-datasets, it provides a unified and extensible benchmark for agent-centric RS research. The contributions include: (1) a unified multimodal instruction paradigm for RS agents; (2) a 2-million-sample dataset covering optical, SAR, and infrared modalities; (3) empirical validation demonstrating support for end-to-end agentic workflows from perception to decision-making; and (4) a comprehensive evaluation benchmark based on baseline experiments. Future work will extend the dataset to temporal and video-based RS scenarios, integrate dynamic decision-making processes, and further improve reasoning and planning capability in real-world, time-varying environments.
Cryption and Network Information Security
Optimized Implementation of Low-Depth Lightweight S-Boxes
FENG Zixi, LIU Yupeng, DOU Guowei, LIU Chengle
2026, 48(4): 1623-1632.   doi: 10.11999/JEIT250690
[Abstract](216) [FullText HTML](121) [PDF 655KB](24)
Abstract:
  Objective  With the rapid development and widespread deployment of the Internet of Things (IoT), embedded systems, and mobile computing devices, secure communication and data protection on resource-constrained platforms have become a central focus in information security. These devices are typically characterized by severe limitations in computational capability, storage capacity, and energy consumption. These limitations make traditional cryptographic algorithms inefficient or even infeasible in such environments. In response, lightweight cryptographic algorithms have been proposed as an effective class of solutions. Their primary objective is to achieve security levels comparable to those of traditional algorithms while significantly reducing hardware and computational overhead through algorithmic simplification and structural optimization. These algorithms are designed to operate efficiently under tight resource constraints and are particularly suitable for applications such as sensor networks, smart cards, RFID systems, and wearable devices. From the perspective of hardware implementation, the design of lightweight cryptographic algorithms must consider multiple performance metrics, including throughput, latency, power efficiency, chip area, and circuit depth. Among these metrics, chip area and circuit depth are particularly critical because they directly affect production cost and computational speed. The Substitution-box (S-Box), as the core nonlinear component that provides confusion in most symmetric encryption schemes, plays a decisive role in determining both the security and implementation efficiency of the entire cipher. Therefore, efficient methods for realizing low-area and low-depth S-Boxes are of fundamental importance for the design of secure and practical lightweight cryptographic systems.  Methods  In this work, a novel S-Box optimization algorithm based on Boolean satisfiability (SAT) solving is proposed to optimize two key hardware metrics simultaneously: logic area and circuit depth. A circuit model with depth k and width w is constructed for this purpose. Under a given area constraint, SAT-solving techniques are used to determine whether the circuit model can implement the target S-Box. By iteratively adjusting the circuit depth, width, and area parameters, an optimized S-Box implementation is obtained. The method is specifically developed for 4-bit S-Boxes, which are widely used in many lightweight block ciphers, and it provides implementations that are highly efficient in both structural compactness and computational depth. This dual optimization approach reduces hardware cost while maintaining low latency, making it particularly suitable for scenarios in which both performance and energy efficiency are critical. The proposed method begins by transforming the S-Box implementation problem into a formal SAT problem, which enables the use of powerful SAT solvers to exhaustively explore possible logic-level representations. In this transformation, a diverse set of logic gates, including 2-input, 3-input, and 4-input gates, is used to construct flexible logic networks. To enforce area and depth constraints, arithmetic operations such as binary addition and comparator logic are encoded into SAT-compatible Boolean constraints, which guide the solver toward low-area and low-depth solutions. To further accelerate the solving process and avoid redundant search paths, symmetry-breaking constraints are introduced. These constraints eliminate logically equivalent but structurally different representations, thereby significantly reducing the size of the solution space. The CaDiCaL SAT solver, known for its speed and efficiency in handling large-scale SAT problems, is used to compute optimized S-Box implementations that minimize both depth and area. The proposed approach not only generates efficient implementations but also provides a general modeling framework that can be extended to other logic-synthesis problems in cryptographic hardware design.  Results and Discussions  To validate the effectiveness of the proposed optimization method, a comprehensive set of experiments is conducted on 4-bit S-Boxes from several representative lightweight block ciphers, including Joltik, Piccolo, Rectangle, Skinny, Lblock, Lac, Midori, and Prøst. The results show that the method consistently produces high-quality implementations that are competitive with, or superior to, existing state-of-the-art results in both chip area and circuit depth. Specifically, for the S-Boxes of Joltik and Piccolo, as well as those used in Skinny and Rectangle, the generated implementations match the best known results in both metrics, indicating that the method can successfully reproduce optimal or near-optimal designs. In the cases of Lblock and Lac, although the logic area remains similar to previous results, the circuit depth is significantly reduced from 10 to 3, representing a substantial improvement in processing latency and suitability for real-time applications. For the inverse S-Box of the Rectangle cipher, the proposed implementation achieves the same circuit depth as previous designs but reduces the area from 24.33 Gate Equivalents (GE) to 17.66 GE, yielding a more compact and efficient realization. The optimization results for the Midori S-Box further confirm the effectiveness of the method: the depth is reduced from 4 to 3, and the area is reduced from 20.00 GE to 16.33 GE. For the Prøst cipher’s S-Box, two alternative implementations are presented to illustrate the trade-off between area and depth. The first achieves a depth of 4 with an area of 22.00 GE, matching the best known depth but at a higher area cost. The second increases the depth to 5 but reduces the area significantly to 13.00 GE. These results show that the method supports flexible optimization under different design constraints and also provides deeper insight into the complexity and trade-offs of S-Box implementation.  Conclusions  This paper presents a SAT-based method for jointly optimizing S-Box hardware implementations in terms of area and circuit depth. By modeling S-Box realization as a satisfiability problem and using advanced constraint encoding, multi-input logic gates, and symmetry-breaking techniques, the method effectively reduces hardware complexity while maintaining or improving depth performance. Extensive experiments on various 4-bit S-Boxes show that the proposed approach matches or outperforms existing results, particularly in reducing circuit depth and improving logic compactness. This makes it well suited to lightweight cryptographic systems operating under strict constraints on silicon area, speed, and energy consumption. Despite these advantages, the method still has limitations. Although it achieves optimal or near-optimal results for 4-bit S-Boxes, scalability to larger instances, such as 5-bit or 8-bit S-Boxes, remains challenging because of the exponential growth of the search space and solving time. As model complexity increases, solving becomes computationally expensive and may fail to converge in practice. Future work will focus on improving modeling efficiency and solver performance through refined constraint generation, stronger pruning strategies, and heuristic-guided search, with the goal of extending the method to more complex S-Boxes and other nonlinear components in lightweight and post-quantum cryptographic systems.
Battery Pack Multi-fault Diagnosis Algorithm Based on Dual-Perspective Spectral Attention Fusion
LIU Mingjun, GU Shenyu, YIN Jingde, ZHANG Yifan, DONG Zhekang, JI Xiaoyue
2026, 48(4): 1633-1645.   doi: 10.11999/JEIT251156
[Abstract](436) [FullText HTML](299) [PDF 11488KB](45)
Abstract:
  Objective  With the rapid growth of electric vehicles and their widespread deployment, battery pack faults have become more frequent, creating an urgent need for efficient fault diagnosis methods. Although deep learning-based approaches have achieved notable progress, existing studies remain limited in addressing multiple fault types, such as Internal Short Circuit (ISC), sensor noise, sensor drift, and State-Of-Charge (SOC) inconsistency, and in modeling the coupling relationships among these faults. To address these limitations, a multi-fault diagnosis algorithm for battery packs based on dual-perspective spectral attention is proposed. A dual-perspective tokenization module is designed to extract spatiotemporal features from battery data, whereas a spectral attention mechanism addresses non-stationary time-series characteristics and captures long-term dependencies, thereby improving diagnostic performance.   Methods  To improve spatiotemporal feature extraction and fault diagnosis performance, a dual-perspective spectral attention fusion algorithm for battery pack multi-fault diagnosis is proposed. The overall architecture consists of four core modules (Fig. 3): a dual-perspective tokenization module, a spectral attention module, a feature fusion module, and an output module. The dual-perspective tokenization module applies positional encoding to jointly model temporal and spatial dimensions, enabling comprehensive spatiotemporal feature representation. When combined with the spectral attention mechanism, the capability of the model to handle non-stationary characteristics is strengthened, leading to improved diagnostic performance. In addition, to address the lack of comprehensive publicly available datasets for battery pack fault diagnosis, a new dataset is constructed, covering ISC, sensor noise, sensor drift, and SOC inconsistency faults. The dataset includes three operating conditions, FUDS, UDDS, and US06, which alleviates data scarcity in this research field.  Results and Discussions  Experimental results indicate that the proposed method improves average precision, recall, F1 score, and accuracy by 10.98%, 12.64%, 13.84%, and 13.45%, respectively, compared with existing optimal fault diagnosis methods. Comparison experiments under different operating conditions (Table 6) support this conclusion. Conventional convolutional neural network methods perform well in local feature extraction; however, fixed-size convolution kernels are not well suited to time features with varying frequencies, which limits long-term temporal dependency modeling and global feature capture. Recurrent neural network-based methods show reduced computational efficiency when large-scale datasets are processed. Transformer-based models face constraints in spatial feature extraction and in representing temporal variations. By contrast, the proposed algorithm addresses these limitations through an integrated architectural design. Ablation experiments demonstrate the contribution of each module to overall performance (Table 7), and the complete framework improves average F1 score and accuracy by 8.86% and 9.31%, respectively, compared with ablation variants. Robustness analysis under simulated noise conditions (Table 8) shows that the proposed method achieves accuracy improvements ranging from 49.95% to 124.34% over baseline methods at noise levels from –2 dB to –8 dB, indicating strong noise resistance.  Conclusions  A multi-fault diagnosis algorithm for battery packs is presented that integrates dual-perspective tokenization and spectral attention to combine spatiotemporal and spectral information. The dual-perspective tokenization module performs tokenization and positional encoding along temporal and spatial axes, which improves spatiotemporal representation. The spectral attention mechanism strengthens modeling of non-stationary signals and long-term dependencies. Experiments under FUDS, UDDS, and US06 driving cycles show that the proposed method outperforms existing multi-fault diagnosis approaches, with average gains of 13.84% in F1 score and 13.45% in accuracy. Ablation studies confirm that both modules contribute substantially and that their combination enables effective handling of complex time-series features. Under high-noise conditions (–2 dB, –4 dB, –6 dB, and –8 dB), the method also shows improved robustness, with accuracy gains of 49.95%, 90.39%, 112.01%, and 124.34%, respectively, compared with baseline methods. Several limitations remain. First, the data are mainly derived from laboratory simulations, and further validation under real-world operating conditions is required. Second, the effect of fault severity on battery management system hierarchical decision making has not been fully addressed, and future work will focus on establishing a fault severity grading strategy. Third, physical interpretability requires further improvement, and subsequent studies will explore the integration of equivalent circuit models or electrochemical mechanism models to balance diagnostic accuracy and interpretability.
Research on the Architecture of Dual-field Reconfigurable Polynomial Multiplication Unit for Lattice-based Post-quantum Cryptography
CHEN Tao, ZHAO Wangpeng, BIE Mengni, LI Wei, NAN Longmei, DU Yiran, FU Qiuxing
2026, 48(4): 1646-1658.   doi: 10.11999/JEIT250929
[Abstract](308) [FullText HTML](189) [PDF 2471KB](22)
Abstract:
  Objective  Polynomial multiplication accounts for more than 80% of the computational time in lattice cryptography algorithms. The Number Theoretic Transform (NTT) and Fast Fourier Transform (FFT) reduce the computational complexity of polynomial multiplication from exponential to logarithmic order. However, mainstream lattice cryptography algorithms, including Kyber, Dilithium, and Falcon, differ considerably in their parameter sets and polynomial multiplication implementations. To support polynomial multiplication under multiple parameter configurations and improve resource utilization, a dual-field reconfigurable polynomial multiplication unit architecture is proposed.  Methods  First, the computational network for polynomial multiplication is extracted according to the parameter characteristics of Kyber, Dilithium, and Falcon. The internal dual-field multiplication operations are optimized at the algorithm level. Next, a dual-field reconfigurable polynomial multiplication unit architecture is designed for the polynomial multiplication network. The dual-field reconfigurable multiplication unit is further optimized to improve computational speed. Finally, a parallelism analysis is conducted to improve resource utilization of the computational architecture. The proposed architecture achieves the highest area efficiency when supporting 1-lane 64 bit, 2-lane 32 bit, or 4-lane 16 bit operations.  Results and Discussions  The architecture is experimentally validated on the Xilinx FPGA XC7V2000TFLG1925. It simultaneously supports one channel of complex-form floating-point operations or two channels of 17\begin{document}$ \sim $\end{document}32 bit internal NTT operations and four channels of 16 bit internal NTT operations. At an operating frequency of 169 MHz, the architecture reduces the area-time product by more than 50%.  Conclusions  The proposed dual-field reconfigurable processing unit architecture provides advantages in scalability, area efficiency, and core unit performance. Its configurable bit-width design adapts more easily to traditional cryptographic processors and provides a practical approach for migrating conventional public-key cryptosystems to post-quantum cryptography.
Physical Layer Key Generation Method for Integrated Sensing and Communication Systems
LIU Kexin, HUANG Kaizhi, PEI Xinglong, JIN Liang, CHEN Yajun
2026, 48(4): 1659-1667.   doi: 10.11999/JEIT251034
[Abstract](440) [FullText HTML](251) [PDF 2417KB](68)
Abstract:
  Objective  Integrated Sensing And Communication (ISAC) has become a central technology in Sixth-Generation (6G) wireless networks, enabling simultaneous data transmission and environmental sensing. However, the characteristics of ISAC systems, including highly directional sensing signals and the risk of sensitive information leakage to malicious sensing targets, create specific security challenges. Physical layer security provides lightweight methods to enhance confidentiality. In secure transmission, approaches such as artificial noise injection and beamforming can partially improve secrecy, although they may reduce sensing accuracy or communication efficiency. Their effect also depends on the quality advantage of legitimate channels over eavesdropping channels. For Physical Layer Key Generation (PLKG), existing work has only demonstrated basic feasibility. Most current schemes adopt a radar-centric design, which limits compatibility with communication protocols and restricts key generation rates. This paper proposes a PLKG method tailored for ISAC systems. It aims to maximize the Sum Key Generation Rate (SKGR) under sensing accuracy constraints through a Twin Delayed Deep Deterministic policy gradient (TD3)-based joint communication and sensing beamforming algorithm, thereby improving the security performance of ISAC systems.  Methods  A MIMO ISAC system is considered, where a base station (Alice) equipped with multiple antennas communicates with single-antenna users (Bobs) and senses a malicious target (Eve). The system operates under a TDD protocol to leverage channel reciprocity. A PLKG protocol designed for ISAC systems is developed, including channel estimation, joint communication and sensing beamforming, and key generation. The SKGR is derived in closed form, and sensing accuracy is evaluated using the Cramér-Rao Bound (CRB). To maximize the SKGR under CRB constraints, a non-convex optimization problem for the joint design of communication and sensing beamforming matrices is formulated. Given its NP-hardness, an algorithm based on TD3 is proposed. TD3 employs dual critic networks to reduce overestimation, delayed policy updates to enhance stability, and target policy smoothing to improve robustness. The state includes channel state information, the actions correspond to beamforming matrices, and the reward function combines SKGR, CRB, and power constraints.  Results and Discussions  Simulation results confirm the effectiveness of the proposed design. The TD3-based algorithm achieves a stable SKGR of 18.5 bit/channel use after training (Fig. 4), outperforming benchmark schemes such as Deep Deterministic Policy Gradient (DDPG), greedy search, and random algorithms. The SKGR increases monotonically with transmit power because of reduced noise interference (Fig. 5). Increasing the number of antennas also improves SKGR, although the gain diminishes as power per antenna decreases. The scheme maintains stable SKGR across different distances to the eavesdropper (Fig. 6), demonstrating the robustness of PLKG against eavesdropping attacks. The proposed algorithm manages the complex optimization problem effectively and adapts to dynamic system conditions, offering a practical approach for secure ISAC systems.  Conclusions  This paper presents a PLKG method for ISAC systems. The proposed protocol generates consistent keys between the base station and communication users. The SKGR maximization problem with sensing constraints is solved using a TD3-based algorithm that jointly optimizes communication and sensing beamforming matrices. Simulation results show that the method outperforms benchmark schemes, with significant gains in SKGR and adaptability to system conditions. The study establishes a basis for integrating PLKG into ISAC to strengthen security without reducing sensing performance. Future work will examine real-time implementation and scalability in large networks.
Wireless Communication and Internet of Things
UAV-assisted Mobile Edge Computing based on Hybrid Hierarchical DRL in the Internet of Vehicular
YANG Miaoyan, FANG Xuming
2026, 48(4): 1668-1677.   doi: 10.11999/JEIT250743
[Abstract](339) [FullText HTML](243) [PDF 2634KB](42)
Abstract:
  Objective  In the Internet of Vehicles (IoV), the use of Unmanned Aerial Vehicles (UAVs) to address increasing edge computing demand has become a key direction in 6G research. However, when Deep Reinforcement Learning (DRL) is applied to optimize system latency, the action space grows exponentially with the number of vehicles and causes training difficulty and slow convergence. This study proposes a two-layer hybrid solution for UAV-assisted Mobile Edge Computing (MEC) based on DRL, termed Hybrid Hierarchical Deep Reinforcement Learning (HHDRL).  Methods  The HHDRL algorithm adopts a two-layer architecture to decompose complex optimization tasks. The upper layer uses an agent based on Proximal Policy Optimization (PPO) and a multi-head actor network to manage user offloading and UAV control policies. The N heads determine offloading decisions for N users, including local processing or offloading to associated CAPs or the UAV. A separate UAV flight-control head selects discrete acceleration actions to satisfy practical control constraints. The lower layer applies a computationally efficient greedy algorithm to prioritize resources based on task characteristics. This hybrid hierarchical design reduces the computational cost associated with DRL-only resource allocation.  Results and Discussions  The performance of the HHDRL scheme was evaluated through numerical simulations using a Rician fading channel model, a UAV flight energy consumption model, and system parameters such as mission data sizes of 9~18 Mbits and mission complexities of 2 000~3 000 cycle/bit. Figure 3 shows that HHDRL converges faster than standard DRL, although the final reward is slightly lower. Figure 4 indicates that HHDRL maintains the user delay fairness of DRL. The evaluation in Figure 5 shows that the proposed method reduces system latency by approximately 71~91% compared with a random baseline and by 1~12% compared with the original DRL algorithm. Figure 6 shows training time results for different numbers of users; HHDRL consistently achieves shorter training times, and its training time grows more slowly as the number of users increases. This results from the reduced DRL output action space. When the PPO-based upper layer is replaced with other DRL algorithms, the scheme still outperforms the random baseline and achieves performance comparable to non-hierarchical DRL, demonstrating the generality of the architecture. Figure 8 shows that computational resources have the strongest effect on latency because computation typically dominates total task processing time. Figure 9 presents UAV trajectory optimization. Figure 9(a) shows realistic velocity changes under discrete acceleration control. Figure 9(b) shows that the UAV adjusts its position to track dynamic user distribution while maintaining stable flight.  Conclusions  This study presents an HHDRL algorithm that integrates DRL with a greedy strategy in a hierarchical framework to address the training challenges of UAV-assisted MEC in IoV scenarios. The simulations show that (1) the proposed method accelerates convergence and reduces training time compared with standard DRL; (2) its latency performance is comparable to DRL and significantly better than heuristic and random baselines; and (3) the framework effectively manages task offloading, resource allocation, and UAV trajectory optimization under practical constraints. Future work will extend the framework to multi-UAV collaboration and more complex environments.
Low-Complexity Joint Estimation Algorithm for Carrier Frequency Offset and Sampling Frequency Offset in 5G-NTN Low Earth Orbit Satellite Communications
GONG Xianfeng, LI Ying, LIU Mingyang, ZHAI Shenghua
2026, 48(4): 1678-1687.   doi: 10.11999/JEIT251086
[Abstract](474) [FullText HTML](262) [PDF 4958KB](57)
Abstract:
  Objective   The Doppler effect is a major impairment in Low Earth Orbit (LEO) satellite communications within 5G Non-Terrestrial Networks (5G-NTN). It introduces Carrier Frequency Offset (CFO), Sampling Frequency Offset (SFO), and Inter-Subcarrier Frequency Offset (ISFO) across subcarriers. Although existing estimation algorithms focus mainly on CFO and SFO, the effect of ISFO is insufficiently addressed. ISFO becomes highly detrimental to receiver performance when Orthogonal Frequency-Division Multiplexing (OFDM) systems use a large number of subcarriers and high-order modulation. Moreover, under joint CFO and SFO conditions, conventional Maximum Likelihood Estimation (MLE) methods often require one- or two-dimensional grid searches. This results in high computational cost. To reduce this cost, two joint estimation algorithms for CFO and SFO are proposed.  Methods   The influence of non-ideal factors at the transmitter, receiver, and channel, such as local oscillator offset, SFO in Digital-to-Analog Converters (DACs) and Analog-to-Digital Converters (ADCs), and the Doppler effect, is analyzed. A mathematical model for the received OFDM signal is developed, and the mechanism through which SFO and ISFO distort the phase of frequency-domain subcarriers is derived. Leveraging the pilot structure of 5G-NTN, two joint CFO and SFO estimation algorithms are proposed. (1) Algorithm 1 uses the sequence correlation between the received frequency-domain Demodulation Reference Signal (DMRS) vectors. After phase pre-compensation is applied, the normalized cross-correlation vector is computed. An objective function is constructed from this vector, and its unimodal behavior in the main lobe is used to estimate the parameters through a bisection search. (2) Algorithm 2 treats the estimation parameter as analogous to a CFO in single-carrier systems and adopts an L&R-based autocorrelation method to derive approximate closed-form expressions.  Results and Discussions   A computational complexity analysis compares the proposed algorithms with one-dimensional (1D-ML) and two-dimensional (2D-ML) grid-search MLE methods. Numerical results show that Algorithm 1 reduces complexity substantially. The number of complex multiplications, which represent the main computational cost, is 4% of that of the 2D-ML method, 8% of that of Algorithm 2, and 44% of that of the 1D-ML method. Although Algorithm 2 is more computationally demanding, it yields a closed-form estimation expression. The performance of each algorithm is evaluated through the Mean Square Error (MSE) of the estimated parameters. Simulations show that for a subcarrier number of 3072, the 1D-ML algorithm performs slightly better than the others at Signal-to-Noise Ratios (SNRs) below 5 dB. However, because robust modulation schemes such as BPSK and QPSK typically used at low SNRs tolerate larger offsets, the medium-to-high SNR range is of greater practical relevance. In this range, all four algorithms demonstrate comparable estimation performance.  Conclusions  This study addresses the effect of Doppler in 5G-NTN LEO satellite communications by analyzing the mechanism and influence of ISFO and by proposing two joint estimation algorithms for CFO and SFO. First, a mathematical model of the received signal is established considering non-ideal factors such as CFO, SFO, and ISFO. The combined effect of SFO and ISFO on OFDM signals is derived to be equivalent to their linear superposition, which expands the range of the equivalent SFO. Second, the objective function is defined using the cross-correlation vector of two DMRS sequences. By using its unimodal behavior within the main lobe, a binary search enables fast convergence. Subsequently, the parameter determined by SFO and ISFO is then treated as analogous to the CFO in single-carrier systems, allowing an approximate closed-form estimation solution to be obtained through the L&R method. Finally, complexity analysis and performance simulations show that the proposed algorithms provide significant computational savings and strong estimation performance. These results can support the development of 5G-NTN LEO satellite payloads and terminal products.
Secrecy Rate Maximization Algorithm for IRS Assisted UAV-RSMA Systems
WANG Zhengqiang, KONG Weidong, WAN Xiaoyu, FAN Zifu, DUO Bin
2026, 48(4): 1688-1697.   doi: 10.11999/JEIT250452
[Abstract](408) [FullText HTML](195) [PDF 1839KB](56)
Abstract:
  Objective  Under the stringent requirements of Sixth-Generation(6G) mobile communication networks for spectral efficiency, energy efficiency, low latency, and wide coverage, Unmanned Aerial Vehicle (UAV) communication has emerged as a key solution for 6G and beyond, leveraging its Line-of-Sight propagation advantages and flexible deployment capabilities. Functioning as aerial base stations, UAVs significantly enhance network performance by improving spectral efficiency and connection reliability, demonstrating irreplaceable value in critical scenarios such as emergency communications, remote area coverage, and maritime operations. However, UAV communication systems face dual challenges in high-mobility environments: severe multi-user interference in dense access scenarios that substantially degrades system performance, alongside critical physical-layer security threats resulting from the broadcast nature and spatial openness of wireless channels that enable malicious interception of transmitted signals. Rate-Splitting Multiple Access (RSMA) mitigates these challenges by decomposing user messages into common and private streams, thereby providing a flexible interference management mechanism that balances decoding complexity with spectral efficiency. This makes RSMA especially suitable for high-density user access scenarios. In parallel, Intelligent Reflecting Surfaces (IRS) have emerged as a promising technology to dynamically reconfigure wireless propagation through programmable electromagnetic unit arrays. IRS improves the quality of legitimate links while reducing the capacity of eavesdropping links, thereby enhancing physical-layer security in UAV communications. It is noteworthy that while existing research has predominantly centered on conventional multiple access schemes, the application potential of RSMA technology in IRS-assisted UAV communication systems remains relatively unexplored. Against this background, this paper investigates secure transmission strategies in IRS-assisted UAV-RSMA systems.  Methods  This paper investigates the effect of eavesdroppers on the security performance of UAV communication systems and proposes an IRS-assisted RSMA-based UAV communication model. The system comprises a multi-antenna UAV base station, an IRS mounted on a building, multiple single-antenna legitimate users, and multiple single-antenna eavesdroppers. The optimization problem is formulated to maximize the system secrecy rate by jointly optimizing precoding vectors, common secrecy rate allocation, IRS phase shifts, and UAV positioning. The problem is highly non-convex due to the strong coupling among these variables, rendering direct solutions intractable. To overcome this challenge, a two-layer optimization framework is developed. In the inner layer, with UAV position fixed, an alternating optimization strategy divides the problem into two subproblems: (1) joint optimization of precoding vectors and common secrecy rate allocation and (2) optimization of IRS phase shifts. Non-convex constraints are transformed into convex forms using techniques such as Successive Convex Approximation (SCA), relaxation variables, first-order Taylor expansion, and Semidefinite Relaxation (SDR). In the outer layer, the Particle Swarm Optimization (PSO) algorithm determines the UAV deployment position based on the optimized inner-layer variables.  Results and Discussions  Simulation results show that the proposed algorithm outperforms RSMA without IRS, NOMA with IRS, and NOMA without IRS in terms of secrecy rate. (Fig. 2) illustrates that the secrecy rate increases with the number of iterations and converges under different UAV maximum transmit power levels and antenna configurations. (Fig. 3) demonstrates that increasing UAV transmit power significantly enhances the secrecy rate for both the proposed and benchmark schemes. This improvement arises because higher transmit power strengthens the signal received by legitimate users, increasing their achievable rates and enhancing system secrecy performance. (Fig. 4) indicates that the secrecy rate grows with the number of UAV antennas. This improvement is due to expanded signal coverage and greater spatial degrees of freedom, which amplify effective signal strength in legitimate user channels. (Fig. 5) shows that both the proposed scheme and NOMA with IRS achieve higher secrecy rate as the number of IRS reflecting elements increases. The additional elements provide greater spatial degrees of freedom, improving channel gains for legitimate users and strengthening resistance to eavesdropping. In contrast, benchmark schemes operating without IRS assistance exhibit no performance improvement and maintain constant secrecy rate. This result highlights the critical role of the IRS in enabling secure communications. Finally, (Fig. 6) demonstrates the optimal UAV position when \begin{document}${P_{\max }} = 30{\text{ dBm}}$\end{document}. Deploying the UAV near the center of legitimate users and adjacent to the IRS minimizes the average distance to users, thereby reducing path loss and fully exploiting IRS passive beamforming. This placement strengthens legitimate signals while suppressing the eavesdropping link, leading to enhanced secrecy performance.  Conclusions  This study addresses secure communication scenarios with multiple eavesdroppers by proposing an IRS-assisted secure resource allocation algorithm for UAV-enabled RSMA systems. An optimization problem is formulated to maximize the system secrecy rate under multiple constraints, including UAV transmit power, by jointly optimizing precoding vectors, common rate allocation, IRS configurations, and UAV positioning. Due to the non-convex nature of the problem, a hierarchical optimization framework is developed to decompose it into two subproblems. These are effectively solved using techniques such as SCA, SDR, Gaussian randomization, and PSO. Simulation results confirm that the proposed algorithm achieves substantial secrecy rate gains over three benchmark schemes, thereby validating its effectiveness.
Radio Map Enabled Path Planning for Multiple Cellular-Connected Unmanned Aerial Vehicles
ZHOU Decheng, WANG Wei, SHAO Xiang, CHEN Mei, XIAO Jianghao
2026, 48(4): 1698-1707.   doi: 10.11999/JEIT250821
[Abstract](354) [FullText HTML](182) [PDF 2063KB](39)
Abstract:
  Objective  In collaborative operation scenarios of cellular-connected Unmanned Aerial Vehicles (UAVs), conflict avoidance strategies often cause unbalanced service quality. Traditional schemes focus on reducing total task completion time but do not ensure service fairness. To address this issue, a radio map-assisted cooperative path planning scheme is proposed. The objective is to minimize the maximum weighted sum of task completion time and communication disconnection time across all UAVs to improve service fairness in multi-UAV scenarios.  Methods  A Signal-to-Interference-plus-Noise Ratio (SINR) map is constructed to assess communication quality. The two-dimensional airspace is discretized into grids, and link gain maps are generated through ray tracing and Axis-Aligned Bounding Box detection to determine Line-of-Sight (LoS) or Non-Line-of-Sight (NLoS) conditions. The SINR map is produced by selecting, for each grid, the base station with the highest expected SINR. To solve the optimization problem, an Improved Conflict-Based Search (ICBS) algorithm with a hierarchical structure is developed. At the high-level stage, proximity conflicts are managed to maintain safety distances, and the cost function is reformulated to emphasize fairness by minimizing the maximum weighted time. The low-level stage applies a bidirectional A* algorithm for single-UAV path planning, using parallel search to improve efficiency while meeting the constraints set by the high-level stage.  Results and Discussions  The proposed scheme is evaluated through simulations across different scenarios. Building heights and positions are shown, where base station locations are marked by red stars and building heights are represented with color gradients from light to dark to indicate increasing height (Fig. 2). The wireless propagation characteristics between UAVs and ground base stations are demonstrated by the SINR map at an altitude of 60 m (Fig. 3), which shows significant SINR degradation in areas affected by building blockage and co-channel interference, resulting in communication blind zones. Trajectory planning results for four UAVs at an altitude of 60 m with an SINR threshold of 2 dB show that all UAVs avoid signal blind zones and complete tasks without collision risks under the proposed scheme (Fig. 4). The trade-off between task completion time and disconnection time is controlled by the weight coefficient (Fig. 5). The maximum weighted time increases monotonically as the weight coefficient increases, whereas the maximum disconnection time decreases. The bidirectional A* algorithm achieves higher computational efficiency than Dijkstra’s and traditional A* algorithms while maintaining optimal solution quality (Table 1). All three algorithms yield identical weighted times, confirming the optimality of the bidirectional A* approach, and its runtime is reduced significantly due to parallel search. Compared with three benchmark schemes, the proposed scheme achieves the lowest maximum weighted time for different SINR thresholds (Fig. 6). Performance analysis at different UAV altitudes shows that the proposed scheme maintains stable maximum weighted time below 75 m, while sharp increases appear above 75 m due to intensified interference from non-serving base stations (Fig. 7). The scalability analysis further shows clear improvements over benchmark schemes, especially when conflicts occur more frequently (Fig. 8).  Conclusions  To address fairness in cellular-connected multi-UAV systems, a radio map-assisted path planning scheme is proposed to minimize the maximum weighted time. Based on a discretized SINR map, an ICBS algorithm is developed. At the high-level stage, proximity conflicts and a reformulated cost function ensure safety and fairness, and at the low-level stage, a bidirectional A* algorithm increases search efficiency. Simulation results show that the proposed scheme lowers the maximum weighted time compared with benchmark schemes and improves fairness and overall multi-UAV collaboration performance.
Entropy-Enhanced Quantum Ripple Synergy Planning Method for Emergency Path of Unmanned Aerial Vehicles Driven by Survival Probability
WANG Enliang, ZHANG Zhen, SUN Zhixin
2026, 48(4): 1708-1718.   doi: 10.11999/JEIT250694
[Abstract](350) [FullText HTML](189) [PDF 4452KB](18)
Abstract:
  Objective  Natural disaster emergency rescue places stringent requirements on the timeliness and safety of Unmanned Aerial Vehicle (UAV) path planning. Conventional optimization objectives, such as minimizing total distance, often fail to reflect the critical time-sensitive priority of maximizing the survival probability of trapped victims. Moreover, existing algorithms struggle with the complex constraints of disaster environments, including no-fly zones, caution zones, and dynamic obstacles. To address these challenges, this paper proposes an Entropy-Enhanced Quantum Ripple Synergy Algorithm (E2QRSA). The primary goals are to establish a survival probability maximization model that incorporates time decay characteristics and to design a robust optimization algorithm capable of efficiently handling complex spatiotemporal constraints in dynamic disaster scenarios.  Methods  E2QRSA enhances the Quantum Ripple Optimization framework through four key innovations: (1) information entropy-based quantum state initialization, which guides population generation toward high-entropy regions; (2) multi-ripple collaborative interference, which promotes beneficial feature propagation through constructive superposition; (3) entropy-driven parameter control, which dynamically adjusts ripple propagation according to search entropy rates; and (4) quantum entanglement, which enables information sharing among elite individuals. The model employs a survival probability objective function that accounts for time-sensitive decay, base conditions, and mission success probability, subject to constraints including no-fly zones, warning zones, and dynamic obstacles.  Results and Discussions  Simulation experiments are conducted in medium- and large-scale typhoon disaster scenarios. The proposed E2QRSA achieves the highest survival probabilities of 0.847 and 0.762, respectively (Table 1), exceeding comparison algorithms such as SEWOA and PSO by 4.2~16.0%. Although the paths generated by E2QRSA are not the shortest, they are the most effective in maximizing survival chances. The ablation study (Table 3) confirms the contribution of each component, with the removal of multi-ripple interference causing the largest performance decrease (9.97%). The dynamic coupling between search entropy and ripple parameters (Fig. 2) is validated, demonstrating the effectiveness of the adaptive control mechanism. The entanglement effect (Fig. 4) is shown to maintain population diversity. In terms of constraint satisfaction, E2QRSA-planned paths consume only 85.2% of the total available energy (Table 5), ensuring a safe return, and all static and dynamic obstacles are successfully avoided, as visually verified in the 3D path plots (Figs. 6 and 7).  Conclusions  E2QRSA effectively addresses the challenge of UAV path planning for disaster relief by integrating adaptive entropy control with quantum-inspired mechanisms. The survival probability objective captures the essential requirements of disaster scenarios more accurately than conventional distance minimization. Experimental validation demonstrates that E2QRSA achieves superior solution quality and faster convergence, providing a robust technical basis for strengthening emergency response capabilities.
Radar, Sonar,Navigation and Array Signal Processing
Peak-to-Average Power Ratio Reduction Theory and Method forOrthogonal Time Frequency Space Systems via Nonzero-Unitary Precoding
ZENG Junlong, JIANG Zhanjun, LIU Haoxiang, ZHANG Huawei, LI Cuiran
2026, 48(4): 1719-1728.   doi: 10.11999/JEIT250888
[Abstract](221) [FullText HTML](120) [PDF 2431KB](27)
Abstract:
  Objective  Orthogonal Time Frequency Space (OTFS) and its variants provide robust performance in high-mobility doubly selective channels. However, their inherently high Peak-to-Average Power Ratio (PAPR) limits power amplifier efficiency and practical implementation. Recent observations have revealed a mismatch between theory and practice. Some OTFS variants obtained by changing the orthogonal basis, such as DCT-based designs, reduce PAPR while maintaining an OTFS-like Bit Error Rate (BER). However, the prevailing explanation mainly attributes reliability to constant-modulus unitary transforms and does not directly account for such non-constant-modulus cases. Therefore, it remains unclear which unitary bases preserve the channel-hardening behavior that stabilizes effective gains and protects BER, and which unitary choices may degrade performance even though they are mathematically unitary. This paper aims to close this gap by establishing a verifiable and more general condition for BER-robust unitary precoding, and by developing a waveform and precoder design approach that suppresses PAPR without sacrificing reliability in OTFS and typical OTFS-like variants.  Methods  A waveform design framework based on nonzero-unitary precoding is established. An upper bound on effective channel-gain fluctuation is derived. It is shown that when the precoder satisfies a nonzero and near-uniform energy-spreading condition, the variance of the effective channel coefficients decreases as the time-frequency grid grows, indicating the emergence of a channel-hardening effect. On the basis of this result, waveform design is formulated as a peak-power minimization problem over the unitary precoder. The objective is to reduce the maximum instantaneous power while preserving the unitary structure required by the modulation framework. A CVX-based solver is used to provide a performance-reference benchmark for the formulated objective. For engineering implementation, an efficient algorithm is developed using the Alternating Direction Method of Multipliers (ADMM). In this method, the original nonconvex design is decomposed into low-cost sub-updates together with a unitary projection step, which enables scalable computation.  Results and Discussions  Simulation results under representative doubly selective channels with high terminal speeds show that the proposed precoder design achieves noticeable PAPR suppression while maintaining the BER close to that of conventional constant-modulus unitary precoding. In addition, the CVX-based benchmark reveals the attainable performance region, and the ADMM-based implementation approaches this reference with a favorable PAPR-BER trade-off. The computational advantage is also validated. Compared with general-purpose convex optimization, the ADMM solver reduces the overall runtime and complexity by roughly three orders of magnitude for typical OTFS parameter settings, which supports real-time or near-real-time deployment. The observed performance trends are consistent with the theoretical insight that near-uniform energy spreading stabilizes effective channel gains and prevents spiky basis vectors from degrading robustness. Furthermore, the framework is applicable to OTFS variants because basis selection and waveform shaping can be interpreted equivalently as unitary-precoder design within the same optimization architecture.  Conclusions  A theoretical and algorithmic solution for PAPR suppression in OTFS systems is presented through nonzero-unitary precoding. Channel hardening is established under a nonzero and near-uniform energy-spreading condition, which provides a principled justification for seeking low-PAPR solutions beyond constant-modulus transforms. A peak-power minimization formulation is adopted to translate this insight into waveform optimization, and a CVX benchmark is provided to quantify the achievable performance reference. A low-complexity ADMM algorithm is then constructed to enable scalable computation through simple sub-updates and unitary projection, while keeping BER performance essentially unchanged. The proposed approach provides a unified low-PAPR waveform design paradigm for OTFS and its variants, with theoretical generality, computational efficiency, and controllable performance under high-mobility doubly selective channels.
Research on Recognition Method in Mixture Scenarios of Ships and Floating Targets
DING Hao, LI Ao, CAO Zheng, LIU Ningbo, WANG Guoqing, SUN Dianxing
2026, 48(4): 1729-1739.   doi: 10.11999/JEIT251119
[Abstract](311) [FullText HTML](208) [PDF 3787KB](34)
Abstract:
  Objective  In radar maritime target detection scenarios, when two or more targets are located within the same range cell, mixture echoes are generated, such as echoes containing both ships and floating targets. Existing target recognition methods exhibit notable limitations in these scenarios because they typically focus on the Doppler channel with the strongest energy in the time-frequency domain. To address this issue, a target recognition method that integrates mode reconstruction and time-frequency features is proposed. The aim is to distinguish individual targets without prior knowledge of whether the received echoes contain mixture targets, thereby avoiding reliance on high range resolution or multipolarization information.  Methods  The core idea is to introduce Variational Mode Decomposition (VMD) to decompose radar echoes into multiple modal components, thereby enabling Doppler-channel separation. To address spurious modes and the fragmented representation of a single target across multiple modes after decomposition, an energy-constrained mode filtering method and a spectral-consistency-based mode clustering method are proposed for effective mode selection and reconstruction. Based on the reconstructed signals, time-frequency differences between ships and floating targets are analyzed in terms of micromotion and signal complexity. Features are extracted from two perspectives: motion stability and the disorder degree of energy distribution, referred to as VF and REDDC features, respectively. These features enable accurate identification of individual targets.  Results and Discussions  Experiments are conducted using X-band radar measured data under sea states 2~4 (Table 1 and Table 2). The results show that the proposed method achieves an average recognition accuracy of 97.32% in mixture scenarios. This performance significantly exceeds that of the existing four-feature recognition method (Table 3) and other advanced methods (Fig. 9). The effect of frequency separation between different targets is further examined. When the time-frequency ridge spacing exceeds 70 Hz, the recognition accuracy reaches 97.93% (Fig. 11). This result also provides empirical guidance for selecting an appropriate clustering threshold during the mode reconstruction stage. When mixture scenarios change to single-target scenarios due to relative motion, the proposed method achieves an average recognition accuracy of 93.34%. This value is 4.62% higher than that of the existing four-feature method (88.72%) (Table 4). Additional analysis indicates that the observation duration used for feature extraction should be no less than 0.25 s to maintain the expected recognition accuracy (Fig. 12).  Conclusions  This study examines recognition problems in maritime multi-target mixture scenarios. VMD is applied to separate the constituent components of mixture echoes. To address spurious modes and fragmented representation of target information across multiple modes, an energy-constrained mode filtering method and a spectral-consistency-based mode clustering method are proposed. VF and REDDC features are extracted from the perspectives of structural characteristics and signal complexity. A Support Vector Machine (SVM) classifier is then used for target recognition. Performance analysis confirms that the proposed method effectively identifies each constituent target in mixture echoes and maintains strong recognition performance in single-target scenarios. Future work will improve computational efficiency and real-time capability by optimizing the stopping criteria of VMD iterations and will further examine the application boundaries of the method using measured data under higher sea states.
DGCN-MFW: A Lightweight Human Action Recognition Network for Millimeter-Wave Radar 3D Point Clouds
DING Xuanyu, JIN Biao, ZHANG Zhenkai
2026, 48(4): 1740-1750.   doi: 10.11999/JEIT251087
[Abstract](380) [FullText HTML](209) [PDF 4318KB](35)
Abstract:
  Objective  Millimeter-wave radar 3D point clouds provide important spatial cues for human action recognition. However, their inherent disorder complicates feature extraction, and actions rely on temporal correlations across multiple frames, which makes single-frame analysis prone to error. In this paper, a dynamic graph convolutional network is proposed for long 3D point-cloud sequences to improve recognition performance and efficiency through multi-scale feature fusion, adaptive frame weighting, and cross-attention.  Methods  A dynamic graph convolutional network solution, DGCN-MFW, is proposed with three core components: dynamic graph convolution feature extraction, multi-scale feature fusion, and adaptive temporal frame weighting. In Step 1, dynamic graph convolution is used to automatically construct spatial geometry through local directed neighborhood graphs, and the neighborhoods are updated online. This design avoids manual graph construction and improves feature robustness. In Step 2, multi-scale feature fusion is applied to jointly extract and integrate point-cloud features across spatial and temporal dimensions, thereby capturing local details and global semantics. In Step 3, adaptive frame weighting is introduced to learn the importance of each frame, emphasize discriminative key frames, and suppress noisy or unimportant frames. Cross-attention is further used to enable information exchange between the center frame and its context, compensating for the limitations of single-frame analysis caused by motion blur, occlusion, or pose ambiguity.  Results and Discussions  The proposed network extracts features through dynamic graph convolution, performs multi-scale feature fusion and adaptive frame weighting, and ultimately completes human action recognition. It achieves strong performance on the public TI and Vayyar millimeter-wave radar point-cloud datasets. With only 2.06M parameters and 4.51 GFLOPs, it outperforms existing methods (Tables 2, 3, and 4). Ablation experiments confirm that both core modules substantially improve recognition accuracy (Table 1). The confusion matrices indicate accuracy above 99% for most actions on the two datasets, demonstrating superior recognition performance (Figs. 10 and 11). However, its scalability, parameter efficiency, and processing efficiency for large-scale data still require improvement. Future work will therefore focus on further lightweight design and architectural optimization to improve efficiency.  Conclusions  To address the two main challenges in mmWave radar 3D point-cloud-based human action recognition, an action recognition algorithm based on a dynamic graph convolutional network and multi-feature fusion is proposed. A multi-scale feature fusion module and cross-scale interaction are used to extract local and global features, which improves spatial representation. An adaptive frame-weighting module and a cross-attention mechanism are adopted to capture the temporal evolution of actions. The method achieves accuracies of 98.32% and 99.48% on two datasets with 2.06M parameters and 4.51 GFLOPs, outperforming mainstream models. It provides a new solution for high-precision, low-resource mmWave radar action recognition and is suitable for real-time scenarios such as industrial human-machine interaction, intelligent security, and healthcare.
Image and Intelligent Information Processing
Multi-Scale Deformable Alignment-Aware Bidirectional Gated Feature Aggregation for Stereoscopic Image Generation from a Single Image
ZHANG Chunlan, QU Yuwei, NIE Lang, LIN Chunyu
2026, 48(4): 1751-1762.   doi: 10.11999/JEIT250760
[Abstract](265) [FullText HTML](186) [PDF 12920KB](19)
Abstract:
  Objective  The generation of stereoscopic images from a single image usually relies on depth as a prior, which often leads to geometric misalignment, occlusion artifacts, and texture blurring. Recent studies have therefore shifted toward end-to-end learning of alignment transformation and rendering within the image or feature domain. By adopting a content-based feature transformation and alignment mechanism, high-quality novel images can be generated without explicit geometric information. However, three main challenges remain. First, fixed convolution has limited ability to model large-scale geometric and disparity changes, which restricts feature alignment performance. Second, texture and structural information are tightly coupled in network representations, and hierarchical modeling and dynamic fusion mechanisms are often absent. This limitation makes it difficult to preserve fine details while maintaining semantic consistency. Third, existing supervision strategies mainly focus on reconstruction errors and provide limited constraints on the intermediate alignment process, which reduces the efficiency of cross-view feature consistency learning. To address these challenges, a Multi-Scale Deformable Alignment-Aware Bidirectional Gated Feature Aggregation network is proposed for stereoscopic image generation from a single image.  Methods  First, to address image misalignment and distortion caused by the inability of fixed convolution to adapt to geometric deformation and disparity changes, a Multi-Scale Deformable Alignment (MSDA) module is proposed. This module employs multi-scale deformable convolution to adaptively adjust sampling positions based on image content, enabling effective alignment between source and target features across different scales. Second, to address texture blurring and structural distortion in synthesized images, a feature decoupling strategy is adopted to guide shallow layers to learn texture information and deeper layers to model structural information. A Texture-Structure Bidirectional Gating Feature Aggregation (Bi-GFA) module is designed to achieve dynamic complementarity and efficient fusion of texture and structural features. Third, to improve cross-view feature alignment accuracy, a Learnable Alignment-Guided Loss (LAG) function is proposed. This loss guides the alignment network to adaptively refine the offset field at the feature level, thereby improving the fidelity and semantic consistency of the synthesized images.  Results and Discussions  This study focuses on scene-level image synthesis from a single image. Quantitative results show that the proposed method performs better than all compared methods in terms of PSNR, SSIM, and LPIPS. The method also maintains stable performance across different dataset sizes and scene complexities, indicating strong generalization ability and robustness (Tab. 1 and Tab. 2). Qualitative comparisons indicate that the generated images are visually closest to the ground-truth images and exhibit high overall sharpness and detail fidelity. In the outdoor KITTI dataset, pixel alignment errors of foreground objects are effectively reduced (Fig. 4). In indoor scenes, facial and hair textures are clearly reconstructed. High-frequency regions, such as champagne towers and balloon edges, present sharp contours and accurate color reproduction without visible artifacts or blurring. Both global illumination and local structural details are well preserved, producing high perceptual quality (Fig. 5). Ablation experiments further confirm the effectiveness of the proposed MSDA, Bi-GFA, and LAG modules (Tab. 3).  Conclusions  A Multi-Scale Deformable Alignment-Aware Bidirectional Gated Feature Aggregation network is proposed to address strong dependence on ground-truth depth, geometric misalignment and distortion, texture blurring, and structural distortion in stereoscopic image generation from a monocular image. The MSDA module improves the flexibility and accuracy of cross-view feature alignment. The Texture-Structure Bi-GFA module enables complementary fusion of texture details and structural information. The LAG further refines offset field estimation and improves the fidelity and semantic consistency of the synthesized images. Experimental results show that the proposed method performs better than existing advanced methods in structural reconstruction, texture clarity, and viewpoint consistency, while maintaining strong generalization ability and robustness. Future work will examine the effect of different depth estimation strategies on system performance and investigate more efficient network architectures and model compression methods to reduce computational cost and support real-time stereoscopic image generation.
Small Object Detection Algorithm for UAV Aerial Images in Complex Environments
LIU Jie, LIU Shuhao, TIAN Ming, CUI Zhigang
2026, 48(4): 1763-1773.   doi: 10.11999/JEIT251126
[Abstract](574) [FullText HTML](349) [PDF 6050KB](77)
Abstract:
  Objective  Small object detection is critical in applications such as UAV (Unmanned Aerial Vehicle) inspection and intelligent transportation systems, where accurate perception of diminutive targets is essential for operational reliability and safety. It supports automated identification and tracking of challenging targets. However, the limited pixel size of small objects, combined with frequent occlusion and background integration, introduces strong background noise and leads to poor performance and high false-negative rates in existing detection models. To address these issues and to achieve high-performance and high-precision detection of small objects in complex scenes, this study proposes HAR-DETR, an enhanced version of the RT-DETR baseline model, designed to improve detection accuracy for small objects.  Methods  HAR-DETR is designed for small object detection in aerial images and integrates three major improvements: Aggregated Attention, RFF-FPN (Recalibrated Feature Fusion Network-FPN), and a high-resolution detection branch. In the backbone, Aggregated Attention strengthens the model’s focus on relevant features of small objects. By expanding the receptive field, the model captures detailed edge and texture information, improving multi-scale feature extraction. During feature fusion, RFF-FPN selectively integrates high- and low-level features to retain critical spatial information and context. This supports better reconstruction of edges and contours of small objects and improves localization and recognition, particularly when object details are partially obscured by cluttered backgrounds or variable lighting. The high-resolution detection branch (HRDB) emphasizes edge features of small objects, enhancing perception and improving robustness and precision.  Results and Discussions  The model is compared with commonly used object detection models, including YOLOv5, YOLOv8, and YOLOv10, using precision, recall, and mAP metrics to assess performance in small object detection. Experimental results show that HAR-DETR outperforms the comparative models on the VisDrone2019 dataset (Table 1). The mAP50 and mAP50-95 increase by 3.8% and 3.2%, respectively, relative to the baseline model (Table 2). These results demonstrate superior detection performance in aerial images under complex conditions. GradCAM heatmaps are used for comparative analysis and show consistent improvements across all proposed components compared with the baseline model (Fig. 6). In the generalization experiment, the VisDrone2019 validation set and RSOD dataset are evaluated under identical training settings. The results confirm that HAR-DETR maintains strong generalization across heterogeneous tasks (Tables 3 and 4).  Conclusions  This work addresses false positives and false negatives in small object detection for aerial images captured in complex environments by using HAR-DETR. Aggregated Attention is used in the backbone to expand the receptive field and improve global feature extraction. During feature fusion, the RFF-FPN structure strengthens feature representation. A high-resolution detection head further increases sensitivity to edge textures of small objects. Evaluation on the VisDrone2019 and RSOD datasets shows: (1) mAP50 and mAP50-95 improve by 3.8% and 3.2%, respectively, reaching 51.2% and 32.1%, which reduces false negatives and false positives; (2) HAR-DETR outperforms mainstream object detection models, confirming its effectiveness; (3) the model achieves high accuracy in cross-dataset training, demonstrating strong generalization. These results show that HAR-DETR has stronger semantic representation and spatial awareness, adapts well to varied aerial perspectives and target distributions, and provides a more versatile solution for UAV visual perception in complex environments.
Research on Ultrasound Imaging Algorithm Fused with Diffusion Model
YUAN Ye, HUANG Minshang, YANG Weifeng
2026, 48(4): 1774-1784.   doi: 10.11999/JEIT251083
[Abstract](312) [FullText HTML](150) [PDF 7319KB](59)
Abstract:
  Objective   Medical ultrasound imaging uses ultrasonic waves to probe human tissues and forms images by processing returning echoes. It has become an essential clinical diagnostic tool because it is noninvasive, safe, and capable of real-time imaging. However, conventional ultrasound imaging remains fundamentally limited by factors such as the finite width of ultrasonic pulses, variations in tissue acoustic impedance, and the complexity of echo signals. These factors lead to persistent challenges, including limited spatial resolution, severe speckle noise, and off-axis artifacts. These limitations directly reduce lesion detectability and diagnostic accuracy. Traditional approaches based on hardware optimization and signal processing algorithms, such as adaptive beamforming, have provided only incremental improvement. Their performance is often constrained by physical laws, computational complexity, and dependence on manual parameter tuning. Recent deep learning methods, particularly those based on Generative Adversarial Networks (GANs), have shown promising performance, but they suffer from training instability and limited interpretability. The diffusion model, an emerging state-of-the-art generative framework, has shown strong robustness and generalization in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) reconstruction. However, its application in ultrasound imaging remains largely unexplored. This study aims to address this gap by developing a novel diffusion model-based framework for high-quality ultrasound image formation and to provide a stable, efficient, and interpretable solution for improving ultrasound image quality.  Methods   A novel ultrasound imaging method based on a Denoising Diffusion Probabilistic Model (DDPM) is proposed. The core of the method is a multi-scale diffusion network architecture designed to progressively refine a low-quality ultrasound image, such as one generated by a simple Delay-And-Sum (DAS) beamformer, into a high-quality image. The process includes forward and reverse stages. In the forward stage, Gaussian noise is gradually added to a high-quality ground-truth image over a series of time steps. In the reverse stage, the model is trained to learn the conditional denoising function. A custom denoising network takes a low-resolution DAS image as conditional input and fuses it with the noisy image at each denoising step through residual connections and feature-wise transformations at multiple scales. This deep fusion mechanism enables the network to incorporate the underlying anatomical structure from the low-quality input while iteratively removing noise and artifacts through the diffusion process. The model is trained on a dataset of paired low-quality and high-quality ultrasound images, in which the high-quality images serve as the training target. The training objective is to maximize the variational lower bound of the likelihood, thereby enabling the network to reverse the noising process. The proposed method is quantitatively compared with traditional DAS, Minimum Variance (MV) beamforming, and a representative GAN-based super-resolution method using Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity index (SSIM).  Results and Discussions   The proposed diffusion model demonstrates superior performance in improving ultrasound image quality. Quantitatively, the method achieves a mean PSNR of 35.2 dB and an SSIM of 0.933, with a PSNR improvement of 4.5 dB over conventional beamforming methods, while maintaining excellent structural fidelity. The method also consistently outperforms adaptive MV beamforming and GAN-based methods across all evaluation metrics, including contrast-to-noise ratio. Visual assessment supports these quantitative results. The generated images show markedly reduced speckle noise and substantially improved boundary definition of anatomical structures. Notably, these improvements are achieved without the blurring or artificial textures commonly observed in other deep learning-based methods. The multi-scale architecture with conditional feature injection effectively preserves structural integrity, as shown by the clear and continuous edges in the output images. The progressive denoising nature of the method also provides inherent interpretability for the image refinement process. Unlike the opaque single-step generation used in many other deep learning models, this method provides a transparent, stepwise enhancement pathway from the initial input to the final output. In addition, the training process remains stable and convergent, avoiding the instability that frequently affects adversarial training methods. Ablation experiments confirm the critical role of the deep fusion mechanism, and resolution analysis verifies substantial improvement in both lateral and axial resolution compared with all baseline methods.  Conclusions   This study develops and validates a novel ultrasound imaging method based on a diffusion model. The proposed framework effectively addresses key limitations of conventional methods and existing deep learning-based approaches. It avoids the complex matrix computations and manual parameter tuning required by adaptive beamformers and provides a more stable training framework than GAN-based methods. The results show that the method can substantially improve image quality by increasing PSNR and maintaining excellent structural similarity, thereby producing images with suppressed noise, reduced artifacts, and improved resolution. The multi-scale diffusion process preserves anatomical structures and provides a degree of interpretability for the image generation process. This work establishes diffusion models as a promising new framework for advanced ultrasound imaging and provides a robust, high-performance technical route for overcoming current bottlenecks in ultrasound image quality, with broad potential clinical value.
A Long-Short Term Fusion Spiking Neural Network for Detecting Tiny Moving Targets in Dynamic Vision
LI Miao, ZHANG Heng, CHEN Nuo, SHI Yangsi, HE Shiman, AN Wei
2026, 48(4): 1785-1794.   doi: 10.11999/JEIT250785
[Abstract](329) [FullText HTML](202) [PDF 2950KB](29)
Abstract:
  Objective  Long-distance electro-optical surveillance systems are widely used for applications such as space debris monitoring and unauthorized drone flight warning. In such systems, targets appear randomly and move rapidly. Because of the long detection distance, targets appear extremely small in the optical sensor and lack obvious morphological or texture features; therefore, they are classified as tiny moving targets. Conventional tiny-target perception mechanisms adopt the “image frame imaging + artificial neural network processing” paradigm. This approach generates large data volumes and requires high computational power and energy consumption, which restricts system lightweight deployment. In recent years, inspired by bionic perception and brain-like processing, the paradigm of “dynamic vision detection + brain-like processing” has emerged as a new direction. Dynamic vision provides low redundancy and high temporal resolution. However, its output is not regular image frames but sparse event streams, which require new processing methods. The Spiking Neural Network (SNN) is regarded as the third-generation neural network. It uses sparse connections and spike-based representations and naturally matches the asynchronous event triggering and bright-dark pulse output of dynamic vision sensors. Existing SNN-based methods mainly focus on targets with clear shapes in scenarios such as autonomous driving and are not well suited for tiny moving targets in long-distance electro-optical surveillance systems. To address this problem, a Long-Short Term Fusion SNN is proposed to support the application of dynamic vision in tiny moving target detection.  Methods  The proposed network architecture contains four main components. First, a short-term feature extraction module, the Spiking Swin Transformer (SST), is designed to capture the morphological expansion characteristics of tiny targets. This module focuses on spatiotemporal correlations across adjacent time steps and spatial regions. It integrates a spiking self-attention mechanism to enhance the learning of irregular pixel correlations and temporal dependencies. Second, a long-term feature extraction module, the spiking ConvLSTM (SCL), is proposed to learn motion continuity embedded in long temporal sequences. A longer temporal range provides richer learnable motion features. The SCL is designed based on the ANN-style ConvLSTM architecture and takes advantage of the inherent temporal processing capability of spiking recurrent neural networks to strengthen long-term temporal memory. Third, features from the SST and SCL branches are aligned and integrated through tensor alignment and additive fusion, forming the Spiking Feature Pyramid Network (SFPN). This module performs spiking pyramid operations to fuse cross-scale spatiotemporal features across different network depths. Finally, a detection head is used to extract and identify tiny targets.  Results and Discussions  The proposed algorithm is validated using real dynamic vision data for drone detection. Experimental results show clear performance improvements across several evaluation metrics. Compared with methods that rely only on short-term temporal features, the proposed method increases recall by about 1.3% and improves accuracy by precision 0.9%, which allows more reliable detection of tiny moving targets. Analysis of the F1-score further indicates that recall improves by 1.3% while false alarms are reduced. These results confirm that the dual-path spiking memory network for long-term feature extraction strengthens the ability of the model to identify subtle target characteristics. In particular, the integration of long-term temporal features improves discrimination between noise events and genuine tiny targets.  Conclusions  This study addresses tiny moving target detection under dynamic vision and proposes a method based on Long-Short Term Fusion SNN. Considering the morphological expansion characteristics and motion continuity of tiny targets, the SST module and the SCL module are designed to extract short-term and long-term temporal features. Multi-scale dual-path features are fused through a spiking pyramid module. By learning high-dimensional features across different temporal windows, the method enables deeper mining and automatic learning of limited surface features of tiny targets. Experiments on real dynamic vision data verify the performance advantage of the proposed method, achieving a recall rate above 95% and outperforming comparison algorithms. Ablation experiments further demonstrate that long-term temporal feature learning and larger temporal data ranges improve tiny target detection performance. The proposed method enables natural integration between sparse event streams from dynamic vision sensors and spiking neural mechanisms. It provides algorithmic support for applying the “bionic detection + brain-like processing” perception paradigm in long-distance electro-optical surveillance systems.
Multi-scale Frequency Adapter and Dual-path Attention for Time Series Forecasting
YANG Zhenzhen, XU Yi, WAN Chengye, YANG Yongpeng
2026, 48(4): 1795-1805.   doi: 10.11999/JEIT251188
[Abstract](309) [FullText HTML](170) [PDF 4941KB](35)
Abstract:
  Objective  With the rapid development of big data technology, time series data are increasingly used in meteorology, power systems, finance, and other fields. However, mainstream forecasting methods face challenges in multi-scale modeling and frequency-domain feature extraction, which limit the ability to capture dynamic properties and periodic patterns in complex datasets. Traditional statistical approaches, such as AutoRegressive Integrated Moving Average (ARIMA), rely on assumptions of linear relationships and therefore perform poorly when applied to nonlinear or high-dimensional time series data. Although deep learning methods, particularly those based on convolutional neural networks and Transformer architectures, improve forecasting accuracy through advanced feature extraction and long-range dependency modeling, limitations remain in efficiently extracting and integrating multi-scale features in both temporal and frequency domains. These limitations reduce stability and forecasting accuracy, especially in dynamic and heterogeneous applications. This study proposes an intelligent forecasting framework that models multi-scale information and improves prediction accuracy across different scenarios.  Methods  A Multi-scale Frequency Adapter and Dual-path Attention (MFADA) framework is proposed for time series forecasting. The framework integrates two key modules: the Multi-scale Frequency Adapter (MFA) and the Multi-scale Dual-path Attention (MDA). The MFA module captures multi-scale frequency features through adaptive pooling and deep convolution operations. This design improves sensitivity to different frequency components and supports modeling of both short-term and long-term dependencies. The MDA module applies a multi-scale attention mechanism to strengthen fine-grained modeling across temporal and feature dimensions. It enables effective extraction and fusion of comprehensive time-domain and frequency-domain information. The framework is designed with computational efficiency to ensure scalability. Experiments on eight public datasets verify the effectiveness and robustness of the proposed method compared with existing time series forecasting approaches.  Results and Discussions  Extensive experiments were conducted on eight publicly available multivariate datasets, including ECL, Weather, ETT (ETTm1, ETTm2, ETTh1, ETTh2), Solar-Energy, and Traffic. Evaluation metrics include Mean Absolute Error (MAE) and Mean Squared Error (MSE). Model complexity was assessed through parameter count, FLoating Point Operations (FLOPs), and training time. Comparisons were performed with state-of-the-art models, including Fredformer, Peri-midFormer, iTransformer, TFformer, PatchTST, MSGNet, TimesNet, and TCM. Results show that MFADA achieves superior forecasting performance on most datasets and forecasting horizons (Table 1). The model obtains the best average MSE and MAE of 0.163 and 0.261 on ECL, representing decreases of 13.2% and 17.3% compared with TimesNet for forecasting length 96. On the periodic ETTm1 dataset, the average MSE reaches 0.377, which is 5.3% lower than MSGNet. Ablation experiments (Table 2) confirm the contributions of the MFA and MDA modules. Removing MFA or replacing MDA with standard self-attention increases forecasting errors on ECL, Weather, ETTh1, and ETTh2. These results indicate the complementary roles of both modules in modeling complex temporal patterns. Complexity analysis (Fig. 2) shows that MFADA achieves a balanced trade-off among forecasting accuracy, parameter efficiency, and training time, outperforming Fredformer, MSGNet, and TimesNet. Visualization results for ECL and ETTh2 (Fig. 3, Fig. 4) demonstrate that MFADA effectively follows ground-truth trends, captures turning points, and improves prediction accuracy at both global and local levels. Performance on the Traffic dataset is relatively weaker because of strong spatial correlations in the data, which indicates potential directions for future research.  Conclusions  This paper proposes MFADA, a time series forecasting method that integrates multi-scale frequency adaptation and dual-path attention mechanisms. MFADA presents four main advantages: (1) The MFA module effectively extracts and integrates multi-scale frequency-domain features through pyramid pooling and channel gating, which improves representation across different temporal scales. (2) The MDA module captures multi-scale dependencies in both temporal and feature dimensions, enabling fine-grained dynamic modeling. (3) The architecture maintains computational efficiency through lightweight convolution and pooling operations. (4) Experimental results across eight datasets and multiple forecasting horizons demonstrate strong generalization ability, particularly for multivariate and long-term forecasting tasks. These results show that MFADA improves both accuracy and efficiency in time series forecasting and provides useful directions for research and practical applications. Future work will explore the integration of spatial correlation information to further improve model applicability.
Research on Proximal Policy Optimization for Autonomous Long-Distance Rapid Rendezvous of Spacecraft
LIN Zheng, HU Haiying, DI Peng, ZHU Yongsheng, ZHOU Meijiang
2026, 48(4): 1806-1819.   doi: 10.11999/JEIT250844
[Abstract](498) [FullText HTML](379) [PDF 3329KB](61)
Abstract:
  Objective   With increasing demands from deep-space exploration, on-orbit servicing, and space debris removal missions, autonomous long-distance rapid rendezvous capabilities are required for future space operations. Traditional trajectory planning approaches based on analytical methods or heuristic optimization show limitations when complex dynamics, strong disturbances, and uncertainties are present, which makes it difficult to balance efficiency and robustness. Deep Reinforcement Learning (DRL) combines the approximation capability of deep neural networks with reinforcement learning-based decision-making, which supports adaptive learning and real-time decisions in high-dimensional continuous state and action spaces. In particular, Proximal Policy Optimization (PPO) is a representative policy gradient method because of its training stability, sample efficiency, and ease of implementation. Integration of DRL with PPO for spacecraft long-distance rapid rendezvous is therefore expected to overcome the limits of conventional methods and provide an intelligent, efficient, and robust solution for autonomous guidance in complex orbital environments.   Methods   A spacecraft orbital dynamics model is established by incorporating J2 perturbation, together with uncertainties arising from position and velocity measurement errors and actuator deviations during on-orbit operations. The long-distance rapid rendezvous problem is formulated as a Markov Decision Process, in which the state space includes position, velocity, and relative distance, and the action space is defined by impulse duration and direction. Fuel consumption and terminal position and velocity constraints are integrated into the model. On this basis, a DRL framework based on PPO is constructed. The policy network outputs maneuver command distributions, whereas the value network estimates state values to improve training stability. To address convergence difficulties caused by sparse rewards, an enhanced dense reward function is designed by combining a position potential function with a velocity guidance function. This design guides the agent toward the target while enabling gradual deceleration and improved fuel efficiency. The optimal maneuver strategy is obtained through simulation-based training, and robustness is evaluated under different uncertainty conditions.   Results and Discussions   Based on the proposed DRL framework, comprehensive simulations are conducted to assess effectiveness and robustness. In Case 1, three reward structures are examined: sparse reward, traditional dense reward, and an improved dense reward that integrates a relative position potential function with a velocity guidance term. The results show that reward design strongly affects convergence behavior and policy stability. Under sparse rewards, insufficient process feedback limits exploration of feasible actions. Traditional dense rewards provide continuous feedback and enable gradual convergence, but terminal velocity deviations are not fully corrected at later stages, which leads to suboptimal convergence and incomplete satisfaction of terminal constraints. In contrast, the improved dense reward guides the agent toward favorable behaviors from early training stages while penalizing undesirable actions at each step, which accelerates convergence and improves robustness. The velocity guidance term allows anticipatory adjustments during mid-to-late approach phases rather than delaying corrections to the terminal stage, resulting in improved fuel efficiency.Simulation results show that the maneuvering spacecraft performs 10 impulsive maneuvers, achieving a terminal relative distance of 21.326 km, a relative velocity of 0.005 0 km/s, and a total fuel consumption of 111.212 3 kg. To evaluate robustness under realistic uncertainties, 1 000 Monte Carlo simulations are performed. As summarized in Table 6, the mission success rate reaches 63.40%, and fuel consumption in all trials remains within acceptable bounds. In Case 2, PPO performance is compared with that of Deep Deterministic Policy Gradient (DDPG) for a multi-impulse fast-approach rendezvous mission. PPO results show five impulsive maneuvers, a terminal separation of 2.281 8 km, a relative velocity of 0.003 8 km/s, and a total fuel consumption of 4.148 6 kg. DDPG results show a fuel consumption of 4.322 5 kg, a final separation of 4.273 1 km, and a relative velocity of 0.002 0 km/s. Both methods satisfy mission requirements with comparable fuel use. However, DDPG requires a training time of 9 h 23 min, whereas PPO converges within 6 h 4 min, indicating lower computational cost. Overall, the improved PPO framework provides better learning efficiency, policy stability, and robustness.  Conclusions   The problem of autonomous long-distance rapid rendezvous under J2 perturbation and uncertainties is investigated, and a PPO-based trajectory optimization method is proposed. The results demonstrate that feasible maneuver trajectories satisfying terminal constraints can be generated under limited fuel and transfer time, with improved convergence speed, fuel efficiency, and robustness. The main contributions include: (1) development of an orbital dynamics framework that incorporates J2 perturbation and uncertainty modeling, with formulation of the rendezvous problem as a Markov Decision Process; (2) design of an enhanced dense reward function that combines position potential and velocity guidance, which improves training stability and convergence efficiency; and (3) simulation-based validation of PPO robustness in complex orbital environments. Future work will address sensor noise, environmental disturbances, and multi-spacecraft cooperative rendezvous in more complex mission scenarios to further improve practical applicability and generalization.
Image Deraining Driven by CLIP Visual Embedding
SUN Jin, CUI Yuntong, TIAN Hongwei, HUANG Changcheng, WANG Jigang
2026, 48(4): 1820-1831.   doi: 10.11999/JEIT251066
[Abstract](331) [FullText HTML](218) [PDF 5379KB](49)
Abstract:
  Objective  Rain streaks introduce visual distortions that degrade image quality and significantly impair downstream vision tasks such as feature extraction and object detection. This work addresses the problem of single-image rain streak removal. Existing methods often rely heavily on restrictive priors or synthetic datasets. This dependence limits robustness and generalization because such data differ from complex and unstructured real-world scenarios. Contrastive Language-Image Pre-training(CLIP) demonstrates strong zero-shot generalization through large-scale image-text contrastive learning. Motivated by this property, this study proposes FCLIP-UNet, a visual-semantic-driven deraining architecture designed to improve rain removal and generalization in real-world rainy environments.  Methods  FCLIP-UNet adopts a U-Net encoder-decoder architecture and formulates deraining as pixel-level detail regression guided by high-level semantic features. During the encoding stage, textual queries are omitted. Instead, the first four layers of a frozen CLIP-RN50 are employed to extract robust features that are decoupled from rain distribution. These features exploit the semantic representation capability of CLIP to suppress diverse rain patterns. To guide accurate image restoration, a collaborative decoding architecture that integrates ConvNeXt-T and an Upsampling DepthWise convolution Block (UpDWBlock) is adopted. The decoder employs ConvNeXt-T in place of conventional convolution modules to expand the receptive field and capture global contextual information. It parses rain streak patterns by using semantic priors extracted from the encoder. Under the constraint of these priors, UpDWBlock reduces information loss during upsampling and reconstructs fine-grained image details. Multi-level skip connections compensate for information loss introduced during encoding. In addition, a Layer-wise Differentiated Feature Perturbation Strategy (LDFPS) is incorporated to enhance robustness and adaptability in complex real-world rainy scenes.  Results and Discussions  Comprehensive evaluations are conducted on the Rain13K composite dataset by comparing the proposed model with ten state-of-the-art deraining algorithms. FCLIP-UNet shows consistently superior performance across all five testing subsets of Rain13K. In particular, the method outperforms the second-best approach on both datasets: on Test100 by 0.32 dB in Peak Signal-to-Noise Ratio (PSNR) and 0.006 in Structural Similarity Index Measure (SSIM); on Test2800 by 0.14 dB and 0.002, respectively. On Rain100H and Rain100L, FCLIP-UNet achieves competitive results, including the best SSIM on Rain100H and comparable results on other metrics (Table 3). To evaluate model generalization, the Rain13K-pretrained FCLIP-UNet is further tested on three datasets with different rainfall distribution characteristics: SPA-Data, HQ-RAIN, and MPID (Table 4, Fig. 7). Qualitative and quantitative evaluations are also conducted on the real-world NTURain-R dataset (Table 5, Figs. 8\begin{document}$ \sim $\end{document}10). These results consistently demonstrate the strong generalization capability of FCLIP-UNet. Ablation experiments on Rain100H validate the proposed encoder design and confirm the effectiveness of both UpDWBlock and LDFPS (Tables 6, 811). Additional ablation studies show that the use of LDFPS, combined with a 1:1 weighting ratio between L1 loss and perceptual loss, provides the best performance for FCLIP-UNet (Tables 9\begin{document}$ \sim $\end{document}11).  Conclusions  This study proposes FCLIP-UNet, a deraining network designed for real-world generalization by leveraging the CLIP paradigm. Three main contributions are presented. First, image deraining is formulated as a pixel-level regression task that reconstructs rain-free images from high-level semantic features. A frozen CLIP image encoder extracts representations that remain stable across different rain distributions, thereby reducing domain shifts caused by diverse rain models. Second, a decoder that integrates ConvNeXt-T with an UpDWBlock is designed, and an LDFPS is proposed to improve robustness to unseen rain distributions. Third, a composite loss function jointly optimizes pixel-level accuracy and perceptual consistency. Experiments on both synthetic and real-world rainy datasets show that FCLIP-UNet effectively removes rain streaks, preserves fine image details, and achieves strong deraining performance with reliable generalization capability.
Circuit and System Design
Genetic-algorithm-optimized All-metal Metasurface for Cross-band Stealth via Low-cost Computer Numerical Control Fabrication
ZHANG Ming, ZHANG Najiao, LI Jialei, LI Kang, Vazgen MELIKYAN, YANG Lin, HOU Weimin
2026, 48(4): 1832-1842.   doi: 10.11999/JEIT251080
[Abstract](433) [FullText HTML](290) [PDF 5952KB](32)
Abstract:
  Objective  Traditional electromagnetic stealth materials face the practical challenge of achieving both microwave absorption and infrared stealth. Conventional solutions, including geometric optimization and multilayer composite coatings, often suffer from narrow bandwidth, complex fabrication, and limited cross-band compatibility. This study proposes a genetic algorithm-optimized all-metal random coding metasurface that enables concurrent broadband Radar Cross Section (RCS) reduction and low infrared emissivity on a monolithic metallic platform, thereby addressing these practical limitations.  Methods  Monolithic all-metal C-shaped resonant units are employed. The design is based on the Pancharatnam-Berry geometric phase, in which the reflection phase is regulated by the rotation angle of the unit. Coding schemes of 2-bit, 3-bit, and 4-bit are implemented, corresponding to 4, 8, and 16 discrete phase states. A MATLAB-CST co-simulation framework is established. CST extracts unit responses using the Finite Element Method (FEM), whereas MATLAB applies a genetic algorithm to optimize the phase distribution for scattering energy diffusion. All-metal metasurface prototypes (150×150 mm2, 10×10 array) are fabricated using Computer Numerical Control(CNC) cutting.  Results and Discussions  Genetic algorithm optimization converges within 6~8 generations. Increasing the number of coding bits enhances phase randomness. The 4-bit metasurface achieves an average 10 dB RCS reduction over 11\begin{document}$ \sim $\end{document}18.4 GHz. Simulation results agree with anechoic chamber measurements under oblique incidence angles from 0° to 60°. Infrared imaging confirms the low emissivity of the metallic surface. Compared with conventional composite or multilayer structures, the all-metal design simplifies fabrication, prevents interfacial mismatch, and improves structural stability. The metasurface demonstrates broadband, wide-angle, and cross-band stealth performance.  Conclusions  This study presents a genetic algorithm-optimized all-metal random coding metasurface that achieves cross-band stealth compatibility. The design addresses the persistent challenge of realizing both microwave performance and thermal management in conventional stealth materials. Three main technical contributions are demonstrated. (1)The monolithic copper structure provides greater than 99.9% infrared reflectivity in the 8\begin{document}$ \sim $\end{document}14 μm band, verified by FLIR imaging, and achieves an average 10 dB RCS reduction over 11\begin{document}$ \sim $\end{document}18.4 GHz. (2)The single-material configuration removes the risk of delamination. The CNC-fabricated prototype maintains structural integrity under 60° oblique incidence and reduces fabrication cost by approximately 78% compared with lithographic processing. (3)The co-simulation optimization framework converges within eight generations for 4-bit coding, enabling broadband scattering manipulation over 7.4 GHz. The proposed metasurface combines fabrication reliability, cost efficiency, and dual-band stealth capability. These characteristics provide a practical basis for large-scale deployment in military stealth systems and satellite platforms that require multispectral concealment and long-term structural durability.
One-pass Architectural Synthesis for Continuous-Flow Microfluidic Biochips Based on Deep Reinforcement Learning
LIU Genggeng, JIAO Xinyue, PAN Youlin, HUANG Xing
2026, 48(4): 1843-1852.   doi: 10.11999/JEIT251058
[Abstract](415) [FullText HTML](227) [PDF 3358KB](50)
Abstract:
Continuous-Flow Microfluidic Biochips (CFMBs) are widely applied in biomedical research because of miniaturization, high reliability, and low sample consumption. As integration density increases, design complexity significantly rises. Conventional stepwise design methods treat binding, scheduling, layout, and routing as separate stages, with limited information exchange across stages, which leads to reduced solution quality and extended design cycles. To address this limitation, a one-pass architectural synthesis method for CFMBs is proposed based on Deep Reinforcement Learning (DRL). Graph Convolutional Neural networks (GCNs) are used to extract state features, capturing structural characteristics of operations and their relationships. Proximal Policy Optimization (PPO), combined with the A* algorithm and list scheduling, ensures rational layout and routing while providing accurate information for operation scheduling. A multiobjective reward function is constructed by normalizing and weighting biochemical reaction time, total channel length, and valve count, enabling efficient exploration of the decision space through policy gradient updates. Experimental results show that the proposed method achieves a 2.1% reduction in biochemical reaction time, a 21.3% reduction in total channel length, and a 65.0% reduction in valve count on benchmark test cases, while maintaining feasibility for larger-scale chips.  Objective  CFMBs have gained sustained attention in biomedical applications because of miniaturization, high reliability, and low sample consumption. With increasing integration density, design complexity escalates substantially. Traditional stepwise design methods often yield suboptimal solutions, extended design cycles, and feasibility limitations for large-scale chips. To address these challenges, a one-pass architectural synthesis framework is proposed that integrates DRL to achieve coordinated optimization of binding, scheduling, layout, and routing.  Methods  All CFMB design tasks are integrated into a unified optimization framework formulated as a Markov decision process. The state space includes device binding information, device locations, operation priorities, and related parameters, whereas the action space adjusts device placement, operation-to-device binding, and operation priority. High-dimensional state features are extracted using GCNs. PPO is applied to iteratively update policies. The reward function accounts for biochemical reaction time, total flow-channel length, and the number of additional valves. These metrics are evaluated using the A* algorithm and list scheduling, normalized, and weighted to balance trade-offs among objectives.  Results and Discussions  Based on the current state and candidate actions, architectural solutions are generated iteratively through PPO-guided policy updates combined with the A* algorithm and list scheduling. The defined reward function enables the generation of CFMB architectures with improved overall quality. Experimental results show an average reduction of 2.1% in biochemical reaction time, an average reduction of 21.3% in total flow-channel length, and an average reduction of 65.0% in additional valve count compared with existing methods. These improvements reduce manufacturing cost and operational risk.  Conclusions  A one-pass architectural synthesis method for CFMBs based on DRL is proposed to address flow-layer design challenges. By applying GCN-based state feature extraction and PPO-based policy optimization, the multiobjective design problem is transformed into a sequential decision-making process that enables joint optimization of binding, scheduling, layout, and routing. Experimental results obtained from multiple benchmark test cases confirm improved performance in biochemical reaction completion time, total channel length, and valve count, while preserving scalability for larger chip designs.
A Triple Modular Redundancy Voter Insertion Algorithm Utilizing Stagnation-Aware Probabilistic Reordering
LIU Zhaoting, LIU Peng
2026, 48(4): 1853-1862.   doi: 10.11999/JEIT250825
[Abstract](188) [FullText HTML](155) [PDF 1224KB](37)
Abstract:
  Objective  With the rapid development of integrated circuit technology, performance degradation and failure of electronic devices in high-energy particle radiation environments become increasingly prominent. High reliability is required in applications such as aerospace, the nuclear industry, petroleum exploration, and deep-sea detection. Among the available reliability enhancement techniques, Triple Modular Redundancy (TMR) is widely regarded as one of the most effective methods. In TMR, three identical copies of a digital circuit operate in parallel with the same input, and the correct output is obtained through majority voting when one copy fails. Common implementation methods include fine-grained TMR, system-level partitioning, and state synchronization. State synchronization is a key step in TMR-based radiation hardening because it restores registers to the correct state after a fault. This process is achieved by inserting synchronization voters, but the resulting resource overhead is often high. This study proposes a new synchronization voter insertion algorithm to reduce hardware cost. The objective is to develop and validate an algorithm that avoids exponential runtime complexity and, relative to existing methods, reduces the number of required synchronization voters.  Methods  After circuit preprocessing, the synchronization voter insertion task is formulated as a Feedback Vertex Set Problem (FVSP). The memory circuit is first extracted from the digital circuit to exclude nodes outside the candidate range and reduce circuit size. A Feedback Vertex Set (FVS) is then solved to identify the flip-flop nodes at which synchronization voters should be inserted. By inserting voters at the outputs of these flip-flops, all cycles containing memory elements are broken, and state synchronization is ensured. In implementation, a Simulated Annealing (SA) algorithm is used. Topological ordering is adopted to avoid direct loop detection and to reduce the time complexity of cycle checking. To improve search efficiency and solution quality, a Stagnation-Aware Probabilistic Reordering (SAPR) scheme is incorporated into the SA framework. A priority-based mechanism is applied during topological reordering to reduce conflicts and false conflict judgments in critical search steps. The candidate-set update strategy is also refined so that insertion positions with the fewest conflicts are selected in the topological ordering. When the FVS is not improved over multiple iterations, reordering is triggered with a certain probability to balance computational cost and the ability to escape local optima.  Results and Discussions  The quality of the FVS obtained by the SAPR-SA-FVSP algorithm is evaluated by comparison with three other methods. The proposed method shows higher probabilities of achieving the minimum average, best, and worst values, which indicates better overall solution quality (Table 3). Furthermore, SAPR-SA-FVSP shows a smaller mean standard deviation, which indicates better stability. The average standard deviations over all test graphs are 0.596 34 for SA-FVSP, 0.667 55 for the Nonuniform Neighborhood Sampling (NNS)-based SA method, 0.651 93 for dynamic-threshold reordering, and 0.562 17 for SAPR-SA-FVSP, confirming the superior stability of the proposed method (Table 4). Using the ISCAS89 and ITC'99 benchmark circuits, the proposed voter insertion algorithm is further compared with the critical path-based voter insertion algorithm and the highest-fanout flip-flop algorithm. Across all test cases, SAPR-SA-FVSP yields the smallest number of synchronization voters. The maximum reduction reaches 78.88% relative to the critical path-based method and 74.05% relative to the highest-fanout flip-flop algorithm (Table 5). The proposed algorithm also shows better speed and robustness. It runs successfully on all test cases without failure. The average execution times on the circuits for which all three algorithms complete successfully are 9880.19 ms for the critical path-based algorithm, 9 625.04 ms for the highest-fanout flip-flop algorithm, and 3 389.73 ms for the proposed algorithm.  Conclusions  The proposed SAPR strategy improves the conventional SA-FVSP method and yields better solution quality and greater stability. On this basis, a resource-efficient synchronization voter insertion algorithm is proposed for restoring correct register states in TMR-hardened digital circuits. The algorithm divides the task into memory-circuit extraction and FVSP solving. Its completeness and efficiency are demonstrated theoretically, and substantial reductions in synchronization voter insertion are verified on benchmark circuits relative to the critical path-based and highest-fanout flip-flop methods. The proposed method therefore provides an effective approach for reducing hardware overhead while maintaining high reliability in TMR hardening of digital circuits.
Optimal Federated Average Fusion of Gaussian Mixture-Probability Hypothesis Density Filters
XUE Yu, XU Lei
2026, 48(4): 1863-1874.   doi: 10.11999/JEIT250759
[Abstract](302) [FullText HTML](152) [PDF 2571KB](44)
Abstract:
  Objective  To realize optimal decentralized fusion tracking of uncertain targets, this study proposes a federated average fusion algorithm for Gaussian Mixture-Probability Hypothesis Density (GM-PHD) filters, designed with a hierarchical structure. Each sensor node operates a local GM-PHD filter to extract multi-target state estimates from sensor measurements. The fusion node performs three key tasks: (1) maintaining a master filter that predicts the fusion result from the previous iteration; (2) associating and merging the GM-PHDs of all filters; and (3) distributing the fused result and several parameters to each filter. The association step decomposes multi-target density fusion into four categories of single-target estimate fusion. We derive the optimal single-target estimate fusion both in the absence and presence of missed detections. Information assignment applies the covariance upper-bounding theory to eliminate correlation among all filters, enabling the proposed algorithm to achieve the accuracy of Bayesian fusion. Simulation results show that the federated fusion algorithm achieves optimal tracking accuracy and consistently outperforms the conventional Arithmetic Average (AA) fusion method. Moreover, the relative reliability of each filter can be flexibly adjusted.  Methods  The multi-sensor multi-target density fusion is decomposed into multiple groups of single-target component merging through the association operation. Federated filtering is employed as the merging strategy, which achieves the Bayesian optimum owing to its inherent decorrelation capability. Section 3 rigorously extends this approach to scenarios with missed detections. To satisfy federated filtering’s requirement for prior estimates, a master filter is designed to compute the predicted multi-target density, thereby establishing a hierarchical architecture for the proposed algorithm. In addition, auxiliary measures are incorporated to compensate for the observed underestimation of cardinality.  Results and Discussions  modified Mahalanobis distance (Fig.3). The precise association and the single-target decorrelation capability together ensure the theoretical optimality of the proposed algorithm, as illustrated in Fig. 2. Compared with conventional density fusion, the Optimal Sub-Pattern Assignment (OSPA) error is reduced by 8.17% (Fig. 4). The advantage of adopting a small average factor for the master filter is demonstrated in Figs. 5 and 6. The effectiveness of the measures for achieving cardinality consensus is also validated (Fig. 7). Another competitive strength of the algorithm lies in the flexibility of adjusting the average factors (Fig. 8). Furthermore, the algorithm consistently outperforms AA fusion across all missed detection probabilities (Fig. 9).  Conclusions  This paper achieves theoretically optimal multi-target density fusion by employing federated filtering as the merging method for single-target components. The proposed algorithm inherits the decorrelation capability and single-target optimality of federated filtering. A hierarchical fusion architecture is designed to satisfy the requirement for prior estimates. Extensive simulations demonstrate that: (1) the algorithm can accurately associate filtered components belonging to the same target, thereby extending single-target optimality to multi-target fusion tracking; (2) the algorithm supports flexible adjustment of average factors, with smaller values for the master filter consistently preferred; and (3) the superiority of the algorithm persists even under sensor malfunctions and high missed detection rates. Nonetheless, this study is limited to GM-PHD filters with overlapping Fields Of View (FOVs). Future work will investigate its applicability to other filter types and spatially non-overlapping FOVs.
News
more >
Conference
more >
Author Center

Wechat Community