Latest Articles

Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes/issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Construction of MDS Codes and NMDS Codes Based on Cyclic Subgroup of \begin{document}$ \mathbb{F}_{{q}^{2}}^{*} $\end{document}
DU Xiaoni, XUE Jing, QIAO Xingbin, ZHAO Ziwei
Available online  , doi: 10.11999/JEIT251204
Abstract:
  Objective  With the rapid development of modern communication technologies, the demand for higher performance and efficiency in error correcting codes has intensified. Error-correcting codes are used to detect and correct errors introduced during transmission. Thanks to their superior algebraic structure, straightforward encoding and decoding algorithms, and ease of implementation, linear codes have become the most widely used class of error-correcting codes in communication systems. Their parameters are constrained by classical bounds, such as the Singleton bound: for a linear code of length \begin{document}$ n $\end{document} and dimension \begin{document}$ k $\end{document}, the minimum distance \begin{document}$ d $\end{document} satisfies \begin{document}$ d\leq n-k+1 $\end{document}. When \begin{document}$ d=n-k+1 $\end{document}, the code is known as maximum distance separable (MDS) code. MDS codes are widely used in distributed storage systems and random error channels. If \begin{document}$ d=n-k $\end{document}, the code is termed almost MDS (AMDS), when both a code and its dual are AMDS, it is near MDS (NMDS). Owing to their distinctive geometric structure, NMDS codes have important applications in cryptography and combinatorics. There has been sustained, in-depth research worldwide on constructing structurally simple, high performance MDS and NMDS codes. Against this backdrop, this paper constructs several families of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document} over the finite field \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document} of even characteristic, leveraging the cyclic subgroup \begin{document}$ {U}_{q+1} $\end{document}. Furthermore, several families of optimal locally repairable codes (LRCs) are presented. LRCs enable efficient failure recovery by accessing only a small set of local nodes, thereby reducing repair overhead and improving system availability, which makes them attractive for distributed and cloud-storage settings.  Methods  In 2021, Wang et al. constructed NMDS codes of 3 dimension using elliptic curves over finite fields \begin{document}$ {\mathbb{F}}_{q} $\end{document}. In 2023, Heng et al. obtained several classes of 4 dimensional NMDS codes by appending appropriate column vectors to a base generator matrix. In 2024, Ding et al. presented four classes of 4 dimensional NMDS codes. They also determined the locality of the corresponding dual codes and constructed four classes of distance optimal and dimension optimal LRCs. Building upon the aforementioned works, this paper combines the unit circle \begin{document}$ {U}_{q+1} $\end{document} in \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document} with elliptic curves to construct generator matrices. By augmenting these matrices with two additional column vectors, several classes of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document} are obtained. Furthermore, the locality of the constructed NMDS codes is precisely determined, resulting in several classes of optimal LRCs.  Results and Discussions  In 2023, Heng et al. constructed a class of generator matrices whose second row entries range over \begin{document}$ \mathbb{F}_{q}^{*} $\end{document}, while the elements of the other rows are composed of nonconsecutive powers of the elements of the second row. Building on this work, Yin et al. constructed generator matrices with elements taken from \begin{document}$ {U}_{q+1} $\end{document} in 2025, thereby obtaining new matrices and constructing infinite families of MDS and NMDS codes. Following this line of research, the present paper expands such generator matrices by appending two column vectors, whose elements are selected from the finite field \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document}. The resulting matrices serve as generator matrices for several new classes of MDS and NMDS codes of length\begin{document}$ q+3 $\end{document}. Notably, several classes of NMDS codes with identical parameters but distinct weight distributions are obtained. Computing the minimum locality of the NMDS codes constructed in this paper shows that some of them are optimal LRCs that satisfy the Singleton-like, Cadambe-Mazumdar, Plotkin-like and Griesmer-like bounds. Furthermore, all of the constructed MDS codes are Griesmer codes, whereas the NMDS codes are near Griesmer. The results indicate that the constructions presented in this paper are more general and unified in nature.  Conclusions  In this paper, several families of MDS and NMDS codes of length \begin{document}$ q+3 $\end{document} over \begin{document}$ {\mathbb{F}}_{{{q}^{2}}} $\end{document} are obtained by leveraging elements from the unit circle \begin{document}$ {U}_{q+1} $\end{document} in conjunction with oval polynomials, and by appending two additional column vectors with entries in \begin{document}$ {\mathbb{F}}_{q} $\end{document} to the resulting matrices. Furthermore, the minimum locality of the constructed NMDS codes is analyzed, revealing that some of them are optimal LRCs. The results demonstrate that the proposed framework generalizes previous constructions and that the obtained codes are either optimal or near-optimal with respect to the Griesmer bound.
Total Coloring on Planar Graphs of Nested n-Pointed Stars
SU Rongjin, FANG Gang, ZHU Enqiang, XU Jin
Available online  , doi: 10.11999/JEIT250861
Abstract:
  Objective  Various kinds of combinatorial optimization problems in real life can be regarded as graph coloring problems. One classic theme in graph coloring is total coloring that combines vertex coloring and edge coloring. Both previous results and present researches about total coloring center on the famous Total Coloring Conjecture (TCC), which was proposed in the 1960s. Researchers have been trying to find solutions. For those graphs (including planar graphs) with low maximum degree, specifically of less than six, the correctness of TCC has been verified through enumerating the maximum degree. A powerful and wonderful discharging technique is utilized to prove that TCC holds for planar graphs with high maximum degree, specifically of more than six. This method is conducive to solve many problems regarding planar graphs involves identifying some reducible configurations and formulating detailed discharging rules. However, when the discharging method is employed to cope with the case where planar graphs have maximum degree of exactly six, it becomes limited. In this complex case, researchers have found certain graphs that satisfying TCC. These graphs are subject to certain constraints, such as the graphs free of cycles of length four and the graphs without two adjacent triangles. Additionally, it is improved in recent results. It is demonstrated that TCC holds for those planar graphs free of 4-fan subgraphs and for those planar graphs with maximum average degree of less than twenty-three fifths. Hence, one still may not confirm that whether planar graphs with maximum degree of six which contain a 4-fan subgraph or have maximum average degree of not less than twenty-three fifths satisfy the TCC. In order to verify that whether TCC is applicable for such graphs, this paper explores total coloring on one kind of such planar graphs, called nested n-pointed stars. It is aimed to show that TCC holds for nested n-pointed stars.  Methods  This article mainly adopts methods regarding theoretical research. One of these methods is the well-established mathematical induction which is widely used to prove theorems, especially those containing countable parameters. Another one is construction method, demanding constructing instances to support propositions. Method of listing cases is also exploited for completeness and clarity of statements. To start with, an n-pointed star is obtained, through connecting each outer side of an n-polygon (n≥3) to each triangle and then connecting all the vertices of these triangles not appearing on this n-polygon one by one into a new n-polygon. Based on the new n-polygon, a nested n-pointed star with l layers, denoted by \begin{document}$ G_{n}^{l} $\end{document}, is formed by iterating the operation layer by layer. The nested n-pointed stars obtained in this way have maximum degree of exactly six. The properties that nested n-pointed stars contain 4-fan subgraphs and have maximum average degree of large than twenty-three fifths are determined. As nested n-pointed stars are iterative planar graphs, mathematical induction is used by induction on the number of layers of \begin{document}$ G_{n}^{l} $\end{document} to demonstrate that nested n-pointed stars have a total 8-coloring by following steps: (1) Show that \begin{document}$ G_{n}^{1} $\end{document} has a total 8-coloring; (2) Suppose that \begin{document}$ G_{n}^{l-1} $\end{document} has a total 8-coloring; (3) Prove that \begin{document}$ G_{n}^{l} $\end{document} has a total 8-coloring. Further, a nested n-pointed star \begin{document}$ G_{n}^{l} $\end{document} will be referred as type I graph if it admits a total 7-coloring. When \begin{document}$ n=3k $\end{document}, construction method is utilized to present that \begin{document}$ G_{3k}^{l} $\end{document} is type I graph. The value of \begin{document}$ k $\end{document} is classified as odd number \begin{document}$ (k=2m-1) $\end{document} and even number \begin{document}$ (k=2m) $\end{document}. In each case, a total 7-coloring of \begin{document}$ G_{3k}^{l} $\end{document} is obtained by assigning colors to all the vertices and edges directly.  Results and Discussions  It is demonstrated by induction on the number of layers of \begin{document}$ G_{n}^{l} $\end{document} that nested n-pointed stars follow the Total Coloring Conjecture (as illustrated in Fig. 5). Only five distinct colors are assigned to vertices and edges of \begin{document}$ G_{3k}^{1} $\end{document} to identify a special total 5-coloring of \begin{document}$ G_{3k}^{1} $\end{document}(see Fig. 6(a) and Fig. 8(a)). The other two colors are alternately assigned to the edges connecting two polygons lying on layer 1 and layer 2. Then a total 7-coloring of \begin{document}$ G_{3k}^{2} $\end{document} is gotten by coloring the vertices and edges of the polygon lying on layer 2 (see Fig. 6(b) and Fig. 8(b)). Next, it is extended to a total 7-coloring of \begin{document}$ G_{3k}^{3} $\end{document} (see Fig. 7(a) and Fig. 9(a)). After performing a permutation operation, another total 7-coloring of \begin{document}$ G_{3k}^{3} $\end{document} is obtained (see Fig. 7(b) and Fig. 9(b)). The coloring pattern on the outermost layer of this total 7-coloring is exactly the same as that on the outermost layer of \begin{document}$ G_{3k}^{1} $\end{document}, which means that a total 7-coloring of \begin{document}$ G_{3k}^{4},G_{3k}^{5},\cdots ,G_{3k}^{l} $\end{document} can be obtained in the same extension way. Therefore, \begin{document}$ G_{3k}^{l} $\end{document} is type I graph.  Conclusions  This paper verifies that Total Coloring Conjecture holds for nested n-pointed stars, which have maximum degree of exactly 6 and contain 4-fan subgraphs. It shows that \begin{document}$ G_{3k}^{l} $\end{document} is type I graph. Then, a question arises: whether \begin{document}$ G_{n}^{l} $\end{document} is type I graph when \begin{document}$ n\neq 3k $\end{document}. It is not hard to construct a total 7-coloring of \begin{document}$ G_{n}^{l} $\end{document} if \begin{document}$ n=4 $\end{document} or \begin{document}$ n=5 $\end{document}, and therefore both \begin{document}$ G_{4}^{l} $\end{document} and \begin{document}$ G_{5}^{l} $\end{document} are type I graphs. For other \begin{document}$ n\neq 3k $\end{document}, whether \begin{document}$ G_{n}^{l} $\end{document} is type I graph remains to be determined.
A Miniaturized SSVEP Brain-Computer Interface System
CAI Yu, WANG Junyang, JIANG Chuanli, LUO Ruixin, LV Zhengchao, YU Haiqing, HUANG Yongzhi, JUNG Tzyy-Ping, XU Minpeng
Available online  , doi: 10.11999/JEIT251223
Abstract:
  Objective  The practical deployment of brain-computer interface (BCI) systems in daily-life scenarios is constrained by the bulkiness of acquisition hardware and the tethering cables required for reliable operation. While portable systems have been developed, achieving concurrent goals of significant device compactness, complete user mobility, and high decoding performance remains a challenge. This study aims to design, implement, and validate a wearable steady-state visual evoked potential (SSVEP) BCI system. The primary goal is to realize an integrated system featuring ultra-miniaturized, concealable acquisition hardware and a stable architecture that operates without the need for synchronization cables, and to demonstrate that this approach delivers online performance comparable to conventional laboratory systems, thereby advancing the feasibility of truly wearable BCIs.  Methods  A system-level solution was developed, centered on a distributed architecture to achieve wearability and hardware simplification. The core of the system is an ultra-miniaturized acquisition node. Each node, functioning as an independent EEG acquisition unit, integrates a Bluetooth Low Energy (BLE) system-on-chip (CC2640R2F), a high-precision analog-to-digital converter (ADS1291), a battery, and an electrode into a single encapsulated module. Through optimized 6-layer PCB design and a stacked assembly, the module dimensions were reduced to 15.12 mm × 14.08 mm × 14.31 mm (3.05 cm3) with a weight of 3.7 g. Each node incorporates a single active electrode, and all nodes share a common reference electrode connected via a thin, short wire. This design reduces scalp connections and enables a hair-clip structure for concealed placement within the user's hair. Multiple such nodes form a star network coordinated by a master device, which manages communication with a stimulus-presentation computer.To enable cable-free operation while maintaining data integrity, a synchronization strategy was implemented to address timing uncertainties inherent in distributed wireless systems. This strategy combines hardware-event detection with software-based clock management to align stimulus markers with the multi-channel EEG data streams without dedicated synchronization cables. The master device coordinates this process and streams the synchronized data to the computer for real-time processing.System evaluation was conducted in two phases. Foundational performance metrics included physical characteristics, key electrical parameters (input-referred noise: 3.91 μVpp; common-mode rejection ratio: 132.99 dB), and synchronization accuracy across different network scales. Application-level performance was assessed through a 40-command online SSVEP spelling experiment with six subjects in an unshielded room with common RF interference. Four nodes were placed at positions Pz, PO3, PO4, and Oz. EEG epochs (0.14–3.14 s post-stimulus) were analyzed using canonical correlation analysis (CCA) and ensemble task-related component analysis (e-TRCA) to compute recognition accuracy and information transfer rate (ITR).  Results and Discussions  The implemented system successfully achieved its design objectives. Each acquisition node attained an ultra-compact form factor (3.05 cm3, 3.7 g) suitable for concealed wear, with a battery life exceeding 5 hours at a 1000 Hz sampling rate. The electrical performance confirmed its capability for high-quality SSVEP acquisition.The cable-free synchronization strategy provided the necessary temporal stability for system operation. Evaluation showed that over 95% of event markers were aligned with the EEG data stream with an error of less than 1 millisecond (Fig. 4), meeting the requirements for SSVEP-BCI applications. This reliable synchronization contributed to the quality of the recorded neural signals. Grand-averaged SSVEP responses across subjects exhibited clear and stable waveforms with precise phase alignment (Fig. 5). The signal-to-noise ratio at the fundamental stimulation frequency exceeded 10 dB for all 40 commands (Fig. 6), confirming good signal quality.In the online spelling experiment, the system demonstrated robust decoding performance. Using the e-TRCA algorithm with a 3-second data window, an average recognition accuracy of (95.00 ± 2.04)% was achieved. The system reached a peak ITR of (147.24 ± 30.52) bits/min with a short 0.4-second data length (Fig. 7). A comparative analysis with existing SSVEP-BCI systems (Table 1) shows that the proposed system, under constraints of miniaturization, cable-free use, and a reduced number of electrodes (four channels), achieved accuracy comparable to some cable-dependent laboratory systems while demonstrating improved wearability.  Conclusions  This work presents the development and validation of a wearable SSVEP-BCI system that integrates ultra-miniaturized hardware with a distributed, cable-free architecture. The system demonstrates that through coordinated design at the hardware and system levels, it is possible to overcome traditional trade-offs between device size, user freedom, and decoding capability. The acquisition node, at 3.7 g and 3.05 cm3, represents a significant step toward concealable wearability. The implemented synchronization strategy supported reliable operation without dedicated cables. The overall system, evaluated in a realistic environment, delivered online performance competitive with many cable-dependent setups, achieving 95.00% recognition accuracy and a peak ITR of 147.24 bits/min in a 40-target task. Therefore, this study provides a comprehensive system-level solution, contributing a practical platform that facilitates the transition of high-performance BCIs from the laboratory toward everyday wearable applications.
Wavelet Transform and Attentional Dual-Path EEG Model for Virtual Reality Motion Sickness Detection
CHEN Yuechi, HUA Chengcheng, DAI Zhian, FU Jingqi, ZHU Min, WANG Qiuyu, YAN Ying, LIU Jia
Available online  , doi: 10.11999/JEIT251233
Abstract:
  Objective  Virtual Reality Motion Sickness (VRMS) poses a significant challenge to the widespread adoption of immersive Virtual Reality (VR) technologies, and this symptom primarily caused by sensory conflict between the vestibular and visual systems. Current assessment methods largely rely on subjective reports, which interrupt the experience and lack real-time capability. To address this, an objective and direct detection method is needed. This paper proposes a novel dual-path fusion model, the Wavelet Transform Attentional Network (WTATNet), which combines wavelet transform and attention mechanisms. The objective is to achieve the classification of the resting-state Electroencephalograph (EEG) signals collected before and after the VR motion stimulus exposure, thereby providing a robust tool for VRMS detection and facilitating further research into its causes and mitigation strategies.  Methods  The proposed WTATNet model consists of two parallel pathways for feature extraction from EEG signals. The first pathway employs a two-dimensional Discrete Wavelet Transform (2D-DWT) applied simultaneously to the time and electrode dimensions of the EEG, which is reshaped into a 2D matrix based on the spatial layout of the scalp electrodes (using both horizontal or vertical arrangements). This process decomposes the signal to capture multi-scale spatiotemporal features. The resulting wavelet coefficients are then fed into Convolutional Neural Network (CNN) layers for further feature extraction. The second pathway processes the EEG through a one-dimensional CNN layer for initial filtering, followed by a dual-attention mechanism comprising a channel attention module and an electrode attention module. These modules dynamically recalibrate the importance of features in the channel and electrode dimensions, respectively, enhancing the model's focus on task-relevant information. Finally, features from both pathways are fused and passed through the fully connected layers for classification into pre-exposure (non-VRMS) and post-exposure (VRMS) states according to the subjective questionnaire results validation. The model was trained and evaluated using a ten-fold cross-validation on the dataset collected from 22 subjects exposed to VRMS via the game "Ultrawings2," with performance assessed using accuracy, precision, recall, and F1-score.  Results and Discussions  The WTATNet model demonstrated superior performance in classifying VRMS-related EEGs. It achieved an average accuracy of 98.39%, F1-score of 98.39%, precision of 98.38%, and recall of 98.40% outperforming the several classical and state-of-the-art EEG models such as ShallowConvNet, EEGNet, Conformer, and FBCNet (Table 2). The ablation studies (Tables 3 & 4) confirmed the contribution of each component: the wavelet transform path, the electrode attention module, or the channel attention module led to performance drops of 1.78%, 1.36%, and 1.01% in accuracy, respectively, highlighting their importance. The 2D-DWT proved significantly more effective than the 1D-DWT in the proposed model, underscoring the value of joint spatiotemporal analysis. Furthermore, the experiments with randomized electrode ordering (Table 5) resulted in markedly lower performance compared to using spatially coherent layouts (horizontal/vertical), validating that the 2D-DWT effectively leverages the intrinsic spatial correlations among scalp electrodes. The visualizations of the extracted features using t-SNE (Figures. 5 & 6) showed that WTATNet learned more discriminative features compared to baseline and ablated models.  Conclusions  This proposed dual-path hybrid model, WTATNet, integrates wavelet transform and attention mechanisms for highly accurate VRMS detection using the resting-state EEG. The model effectively combines the interpretable, multi-scale spatiotemporal features from 2D-DWT with the adaptive feature weighting abilities of the channel and electrode attentions. The experimental results demonstrate that WTATNet achieves a state-of-the-art performance, offering an objective, robust, and non-intrusive method for detecting VRMS. This approach not only holds immediate application for assessing user comfort in VR but also provides a valuable tool and technical pathway for investigating the underlying neural mechanisms of VRMS and developing the countermeasures. Furthermore, the WTATNet framework shows promising potential for generalization to other EEG decoding tasks in neuroscience and clinical medicine.
A Sparse-Reconstruction-Based Fast Localization Algorithm for Mixed Far-Field and Near-Field Sources
FU Shijian, QIU Longhao, LIANG Guolong
Available online  , doi: 10.11999/JEIT250165
Abstract:
  Objective  Source localization is a key research topic in array signal processing, with applications in radar, sonar, and wireless communications. Conventional localization methods based solely on far-field or near-field models face clear limitations when separating and localizing mixed far-field and near-field sources. Existing approaches, such as subspace-based methods, often show high computational complexity, limited localization accuracy, and degraded performance under low Signal-to-Noise Ratio (SNR) conditions. In addition, many methods assume that near-field sources lie strictly within the Fresnel region, which leads to localization errors and a reduced effective array aperture. Improved algorithms, such as Multiple Sparse Bayesian Learning for Far- and Near-Field Sources (FN-MSBL), overcome part of these limitations and achieve higher localization accuracy. However, their reliance on iterative matrix inversion leads to high computational cost and restricts real-time applicability. Therefore, this study aims to address these issues by proposing a novel algorithm that develops a sparse representation model for mixed far-field and near-field sources in the covariance domain and integrates sparse reconstruction with the Generalized Approximate Message Passing (GAMP) and Variational Bayesian Inference (VBI) frameworks. The objective is to achieve high-precision localization of mixed sources while substantially reducing computational cost.  Methods  Two algorithms, termed Covariance-Based VBI for Far- and Near-Field Sources (FN-CVBI) and Covariance-Based GAMP-VBI for Far- and Near-Field Sources (FN-GAMP-CVBI), are developed. First, a unified sparse representation model for mixed far-field and near-field sources is constructed based on the covariance vector. This representation benefits from the improved SNR of the covariance vector relative to the original array output, which improves far-field Direction of Arrival (DOA) estimation. Second, to reduce estimation errors in the sample covariance matrix, a pre-whitening operation is applied to the covariance vector to minimize inter-element correlation and improve robustness. Third, a hierarchical Bayesian model is established to impose sparsity, and VBI is employed to estimate model parameters through iterative posterior updates. Fourth, to reduce the computational burden associated with conventional VBI, GAMP is embedded into the VBI framework to replace matrix inversion operations. The detailed implementation of GAMP is given in Table 1. By combining sparse reconstruction, VBI, and GAMP, the proposed approach improves localization accuracy while markedly reducing computational complexity.  Results and Discussions  The proposed FN-GAMP-CVBI algorithm shows clear improvements in both localization accuracy and computational efficiency. Complexity analysis indicates a substantial reduction in computational cost (Table 2). In terms of localization performance, FN-CVBI and FN-GAMP-CVBI outperform comparative methods, including LOFNS and FN-MSBL (Fig. 3, Fig. 4), particularly under low SNR conditions and with sufficient snapshots (Fig. 5, Fig. 6). The proposed methods also show strong capability in resolving closely spaced far-field sources (Fig. 7). Experimental validation using lake trial data confirms these findings, as reflected by sharper spectral peaks and fewer false peaks in the background noise of the Bearing Time Recording (BTR) results (Fig. 9). FN-CVBI achieves the highest accuracy in far-field DOA estimation and near-field localization. The computational time of FN-GAMP-CVBI is reduced by up to 95% compared with FN-MSBL (Table 4), demonstrating its suitability for real-time applications.  Conclusions  A sparse-reconstruction-based approach for mixed far-field and near-field source localization is presented by integrating sparse reconstruction with the GAMP-VBI framework. The proposed FN-GAMP-CVBI algorithm addresses the limitations of existing methods and achieves a balanced trade-off between localization accuracy and computational efficiency. Simulation results confirm superior performance, especially under low SNR conditions with sufficient snapshots, and experimental results further support the effectiveness of the approach. The low computational complexity and ability to handle mixed-source scenarios indicate that the proposed algorithm is well suited for real-time localization in complex environments.
Group-based Sparse Vector Codes for Short-Packet Communications
ZHANG Xuewan, ZHANG Di, GU Bo
Available online  , doi: 10.11999/JEIT251143
Abstract:
  Objective  Sparse Vector Codes (SVC) aim to construct sparse underdetermined linear systems and have attracted wide interest for short-packet Ultra-Reliable and Low-Latency Communications (URLLC) because of their simple implementation and reliable transmission. To guarantee system performance, short sparse vectors that can be transmitted using small-size random spreading codebooks are required. However, most existing sparse transformation schemes based on index modulation adopt a global selection strategy, where nonzero positions, to which transmission bits are mapped, are selected directly from the entire set of available positional resources in the sparse vector. Under high coding efficiency requirements, this strategy often leads to excessively long sparse vectors and a sharp degradation in transmission performance. To address this issue, a Group-based Sparse Vector Code (GSVC) scheme is proposed. Unlike the conventional global sparse mapping approach, GSVC divides index bits into groups and sequentially determines the nonzero positions for each group within a predefined sparse vector. This design enables positional resource sharing among all groups and generates compressed sparse vectors with higher positional resource utilization, thereby achieving Better Block Error Rate (BLER) performance than conventional SVC schemes.  Methods  The proposed GSVC scheme partitions the total number of nonzero positions N into V groups. Within a single predefined sparse vector, each group sequentially selects its N/V nonzero positions through index modulation. To prevent position selection conflicts among groups, a resource supplementation and elimination mechanism is applied. This mechanism ensures that the selected positions are mutually exclusive and that each group maintains the same number of available positional resources throughout the selection process. Given the sparsity of the constructed vector, a low-complexity sparse recovery algorithm is employed at the receiver. Accordingly, a GSVC decoder based on the Multipath Matching Pursuit (MMP) algorithm is designed. To enable accurate identification of the group affiliation associated with each nonzero position, GSVC adopts a multi-constellation mapping strategy for the nonzero elements. The receiver performs constellation matching by exploiting the unique characteristics of each constellation, thereby determining group affiliation and ensuring a high probability of successful decoding.  Results and Discussions  By enabling different groups to share positional resources through group-based nonzero position selection, GSVC effectively compresses the sparse vector and improves transmission reliability. Simulation results show that the GSVC decoder based on MMP significantly outperforms the decoder based on the Orthogonal Approximate Message Passing (OAMP) algorithm (Fig. 3). At lower modulation orders, GSVC achieves better BLER performance than existing schemes, including enhanced SVC, multi-rotation constellation-based SVC, and index-redefined SVC (Fig. 4 and Fig. 5). When the number of Orthogonal Frequency Division Multiplexing (OFDM) subcarriers is large, GSVC provides the best BLER performance among all compared schemes (Fig. 6). In addition, for a fixed number of nonzero entries per group, the BLER performance advantage of GSVC increases as the number of groups increases. A performance gain exceeding 1 dB over the second-best SVC scheme is observed at a BLER of 10-5 (Fig. 7). Compared with polar codes (Fig. 8), GSVC achieves better BLER performance without Cyclic Redundancy Check (CRC) assistance and even outperforms CRC-aided polar codes.  Conclusions  This paper proposes a GSVC scheme to address the excessive sparse vector length encountered in conventional index modulation-based SVC systems. The central feature of GSVC is a grouped nonzero position selection mechanism that enables multiple groups to share positional resources within a predefined sparse vector, thereby reducing the overall vector length. A dedicated multi-constellation mapping design, together with well-defined resource allocation rules, ensures conflict-free and efficient utilization of positional resources. Simulation results demonstrate that (1) the GSVC decoder implemented using MMP significantly outperforms decoders based on the OAMP algorithm; (2) GSVC achieves superior BLER performance compared with enhanced SVC, multi-rotation constellation-based SVC, and index-redefined SVC schemes, particularly at lower modulation orders and with a large number of OFDM subcarriers; and (3) GSVC surpasses the BLER performance of CRC-aided polar codes without requiring CRC. Future work will focus on optimizing the grouping strategy and examining the transmission performance of SVC under imperfect channel estimation to improve robustness in practical communication systems.
Research on GFRA Preamble Design and Active Device Detection Technology for Short-Packet Communication in LEO Satellite IoT
DAI Jianmei, ZHANG Mengchen, LI Keying, SU Qi, CHENG Ying, WANG Xianpeng, XU Rong
Available online  , doi: 10.11999/JEIT250609
Abstract:
  Objective  To address preamble collision and high detection complexity in massive device random access for Low-Earth Orbit Satellite Internet of Things (LEO-IoT) short-packet communication, and to overcome the limitations of traditional random access schemes in preamble pool capacity and detection efficiency, thereby enabling highly reliable access for massive devices.  Methods  A Grant-Free Random Access (GFRA) scheme is adopted, and a three-pilot superimposed preamble structure with a cyclic prefix is constructed. The proposed preamble structure preserves time–frequency resource efficiency and further expands the pilot code pool capacity. To satisfy the detection requirements of superimposed preambles, a dynamic detection algorithm based on idle preamble search is proposed. This algorithm reduces computational complexity and improves detection accuracy.  Results and Discussions  Under the GFRA mode, a three-pilot superimposed preamble structure with a cyclic prefix is constructed (Fig. 3). The pilot code pool capacity is increased to 3.2 times that of traditional schemes, whereas time–frequency resource efficiency is maintained (Fig. 4, Fig. 5, Fig. 6). For superimposed preamble detection, a dynamic detection algorithm based on idle preamble search is proposed (Table 1). Compared with the traditional exhaustive search method, the proposed algorithm reduces computational complexity to 18.7% of the original scheme while maintaining a detection accuracy of 99.5% (Fig. 7). Theoretical analysis shows that the proposed scheme achieves a Signal-to-Interference-plus-Noise Ratio (SINR) gain of 3.8 dB at a Bit Error Rate (BER) of 10–5. Simulation results indicate that the miss detection rate remains below 2% when the device activation rate exceeds 80% (Fig. 10). Compared with compressed sensing methods, the proposed algorithm provides a more favorable balance between detection accuracy and computational complexity. Its polynomial-level complexity improves practicality for real LEO-IoT systems (Fig. 13, Fig. 14).  Conclusions  The proposed superimposed preamble structure and dynamic detection algorithm effectively mitigate preamble collision, significantly reduce detection complexity, and achieve a clear SINR gain with a low miss detection rate. The scheme shows strong performance and robustness under high-load and asynchronous LEO-IoT access conditions, supporting its suitability for practical deployment.
Short-packet Covert Communication Design for Minimizing Age of Information under Non-ideal Channel Conditions
ZHU Kaiji, MA Ruiqian, LIN Zhi, MA Yue, WANG Yong, GUAN Xinrong, CAI Yueming
Available online  , doi: 10.11999/JEIT250836
Abstract:
  Objective  With the rapid development of mobile communication technologies and the widespread adoption of smart devices, the security and timeliness of information transmission are critical. Most existing studies on covert communication assume ideal channel conditions and long packet lengths, which are impractical for delay-sensitive applications. This paper addresses the problem of minimizing the average Covert Age of Information (CAoI) under non-ideal channel conditions caused by limited pilot symbols. The objective is to improve both timeliness and security in short-packet covert communication systems.  Methods  A system model is considered in which a transmitter sends short packets to a legitimate receiver under the surveillance of a warden. The effects of pilot length and transmit power on channel estimation error are characterized. Based on this analysis, closed-form expressions for the detection error probability and the average CAoI are derived. A joint optimization problem is then formulated to determine the optimal transmit power, total blocklength, and pilot-to-data ratio. This problem is solved using a golden-section search algorithm.  Results and Discussions  Numerical results show that an optimal total packet length and an optimal pilot-to-data ratio exist for minimizing the average CAoI (Fig. 3). The proposed joint optimization strategy significantly outperforms fixed-ratio schemes (Fig. 4). As the covertness constraint becomes stricter, the transmit power decreases, which requires longer pilot sequences to preserve channel estimation accuracy (Fig. 6(a)). The optimal total packet length is also shown to decrease as the covertness constraint is relaxed (Fig. 6(b)). Additionally, increasing the distance between Alice and Bob degrades the average CAoI performance due to poorer channel conditions (Fig. 5).  Conclusions  This study optimizes the average CAoI in short-packet covert communication systems with imperfect channel estimation. Closed-form expressions for covertness and CAoI are obtained, and a golden-section search method is applied to dynamically adjust the packet structure to minimize the average CAoI. Numerical results confirm that the optimized design outperforms fixed-allocation methods. The results further show that stricter covertness constraints require longer pilot sequences to compensate for reduced transmit power, providing useful design guidance for latency-sensitive covert wireless systems.
Photosensing Model and Circuit Design of Rod Cells Based on Memristors
SUN Jingru, MA Wenjing, WANG Chunhua, XUE Xiaoyong
Available online  , doi: 10.11999/JEIT250901
Abstract:
  Objective   Visual perception plays a critical role in artificial intelligence, robotics, and the Internet of Things. Although existing visual perception devices have achieved substantial progress, the widespread use of conventional CMOS circuit architectures still results in limitations such as slow sensing speed, complex structures, and high power consumption. In contrast, biological visual perception systems exhibit high response speed, low power consumption, and strong stability. Therefore, designing optical perception circuits inspired by biological visual systems has become an active research direction. Existing biologically inspired optical perception circuits are mainly based on the Leaky Integrate-and-Fire (LIF) model, which enables rapid and low-cost conversion of light intensity signals into spike signals. However, the LIF model only supports basic signal conversion and cannot adequately reproduce the working mechanisms and computational characteristics of biological visual neurons. Therefore, practical applications suffer from limited imaging quality, slow response, and weak adaptability. To address these issues, the structure and operating mechanism of human visual perception cells are investigated, a corresponding photosensing circuit is designed, and spiking camera schemes are proposed to achieve high-speed, low-power, and stable imaging.  Methods   The biological visual system provides valuable inspiration for bionic photosensing circuits due to its fast response, low power consumption, high stability, and strong adaptability. The biological mechanism of photoreceptor cells in the human visual system is analyzed from the perspective of ionic flow, and a mathematical photosensitivity model of rod cells is derived following the construction approach of the Hodgkin–Huxley (HH) model. Based on the closed states of ionic channels in rod cells, a memristor model is designed. Using the proposed memristor model and the mathematical model of photoreceptor cells, a rod-cell photosensing circuit is developed. Its adaptability, conversion speed, stability, and dynamic range are evaluated through simulation to verify effectiveness and bionic characteristics, and the results are compared with those of a photosensing circuit based on the LIF model. To further demonstrate practicality, the proposed rod-cell photosensing circuit is applied to a spiking camera, and its adaptability, speed, power consumption, error, and dynamic range are analyzed and compared with a spiking camera based on a simplified neuron photosensing circuit.  Results and Discussions   Based on the operating principles of photoreceptor cells in the human visual system, a photoreceptor cell model is proposed. Sodium-ion memristors and calcium-ion memristors are introduced to simulate sodium and calcium ion channels in photoreceptor cells, respectively, where the sodium-ion memristor is implemented as a tri-valued memristor. Using the proposed memristor model, a rod-cell photosensing circuit is designed. Under strong illumination, the circuit adapts to light intensity through resistance transitions of the sodium-ion memristor, reducing sensitivity and suppressing the influence of extreme illumination on normal lighting conditions, while maintaining fast conversion speed and a wide dynamic range. The rod-cell photosensing circuit is further combined with the signal conversion circuit to implement a spiking camera. Simulation results show that, compared with spiking cameras based on simplified neuron photosensing circuits and CMOS circuits, the imaging speed increases by 20% and 150%, respectively, while automatic adaptation to extreme illumination, low power consumption, high accuracy, and strong stability are achieved.  Conclusions   Inspired by the operating mechanisms of photoreceptor cells in the visual system, a mathematical model of rod cells and a corresponding memristor model are proposed, and a rod-cell photosensing circuit based on memristors is designed. The circuit reproduces the hyperpolarization and adaptive processes observed in rod-cell photosensing. Through capacitor charge–discharge behavior and memristor resistance transitions, optical signals are converted into voltage signals whose amplitudes vary with light intensity, with higher illumination producing higher voltage amplitudes. Automatic amplitude regulation under strong illumination is achieved, thereby suppressing the influence of extreme light conditions. Compared with simplified neuron photosensing circuits, the proposed rod-cell photosensing circuit provides faster conversion speed, a wide dynamic range from 50 to 5 000 lx, self-adaptation, and improved stability. An intelligent optical sensor array is further constructed, and a spiking camera is implemented by combining the photosensing circuit with a signal conversion circuit and a time-window function. Simulation results confirm clearer imaging under strong background illumination and effective high-speed imaging for both stationary and rapidly moving objects. Compared with spiking cameras based on simplified neuron photosensing circuits and CMOS circuits, imaging speed is improved by 20% and 150%, respectively, while low power consumption, small error, and strong anti-interference capability are maintained.
Joint Mask and Multi-Frequency Dual Attention GAN Network for CT-to-DWI Image Synthesis in Acute Ischemic Stroke
ZHANG Zehua, ZHAO Ning, WANG Shuai, WANG Xuan, ZHENG Qiang
Available online  , doi: 10.11999/JEIT250643
Abstract:
  Objective  In the clinical management of Acute Ischemic Stroke (AIS), Computed Tomography (CT) and Diffusion-Weighted Imaging (DWI) serve complementary roles at different stages. CT is widely applied for initial evaluation due to its rapid acquisition and accessibility, but it has limited sensitivity in detecting early ischemic changes, which can result in diagnostic uncertainty. In contrast, DWI demonstrates high sensitivity to early ischemic lesions, enabling visualization of diffusion-restricted regions soon after symptom onset. However, DWI acquisition requires a longer time, is susceptible to motion artifacts, and depends on scanner availability and patient cooperation, thereby reducing its clinical accessibility. The limited availability of multimodal imaging data remains a major challenge for timely and accurate AIS diagnosis. Therefore, developing a method capable of rapidly and accurately generating DWI images from CT scans has important clinical significance for improving diagnostic precision and guiding treatment planning. Existing medical image translation approaches primarily rely on statistical image features and overlook anatomical structures, which leads to blurred lesion regions and reduced structural fidelity.  Methods  This study proposes a Joint Mask and Multi-Frequency Dual Attention Generative Adversarial Network (JMMDA-GAN) for CT-to-DWI image synthesis to assist in the diagnosis and treatment of ischemic stroke. The approach incorporates anatomical priors from brain masks and adaptive multi-frequency feature fusion to improve image translation accuracy. JMMDA-GAN comprises three principal modules: a mask-guided feature fusion module, a multi-frequency attention encoder, and an adaptive fusion weighting module. The mask-guided feature fusion module integrates CT images with anatomical masks through convolution, embedding spatial priors to enhance feature representation and texture detail within brain regions and ischemic lesions. The multi-frequency attention encoder applies Discrete Wavelet Transform (DWT) to decompose images into low-frequency global components and high-frequency edge components. A dual-path attention mechanism facilitates cross-scale feature fusion, reducing high-frequency information loss and improving structural detail reconstruction. The adaptive fusion weighting module combines convolutional neural networks and attention mechanisms to dynamically learn the relative importance of input features. By assigning adaptive weights to multi-scale features, the module selectively enhances informative regions and suppresses redundant or noisy information. This process enables effective integration of low- and high-frequency features, thereby improving both global contextual consistency and local structural precision.  Results and Discussions  Extensive experiments were performed on two independent clinical datasets collected from different hospitals to assess the effectiveness of the proposed method. JMMDA-GAN achieved Mean Squared Error (MSE) values of 0.0097 and 0.0059 on Clinical Dataset 1 and Clinical Dataset 2, respectively, exceeding state-of-the-art models by reducing MSE by 35.8% and 35.2% compared with ARGAN. The proposed network reached peak Signal-to-Noise Ratio (PSNR) values of 26.75 and 28.12, showing improvements of 30.7% and 7.9% over the best existing methods. For Structural Similarity Index (SSIM), JMMDA-GAN achieved 0.753 and 0.844, indicating superior structural preservation and perceptual quality. Visual analysis further demonstrates that JMMDA-GAN restores lesion morphology and fine texture features with higher fidelity, producing sharper lesion boundaries and improved structural consistency compared with other methods. Cross-center generalization and multi-center mixed experiments confirm that the model maintains stable performance across institutions, highlighting its robustness and adaptability in clinical settings. Parameter sensitivity analysis shows that the combination of Haar wavelet and four attention heads achieves an optimal balance between global structural retention and local detail reconstruction. Moreover, superpixel-based gray-level correlation experiments demonstrate that JMMDA-GAN exceeds existing models in both local consistency and global image quality, confirming its capacity to generate realistic and diagnostically reliable DWI images from CT inputs.  Conclusions  This study proposes a novel JMMDA-GAN designed to enhance lesion and texture detail generation by incorporating anatomical structural information. The method achieves this through three principal modules. (1) The mask-guided feature fusion module effectively integrates anatomical structure information, with particular optimization of the lesion region. The mask-guided network focuses on critical lesion features, ensuring accurate restoration of lesion morphology and boundaries. By combining mask and image data, the method preserves the overall anatomical structure while enhancing lesion areas, preventing boundary blurring and texture loss commonly observed in traditional approaches, thereby improving diagnostic reliability. (2) The multi-frequency feature fusion module jointly optimizes low- and high-frequency features to enhance image detail. This integration preserves global structural integrity while refining local features, producing visually realistic and high-fidelity images. (3) The adaptive fusion weighting module dynamically adjusts the learning strategy for frequency-domain features according to image content, enabling the network to manage texture variations and complex anatomical structures effectively, thereby improving overall image quality. Through the coordinated function of these modules, the proposed method enhances image realism and diagnostic precision. Experimental results demonstrate that JMMDA-GAN exceeds existing advanced models across multiple clinical datasets, highlighting its potential to support clinicians in the diagnosis and management of AIS.
Modeling, Detection, and Defense Theories and Methods for Cyber-Physical Fusion Attacks in Smart Grid
WANG Wenting, TIAN Boyan, WU Fazong, HE Yunpeng, WANG Xin, YANG Ming, FENG Dongqin
Available online  , doi: 10.11999/JEIT250659
Abstract:
  Significance   Smart Grid (SG), the core of modern power systems, enables efficient energy management and dynamic regulation through cyber–physical integration. However, its high interconnectivity makes it a prime target for cyberattacks, including False Data Injection Attacks (FDIAs) and Denial-of-Service (DoS) attacks. These threats jeopardize the stability of power grids and may trigger severe consequences such as large-scale blackouts. Therefore, advancing research on the modeling, detection, and defense of cyber–physical attacks is essential to ensure the safe and reliable operation of SGs.  Progress   Significant progress has been achieved in cyber–physical security research for SGs. In attack modeling, discrete linear time-invariant system models effectively capture diverse attack patterns. Detection technologies are advancing rapidly, with physical-based methods (e.g., physical watermarking and moving target defense) complementing intelligent algorithms (e.g., deep learning and reinforcement learning). Defense systems are also being strengthened: lightweight encryption and blockchain technologies are applied to prevention, security-optimized Phasor Measurement Unit (PMU) deployment enhances equipment protection, and response mechanisms are being continuously refined.  Conclusions  Current research still requires improvement in attack modeling accuracy and real-time detection algorithms. Future work should focus on developing collaborative protection mechanisms between the cyber and physical layers, designing solutions that balance security with cost-effectiveness, and validating defense effectiveness through high-fidelity simulation platforms. This study establishes a systematic theoretical framework and technical roadmap for SG security, providing essential insights for safeguarding critical infrastructure.  Prospects   Future research should advance in several directions: (1) deepening synergistic defense mechanisms between the information and physical layers; (2) prioritizing the development of cost-effective security solutions; (3) constructing high-fidelity information–physical simulation platforms to support research; and (4) exploring the application of emerging technologies such as digital twins and interpretable Artificial Intelligence (AI).
Physical Layer Key Generation Method for Integrated Sensing and Communication Systems
LIU Kexin, HUANG Kaizhi, PEI Xinglong, JIN Liang, CHEN Yajun
Available online  , doi: 10.11999/JEIT251034
Abstract:
  Objective  Integrated Sensing And Communication (ISAC) has become a central technology in Sixth-Generation (6G) wireless networks, enabling simultaneous data transmission and environmental sensing. However, the characteristics of ISAC systems, including highly directional sensing signals and the risk of sensitive information leakage to malicious sensing targets, create specific security challenges. Physical layer security provides lightweight methods to enhance confidentiality. In secure transmission, approaches such as artificial noise injection and beamforming can partially improve secrecy, although they may reduce sensing accuracy or communication efficiency. Their effect also depends on the quality advantage of legitimate channels over eavesdropping channels. For Physical Layer Key Generation (PLKG), existing work has only demonstrated basic feasibility. Most current schemes adopt a radar-centric design, which limits compatibility with communication protocols and restricts key generation rates. This paper proposes a PLKG method tailored for ISAC systems. It aims to maximize the Sum Key Generation Rate (SKGR) under sensing accuracy constraints through a Twin Delayed Deep Deterministic policy gradient (TD3)-based joint communication and sensing beamforming algorithm, thereby improving the security performance of ISAC systems.  Methods  A MIMO ISAC system is considered, where a base station (Alice) equipped with multiple antennas communicates with single-antenna users (Bobs) and senses a malicious target (Eve). The system operates under a TDD protocol to leverage channel reciprocity. A PLKG protocol designed for ISAC systems is developed, including channel estimation, joint communication and sensing beamforming, and key generation. The SKGR is derived in closed form, and sensing accuracy is evaluated using the Cramér-Rao Bound (CRB). To maximize the SKGR under CRB constraints, a non-convex optimization problem for the joint design of communication and sensing beamforming matrices is formulated. Given its NP-hardness, an algorithm based on TD3 is proposed. TD3 employs dual critic networks to reduce overestimation, delayed policy updates to enhance stability, and target policy smoothing to improve robustness. The state includes channel state information, the actions correspond to beamforming matrices, and the reward function combines SKGR, CRB, and power constraints.  Results and Discussions  Simulation results confirm the effectiveness of the proposed design. The TD3-based algorithm achieves a stable SKGR of 18.5 bits/channel use after training (Fig. 4), outperforming benchmark schemes such as Deep Deterministic Policy Gradient (DDPG), greedy search, and random algorithms. The SKGR increases monotonically with transmit power because of reduced noise interference (Fig. 5). Increasing the number of antennas also improves SKGR, although the gain diminishes as power per antenna decreases. The scheme maintains stable SKGR across different distances to the eavesdropper (Fig. 6), demonstrating the robustness of PLKG against eavesdropping attacks. The proposed algorithm manages the complex optimization problem effectively and adapts to dynamic system conditions, offering a practical approach for secure ISAC systems.  Conclusions  This paper presents a PLKG method for ISAC systems. The proposed protocol generates consistent keys between the base station and communication users. The SKGR maximization problem with sensing constraints is solved using a TD3-based algorithm that jointly optimizes communication and sensing beamforming matrices. Simulation results show that the method outperforms benchmark schemes, with significant gains in SKGR and adaptability to system conditions. The study establishes a basis for integrating PLKG into ISAC to strengthen security without reducing sensing performance. Future work will examine real-time implementation and scalability in large networks.
Multi-Scale Region of Interest Feature Fusion for Palmprint Recognition
MA Yuxuan, ZHANG Feifei, LI Guanghui, TANG Xin, DONG Zhengyang
Available online  , doi: 10.11999/JEIT250940
Abstract:
  Objective  Accurate localization of the Region Of Interest (ROI) is a prerequisite for high-precision palmprint recognition. In contactless and uncontrolled application scenarios, complex background illumination and diverse hand postures frequently cause ROI localization offsets. Most existing deep learning-based recognition methods rely on a single fixed-size ROI as input. Although some approaches adopt multi-scale convolution kernels, fusion at the ROI level is not performed, which makes these methods highly sensitive to localization errors. Therefore, small deviations in ROI extraction often result in severe performance degradation, which restricts practical deployment. To overcome this limitation, a Multi-scale ROI Feature Fusion Mechanism is proposed, and a corresponding model, termed ROI3Net, is designed. The objective is to construct a recognition system that is inherently robust to localization errors by integrating complementary information from multiple ROI scales. This strategy reinforces shared intrinsic texture features while suppressing scale-specific noise introduced by positioning inaccuracies.  Methods  The proposed ROI3Net adopts a dual-branch architecture consisting of a Feature Extraction Network and a lightweight Weight Prediction Network (Fig. 4). The Feature Extraction Network employs a sequence of Multi-Scale Residual Blocks (MSRBs) to process ROIs at three progressive scales (1.00×, 1.25×, and 1.50×) in parallel. Within each MSRB, dense connections are applied to promote feature reuse and reduce information loss (Eq. 3). Convolutional Block Attention Modules (CBAMs) are incorporated to adaptively refine features in both the channel and spatial dimensions. The Weight Prediction Network is implemented as an end-to-end lightweight module. It takes raw ROI images as input and processes them using a serialized convolutional structure (Conv2d-BN-GELU-MaxPool), followed by a Multi-Layer Perceptron (MLP) head, to predict a dynamic weight vector for each scale. This subnetwork is optimized for efficiency, containing 2.38 million parameters, which accounts for approximately 6.2% of the total model parameters, and requiring 103.2 MFLOPs, which corresponds to approximately 2.1% of the total computational cost. The final feature representation is obtained through a weighted summation of multi-scale features (Eq. 1 and Eq. 2), which mathematically maximizes the information entropy of the fused feature vector.  Results and Discussions  Experiments are conducted on six public palmprint datasets: IITD, MPD, NTU-CP, REST, CASIA, and BMPD. Under ideal conditions with accurate ROI localization, ROI3Net demonstrates superior performance compared with state-of-the-art single-scale models. For instance, a Rank-1 accuracy of 99.90% is achieved on the NTU-CP dataset, and a Rank-1 accuracy of 90.17% is achieved on the challenging REST dataset (Table 1). Model robustness is further evaluated by introducing a random 10% localization offset. Under this condition, conventional models exhibit substantial performance degradation. For example, the Equal Error Rate (EER) of the CO3Net model on NTU-CP increases from 2.54% to 15.66%. In contrast, ROI3Net maintains stable performance, with the EER increasing only from 1.96% to 5.01% (Fig. 7, Table 2). The effect of affine transformations, including rotation (±30°) and scaling (0.85\begin{document}$ \sim $\end{document}1.15×), is also analyzed. Rotation causes feature distortion because standard convolution operations lack rotation invariance, whereas the proposed multi-scale mechanism effectively compensates for translation errors by expanding the receptive field (Table 3). Generalization experiments further confirm that embedding this mechanism into existing models, including CCNet, CO3Net, and RLANN, significantly improves robustness (Table 6). In terms of efficiency, although the theoretical computational load increases by approximately 150%, the actual GPU inference time increases by only about 20% (6.48 ms) because the multi-scale branches are processed independently and in parallel (Table 7).  Conclusions  A Multi-scale ROI Feature Fusion Mechanism is presented to reduce the sensitivity of palmprint recognition systems to localization errors. By employing a lightweight Weight Prediction Network to adaptively fuse features extracted from different ROI scales, the proposed ROI3Net effectively combines fine-grained texture details with global semantic information. Experimental results confirm that this approach significantly improves robustness to translation errors by recovering truncated texture information, whereas the efficient design of the Weight Prediction Network limits computational overhead. The proposed mechanism also exhibits strong generalization ability when integrated into different backbone networks. This study provides a practical and resilient solution for palmprint recognition in unconstrained environments. Future work will explore non-linear fusion strategies, such as graph neural networks, to further exploit cross-scale feature interactions.
Tri-Frequency Wearable Antenna Loaded with Artificial Magnetic Conductors
JIN Bin, ZHANG Jialin, DU Chengzhu, CHU Jun
Available online  , doi: 10.11999/JEIT251050
Abstract:
A tri-band wearable antenna based on an Artificial Magnetic Conductor (AMC) is designed for on-body wireless applications. The design objective is to achieve multi-band operation with enhanced radiation characteristics and reduced electromagnetic exposure under wearable conditions. The antenna adopts a tri-frequency monopole with a trident structure, while the AMC unit employs a three-layer square-ring configuration. Both the antenna and the AMC are fabricated on a semiflexible Rogers 4003 substrate. A 4 × 5 AMC array is positioned on the back of the antenna, forming an integrated structure that improves radiation directionality and suppresses backward radiation. The integrated antenna exhibits measured operating bandwidths of 2.38~2.52 GHz, 3.30~3.86 GHz, and 5.54~7.86 GHz. These frequency ranges cover the ISM band (2.400~2.4835 GHz), the 5G n78 band (3.30~3.80 GHz), and the 5G/WiFi 5.8 GHz band (5.725~5.875 GHz). The measured gains at 2.4 GHz, 3.5 GHz, and 5.8 GHz are corresponding to improvements of 5.3 dB, 4.6 dB, and 2.2 dB compared with the unloaded antenna. The front-to-back ratio improves by 19.8 dB, 16.7 dB, and 12.4 dB relative to the antenna without the AMC. The AMC reflector effectively reduces the Specific Absorption Rate (SAR), with the maximum value maintained below 0.025 W/kg/g, which is lower than the limits specified by the U.S. Federal Communications Commission and the European Telecommunications Standards Institute. Antenna performance is further evaluated when attached to the human chest, back, and thigh, and the measured results indicate stable operation, supporting safe and flexible wearable applications.
An EEG Emotion Recognition Model Integrating Memory and Self-attention Mechanisms
LIU Shanrui, BI Yingzhou, HUO Leigang, GAN Qiujing, ZHOU shuheng
Available online  , doi: 10.11999/JEIT250737
Abstract:
  Objective  ElectroEncephaloGraphy (EEG) is a noninvasive technique for recording neural signals and provides rich emotional and cognitive information for brain science research and affective computing. Although Transformer-based models demonstrate strong global modeling capability in EEG emotion recognition, their multi-head self-attention mechanisms do not reflect the characteristics of brain-generated signals that exhibit a forgetting effect. In human cognition, emotional or cognitive states from distant time points gradually decay, whereas existing Transformer-based approaches emphasize temporal relevance only and neglect this forgetting behavior. This limitation reduces recognition performance. Therefore, a model is designed to account for both temporal relevance and the intrinsic forgetting effect of brain activity.  Methods  A novel EEG emotion recognition model, termed Memory Self-Attention (MSA), is proposed by embedding a memory-based forgetting mechanism into the standard self-attention framework. The MSA mechanism integrates global semantic modeling with a biologically inspired memory decay component. For each attention head, a memory forgetting score is learned through two independent linear decay curves to represent natural attenuation over time. These scores are combined with conventional attention weights so that temporal relationships are adjusted by distance-aware forgetting behavior. This design improves performance with a negligible increase in model parameters and computational cost. An Aggregated Convolutional Neural Network (ACNN) is first applied to extract spatiotemporal features across EEG channels. The MSA module then captures global dependencies and memory-aware interactions. The refined representations are finally passed to a classification head to generate predictions.  Results and Discussions  The proposed model is evaluated on several benchmark EEG emotion recognition datasets. On the DEAP binary classification task, classification accuracies of 98.87% for valence and 98.30% for arousal are achieved. On the SEED three-class task, an accuracy of 97.64% is obtained, and on the SEED-IV four-class task, the accuracy reaches 95.90%. These results (Figs. 35, Tables 35) exceed those of most mainstream methods, indicating the effectiveness and robustness of the proposed approach across different datasets and emotion classification settings.  Conclusions  An effective and biologically informed method for EEG-based emotion recognition is presented by incorporating a memory forgetting mechanism into a Transformer architecture. The proposed MSA model captures both temporal correlations and forgetting characteristics of brain signals, providing a lightweight and accurate solution for multi-class emotion recognition. Experimental results confirm its strong performance and generalizability.
Cross-modal Retrieval Enhanced Energy-efficient Multimodal Federated Learning in Wireless Networks
LIU Jingyuan, MA Ke, XU Runchen, CHANG Zheng
Available online  , doi: 10.11999/JEIT251221
Abstract:
  Objective  Multimodal Federated Learning (MFL) uses complementary information from multiple modalities, yet in wireless edge networks it is restricted by limited energy and frequent missing modalities because many clients store only images or only reports. This study presents Cross-modal Retrieval Enhanced Energy-efficient Multimodal Federated Learning (CREEMFL), which applies selective completion and joint communication–computation optimization to reduce training energy under latency and wireless constraints.  Methods  CREEMFL completes part of the incomplete samples by querying a public multimodal subset, and processes the remaining samples through zero padding. Each selected user downloads the global model, performs image-to-text or text-to-image retrieval, conducts local multimodal training, and uploads model updates for aggregation. An energy–delay model couples local computation and wireless communication and treats the required number of global rounds as a function of retrieval ratios. Based on this model, an energy minimization problem is formulated and solved using a two-layer algorithm with an outer search over retrieval ratios and an inner optimization of transmission time, Central Processing Unit (CPU) frequency, and transmit power.  Results and Discussions  Simulations on a single-cell wireless MFL system show that increasing the ratio of completing text from images improves test accuracy and reduces total energy. In contrast, a large ratio of completing images from text provides limited accuracy gain but increases energy consumption (Fig. 3, Fig. 4). Compared with four representative baselines, CREEMFL achieves shorter completion time and lower total energy across a wide range of maximum average transmit powers (Fig. 5, Fig. 6). For CREEMFL, increased system bandwidth further reduces completion time and energy consumption (Fig. 7, Fig. 8). Under different user modality compositions, CREEMFL also attains higher test accuracy than local training, zero padding, and cross-modal retrieval without energy optimization (Fig. 9).  Conclusions  CREEMFL integrates selective cross-modal retrieval and joint communication–computation optimization for energy-efficient MFL. By treating retrieval ratios as variables and modeling their effect on global convergence rounds, it captures the coupling between per-round costs and global training progress. Simulations verify that CREEMFL reduces training completion time and total energy while preserving classification accuracy in resource-constrained wireless edge networks.
One-Pass Architectural Synthesis for Continuous-Flow Microfluidic Biochips Based on Deep Reinforcement Learning
LIU Genggeng, JIAO Xinyue, PAN Youlin, HUANG Xing
Available online  , doi: 10.11999/JEIT251058
Abstract:
Continuous-Flow Microfluidic Biochips (CFMBs) are widely applied in biomedical research because of miniaturization, high reliability, and low sample consumption. As integration density increases, design complexity significantly rises. Conventional stepwise design methods treat binding, scheduling, layout, and routing as separate stages, with limited information exchange across stages, which leads to reduced solution quality and extended design cycles. To address this limitation, a one-pass architectural synthesis method for CFMBs is proposed based on Deep Reinforcement Learning (DRL). Graph Convolutional Neural networks (GCNs) are used to extract state features, capturing structural characteristics of operations and their relationships. Proximal Policy Optimization (PPO), combined with the A* algorithm and list scheduling, ensures rational layout and routing while providing accurate information for operation scheduling. A multiobjective reward function is constructed by normalizing and weighting biochemical reaction time, total channel length, and valve count, enabling efficient exploration of the decision space through policy gradient updates. Experimental results show that the proposed method achieves a 2.1% reduction in biochemical reaction time, a 21.3% reduction in total channel length, and a 65.0% reduction in valve count on benchmark test cases, while maintaining feasibility for larger-scale chips.  Objective  CFMBs have gained sustained attention in biomedical applications because of miniaturization, high reliability, and low sample consumption. With increasing integration density, design complexity escalates substantially. Traditional stepwise design methods often yield suboptimal solutions, extended design cycles, and feasibility limitations for large-scale chips. To address these challenges, a one-pass architectural synthesis framework is proposed that integrates DRL to achieve coordinated optimization of binding, scheduling, layout, and routing.  Methods  All CFMB design tasks are integrated into a unified optimization framework formulated as a Markov decision process. The state space includes device binding information, device locations, operation priorities, and related parameters, whereas the action space adjusts device placement, operation-to-device binding, and operation priority. High-dimensional state features are extracted using GCNs. PPO is applied to iteratively update policies. The reward function accounts for biochemical reaction time, total flow-channel length, and the number of additional valves. These metrics are evaluated using the A* algorithm and list scheduling, normalized, and weighted to balance trade-offs among objectives.  Results and Discussions  Based on the current state and candidate actions, architectural solutions are generated iteratively through PPO-guided policy updates combined with the A* algorithm and list scheduling. The defined reward function enables the generation of CFMB architectures with improved overall quality. Experimental results show an average reduction of 2.1% in biochemical reaction time, an average reduction of 21.3% in total flow-channel length, with a maximum reduction of 57.1% in the ProteinSplit benchmark, and an average reduction of 65.0% in additional valve count compared with existing methods. These improvements reduce manufacturing cost and operational risk.  Conclusions  A one-pass architectural synthesis method for CFMBs based on DRL is proposed to address flow-layer design challenges. By applying GCN-based state feature extraction and PPO-based policy optimization, the multiobjective design problem is transformed into a sequential decision-making process that enables joint optimization of binding, scheduling, layout, and routing. Experimental results obtained from multiple benchmark test cases confirm improved performance in biochemical reaction completion time, total channel length, and valve count, while preserving scalability for larger chip designs.
High-Efficiency Side-Channel Analysis: From Collaborative Denoising to Adaptive B-Spline Dimension Reduction
LUO Yuling, XU Haiyang, OUYANG Xue, FU Qiang, QIN Sheng, LIU Junxiu
Available online  , doi: 10.11999/JEIT251047
Abstract:
  Objective  The performance of side-channel attacks is often constrained by the low signal-to-noise ratio of raw power traces, the masking of local leakage by redundant high-dimensional data, and the reliance on empirically chosen preprocessing parameters. Existing studies typically optimize individual stages, such as denoising or dimensionality reduction, in isolation, lack a unified framework, and fail to balance signal-to-noise ratio enhancement with the preservation of local leakage features. A unified analysis framework is therefore proposed to integrate denoising, adaptive parameter selection, and dimensionality reduction while preserving local leakage characteristics. Through coordinated optimization of these components, both the efficiency and robustness of side-channel attacks are improved.  Methods  Based on the similarity of power traces corresponding to identical plaintexts and the local approximation properties of B-splines, a side-channel analysis method combining collaborative denoising and Adaptive B-Spline Dimension Reduction (ABDR) is presented. First, a Collaborative Denoising Framework (CDF) is constructed, in which high-quality traces are selected using a plaintext-mean template, and targeted denoising is performed via singular value decomposition guided by a singular-value template. Second, a Neighbourhood Asymmetry Clustering (NAC) method is applied to adaptively determine key thresholds within the CDF. Finally, an ABDR algorithm is proposed, which allocates knots non-uniformly according to the variance distribution of power traces, thereby enabling efficient data compression while preserving critical local leakage features.  Results and Discussions  Experiments conducted on two datasets based on 8-bit AVR (OSR2560) and 32-bit ARM Cortex-M4 (OSR407) architectures demonstrate that the CDF significantly enhances the signal-to-noise ratio, with improvements of 60% on OSR2560 (Fig. 2) and 150% on OSR407 (Fig. 4). The number of power traces required for successful key recovery is reduced from 3 000/2 400 to 1 200/1 500 for the two datasets, respectively (Figs. 3 and 5). Through adaptive threshold selection in the CDF, NAC achieves faster and more stable guessing-entropy convergence than fixed-threshold and K-means-based strategies, which enhances overall robustness (Fig. 6). The ABDR algorithm places knots densely in high-variance leakage regions and sparsely in low-variance regions. While maintaining a high attack success rate, it reduces the data dimensionality from 5 000 and 5 500 to 1 000 and 500, respectively, corresponding to a compression rate of approximately 80%. At the optimal dimensionality (Fig. 7), the correlation coefficients of the correct key reach 0.186 0 on OSR2560 and 0.360 5 on OSR407, both exceeding those obtained using other dimensionality reduction methods. These results indicate superior local information retention and attack efficiency (Tables 3 and 4).  Conclusions  The results confirm that the proposed CDF substantially improves the signal-to-noise ratio of power traces, while NAC enables adaptive parameter selection and enhances robustness. Through accurate local modeling, ABDR effectively alleviates the trade-off between high-dimensional data reduction and the preservation of critical leakage information. Comprehensive experimental validation shows that the integrated framework addresses key challenges in side-channel analysis, including low signal-to-noise ratio, redundancy-induced information masking, and dependence on empirical parameters, and provides a practical and scalable solution for real-world attack scenarios.
Coalition Formation Game based User and Networking Method for Status Update Satellite Internet of Things
GAO Zhixiang, LIU Aijun, HAN Chen, ZHANG Senbai, LIN Xin
Available online  , doi: 10.11999/JEIT250838
Abstract:
  Objective  Satellite communication has become a major focus in the development of next-generation wireless networks due to its advantages of wide coverage, long communication distance, and high flexibility in networking. Short-packet communication represents a critical scenario in the Satellite Internet of Things (S-IoT). However, research on the status update problem for massive users remains limited. It is necessary to design reasonable user-networking schemes to address the contradiction between massive user access demands and limited communication resources. In addition, under the condition of large-scale user access, the design of user-networking schemes with low complexity remains a key research challenge. This study presents a solution for status updates in S-IoT based on dynamic orthogonal access for massive users.  Methods  In the S-IoT, a state update model for user orthogonal dual-layer access is established. A dual-layer networking scheme is proposed in which users dynamically allocate bandwidth to access the base station, and the base station adopts time-slot polling to access the satellite. The closed-form expression of the average Age of Information (aAoI) for users is derived based on short-packet communication theory, and a simplified approximate expression is further obtained under high signal-to-noise ratio conditions. Subsequently, a distributed Dual-layer Coalition Formation Game User-base Station-Satellite Networking (DCFGUSSN) algorithm is proposed based on the coalition formation game framework.  Results and Discussions  The approximate aAoI expression effectively reduces computational complexity. The exact potential game is used to demonstrate that the proposed DCFGUSSN algorithm achieves stable networking formation. Simulation results verify the correctness of the theoretical analysis of user aAoI in the proposed state update model (Fig. 5). The results further indicate that with an increasing number of iterations, the user aAoI gradually decreases and eventually converges (Fig. 6). Compared with other access schemes, the proposed dual-layer access scheme achieves a lower aAoI (Figs. 7\begin{document}$ \sim $\end{document}9).  Conclusions  This study investigates the networking problem of massive users assisted by base stations in the status update S-IoT. A dynamic dual-layer user access framework and the corresponding status update model are first established. Based on this framework, the DCFGUSSN algorithm is proposed to reduce user aAoI. Theoretical and simulation results show strong consistency, and the proposed algorithm demonstrates significant performance improvement compared with traditional algorithms.
Age of Information for Energy Harvesting-Driven LoRa Short-Packet Communication Networks
XIAO Shuyu, SUN Xinghua, YUAN Anshan, ZHAN Wen, CHEN Xiang
Available online  , doi: 10.11999/JEIT250814
Abstract:
  Objective  In short-packet communication scenarios for the Industrial Internet of Things (IIoT), devices operate under stringent energy constraints, whereas certain applications require timely data delivery, which makes real-time performance difficult to guarantee. To address this issue, this study analyzes information freshness in Energy Harvesting (EH) networks and examines the effects of energy storage capacity, random access strategies, and packet block length on the Age of Information (AoI). The objective is to provide effective optimization guidelines for the design of practical IIoT communication systems.  Methods  An accurate system model is established based on short-packet communication theory, random access mechanisms, and EH models. The charging and discharging processes of the energy queue are characterized as a Markov chain, from which the steady-state distribution of energy states is derived, followed by a general expression for the average AoI. A mathematical optimization problem is then formulated to minimize the average AoI. To improve practical applicability, two extreme battery-capacity scenarios are considered. For the minimum battery capacity case, a closed-form analytical solution for the optimal packet generation probability is obtained. For the ideal infinite battery capacity case, the packet generation probability and packet block length are jointly optimized, yielding closed-form optimal solutions for both parameters. Extensive simulations are conducted to evaluate the average AoI under different network parameter settings and to verify the effectiveness of the proposed optimization strategies.  Results and Discussions  An analytical expression for the average AoI is derived, and its optimization is investigated under two extreme battery-capacity conditions. For the minimum battery capacity case, the optimal packet generation probability balances update frequency and channel collision (Fig. 5). As the network size increases, the optimal packet generation probability decreases, which significantly improves the average AoI (Theorem 1; Fig. 6). For the ideal infinite battery capacity case, both packet block length and packet generation probability affect the average AoI (Fig. 7). With a fixed packet generation probability, optimizing the packet block length reduces the AoI, which indicates the existence of an optimal block length that balances transmission reliability and energy consumption. When the packet block length is fixed, a low packet generation probability leads to infrequent updates and increased delay, whereas a high probability increases collision in the Energy-Sufficient Regime (ESR) but enables more efficient utilization of energy and channel resources in the Energy-Limited Regime (ELR). Joint optimization of the packet block length and packet generation probability is consistent with the solution obtained via exhaustive search (Theorem 2; Fig. 8). The optimal packet block length increases with network size. In the ELR, the optimal packet generation probability remains equal to one, whereas it decreases with network size to balance update frequency and collision risk (Fig. 9, Fig. 10). In addition, the average AoI varies with the energy arrival rate, which reveals the effects of battery capacity and packet generation probability on overall system performance (Fig. 11).  Conclusions  For the minimum battery capacity case, the average AoI is minimized when the packet generation probability is set to its theoretical optimal value. Under ideal infinite battery capacity, both the packet generation probability and the packet block length must be jointly configured to their respective theoretical optimal values to achieve the minimum average AoI. Theoretical analysis shows that the selection of the optimal packet block length requires a trade-off between decoding error probability and energy consumption. In the ELR, when the packet block length is preconfigured to its optimal value, an energy buffer supporting a single transmission is sufficient, which allows network nodes to adapt effectively to external energy supply limitations. Network nodes should actively access the channel to fully utilize harvested energy and maintain timely information updates, thereby achieving the optimal average AoI. In contrast, under abundant energy conditions or in large-scale networks, network nodes should adjust the packet generation probability to balance channel collision and update frequency. Simulation results further confirm the proposed optimization strategy and demonstrate that the optimized LoRa network significantly improves information timeliness, which provides theoretical guidance for the design of low-power short-packet communication systems.
A Review of Compressed Sensing Technology for Efficient Receiving and Processing of Communication Signal
CHENG Yiting, DONG Tao, SU Yuwei, WEN Xiaojie, YANG Taojun, LI Yibo
Available online  , doi: 10.11999/JEIT250855
Abstract:
Significance (1)Lower data acquisition and storage costs: By exploiting signal sparsity and designing effective dictionary and measurement matrices, compressed sensing enables reconstruction below the Nyquist sampling rate, making it suitable for resource-constrained environments; (2)Smaller pilot overhead: With sparse prior information and optimized observation design, compressed sensing reduces pilot overhead compared with traditional schemes. This saving releases spectrum resources and improves transmission efficiency; (3)Higher signal processing efficiency: Compressed sensing enhances channel estimation performance by approximately 3\begin{document}$ \sim $\end{document}5 dB under the same data volume and achieves linear computational complexity, which is markedly lower than that of conventional super-linear approaches.  Progress  Between 2006 and 2009, compressed sensing progressed rapidly. Candès established the theoretical basis by converting zero-norm sparsity into a convex one-norm formulation under the Restricted Isometry Property (RIP). Aharon et al. then introduced dictionary matrices to strengthen sparse representation, and Needell et al. applied greedy algorithms to speed up reconstruction. From 2010 to 2020, research shifted toward engineering application and algorithm refinement. Wu et al. proposed more robust recovery strategies to improve adaptability, and Zayyani et al. later advanced AI-based dictionary learning. Since 2020, compressed sensing has integrated with deep learning for data-driven sparse modelling and reconstruction. Liu’s work in Integrated Sensing-And-Communication (ISAC) systems demonstrates this trend and supports deployment in next-generation communication networks.  Conclusion  This paper reviews compressed sensing for efficient receiving and processing of communication signal across three dimensions: current progress, key technical challenges, and future directions. It highlights three main research pathways, including dictionary matrix design, measurement matrix development, and reconstruction strategies. The review also shows that compressed sensing is moving toward greater adaptiveness, lightweight design, and intelligence. Current challenges are also summarized, including high computational cost, limited adaptability, and reduced performance under non-ideal conditions. These observations provide guidance for further study.   Prospects   (1)Research on relaxed sparse condition: Existing sparsity assumptions remain strict and constrain the use of compressed sensing in high-dimensional or non-stationary scenarios where ideal sparse representations are difficult to obtain. Loosening sparse requirements is therefore essential. Present work explores adaptive dictionary learning, structured sparse priors, and neural-network-driven relaxation, yet issues persist, such as dependence on prior assumptions, insufficient interpretability, and lack of theoretical convergence. Future work may refine optimization objectives, develop neural models with clear mathematical interpretation, and establish sparse representation methods that do not rely on rigid sparsity priors. (2)Research on algorithm complexity: Further complexity reduction is required in non-stationary time-varying channels, high-dimensional processing, and long-sequence reconstruction. Promising directions include pre-trained dictionary models, deep-learning-based structured measurement matrices, and robust deep reconstruction networks. (3)Research on algorithm adaptability: Practical systems face noise, spectrum fragmentation, fading, and multipath propagation, with stronger effects in cognitive radio and integrated sensing applications. Adaptive strategies should therefore be prioritized. Possible solutions include dynamic sliding-window modelling or optimized regularization for adaptive dictionaries, structured measurement matrices with tunable parameters, and semi-supervised reconstruction algorithms. (4)Research on non-cooperative user detection: Spectrum scarcity heightens the need for efficient sensing to manage uncoordinated users and prevent high-frequency occupancy. Future research may integrate deep learning with statistical models or embed time-frequency information in online dictionary learning to enhance generalization. Multi-objective design of adaptive measurement matrices may further support reliable detection of non-cooperative users.
Bimodal Emotion Recognition Method Based on Dual-stream Attention and Adversarial Mutual Reconstruction
LIU Jia, ZHANG Yangrui, CHEN Dapeng, MAO Die, LU Guorui
Available online  , doi: 10.11999/JEIT250424
Abstract:
  Objective  This paper proposes a bimodal emotion recognition method that integrates ElectroEncephaloGraphy (EEG) and speech signals to address noise sensitivity and inter-subject variability that limit single-modality emotion recognition systems. Although substantial progress has been achieved in emotion recognition research, cross-subject recognition accuracy remains limited, and performance is strongly affected by noise. For EEG signals, physiological differences among subjects lead to large variations in emotion classification performance. Speech signals are likewise sensitive to environmental noise and data loss. This study aims to develop a dual-modality recognition framework that combines EEG and speech signals to improve robustness, stability, and generalization performance.  Methods  The proposed method utilizes two independent feature extractors for EEG and speech signals. For EEG, a dual feature extractor integrating time–frame–channel joint attention and state-space modeling is designed to capture salient temporal and spectral features. For speech, a Bidirectional Long Short-Term Memory (Bi-LSTM) network with a frame-level random masking strategy is adopted to improve robustness to missing or noisy speech segments. A modality refinement fusion module is constructed using gradient reversal and orthogonal projection to enhance feature alignment and discriminability. In addition, an adversarial mutual reconstruction mechanism is applied to enforce consistent emotion feature reconstruction across subjects within a shared latent space.  Results and Discussions  The proposed method is evaluated on multiple benchmark datasets, including MAHNOB-HCI, EAV, and SEED. Under cross-subject validation on the MAHNOB-HCI dataset, the model achieves accuracies of 81.09% for valence and 80.11% for arousal, outperforming several existing approaches. In five-fold cross-validation, accuracies increase to 98.14% for valence and 98.37% for arousal, demonstrating strong generalization and stability. On the EAV dataset, the proposed model attains an accuracy of 73.29%, which exceeds the 60.85% achieved by conventional Convolutional Neural Network (CNN)-based methods. In single-modality experiments on the SEED dataset, an accuracy of 89.33% is obtained, confirming the effectiveness of the dual-stream attention mechanism and adversarial mutual reconstruction for improving cross-subject generalization.  Conclusions  The proposed dual-stream attention and adversarial mutual reconstruction framework effectively addresses challenges in cross-subject emotion recognition and multimodal fusion for affective computing. The method demonstrates strong robustness to individual differences and noise, supporting its applicability in real-world human–computer interaction systems.
Battery Pack Multi-Fault Diagnosis Algorithm Based on Dual-Perspective Spectral Attention Fusion
LIU Mingjun, GU Shenyu, YIN Jingde, ZHANG Yifan, DONG Zhekang, JI Xiaoyue
Available online  , doi: 10.11999/JEIT251156
Abstract:
  Objective  With the rapid growth of electric vehicles and their widespread deployment, battery pack faults have become more frequent, creating an urgent need for efficient fault diagnosis methods. Although deep learning-based approaches have achieved notable progress, existing studies remain limited in addressing multiple fault types, such as Internal Short Circuit (ISC), sensor noise, sensor drift, and State-Of-Charge (SOC) inconsistency, and in modeling the coupling relationships among these faults. To address these limitations, a multi-fault diagnosis algorithm for battery packs based on dual-perspective spectral attention is proposed. A dual-perspective tokenization module is designed to extract spatiotemporal features from battery data, whereas a spectral attention mechanism addresses non-stationary time-series characteristics and captures long-term dependencies, thereby improving diagnostic performance. Experimental results under the Federal Urban Driving Schedule (FUDS), Urban Dynamometer Driving Schedule (UDDS), and Supplemental Federal Test Procedure (US06) demonstrate that the proposed method achieves average improvements of 10.98% in precision, 12.64% in recall, 13.84% in F1 score, and 13.45% in accuracy compared with existing multi-fault diagnosis methods. Furthermore, systematic ablation studies and robustness analyses are conducted to examine the contribution of core modules to overall model performance and to validate the anti-interference capability and robustness of the proposed method under complex noise conditions. Overall, the dual-perspective spectral attention framework improves multi-fault diagnosis performance and provides a new approach for modeling complex spatiotemporal features, thereby supporting enhanced vehicle safety.  Methods  To improve spatiotemporal feature extraction and fault diagnosis performance, a dual-perspective spectral attention fusion algorithm for battery pack multi-fault diagnosis is proposed. The overall architecture consists of four core modules (Fig. 3): a dual-perspective tokenization module, a spectral attention module, a feature fusion module, and an output module. The dual-perspective tokenization module applies positional encoding to jointly model temporal and spatial dimensions, enabling comprehensive spatiotemporal feature representation. When combined with the spectral attention mechanism, the capability of the model to handle non-stationary characteristics is strengthened, leading to improved diagnostic performance. In addition, to address the lack of comprehensive publicly available datasets for battery pack fault diagnosis, a new dataset is constructed, covering ISC, sensor noise, sensor drift, and SOC inconsistency faults. The dataset includes three operating conditions, FUDS, UDDS, and US06, which alleviates data scarcity in this research field.  Results and Discussions  Experimental results indicate that the proposed method improves average precision, recall, F1 score, and accuracy by 10.98%, 12.64%, 13.84%, and 13.45%, respectively, compared with existing optimal fault diagnosis methods. Comparison experiments under different operating conditions (Table 7) support this conclusion. Conventional convolutional neural network methods perform well in local feature extraction; however, fixed-size convolution kernels are not well suited to time features with varying frequencies, which limits long-term temporal dependency modeling and global feature capture. Recurrent neural network-based methods show reduced computational efficiency when large-scale datasets are processed. Transformer-based models face constraints in spatial feature extraction and in representing temporal variations. By contrast, the proposed algorithm addresses these limitations through an integrated architectural design. Ablation experiments demonstrate the contribution of each module to overall performance (Table 8), and the complete framework improves average F1 score and accuracy by 9.30% and 9.26%, respectively, compared with ablation variants. Robustness analysis under simulated noise conditions (Table 9) shows that the proposed method achieves accuracy improvements ranging from 49.95% to 124.34% over baseline methods at noise levels from –2 dB to –8 dB, indicating strong noise resistance.  Conclusions  A multi-fault diagnosis algorithm for battery packs is presented that integrates dual-perspective tokenization and spectral attention to combine spatiotemporal and spectral information. The dual-perspective tokenization module performs tokenization and positional encoding along temporal and spatial axes, which improves spatiotemporal representation. The spectral attention mechanism strengthens modeling of non-stationary signals and long-term dependencies. Experiments under FUDS, UDDS, and US06 driving cycles show that the proposed method outperforms existing multi-fault diagnosis approaches, with average gains of 13.84% in F1 score and 13.45% in accuracy. Ablation studies confirm that both modules contribute substantially and that their combination enables effective handling of complex time-series features. Under high-noise conditions (–2 dB, –4 dB, –6 dB, and –8 dB), the method also shows improved robustness, with accuracy gains of 49.95%, 90.39%, 112.01%, and 124.34%, respectively, compared with baseline methods. Several limitations remain. First, the data are mainly derived from laboratory simulations, and further validation under real-world operating conditions is required. Second, the effect of fault severity on battery management system hierarchical decision making has not been fully addressed, and future work will focus on establishing a fault severity grading strategy. Third, physical interpretability requires further improvement, and subsequent studies will explore the integration of equivalent circuit models or electrochemical mechanism models to balance diagnostic accuracy and interpretability.
Survey on Intelligent Methods for Large-scale Remote Sensing Satellite Scheduling
DU Yonghao, ZHANG Benkui, WU Jian, CHEN Yingguo, YAN Donglei, YU Haiyan, XING Lining, BAI Baocun
Available online  , doi: 10.11999/JEIT251038
Abstract:
  Significance   Satellite task scheduling is an operational optimization technique. It constructs combinatorial optimization models for space–ground resources and applies operations research and computational intelligence methods to generate task plans, resolve task conflicts and constraints, and maximize satellite utilization efficiency. With the development of large-scale constellations, satellite task scheduling faces several new challenges. (1) The rapid increase in the number of satellites and tasks leads to a combinatorial explosion of the solution space. (2) Satellite applications are shifting from planned operations to on-demand services, which require response times to be reduced from hours to minutes or even seconds. (3) Advances in satellite payload capabilities enable onboard autonomous decision making and in-orbit collaboration, which support interactive and swarm-intelligence-based management of large-scale remote sensing constellations.  Progress   To address large-scale complexity, constellation collaboration, and on-demand service requirements in satellite task scheduling, recent research developments are reviewed from the perspectives of task scheduling frameworks, task scheduling models, and task scheduling algorithms, following a top-down approach. First, centralized scheduling frameworks, distributed scheduling frameworks, and hybrid centralized–distributed scheduling frameworks are described, and their control paradigms and application scenarios are clarified. Second, task scheduling models are examined according to their theoretical foundations and applicable solution methods, including classical operations research models, constraint satisfaction optimization models, and artificial neural network-based decision-making models. Their modeling approaches and application scopes are discussed in detail. Subsequently, three major classes of task scheduling algorithms are summarized, including exact algorithms, metaheuristic algorithms, and machine learning-based algorithms. Their decision-making mechanisms, advantages, and limitations are analyzed. Finally, future research directions are identified, including the reconstruction of large-scale and order-oriented task scheduling frameworks, the development of novel task scheduling models, and the innovative integration of different task scheduling algorithms.  Conclusions and prospects   At the framework level, task scheduling frameworks for constellations consisting of more than one thousand satellites have not yet been reported. Existing task scheduling frameworks mainly address problems with fewer than 100 satellites, which remains insufficient for large-scale remote sensing constellations with thousands or even tens of thousands of satellites. The hybrid centralized–distributed task scheduling framework combines the advantages of centralized scheduling frameworks and distributed scheduling frameworks and is consistent with the hierarchical construction and management characteristics of satellite constellations. It can further adapt to satellite scale expansion and order-based process mechanisms. At the model level, constraint satisfaction optimization models focus on detailed representations of optimization attributes and elements and are suitable for small-scale and medium-scale satellite task scheduling problems. In contrast, artificial neural network-based decision-making models emphasize classification and decision-making characteristics and support online and on-demand scheduling, which makes them suitable for large-scale satellite task scheduling scenarios. These two types of task scheduling models can therefore be coordinated to characterize different stages of large-scale constellation task scheduling. At the algorithm level, the integration of metaheuristic algorithms and machine learning-based algorithms has become an important technical approach for solving satellite task scheduling problems. This integrated approach supports hybrid centralized–distributed task scheduling frameworks and complements both constraint satisfaction optimization models and artificial neural network-based decision-making models.
A Spatio-Temporal Feature Fusion LSTM Relaxation Measurement Method for LEO Satellites
YANG Mengxin, ZHANG Qingting, ZENG Lingxin, GU Yixiao, ZENG Dan, XIA Bin
Available online  , doi: 10.11999/JEIT251146
Abstract:
  Objective  The high dynamics of Low Earth Orbit (LEO) satellite communication systems cause frequent link measurements. Existing schemes mainly adopt threshold-based or standard spatio-temporal prediction-based relaxation measurement strategies to mitigate this issue. However, these approaches do not effectively capture the dynamic evolution of the importance of historical data and multiple measurement metrics induced by satellite mobility. Therefore, adaptation to highly time-varying satellite-ground link environments remains limited. To address this problem, a spatio-temporal feature fusion relaxation measurement method based on a Long Short-Term Memory (LSTM) network is proposed for LEO satellite communication. An LSTM recurrent neural network integrated with a dual-attention mechanism is constructed. The LSTM extracts correlations among historical measurement data, whereas temporal attention and variable attention focus on key time instants and significant features, respectively. On this basis, the measurement frequency point set and the number of relaxation periods are jointly predicted. Intelligent link measurement is then performed using the selected frequency point set and relaxation period, enabling adaptive and energy-efficient link monitoring in LEO satellite systems.  Methods  The proposed spatio-temporal feature fusion LSTM-based relaxation measurement method employs a Dual-Attention LSTM (DA-LSTM) model to reduce measurement overhead while maintaining reliable link monitoring. Historical link quality indicators, including Reference Signal Receiving Power (RSRP), Reference Signal Receiving Quality (RSRQ), and Doppler shift, together with satellite ephemeris information, are used as model inputs. These features capture temporal and spatial variations and support the joint prediction of a subset of measurement frequency points and their corresponding relaxation periods. Based on the predicted results, the terminal performs adaptive frequency point selection and dynamic relaxation period adjustment or executes full-band measurements with a fixed measurement period. This process enables adaptive and energy-efficient link monitoring while preserving communication performance in LEO satellite systems.  Results and Discussions  The proposed relaxation measurement method applies the DA-LSTM model to predict measurement frequency points and the number of relaxation periods using historical link quality information. Simulation results show higher convergence efficiency, higher training accuracy, and lower loss for both frequency point selection and relaxation period selection compared with baseline methods (Fig. 4 and Fig. 5). The proposed measurement algorithm achieves an average measurement frequency below 30% with minimal performance degradation (Table 3). This result is attributed to the adaptive selection of high-quality frequency points and dynamic adjustment of the measurement period. The trade-off between measurement frequency and communication performance is further examined (Fig. 6 and Fig. 7), indicating that the proposed method achieves a better balance than baseline methods under different terminal speeds. Additional simulations under different terminal speeds (Fig. 8) and different maximum relaxation periods (Fig. 9) further confirm that high energy efficiency and communication performance are maintained under diverse operational conditions.  Conclusions  This work addresses the challenge of dynamic spatio-temporal importance evolution caused by satellite mobility, which limits the effectiveness of existing relaxation measurement strategies. A DA-LSTM–based relaxation measurement algorithm is proposed to predict both the measurement frequency point set and the number of relaxation periods by extracting spatio-temporal correlations from historical link quality data. Simulation results under various scenarios show that: (1) the proposed algorithm achieves higher convergence efficiency and training accuracy than baseline methods; (2) adaptive selection of high-quality frequency points and dynamic adjustment of relaxation periods maintain a favorable balance between measurement frequency and communication reliability; and (3) the method remains effective across different terminal speeds and maximum relaxation periods, indicating good scalability and robustness in dynamic operational environments. The current study is limited to simulations and does not consider hardware constraints, atmospheric effects, or real-time processing requirements. These factors should be investigated in future work.
An Expert of Chain Construction and Optimization Method for Satellite Mission Planning
XIA Wei, WEI Hongtu, CHENG Ying, WANG Junting, HU Xiaoxuan
Available online  , doi: 10.11999/JEIT251018
Abstract:
  Objective  Satellite mission planning is a core optimization problem in space resource scheduling. Existing workflows exhibit a semantic gap between business-level natural language requirements and the mathematical models used for planning. In dynamic operational scenarios, model updates, such as constraint modification, parameter recalculation, or task attribute adjustment, rely heavily on human experts. This dependence leads to slow responses, limited adaptability, and high operational costs. To address these limitations, this paper proposes a Large Language Model (LLM)–driven inference framework based on a Chain of Experts (CoE) and a Dynamic Knowledge Enhancement (DKE) mechanism. The framework enables accurate, efficient, and robust modification of satellite mission planning models from natural language instructions.  Methods  The proposed framework decomposes natural language–driven model modification into a collaborative workflow comprising requirement parsing, task routing, and code generation experts. The requirement parsing expert converts natural language requests into structured modification instructions. The task routing expert assesses task difficulty and dispatches instructions accordingly. The code generation expert produces executable modification scripts for complex, large-scale, or batch operations. To improve accuracy and reduce reliance on manual expert intervention, a DKE mechanism is incorporated. This mechanism adopts a tiered LLM strategy, using a lightweight general model for rapid processing and a stronger reasoning model for complex cases, and constructs a dynamic knowledge base of validated modification cases. Through retrieval-augmented few-shot prompting, historical successful cases are integrated into the reasoning process, enabling continuous self-improvement without model fine-tuning. A sandbox environment performs mathematical consistency checks, including constraint completeness, parameter validity, and solution feasibility, before final acceptance of model updates.  Results and Discussions  Experiments are conducted on a simulated satellite mission planning dataset comprising 100 heterogeneous satellites and 1,000 point targets with different payload types, resolution requirements, and operational constraints. A test set of 100 natural language modification requests with varying complexity is constructed to represent dynamic real-world adjustment scenarios (Table 1). The proposed CoE with DKE framework is evaluated against three baselines: standard prompting with DeepSeek R1, Chain-of-Thought prompting with DeepSeek R1, and standard prompting with GPT-4o. The proposed method achieves an accuracy of 82% with an average response time of 81.28 s, outperforming all baselines in both correctness and efficiency (Table 2). Accuracy increases by 35 percentage points relative to the best-performing baseline, whereas response time decreases by 53.3% (Table 2). Scalability experiments show that the CoE with DKE framework maintains stable response times across small, medium, and large problem instances, whereas baseline methods exhibit significant delays as problem size increases (Table 3). Ablation studies indicate that DKE substantially reduces reliance on high-cost reasoning models, improves the general model’s ability to resolve complex modifications independently, and increases accuracy without sacrificing efficiency (Table 5).  Conclusions  This paper presents an LLM-powered reasoning framework that integrates a Chain of Experts workflow with a DKE mechanism to bridge the semantic gap between natural language requirements and formal optimization models in satellite mission planning. Through layered model collaboration, retrieval-augmented prompting, and sandbox-based mathematical verification, the proposed method achieves high accuracy, fast processing, and strong adaptability to dynamic and complex planning scenarios. Experimental results demonstrate its effectiveness in supporting precise model modification and improving operational intelligence. Future work will extend the framework to multimodal inputs and real-world mission environments to further improve robustness and engineering applicability.
NAS4CIM: Tailored Neural Network Architecture Search for RRAM-Based Compute-in-Memory Chips
LI Yuankun, WANG Ze, ZHANG Qingtian, GAO Bin, WU Huaqiang
Available online  , doi: 10.11999/JEIT250978
Abstract:
  Objective  With the growing demand for on-orbit information processing in satellite missions, efficient deployment of neural networks under strict power and latency constraints remains a major challenge. Resistive Random Access Memory (RRAM)-based Compute-in-Memory (CIM) architectures provide a promising solution for low power consumption and high throughput at the edge. To bridge the gap between conventional neural architectures and CIM hardware, this paper proposes NAS4CIM, a Neural Architecture Search (NAS) framework tailored for RRAM-based CIM chips. The framework proposes a decoupled distillation-enhanced training strategy and a Top-k-based operator selection method, enabling balanced optimization of task accuracy and hardware efficiency. This study presents a practical approach for algorithm–architecture co-optimization in CIM systems with potential application in satellite edge intelligence.  Methods  NAS4CIM is designed as a multi-stage architecture search framework that explicitly considers task performance and CIM hardware characteristics. The search process consists of three stages: task-driven operator evaluation, hardware-driven operator evaluation, and final architecture selection with retraining. In the task-driven stage, NAS4CIM employs the Decoupled Distillation-Enhanced Gradient-based Significance Coefficient Supernet Training (DDE-GSCST) method. Rather than jointly training all candidate operators in a fully coupled supernet, DDE-GSCST applies a semi-decoupled training strategy across different network stages. A high-accuracy teacher network is used to guide training. For each stage, the teacher network provides stable feature representations, whereas the remaining stages remain fixed, which reduces interference among candidate operators. Knowledge distillation is critical under CIM constraints. RRAM-based CIM systems typically rely on low-bit quantization and are affected by device-level noise, under which conventional weight-sharing NAS methods show unstable convergence. Feature distillation from a strong teacher network ensures clear optimization signals for candidate operators and supports reliable convergence. After training, each operator is assigned a task significance coefficient that quantitatively reflects its contribution to task accuracy. Following the task-driven stage, a hardware-driven search stage is performed. Candidate network structures are constructed by combining operators according to task significance rankings and are evaluated using an RRAM-based CIM hardware simulator. System-level hardware metrics, including inference latency and energy consumption, are measured. Complete network structures are evaluated directly, capturing realistic effects such as array partitioning, inter-array communication, and Analog-to-Digital Converter (ADC) overhead. From hardware-efficient networks with superior performance, the selection frequency of each operator is analyzed. Operators that appear more frequently in low-latency and low-energy designs are assigned higher hardware significance coefficients. This data-driven evaluation avoids inaccurate operator-level hardware modeling and reflects system-level behavior. In the final stage, task significance and hardware significance matrices are integrated. By adjusting weighting factors, the framework prioritizes accuracy, efficiency, or a balanced trade-off. Based on the combined evaluation, an optimal operator set is selected to construct the final network architecture, which is then retrained from scratch to refine weights and further improve accuracy while maintaining high hardware efficiency on CIM platforms.  Results and Discussions  NAS4CIM is evaluated on FashionMNIST, CIFAR-10, and ImageNet to demonstrate effectiveness across tasks of different scales. On FashionMNIST, the framework achieves 89.8% Top-1 accuracy in the accuracy-oriented search and an Energy–Delay Product (EDP) of 0.16 in the efficiency-oriented search (Fig. 4). Real-chip experiments on fabricated RRAM macros show close agreement between measured accuracy and simulation results, confirming practical feasibility. On CIFAR-10, NAS4CIM reaches 90.5% Top-1 accuracy in the accuracy-oriented mode and an EDP of 0.16 in the efficiency-oriented mode, exceeding state-of-the-art methods under the same hardware configuration. Under a balanced accuracy–efficiency setting, the framework produces a network with 89.3% accuracy and an EDP of 0.97 (Fig. 3). On ImageNet, which represents a large-scale and more complex classification task, NAS4CIM achieves 70.0% Top-1 accuracy in the accuracy-oriented mode, whereas the efficiency-oriented search yields an EDP of 504.74 (Fig. 5). These results indicate effective scalability from simple to complex datasets while maintaining a favorable balance between accuracy and energy efficiency across optimization settings.  Conclusions  This study proposes NAS4CIM, a NAS framework for RRAM-based CIM chips. Through a decoupled distillation-enhanced training method and a Top-k-based operator selection strategy, the framework addresses instability in random sampling approaches and inaccuracies in operator-level performance modeling. NAS4CIM provides a unified strategy to balance task accuracy and hardware efficiency and demonstrates generality across tasks of different complexity. Simulation and real-chip experiments confirm stable performance and consistency between algorithmic and hardware evaluations. NAS4CIM presents a practical pathway for algorithm–hardware co-optimization in CIM systems and supports energy-efficient, real-time information processing for satellite edge intelligence.
Satellite Test Tasks Autonomous Orchestration Based on Task-Coupling Constraints and Time-Bounded Windows
LI Zhen, YU Zhigang, ZHANG Yang, ZHU Xuetian, XIE Ningyu, YANG Fan
Available online  , doi: 10.11999/JEIT250878
Abstract:
  Objective  In recent years, the scale of on-orbit space assets has continued to expand, satellite constellation deployment has accelerated significantly, and the number of satellite launches has increased rapidly. As a result, the demand for on-orbit testing has grown sharply. However, limited ground station availability and scarce visibility arcs severely constrain testing opportunities, giving rise to an increasingly prominent contradiction between “many satellites and few ground stations” under highly limited visibility resources. Traditional satellite mission planning approaches, which rely primarily on manual pre-scheduling, suffer from long decision cycles, low planning efficiency, and high susceptibility to scheduling errors. These limitations make them inadequate for large-scale, multi-task, and highly coupled testing scenarios. Consequently, there is an urgent need to develop efficient automated on-orbit test mission planning technologies to improve the utilization efficiency of satellite–ground visibility arcs.   Methods  To address these limitations, this thesis proposes an automated satellite task orchestration framework to ensure effectiveness and reliability in integrated space–ground systems throughout their lifecycle of construction and operation. A task slider model and a time window model are established, and both general and task-specific orchestration constraints are designed to form a unified constraint paradigm for satellite tasks. A non-convex constraint transformation scheme is further proposed. Using satellite-to-ground link testing as a representative application scenario, an automated task orchestration model is constructed to maximize the number of schedulable tasks under stringent visibility arc constraints while improving the efficiency of visibility arc utilization.  Results and Discussions  Using satellite-to-ground link testing as a representative on-orbit testing scenario, the proposed autonomous orchestration framework is evaluated through simulations with multiple low Earth orbit satellites and limited visibility arcs. The results show that the proposed method schedules testing tasks effectively while strictly satisfying all operational constraints. Compared with traditional heuristic-based algorithms, including genetic algorithms, tabu search, and particle swarm optimization, the proposed approach achieves a significant performance improvement, increasing the total number of scheduled satellite–ground link testing tasks by approximately 1.9 to 2.3 times. The results also indicate that, under highly constrained time windows, the proposed model fully exploits available visibility arcs and avoids resource conflicts, which substantially improves the utilization efficiency of satellite–ground links.  Conclusions  This paper proposes an autonomous orchestration framework for satellite on-orbit testing tasks under complex coupling constraints and time-bounded visibility windows. By modeling testing subtasks and visibility arcs using task slider and time window abstractions, and by integrating general and task-specific constraints into a unified mixed-integer programming formulation, the proposed method provides an effective solution for large-scale testing task scheduling. Simulation results confirm that the framework outperforms traditional heuristic-based methods in terms of the number of executable testing tasks and visibility arc utilization. The proposed approach provides a practical and extensible scheduling paradigm for future large-scale satellite constellation testing scenarios. Future work will consider additional resource-layer constraints and uncertainty factors to further improve robustness in real-world testing environments.
LightMamba: A Lightweight Mamba Network for the Joint Classification of HSI and LiDAR Data
LIAO Diling, LAI Tao, HUANG Haifeng, WANG Qingsong
Available online  , doi: 10.11999/JEIT250981
Abstract:
  Objective  The joint classification of HyperSpectral Imagery (HSI) and Light Detection And Ranging (LiDAR) data is a critical task in remote sensing, where complementary spectral and spatial information is exploited to improve land cover recognition accuracy. However, mainstream deep learning approaches, particularly those based on Convolutional Neural Networks (CNNs) and Transformers, are constrained by high computational cost and limited efficiency in modeling long-range dependencies. CNN-based methods are effective for local feature extraction but suffer from limited receptive fields and increased parameter counts when scaled. Transformer architectures provide global context modeling but incur quadratic computational complexity due to self-attention mechanisms, which leads to prohibitive costs when processing high-dimensional remote sensing data. To address these limitations, a lightweight network architecture named LightMamba is proposed. The model leverages an advanced State Space Model (SSM) to achieve efficient and accurate joint classification of HSI and LiDAR data. The objective is to maintain linear computational complexity while effectively fusing multi-source features and capturing global contextual relationships, thereby supporting resource-constrained applications without accuracy degradation.  Methods  The proposed LightMamba framework consists of three core components. First, a MultiSource Alignment Module (MSAM) is designed to address heterogeneity between HSI and LiDAR data. A dual-branch network with shared weights projects both modalities into a unified feature space, which ensures consistent spatial-spectral representation. This shared-weight strategy reduces the parameter count and strengthens inter-modal correlation through the learning of common foundational features. Second, the Multi-Source Lightweight Mamba Module (MSLMM) forms the core of the framework. Aligned HSI and LiDAR feature sequences are processed using a parameter-efficient Mamba architecture. A hybrid parameter-sharing strategy is adopted by combining shared matrices with modality-specific parameters, which preserves discriminative capability while reducing redundancy. LiDAR elevation information is used as a positional guide to enhance spatial awareness during feature fusion. The selective scanning mechanism of the SSM enables efficient modeling of long-range dependencies with linear complexity, thereby avoiding the quadratic cost associated with Transformers. Spectral bands are processed sequentially to preserve joint spectral spatial characteristics. Finally, a MultiLayer Perceptron (MLP)-based classifier maps fused high-level features to class probabilities with low computational overhead. The model is trained end to end using cross-entropy loss. Evaluations are conducted on two public benchmarks, namely the Houston and Augsburg datasets. Comparisons are performed against representative methods, including CoupledCNN, GAMF, HCT, MFT, Cross-HL, and S2CrossMamba, using Overall Accuracy (OA), Average Accuracy (AA), and the Kappa coefficient. Ablation experiments analyze the contribution of each module, and parameter count and FLoating-Point Operations (FLOPs) are reported.  Results and Discussions  Experimental results demonstrate that LightMamba achieves superior performance and efficiency. On the Houston dataset, an OA of 94.30%, an AA of 95.25%, and a Kappa coefficient of 93.83% are obtained, which exceed those of all comparison methods. Perfect classification accuracy is achieved for several classes, including Soil and Water. Classification maps exhibit improved spatial continuity and internal consistency, with reduced speckle noise, particularly in heterogeneous regions such as commercial areas. On the Augsburg dataset, LightMamba achieves the highest OA of 87.41% and a Kappa coefficient of 82.30%, which confirms strong generalization across different scenes. Although the AA is slightly lower than that of S2CrossMamba, the higher OA and Kappa values indicate better overall performance. Complexity analysis shows that LightMamba attains high accuracy with a lightweight structure containing only 69.93 k parameters, which is substantially fewer than GAMF and comparable to S2CrossMamba, while maintaining moderate FLOPs. Experiments on input patch size indicate adaptability to scene characteristics, with optimal performance observed at 17×17 for the Houston dataset and 9×9 for the Augsburg dataset.  Conclusions  A lightweight network architecture, LightMamba, is presented for joint HSI and LiDAR classification. By combining a shared-weight MSAM with a lightweight Mamba module that adopts hybrid parameterization and elevation-guided fusion, modal heterogeneity is effectively addressed and long-range contextual dependencies are captured with linear computational complexity. Experimental results on public benchmarks demonstrate state-of-the-art classification accuracy with a reduced parameter count and computational cost compared with existing methods. These findings confirm the potential of Mamba-based architectures for efficient multi-source remote sensing data fusion. Future research will explore optimized two-dimensional scanning mechanisms and adaptive scanning strategies to further improve feature capture efficiency and classification performance. The LightMamba code is available at https://www.scidb.cn/detail?dataSetId=064dc4ac5350418e87a8b82dd324737b&version=V1&code=j00173.
Convolutional Mixed Multi-Attention Encoder-Decoder Network for Radar Signal Sorting
CHANG Huaizhao, GU Yingyan, HAN Yunzhi, JIN Benzhou
Available online  , doi: 10.11999/JEIT251031
Abstract:
  Objective  Radar signal sorting is a fundamental technology for electromagnetic environment awareness and electronic warfare systems. The objective of this study is to develop an effective radar signal sorting method that accurately separates intercepted pulse sequences and assigns them to different radiation sources in complex electromagnetic environments. With the increasing complexity of modern radar systems, intercepted pulse sequences are severely affected by pulse overlap, pulse loss, false pulses, and pulse arrival time measurement errors, which substantially reduce the performance of conventional sorting approaches. Therefore, a robust signal sorting framework that maintains high accuracy under non-ideal conditions is required.  Methods  Radar signal sorting in complex electromagnetic environments is formulated as a pulse-level time-series semantic segmentation problem, where each pulse is treated as the minimum processing unit and classified in an end-to-end manner. Under this formulation, sorting is achieved through unified sequence modeling and label prediction without explicit pulse subsequence extraction or iterative stripping procedures, which reduces error accumulation. To address this task, a convolutional mixed multi-attention encoder-decoder network is proposed (Fig. 1). The network consists of an encoder-decoder backbone, a local attention module, and a feature selection module. The encoder-decoder backbone adopts a symmetric structure with progressive downsampling and upsampling to aggregate contextual information while restoring pulse-level temporal resolution. Its core component is a dual-branch dilated bottleneck module (Fig. 2), in which a 1*1 temporal convolution is applied for channel projection. Two parallel dilated convolution branches with different dilation rates are then employed to construct multi-scale receptive fields, which enable simultaneous modeling of short-term local variations and long-term modulation patterns across multiple pulses and ensure robust temporal representation under pulse time shifts and missing pulses. To enhance long-range dependency modeling beyond convolutional operations, a local Transformer module is inserted between the encoder and the decoder. By applying local self-attention to temporally downsampled feature maps, temporal dependencies among pulses are captured with reduced computational complexity, whereas the influence of false and missing pulses is suppressed during feature aggregation. In addition, a feature selection module is integrated into skip connections to reduce feature redundancy and interference (Fig. 3). Through hybrid attention across temporal and channel dimensions, multi-level features are adaptively filtered and fused to emphasize discriminative information for radiation source identification. During training, focal loss is applied to alleviate class imbalance and improve the discrimination of difficult and boundary pulses.  Results and Discussions  Experimental results demonstrate that the proposed network achieves pulse-level fine-grained classification for radar signal sorting and outperforms mainstream baseline methods across various complex scenarios. Compared with existing approaches, an average sorting accuracy improvement of more than 6% is obtained under moderate interference conditions. In MultiFunctional Radar (MFR) overlapping scenarios, recall rates of 88.30%, 85.48%, 86.89%, and 86.48% are achieved for four different MFRs, respectively, with an overall average accuracy of 86.82%. For different pulse repetition interval modulation types, recall rates exceed 90% for fixed patterns and remain above 85% for jittered, staggered, and group-varying modes. In staggered and group-varying cases, performance improvements exceeding 3.5% relative to baseline methods are observed. Generalization experiments indicate that high accuracy is maintained under parameter distribution shifts of 5% and 15%, which demonstrates strong robustness to distribution perturbations (Fig. 8). Ablation studies confirm the effectiveness of each proposed module in improving overall performance (Table 7).  Conclusions  A convolutional mixed multi-attention encoder-decoder network is proposed for radar signal sorting in complex electromagnetic environments. By modeling radar signal sorting as a pulse-level time-series semantic segmentation task and integrating multi-scale dilated convolutions, local attention modeling, and adaptive feature selection, high sorting accuracy, robustness, and generalization capability are achieved under severe interference conditions. The experimental results indicate that the proposed approach provides an effective and practical solution for radar signal sorting in complex electromagnetic environments.
Robust Adaptive Beamforming Algorithm Based on Dominant Eigenvector Extraction and Orthogonal Projection
LIU Yiyuan, ZHANG Xiaokai, XU Yuhua, ZHENG Xueqiang, YANG Weiwei
Available online  , doi: 10.11999/JEIT251282
Abstract:
  Objective  In practical applications, the spatial anti-jamming performance of adaptive beamformers is often degraded by mismatches in the Directions Of Arrival (DOAs) of signals. Some robust adaptive beamforming algorithms reduce the error between the estimated signal steering vector and the actual steering vector by solving a Quadratic Constrained Quadratic Programming (QCQP) problem. This strategy significantly increases hardware cost. In addition, traditional adaptive beamforming algorithms often exhibit beampattern distortion under non-ideal conditions, such as DOA mismatch. The objective of this paper is to design a robust adaptive beamformer that effectively suppresses jamming signals under different mismatch scenarios.  Methods  A robust adaptive beamforming algorithm for spatial anti-jamming is proposed. First, the actual output Signal-to-Jamming-plus-Noise Ratio (SJNR) in the presence of DOA mismatch is analyzed. An ideal beamformer based on orthogonal projection is then proposed to achieve accurate beampattern control and maximize the practical output SJNR. To improve anti-jamming robustness in mismatch environments, the signal steering vector is estimated through covariance matrix construction and dominant eigenvector extraction. The beamforming weight vector is obtained by constructing an orthogonal projection matrix.  Results and Discussions  The proposed adaptive beamforming algorithm effectively suppresses jamming signals in mismatch environments. Numerical results show that the algorithm achieves good spatial anti-jamming performance in an ideal scenario without mismatch (Fig. 3) and in a scenario with steering vector mismatch (Fig. 4). In DOA mismatch scenarios, the proposed algorithm demonstrates superior beampattern performance (Fig. 5, Fig. 6) and output SJNR performance (Fig. 7, Fig. 8, Fig. 9). The results also indicate stronger robustness to DOA mismatch (Fig. 10, Fig. 11). Effective jamming suppression is maintained even when the incoming directions of the jamming signals are closely spaced (Fig. 12).  Conclusions  This paper proposes a robust adaptive beamforming algorithm for suppressing power suppressive jamming signals. An ideal beamformer is first developed to achieve precise beampattern control and maximize the actual output SJNR. A robust adaptive beamforming algorithm is then constructed through covariance matrix construction, dominant eigenvector extraction, and orthogonal projection. Numerical results show that the proposed algorithm provides strong spatial anti-jamming performance in ideal scenarios without mismatch and in scenarios with DOA mismatch or steering vector mismatch.
Residual Subspace Prototype Constraint for SAR Target Class-Incremental Recognition
XU Yanjie, SUN Hao, LIN Qinjie, JI Kefeng, KUANG Gangyao
Available online  , doi: 10.11999/JEIT251007
Abstract:
Synthetic Aperture Radar (SAR) target recognition systems deployed in open environments frequently encounter continuously emerging categories. This paper proposes a SAR target class-incremental recognition method named Residual Subspace Prototype Constraint (RSPC). RSPC constructs lightweight, task-specific adapters to expand the feature subspace, enabling effective learning of new classes and alleviating catastrophic forgetting. First, self-supervised learning is used to pretrain the backbone network to extract generic feature representations from SAR data. During incremental learning, the backbone network is frozen, and residual adapters are trained to focus on changes in discriminative features. To address old-class prototype invalidation caused by feature space expansion, a structured constraint-based prototype completion mechanism is proposed to synthesize prototypes of old classes in the new subspace without replaying historical data. During inference, predictions are made based on the similarity between the input target and the integrated prototypes from all subspaces. Experiments on the MSTAR, SAMPLE, and SAR-ACD datasets validate the effectiveness of RSPC.  Objective  SAR target recognition in dynamic environments must learn new classes while preserving previously acquired knowledge. Rehearsal-based methods are often impractical because of data privacy and storage constraints in real-world applications. Moreover, conventional pretraining suffers from high interclass scattering similarity and ambiguous decision boundaries, which represents a challenge different from typical catastrophic forgetting. A rehearsal-free framework is proposed to model discriminative feature evolution and reconstruct old-class prototypes in expanded subspaces. This framework enables robust, efficient, and scalable SAR target recognition without rehearsal.  Methods  A RSPC framework is proposed for SAR target class-incremental recognition and is built on a pretrained Vision Transformer backbone. During the incremental phase, the backbone is frozen, and a lightweight residual adapter is trained for each new task to learn the residual feature difference between the current task and the historical average, thereby forming a task-specific discriminative subspace. To address prototype decay in expanded subspaces, a structured prototype completion mechanism is introduced. This mechanism synthesizes the prototype of a historical class in the current subspace by aggregating its observed prototypes from all prior subspaces in which it is learned, weighted by a confidence score derived from three geometric consistency metrics: norm ratio, angular similarity, and Euclidean distance between the historical class and all current new classes within each prior subspace. Optimization of the residual adapter is guided by a dual-constraint loss, including a prototype contrastive loss that enforces intraclass compactness and interclass separation, and a subspace orthogonality loss that maximizes the angular distance between the residual features of a sample across consecutive subspaces, thereby preventing feature reuse and promoting task-specific learning.  Results and Discussions  RSPC achieves the highest Average Incremental Accuracy (AIA) and the lowest Precision Drop (PD) among all rehearsal-free methods across all three datasets (Table 46). On MSTAR, RSPC achieves an AIA of 95.23% (N=1) and 94.83% (N=2), outperforming the best baseline EASE by 0.58% and 0.38%, respectively, while reducing PD by 1.90% and 1.21%. On SAMPLE, RSPC achieves an AIA of 93.30% (N=1) and 93.23% (N=2), exceeding EASE by 1.15% and 2.31 percentage points with substantially lower PD. On the more challenging SAR-ACD dataset, RSPC achieves an AIA of 58.69% (N=1) and 60.35% (N=2), demonstrating superior performance over EASE and SimpleCIL and approaching the performance of rehearsal-based methods ILFL and HLFCC. The t-SNE visualizations (Fig. 24) show that RSPC produces more compact and well-separated class clusters than EASE and MEMO and provides improved interclass boundary discrimination compared with DualPrompt and APER_SSF. The ablation study (Table 7\begin{document}$ \sim $\end{document}9) confirms that both the prototype contrastive loss and the subspace orthogonality loss are essential. Their joint use usually yields the highest AIA and the lowest PD across all datasets, demonstrating complementary effects on discriminability and feature disentanglement. Under low-data conditions (Fig. 5), RSPC maintains superior performance and achieves higher accuracy than EASE when only 20% of new-class training samples are available, indicating strong data efficiency.  Conclusions  A rehearsal-free incremental learning framework, RSPC, is presented for SAR target recognition to mitigate catastrophic forgetting caused by high interclass scattering similarity. RSPC employs a residual subspace mechanism to capture discriminative feature increments and a structured prototype completion strategy to reconstruct stable prototypes without historical data. Experiments on three benchmarks show that RSPC substantially outperforms existing rehearsal-free methods and rivals rehearsal-based approaches, establishing a state-of-the-art solution for scalable and privacy-preserving recognition. Robust performance in low-data regimes further supports its suitability for deployment in resource-constrained and privacy-sensitive scenarios.
Few-Shot Remote Sensing Image Classification Based on Parameter-Efficient Vision Transformer and Multimodal Guidance
WEN Hongli, HU Qinghao, HUANG Liwei, WANG Peisong, CHENG Jian
Available online  , doi: 10.11999/JEIT250996
Abstract:
  Objective  Remote sensing image classification is a core task in Earth observation. Its development is limited by the scarcity of high-quality labeled data. Few-shot learning provides a feasible solution. However, existing methods often suffer from limited feature representation, weak generalization to unseen classes, and high computational cost when adapting large models. These issues restrict their application in time-sensitive and resource-constrained scenarios. To address these challenges, this study proposes an Efficient Few-Shot Vision Transformer with Multimodal Guidance (EFS-ViT-MM). The objective is to construct an efficient and accurate classification framework by combining the strong representation capability of a pre-trained Vision Transformer with parameter-efficient fine-tuning. Discriminative capability is further enhanced by incorporating semantic information from textual descriptions to guide prediction.  Methods  The proposed EFS-ViT-MM framework is formulated as a metric-based learning system composed of three coordinated components. First, an Efficient Low-Rank Vision Transformer (ELR-ViT) is adopted as the visual backbone. A pre-trained Vision Transformer is used for feature extraction, whereas a low-rank adaptation strategy is applied for fine-tuning. The pre-trained parameters are frozen, and only a small number of injected low-rank matrices are optimized. This design reduces the number of trainable parameters and mitigates overfitting while preserving generalization capability. Second, a multimodal guidance mechanism is introduced to enrich visual features with semantic context. A Multimodal Large Language Model generates descriptive text for each support image. The text is embedded into a semantic vector and injected into the visual features through Feature-wise Linear Modulation, which adaptively recalibrates visual representations. Third, a cross-attention metric module is designed to replace fixed distance functions. The module learns similarity between query images and multimodally enhanced support samples by adaptively weighting feature correlations, leading to more precise matching in complex remote sensing scenes.  Results and Discussions  The proposed method is evaluated on multiple public remote sensing datasets, including NWPU-RESISC45, WHU-RS19, UC-Merced, and AID. The results demonstrate consistent performance gains over baseline methods. Under the 5-way 1-shot and 5-way 5-shot settings, classification accuracy increases by 4.7% and 7.0%, respectively. These improvements are achieved with a substantially reduced number of trainable parameters, indicating high computational efficiency. The results confirm that combining large pre-trained models with parameter-efficient fine-tuning is effective for few-shot classification. Performance gains are primarily attributed to multimodal guidance and the cross-attention-based metric, which improve feature discrimination and similarity measurement.  Conclusions  The EFS-ViT-MM framework effectively addresses limited feature representation, poor generalization, and high computational cost in few-shot remote sensing image classification. The integration of a pre-trained Vision Transformer with parameter-efficient fine-tuning enables effective utilization of large models with reduced computational burden. Multimodal guidance introduces semantic context that enhances visual understanding, whereas the cross-attention metric provides adaptive and accurate similarity estimation. Extensive experiments demonstrate state-of-the-art performance across multiple datasets. The proposed framework offers an efficient and generalizable solution for data-scarce remote sensing applications and provides a foundation for future research on multimodal and efficient deep learning methods for Earth observation.
Research on Proximal Policy Optimization for Autonomous Long-Distance Rapid Rendezvous of Spacecraft
LIN Zheng, HU Haiying, DI Peng, ZHU Yongsheng, ZHOU Meijiang
Available online  , doi: 10.11999/JEIT250844
Abstract:
  Objective   With increasing demands from deep-space exploration, on-orbit servicing, and space debris removal missions, autonomous long-distance rapid rendezvous capabilities are required for future space operations. Traditional trajectory planning approaches based on analytical methods or heuristic optimization show limitations when complex dynamics, strong disturbances, and uncertainties are present, which makes it difficult to balance efficiency and robustness. Deep Reinforcement Learning (DRL) combines the approximation capability of deep neural networks with reinforcement learning-based decision-making, which supports adaptive learning and real-time decisions in high-dimensional continuous state and action spaces. In particular, Proximal Policy Optimization (PPO) is a representative policy gradient method because of its training stability, sample efficiency, and ease of implementation. Integration of DRL with PPO for spacecraft long-distance rapid rendezvous is therefore expected to overcome the limits of conventional methods and provide an intelligent, efficient, and robust solution for autonomous guidance in complex orbital environments.   Methods   A spacecraft orbital dynamics model is established by incorporating J2 perturbation, together with uncertainties arising from position and velocity measurement errors and actuator deviations during on-orbit operations. The long-distance rapid rendezvous problem is formulated as a Markov Decision Process, in which the state space includes position, velocity, and relative distance, and the action space is defined by impulse duration and direction. Fuel consumption and terminal position and velocity constraints are integrated into the model. On this basis, a DRL framework based on PPO is constructed. The policy network outputs maneuver command distributions, whereas the value network estimates state values to improve training stability. To address convergence difficulties caused by sparse rewards, an enhanced dense reward function is designed by combining a position potential function with a velocity guidance function. This design guides the agent toward the target while enabling gradual deceleration and improved fuel efficiency. The optimal maneuver strategy is obtained through simulation-based training, and robustness is evaluated under different uncertainty conditions.   Results and Discussions   Based on the proposed DRL framework, comprehensive simulations are conducted to assess effectiveness and robustness. In Case 1, three reward structures are examined: sparse reward, traditional dense reward, and an improved dense reward that integrates a relative position potential function with a velocity guidance term. The results show that reward design strongly affects convergence behavior and policy stability. Under sparse rewards, insufficient process feedback limits exploration of feasible actions. Traditional dense rewards provide continuous feedback and enable gradual convergence, but terminal velocity deviations are not fully corrected at later stages, which leads to suboptimal convergence and incomplete satisfaction of terminal constraints. In contrast, the improved dense reward guides the agent toward favorable behaviors from early training stages while penalizing undesirable actions at each step, which accelerates convergence and improves robustness. The velocity guidance term allows anticipatory adjustments during mid-to-late approach phases rather than delaying corrections to the terminal stage, resulting in improved fuel efficiency.Simulation results show that the maneuvering spacecraft performs 10 impulsive maneuvers, achieving a terminal relative distance of 21.326 km, a relative velocity of 0.005 0 km/s, and a total fuel consumption of 111.2123 kg. To evaluate robustness under realistic uncertainties, 1,000 Monte Carlo simulations are performed. As summarized in Table 6, the mission success rate reaches 63.40%, and fuel consumption in all trials remains within acceptable bounds. In Case 2, PPO performance is compared with that of Deep Deterministic Policy Gradient (DDPG) for a multi-impulse fast-approach rendezvous mission. PPO results show five impulsive maneuvers, a terminal separation of 2.281 8 km, a relative velocity of 0.003 8 km/s, and a total fuel consumption of 4.148 6 kg. DDPG results show a fuel consumption of 4.322 5 kg, a final separation of 4.273 1 km, and a relative velocity of 0.002 0 km/s. Both methods satisfy mission requirements with comparable fuel use. However, DDPG requires a training time of 9 h 23 min, whereas PPO converges within 6 h 4 min, indicating lower computational cost. Overall, the improved PPO framework provides better learning efficiency, policy stability, and robustness.  Conclusions   The problem of autonomous long-distance rapid rendezvous under J2 perturbation and uncertainties is investigated, and a PPO-based trajectory optimization method is proposed. The results demonstrate that feasible maneuver trajectories satisfying terminal constraints can be generated under limited fuel and transfer time, with improved convergence speed, fuel efficiency, and robustness. The main contributions include: (1) development of an orbital dynamics framework that incorporates J2 perturbation and uncertainty modeling, with formulation of the rendezvous problem as a Markov Decision Process; (2) design of an enhanced dense reward function that combines position potential and velocity guidance, which improves training stability and convergence efficiency; and (3) simulation-based validation of PPO robustness in complex orbital environments. Future work will address sensor noise, environmental disturbances, and multi-spacecraft cooperative rendezvous in more complex mission scenarios to further improve practical applicability and generalization.
Auxiliary Screening for Hypertrophic Cardiomyopathy With Heart Failure with Preserved Ejection Fraction Utilizing Smartphone-Acquired Heart Sound Analysis
DONG Xianpeng, MENG Xiangbin, ZHANG Kuo, FANG Guanchen, GAI Weihao, WANG Wenyao, WANG Jingjia, GAO Jun, PAN Junjun, TANG Zhenchao, SONG Zhen
Available online  , doi: 10.11999/JEIT250830
Abstract:
  Objective  Heart Failure with preserved Ejection Fraction (HFpEF) is highly prevalent among patients with Hypertrophic CardioMyopathy (HCM), and early identification is critical for improving disease management. However, early screening for HFpEF remains challenging because symptoms are non-specific, diagnostic procedures are complex, and follow-up costs are high. Smartphones, owing to their wide accessibility, low cost, and portability, provide a feasible means to support heart sound-based screening. In this study, smartphone-acquired heart sounds from patients with HCM are used to develop and train an ensemble learning classification model for early detection and dynamic self-monitoring of HFpEF in the HCM population.  Methods  The proposed HFpEF screening framework consists of three components: preprocessing, feature extraction, and model training and fusion based on ensemble learning (Fig. 1). During preprocessing, smartphone-acquired heart sounds are subjected to bandpass filtering and wavelet denoising to improve signal quality, followed by segmentation into individual cardiac cycles. For feature extraction, Mel-Frequency Cepstral Coefficients (MFCCs) and Short-Time Fourier Transform (STFT) time-frequency spectra are calculated (Fig. 3). For classification, a stacking ensemble strategy is applied. Base learners, including a Support Vector Machine (SVM) and a Convolutional Neural Network (CNN), are trained, and their predicted probabilities are combined to construct a new feature space. A Logistic Regression (LR) meta-learner is then trained on this feature space to identify HFpEF in patients with HCM.  Results and Discussions  The classification performance of the three models is evaluated using the same patient-level independent test set. The SVM base learner achieves an Area Under the Curve (AUC) of 0.800, with an accuracy of 0.766, sensitivity of 0.659, and specificity of 0.865 (Table 5). The CNN base learner attains an AUC of 0.850, with an accuracy of 0.789, sensitivity of 0.622, and specificity of 0.944 (Table 5). By comparison, the ensemble-based LR classifier demonstrates superior performance, reaching an AUC of 0.900, with an accuracy of 0.813, sensitivity of 0.768, and specificity of 0.854 (Table 5). Relative to the base learners, the ensemble model exhibits a significant overall performance improvement after probability-based feature fusion (Fig. 5). Compared with existing clinical HFpEF risk scores, the proposed method shows higher predictive performance and stronger dynamic monitoring capability, supporting its suitability for risk stratification and follow-up warning in home settings. Compared with professional heart sound acquisition devices, the smartphone-acquired approach provides greater accessibility and cost efficiency, supporting its application in auxiliary HFpEF screening for high-risk HCM populations.  Conclusions  The challenges of clinical HFpEF screening in patients with HCM are addressed by proposing a smartphone-acquired heart sound analysis approach combined with an ensemble learning prediction model, resulting in an accessible and easily implemented auxiliary screening pipeline. The effectiveness of smartphone-based heart sound analysis for initial HFpEF screening in patients with HCM is validated, demonstrating its feasibility as an economical auxiliary tool for early HFpEF detection. This approach provides a non-invasive, convenient, and efficient screening strategy for patients with HCM complicated by HFpEF.
A Review of Research on Voiceprint Fault Diagnosis of Transformers
GONG Wenjie, LIN Guosong, WEI Xiaoguang
Available online  , doi: 10.11999/JEIT251076
Abstract:
  Significance   Voiceprint fault diagnosis of transformers has become an active research area for ensuring the safe and reliable operation of power systems. Traditional monitoring methods, such as dissolved gas analysis, infrared temperature measurement, and online partial discharge monitoring, exhibit limited real-time capability and rely heavily on expert experience. These limitations hinder effective detection of early-stage faults. Voiceprint fault diagnosis captures operational voiceprint signals from transformers and enables non-contact monitoring for early anomaly warning. This approach offers advantages in real-time performance, sensitivity, and fault coverage. This review systematically traces the technological evolution from traditional signal analysis to deep learning and compares the advantages, limitations, and application scenarios of different models across multiple dimensions. Key challenges are identified, including limited robustness to noise and imbalanced datasets. Potential research directions are proposed, including integration of physical mechanisms with data-driven methods and improvement of diagnostic transparency and interpretability. These analyses provide theoretical support and practical guidance for promoting the transition of voiceprint fault diagnosis from laboratory research to engineering applications.  Progress   Research on voiceprint fault diagnosis of transformers has progressed from traditional signal analysis to an intelligent recognition paradigm based on deep learning, reflecting a clear technological evolution. A bibliometric analysis of 188 papers from the CNKI and Web of Science databases shows that annual publications remained at 1–10 papers between 1997 and 2020, corresponding to an exploratory stage. Studies during this period focused mainly on fundamental voiceprint signal processing methods, including acoustic wave detection, wavelet transform, and Empirical Mode Decomposition (EMD). After 2020, Variational Modal Decomposition (VMD), Mel spectrum, and Mel Frequency Cepstral Coefficient (MFCC) were gradually applied to voiceprint feature extraction. Since 2021, publication output has increased rapidly and reached a historical peak in 2023. This growth was driven by advances in image and speech processing technologies. Early studies emphasized time-domain and frequency-domain analysis of voiceprint signals. Recent research increasingly converts voiceprint signals into two-dimensional time–frequency spectrogram representations. Model architectures have evolved from single-channel feature inputs with single-model outputs to complex frameworks with multi-channel feature extraction and multi-model fusion. Classical machine learning models, including Gaussian Mixture Model (GMM), Support Vector Machine (SVM), Random Forest (RF), and Back Propagation Neural Network (BPNN), form the foundation of voiceprint fault diagnosis but are limited in handling high-dimensional features. Deep learning models, such as Convolutional Neural Network (CNN), Residual Neural Network (ResNet), Recurrent Neural Network (RNN), and Transformer, demonstrate advantages in automatic feature extraction and complex pattern recognition, although they require substantial computational resources.  Conclusions  This review summarizes the technological development of transformer voiceprint fault diagnosis from machine learning to deep learning. Although deep learning methods achieve high recognition accuracy for complex voiceprint signals, five major challenges remain. These challenges include limited robustness to noise in non-stationary environments, severe data imbalance caused by scarce fault samples, the black-box nature of deep learning models, fragmented evaluation systems resulting from inconsistent data acquisition standards, and insufficient cross-modal fusion of multi-source data. Sensitivity to environmental noise limits diagnostic performance under varying operating conditions. Data imbalance reduces recognition accuracy for rare fault types. Limited interpretability restricts fault mechanism analysis and diagnostic credibility. Inconsistent sensor placement and sampling parameters lead to poor comparability across datasets. Single-modal voiceprint analysis restricts effective utilization of complementary information from other data sources. Addressing these challenges is essential for advancing voiceprint fault diagnosis from laboratory validation to field deployment.  Prospects   Future research should focus on five directions. First, noise-robust voiceprint feature extraction methods based on physical mechanisms should be developed to address non-stationary interference in complex operating environments. Second, the lack of real-world fault data should be alleviated by constructing electromagnetic field–structural mechanics–acoustic coupling models of transformers to generate high-fidelity voiceprint fault samples, while unsupervised clustering methods should be applied to improve annotation efficiency and quality. Third, explainable deep learning architectures for voiceprint fault diagnosis that incorporate physical mechanisms should be designed. Attention mechanisms combined with SHapley Additive exPlanations, Grad-CAM, and physical equations can support process-level and post hoc interpretation of diagnostic results. Fourth, industry-wide collaboration is required to establish standardized voiceprint data acquisition protocols, benchmark datasets, and unified evaluation systems. Fifth, cross-modal fusion models based on multi-channel and multi-feature analysis should be developed to enable integrated transformer fault diagnosis through comprehensive utilization of multi-source information.
T3FRNet: A Cloth-Changing Person Re-identification via Texture-aware Transformer Tuning Fine-grained Reconstruction Method
ZHUANG Jianjun, WANG Nan
Available online  , doi: 10.11999/JEIT250476
Abstract:
  Objective  Compared with conventional person re-identification, Cloth-Changing Person Re-Identification (CC Re-ID) requires moving beyond reliance on the temporal stability of appearance features and instead demands models with stronger robustness and generalization to meet real-world application requirements. Existing deep feature representation methods leverage salient regions or attribute information to obtain discriminative features and mitigate the effect of clothing variations; however, their performance often degrades under changing environments. To address the challenges of effective feature extraction and limited training samples in CC Re-ID tasks, a Texture-Aware Transformer Tuning Fine-Grained Reconstruction Network (T3FRNet) is proposed. The method aims to exploit fine-grained information in person images, enhance the robustness of feature learning, and reduce the adverse effect of clothing changes on recognition performance, thereby alleviating performance bottlenecks under scene variations.  Methods  To compensate for the limitations of local receptive fields, a Transformer-based attention mechanism is integrated into a ResNet50 backbone, forming a hybrid architecture referred to as ResFormer50. This design enables spatial relationship modeling on top of local features and improves perceptual capacity for feature extraction while maintaining a balance between efficiency and performance. A fine-grained Texture-Aware (TA) module concatenates processed texture features with deep semantic features, improving recognition capability under clothing variations. An Adaptive Hybrid Pooling (AHP) module performs channel-wise autonomous aggregation, allowing deeper mining of feature representations and balancing global representation consistency with robustness to clothing changes. An Adaptive Fine-Grained Reconstruction (AFR) strategy introduces adversarial perturbations and selective reconstruction at the fine-grained level. Without explicit supervision, this strategy enhances robustness and generalization against clothing changes and local detail perturbations. In addition, a Joint Perception Loss (JP-Loss) is constructed by integrating fine-grained identity robustness loss, texture feature loss, identity classification loss, and triplet loss. This composite loss jointly supervises the learning of robust fine-grained identity features under cloth-changing conditions.  Results and Discussions  Extensive evaluations are conducted on LTCC, PRCC, Celeb-reID, and the large-scale DeepChange dataset (Table 1). Under cloth-changing scenarios, the proposed method achieves Rank-1/mAP scores of 45.6%/19.8% on LTCC, 70.6%/69.1% on PRCC (Table 2), 64.6%/18.4% on Celeb-reID (Table 3), and 58.0%/20.8% on DeepChange (Table 4), outperforming existing state-of-the-art approaches. The TA module effectively captures latent local texture details and, when combined with the AFR strategy, enables fine-grained adversarial perturbation and selective reconstruction. This improves fine-grained feature representation and allows the method to achieve 96.2% Rank-1 and 89.3% mAP on the clothing-consistent Market-1501 dataset (Table 5). The JP-Loss further supports the TA module and AFR strategy by enabling fine-grained adaptive regulation and clustering of texture-sensitive identity features (Table 6). When the Transformer-based attention mechanism is inserted after stage 2 of ResNet50, improved local structural perception and global context modeling are obtained with only a slight increase in computational overhead (Table 7). Setting the \begin{document}$ \beta $\end{document} parameter to 0.5 (Fig. 5) enables effective balancing of global texture consistency and local fine-grained discriminability. Visualization results on PRCC (Fig. 6) and top-10 retrieval comparisons (Fig. 7) provide intuitive evidence of improved stability and accuracy in cloth-changing scenarios.  Conclusions  A CC Re-ID method based on T3FRNet is proposed, consisting of the ResFormer50 backbone, TA module, AHP module, AFR strategy, and JP-Loss. Experimental results on four cloth-changing benchmarks and one clothing-consistent dataset confirm the effectiveness of the proposed approach. Under long-term scenarios, Rank-1/mAP improvements of 16.8%/8.3% on LTCC and 30.4%/32.9% on PRCC are achieved. The ResFormer50 backbone supports spatial relationship modeling over local fine-grained features, while the TA module and AFR strategy enhance feature expressiveness. The AHP module balances sensitivity to local textures and stability of global features, and JP-Loss strengthens adaptive regulation of fine-grained representations. Future work will focus on simplifying the architecture to reduce computational complexity and latency while maintaining high recognition accuracy.
Power Allocation for Downlink Short Packet Transmission with Superimposed Pilots in Cell-free Massive MIMO
SHEN Luyao, ZHOU Xingguang, XU Zile, WANG Yihang, XIA Wenchao, ZHU Hongbo
Available online  , doi: 10.11999/JEIT250655
Abstract:
  Objective  With the advancement of 5th Generation mobile communication, the volume of communication service interactions increases rapidly. To meet this growth in demand, Cell-Free Massive Multiple-Input Multiple-Output (CF-mMIMO) is regarded as a key technology. Multi-user access in CF-mMIMO systems creates complexity in channel estimation. Conventional methods based on Regular Pilots (RP) generate high overhead, which reduces the number of symbols available for data transmission. This reduction lowers the transmission rate, and the effect is stronger in short packet transmission. This study examines a downlink short packet transmission scheme based on Superimposed Pilots (SP) in CF-mMIMO systems to improve short packet transmission performance.  Methods  This study examines an SP-based downlink short packet transmission scenario in CF-mMIMO systems and proposes a power allocation algorithm. Considering energy consumption and resource constraints in practical settings, a User-Centric (UC) approach is used. Based on the Maximum Ratio Transmission (MRT) precoding scheme, a closed-form expression for the downlink achievable rate is derived under imperfect Channel State Information (CSI). Because pilot signals and data signals create cross-interference, an iterative optimization algorithm based on Geometric Programming (GP) and Successive Convex Approximation (SCA) is developed. The objective is to optimize the power allocation between pilot signals and data signals under the minimum data rate requirement and uplink and downlink power constraints. Using logarithmic function approximation and SCA, the non-convex optimization problem is converted into a GP problem, then an iterative algorithm is designed to obtain the solution. This study also compares the SP scheme with the RP scheme to show the superiority of the SP scheme and the proposed algorithm.  Results and Discussions  Simulation results confirm the accuracy of the closed-form expressions for the downlink sum rate under both SP and RP schemes (Fig. 2). To assess the effectiveness of the proposed algorithm, a comparative analysis of weighted sum rate is conducted. The comparison considers the proposed power allocation algorithm under both the SP ansd RP schemes, as well as fixed power allocation under the SP scheme. The number of antennas of APs (Fig. 3), the number of UEs (Fig. 4), block length (Fig. 5), and decoding error probability (Fig. 6) are treated as variables. The results show that the weighted sum rate achieved with the proposed power allocation algorithm under the SP scheme is higher than that achieved with the RP scheme and the fixed power allocation scheme.  Conclusions  This paper investigates the downlink power allocation problem under the SP scheme in CF-mMIMO systems for short packet transmission. The UC scheme is adopted to derive a closed-form expression for the lower bound of the downlink transmission rate under imperfect CSI and MRT precoding. The downlink weighted sum-rate maximization problem for the SP scheme is then formulated, and the non-convex problem is converted into a solvable GP problem through the SCA method. An iterative algorithm is employed to obtain the solution. Simulation results confirm the correctness of the closed-form expression for the transmission rate and show the superiority of the proposed power allocation algorithm.
Multi-code Deep Fusion Attention Generative Adversarial Networks for Text-to-Image Synthesis
GU Guanghua, SUN Wenxing, YI Boyu
Available online  , doi: 10.11999/JEIT250516
Abstract:
  Objective  Text-to-image synthesis is a core task in multimodal artificial intelligence and aims to generate photorealistic images that accurately correspond to natural language descriptions. This capability supports a wide range of applications, including creative design, education, data augmentation, and human–computer interaction. However, simultaneously achieving high visual fidelity and precise semantic alignment remains challenging. Most existing Generative Adversarial Network (GAN) based methods condition image generation on a single latent noise vector, which limits the representation of diverse visual attributes described in text. Therefore, generated images often lack fine textures, subtle color variations, or detailed structural characteristics. In addition, although attention mechanisms enhance semantic correspondence, many approaches rely on single-focus attention, which is insufficient to capture the complex many-to-many relationships between linguistic expressions and visual regions. These limitations result in an observable discrepancy between textual descriptions and synthesized images. To address these issues, a novel GAN architecture, termed Multi-code Deep Feature Fusion Attention Generative Adversarial Network (mDFA-GAN), is proposed. The objective is to enhance text-to-image synthesis by enriching latent visual representations through multiple noise codes and strengthening semantic reasoning through a multi-head attention mechanism, thereby improving detail accuracy and textual faithfulness.  Methods  A mDFA-GAN is proposed. The generator incorporates three main components. First, a multi-noise input strategy is adopted, in which multiple independent noise vectors are used instead of a single latent noise vector, allowing different noise codes to capture different visual attributes such as structure, texture, and color. Second, a Multi-code Prior Fusion Module is designed to integrate these latent representations. This module operates on intermediate feature maps and applies learnable channel-wise weights to perform adaptive weighted summation, producing a unified and detail-rich feature representation. Third, a Multi-head Attention Module is embedded in the later stages of the generator. This module computes attention between visual features and word embeddings across multiple attention heads, enabling each image region to attend to multiple semantically relevant words and improving fine-grained cross-modal alignment. Training is conducted using a unidirectional discriminator with a conditional hinge loss combined with a Matching-Aware zero-centered Gradient Penalty (MA-GP) to enhance training stability and enforce text–image consistency. In addition, a multi-code fusion loss is introduced to reduce variance among features derived from different noise codes, thereby promoting spatial and semantic coherence.  Results and Discussions  The proposed mDFA-GAN is evaluated on the CUB-200-2011 and MS COCO datasets. Qualitative results, as illustrated in (Fig. 7) and (Fig. 8), indicate that the proposed method generates images with accurate colors, fine-grained details, and coherent complex scenes. Subtle textual attributes, such as specific plumage patterns and object shapes, are effectively captured. Quantitative evaluation demonstrates state-of-the-art performance. An Inception Score (IS) of 4.82 is achieved on the CUB-200-2011 dataset (Table 2), reflecting improved perceptual quality and semantic consistency. Moreover, the lowest Fréchet Inception Distance (FID) values of 13.45 on CUB-200-2011 and 16.50 on MS COCO are obtained (Table 3), indicating that the generated images are statistically closer to real samples. Ablation experiments confirm the contribution of each component. Performance degrades when either the Multi-code Prior Fusion Module or the Multi-head Attention Module is removed (Table 4), and the multi-code fusion loss is shown to be critical for training stability and synthesis quality (Table 5). Further analysis identifies three noise codes as the optimal configuration (Table 6). In terms of efficiency, the model achieves an inference time of 0.8 seconds per image (Table 7), maintaining the efficiency advantage of GAN-based methods.  Conclusions  A novel text-to-image synthesis framework, mDFA-GAN, is proposed to address limited fine-grained detail representation and insufficient semantic alignment in existing GAN-based methods. By decomposing the latent space into multiple noise codes and adaptively fusing them, the model enhances its capacity to generate detailed visual content. The integration of multi-head cross-modal attention enables more accurate and context-aware semantic grounding. Experimental results on benchmark datasets demonstrate that mDFA-GAN achieves state-of-the-art performance, as evidenced by improved IS and FID scores and high-quality visual results. Ablation studies further validate the necessity and complementary effects of the proposed components. The framework provides both an effective solution for text-to-image synthesis and useful architectural insights for future research in multimodal representation learning.
A Family of Linear Codes and Their Subfield Codes
CHAI Ye, ZHU Shixin, KAI Xiaoshan
Available online  , doi: 10.11999/JEIT250775
Abstract:
  Objective  The study of weight distributions of linear codes is fundamental in both theory and applications. Weight distributions indicate the error-correcting capability of a code and allow the calculation of error probabilities for detection and correction. Linear codes with few weights also find applications in secret sharing, strongly regular graphs, association schemes, and authentication codes. Therefore, the construction of linear codes with few weights has attracted sustained attention. Subfield codes of linear codes over finite fields have recently received considerable interest because they can yield optimal codes with potential applications in data storage systems and communication systems. In recent years, subfield codes of linear codes over finite fields with good parameters have been widely studied. Motivated by these constructions, a different defining set is selected to extend existing results. The objectives of this paper are to study the weight distributions and dual codes of this class of linear codes and their punctured codes, and to investigate their subfield codes to obtain linear codes with few weights.  Methods  The selection of the defining set is a key step in the analysis. The calculation of weight distributions relies on decomposing elements of finite fields into their subfields and applying the first four Pless power moments. Using known results on Kloosterman sums over finite fields, the lengths and weight distributions of this class of linear codes admit closed-form expressions and are completely determined in the binary case. The parameters of their dual codes are also determined and are optimal or almost optimal in the binary case. Trace representations of the subfield codes of this class of codes and their punctured codes are derived. Properties of characters over finite fields are then used to determine the parameters, weight distributions, and dualities of these subfield codes.  Results and Discussions  By selecting an appropriate defining set and using Kloosterman sums over finite fields, the parameters and weight distributions of a family of q-ary linear codes with few weights and their punctured codes are completely determined. Their dual codes and subfield codes are also examined and are shown to be length-optimal and dimension-optimal with respect to the Sphere-packing bound. A class of eight-weight linear codes and their punctured codes is constructed. The corresponding dual codes are all AMDS linear codes, and they are length-optimal and dimension-optimal linear codes with respect to the Sphere-packing bound (see Theorems 1 and 2, and Tables 1 and 2). The parameters and weight distributions of their subfield codes and the corresponding dual codes are provided (see Theorem 3 and Table 3). In addition, the subfield codes of the punctured codes are studied, and the weight distributions and duality of these codes are determined (see Theorem 4 and Table 4). All results are verified using Magma through two examples.  Conclusions  A family of q-ary linear codes with few weights and their punctured codes is studied. Based on Kloosterman sums over finite fields, the weight distributions and parameters of the codes and their dual codes are determined, yielding optimal linear codes with respect to the Sphere-packing bound. The weight distributions of their subfield codes and the parameters of the corresponding dual codes are also determined, resulting in few-weight binary linear codes.
Intelligent Unmanned Aerial Vehicles for Low-altitude Economy: A Review of the Technology Framework and Future Prospects
QIAN Zhihong, WANG Yijun
Available online  , doi: 10.11999/JEIT251246
Abstract:
  Significance  The deep integration of new quality productive forces with the digital economy accelerates the development of the low-altitude economy and positions it as an emerging driver of global economic growth. Operating in airspace typically below 3 000 m, this industrial system supports diverse applications, including Unmanned Aerial Vehicle (UAV) logistics, Urban Air Mobility (UAM), industrial inspection, and public safety. Intelligent UAVs, characterized by cost efficiency, scalability, and autonomous capability, function as the core technical enabler of this ecosystem. Their deployment promotes a transition in aviation from centralized and isolated operation modes toward distributed, intelligent, and service-oriented aerial utilization. From a strategic perspective, intelligent UAVs contribute to industrial upgrading, urban infrastructure improvement, airspace security assurance, and regional economic development. Therefore, a systematic review and structured construction of an intelligent UAV technology framework is necessary to support future research, clarify key challenges, and promote sustained development of the low-altitude economy.   Progress   A holistic technology framework for intelligent UAVs is constructed, organized hierarchically from foundational technologies to application-oriented systems. The framework integrates four interrelated domains. Intelligent perception and navigation emphasize stable operation in complex environments through tightly coupled multi-sensor fusion and advanced state estimation methods, such as visual-inertial odometry, supported by multi-source adaptive positioning in Global Navigation Satellite System (GNSS)-denied scenarios. Wireless communication networks focus on reliable Beyond-Visual-Line-Of-Sight (BVLOS) connectivity by combining cellular network access, self-organizing flying ad hoc networks (FANETs) with intelligent topology control, and UAV-assisted edge computing for efficient resource scheduling. Autonomous decision-making and cooperative control evolve from classical rule-based approaches toward learning-based paradigms, where multi-agent reinforcement learning enables coordinated swarm behavior and adaptive task execution. Low-altitude security and airspace management provide essential system support through integrated detection and countermeasure technologies, supplemented by UAV cloud platforms and Unmanned aircraft system Traffic Management (UTM) for coordinated airspace operation.   Conclusions   The review indicates that UAVs are transitioning from isolated platforms to interconnected intelligent nodes embedded within the low-altitude economy system. Although substantial progress has been achieved across multiple technological domains, several critical challenges remain. Major technical constraints include maintaining communication reliability in complex low-altitude channels, addressing perception degradation in cluttered or deceptive environments, achieving robust autonomous cooperation under uncertainty, and overcoming the inherent limitations of existing energy and power technologies. These technical issues coexist with non-technical barriers, such as the establishment of adaptive regulatory and airspace governance frameworks, the formation of scalable and sustainable business models, and the enhancement of public acceptance. The analysis suggests that addressing these challenges requires deep integration of enabling technologies. A closed-loop evolution paradigm of “challenge-driven → technology fusion → system construction → feedback iteration” is proposed to describe the intrinsic iterative logic of technological development and to provide methodological guidance for future research and engineering practice.   Prospects   Future intelligent UAV development is expected to concentrate on several strongly coupled directions. Intelligent holistic communication will advance through deep integration of air-ground-space networks and Integrated Sensing And Communication (ISAC), forming a proactive data environment that supports predictive resource management and resilient connectivity. Cognitive swarm intelligence will promote the transformation of UAV clusters into cooperative cognitive systems by combining large language models for task comprehension with multi-agent reinforcement learning for decentralized decision-making, enabling emergent collective intelligence. High-assurance autonomous security will rely on formal verification of artificial intelligence models, explainable decision mechanisms, and extensive application of digital twins for virtual validation and certification, thereby strengthening operational trust. In parallel, green and sustainable technologies will influence the full lifecycle of UAV systems, encouraging advances in high-energy-density power solutions, including solid-state batteries and hydrogen fuel cells, the use of environmentally friendly materials, and artificial intelligence-based optimization of energy consumption and acoustic performance, which together support the long-term sustainability of the low-altitude economy.
Research on Snow Depth Measurement Technology Based on Dual-Band Microwave Open Resonant Cavity
LI Mengyao, ZHANG Pengfei, FENG Hao, MA Zhongfa
Available online  , doi: 10.11999/JEIT250724
Abstract:
  Objective  Large-scale winter snowfall poses a significant threat to the safety of outdoor infrastructure, including power transmission and communication systems. Real-time monitoring of snow depth within the range of 1~30 mm is required for accurate early warning and effective snow removal scheduling. Satellite- and radar-based techniques are mainly applied to snow depths exceeding 10 cm, but their large size and limited spatial resolution restrict their applicability to near-surface measurements. Although recently developed planar resonant sensors based on the resonance principle improve measurement accuracy, their effective measurement range remains limited. To resolve the trade-off between measurement range and accuracy, a rectangular microwave open resonant cavity featuring a dual-cavity, dual-feed, and dual-frequency-band configuration is proposed in combination with a data inversion algorithm. This scheme achieves a wide dynamic range of 1~30 mm while maintaining a measurement accuracy of 1 mm. The proposed device meets the monitoring requirements for snow depth corresponding to six snowfall intensity grades, ranging from light snow to heavy snowstorms.  Methods  The research methodology consists of four main stages. First, the phase-matching condition of the resonator formed by the open-ended waveguide and the snow layer is used to derive an analytical relationship between resonant frequency and snow depth, thereby verifying the feasibility of the measurement principle. Subsequently, a single-cavity model with coaxial feed is designed and simulated to evaluate its sensitivity to snow depths from 1 to 25 mm and to determine the corresponding operating frequency band. To further extend the measurement range, a dual-cavity, dual-feed model is constructed using either a metal plate or a Frequency Selective Surface (FSS) as a separator. A segmented measurement strategy is adopted, in which the large cavity and small cavity are responsible for different snow thickness intervals, enabling stable measurements with a precision of 1 mm over the full 1~30 mm range under different snow conditions. Finally, an optimal data inversion scheme is selected and implemented to further improve measurement accuracy.  Results and Discussions  A snow depth measurement technique based on a dual-band open-ended microwave resonant cavity is demonstrated. The dynamic measurement range is extended from 1~25 mm (Fig. 4) for the single-cavity configuration to 1~30 mm (Fig. 9) for the dual-cavity configuration. Simulation results show that the dual-cavity model maintains stable performance under variations in snow physical properties (Fig. 1013). As snow depth increases, the resonant frequency exhibits a regular shift toward lower frequencies (Fig. 9(a)), whereas the attenuation remains below –10 dB (Fig. 9(b)), achieving a measurement precision of 1 mm. Experimental results show trends consistent with the simulations (Fig. 15). When combined with the data inversion scheme, the inversion error is less than 0.16 mm (Table 5), satisfying the requirements for both wide dynamic range and high measurement accuracy.  Conclusions  A dual-cavity, dual-feed, and dual-frequency snow depth measurement method employing either a metal plate or an FSS plate as a cavity separator is proposed. The limited dynamic range of conventional single-cavity designs is addressed through the constructed dual-cavity architecture. Measurement resolution is improved by assigning different snow thickness ranges to the two frequency bands and applying a data inversion algorithm. Experimental results demonstrate that the proposed method enables segmented measurement of snow depths from 1~30 mm, with an inversion accuracy of 0.16 mm and a measured precision better than 1 mm. The effects of variations in snow density and snow moisture content on resonant frequency and attenuation are analyzed. For future research, machine learning methods are suggested to associate measurement parameters with meteorological parameters, thereby improving measurement accuracy and extending the early-warning capability of the system.
AoI-prioritized Multi-UAV Deployment and Resource Allocation Method in Scenarios with Differentiated User Requirements
JIN Feihong, ZHANG Jing, XIE Yaqin
Available online  , doi: 10.11999/JEIT251062
Abstract:
  Objective  In emergency scenarios such as natural disasters, ground-based fixed base stations are often damaged and may not be restored promptly. Because Unmanned Aerial Vehicles (UAVs) provide flexibility and low cost, UAV-assisted emergency communication has gained growing attention from academia and industry. However, existing studies on bandwidth and power allocation often overlook the heterogeneity of traffic demands among different Ground Users (GUs). They also do not fully address the effect of Age of Information (AoI) on the timeliness of emergency decision-making. Given differentiated traffic requirements and the direct effect of AoI on emergency response, this study proposes an AoI-based joint UAV deployment and resource allocation method for emergency communication. The objectives are: (1) to determine the minimum number of UAVs required while meeting the total GU traffic demand, and (2) to jointly optimize bandwidth, power, and Three-Dimensional (3D) UAV positions to minimize the system’s average AoI.  Methods  A two-stage approach that combines the Multiple UAV Deployment (MUD) algorithm and the Bandwidth, Power, and 3D Location (BPL) algorithm is proposed. For UAV quantity determination, the Particle Swarm Optimization (PSO) algorithm calculates the traffic density of each uncovered GU. The GU with the highest traffic density is selected as the core, and its adjacent GUs form a cluster. PSO optimizes the cluster position to maximize covered traffic volume while meeting UAV service constraints and determines the minimum number of UAVs required. For joint resource and position optimization, the BPL algorithm allocates bandwidth, power, and 3D locations. Bandwidth allocation uses an improved relaxation adjustment method in which weights are assigned based on GU data transmission time, and subchannels are allocated dynamically to balance transmission time. Power allocation follows the same structure. For 3D position optimization, the Whale Optimization Algorithm (WOA) is applied. After fixing the UAV’s horizontal position, the minimum height needed for coverage is derived using ellipse characteristics to reduce energy consumption. This converts the 3D search into a 2D search for the optimal position.  Results and Discussions  Simulation results confirm the effectiveness of the method. In a scenario with 100 GUs distributed randomly in a 1 km × 1 km area, 7 UAVs are required to achieve a 90% coverage rate (Fig. 2). The system’s average AoI under this deployment meets basic real-time communication requirements. Compared with benchmark algorithms such as Weighted K-Means (WKM) and Minimum Degree Prior (MDP), the MUD algorithm consistently uses fewer UAVs under different conditions of area size, GU quantity, and UAV service capability (Fig. 3). As the maximum GU traffic demand increases, data transmission time increases, which raises the required UAV count, whereas UAV climbing time decreases because cluster radii are smaller. Therefore, the average AoI shows a slight decrease (Fig. 4). The improved allocation method yields better performance than average allocation. It reduces the maximum GU data transmission time by 26.35% (Fig. 5a) and assigns 16.7% more bandwidth and power to high-traffic GUs (Fig. 5b). This leads to more balanced transmission times and higher resource use efficiency. When compared with NBPL (no Bandwidth-Power and Location optimization), OL (Only Location optimization), and OBP (Only Bandwidth-Power optimization), the full BPL (Bandwidth-Power and Location optimization) algorithm achieves the lowest average AoI under different GU quantities. When the GU count is large, the BPL algorithm reduces the average AoI by about 21.1% compared with NBPL (Fig. 6a). The method also reaches the lowest total energy consumption per UAV among all compared schemes (Fig. 6b). Its computational complexity remains suitable for practical emergency deployment.  Conclusions  This study proposes an AoI-prioritized multi-UAV deployment and resource allocation method for emergency communication scenarios characterized by differentiated user traffic demands. The method integrates a PSO-enhanced MUD algorithm to determine the minimum UAV quantity and a BPL algorithm that jointly optimizes bandwidth, power, and 3D UAV positions using WOA and an improved allocation method. It meets three objectives: reducing UAV use, minimizing average AoI to maintain information freshness, and lowering energy consumption. Simulation results confirm advantages in deployment efficiency, AoI performance, and energy efficiency. Future work includes extending the method to non-LoS channel conditions, designing lower-complexity heuristic methods for larger-scale tasks, developing distributed optimization frameworks, and studying online joint trajectory and resource optimization methods for dynamic environments.
Optimization of Energy Consumption in Semantic Communication Networks for Image Recovery Tasks
CHEN Yang, MA Huan, JI Zhi, LI Yingqi, LIANG Jiayu, GUO Lan
Available online  , doi: 10.11999/JEIT250915
Abstract:
  Objective  With the rapid development of semantic communication and the increasing demand for high-fidelity image recovery, high computational and transmission energy consumption remains a key factor limiting network deployment. Existing resource management strategies are largely static and show limited adaptability to dynamic wireless environments and user mobility. To address these issues, a robust energy optimization strategy driven by a modified Multi-Agent Proximal Policy Optimization (MAPPO) algorithm is proposed. By jointly optimizing communication and computing resources, the total network energy consumption is minimized while strictly satisfying multi-dimensional constraints, including latency and image recovery quality.  Methods  First, a theoretical model of the semantic communication network is constructed, and a closed-form expression for the user Symbol Error Rate (SER) is derived through asymptotic analysis of the uplink Signal-to-Interference-plus-Noise Ratio (SINR). Subsequently, the coupling relationships among semantic extraction rate, transmit power, computing resources, and network energy consumption are quantified. On this basis, a joint optimization model is formulated to minimize total energy consumption under constraints of delay, accuracy, and reliability. To solve this mixed-integer nonlinear programming problem, a modified MAPPO algorithm is designed. The algorithm integrates Long Short-Term Memory (LSTM) networks to capture temporal dynamics of user positions and channel states, and introduces a noise mechanism into the global state and advantage function to improve policy exploration and robustness.  Results and Discussions  Simulation results show that the proposed algorithm consistently outperforms baseline methods, including standard MAPPO, NOISE-MAPPO, LSTM-MAPPO, MADDPG, and greedy algorithms. The proposed strategy accelerates training convergence by 66.7%–80% relative to the benchmarks. In dynamic environments, network energy consumption stability is improved by approximately 50%, and user latency stability is enhanced by more than 96%. Additionally, the average SER is reduced by 4%–16.33% without degrading final image recovery performance, demonstrating an effective balance between energy efficiency and task reliability.  Conclusions   This study addresses energy optimization in semantic communication networks by combining theoretical modeling with a modified deep reinforcement learning framework. The proposed decision-making approach enhances the standard MAPPO algorithm through LSTM-based temporal feature extraction and noise-assisted robust exploration. Simulation results in dynamic single-cell and multi-cell scenarios show that the method improves convergence efficiency and system stability, and achieves a favorable trade-off between energy consumption and service quality. These results provide a theoretical basis and an efficient resource management framework for future energy-constrained semantic communication systems.
Band-Limited Signal Compression Enabled Computationally Efficient Software-Defined Radio for Two-Way Satellite Time and Frequency Transfer
CHENG Long, DONG Shaowu, WU Wenjun, GONG Jianjun, WANG Weixiong, GAO Zhe
Available online  , doi: 10.11999/JEIT250705
Abstract:
  Objective  This study addresses key challenges in Two-Way Satellite Time and Frequency Transfer (TWSTFT) systems, with emphasis on the computational inefficiency and high resource consumption of Software-Defined Radio (SDR) receivers. Although TWSTFT provides excellent long-term stability and time-transfer precision, conventional hardware implementations exhibit significant diurnal effects. Existing mitigation approaches, such as fusion with GPS Precise Point Positioning, depend on auxiliary link quality and lack unified algorithms across international networks. SDR receivers reduce diurnal effects and improve accuracy; however, high sampling rates and multi-correlator processing impose excessive computational burdens that limit real-time multi-station operation. The objective is to develop a band-limited signal compression approach that preserves measurement resolution while substantially improving computational efficiency, thereby enabling scalable and high-performance time transfer across international timing laboratories.  Methods  A band-limited signal compression method tailored to TWSTFT is proposed by accounting for the distortion of Pseudo-Random Noise (PRN) code square-wave characteristics under bandwidth constraints. Bandwidth-matched filtering is first applied to the local PRN code replica to align its spectrum with the effective bandwidth of the received signal and suppress out-of-band noise. For received signals with different bandwidths, n groups (e.g., n = 1, 2, or 20) of phase-diversified, equally spaced PRN code subsequences are generated. The number of subsequence groups n satisfies n × Rchip ≥ 2 × Bandsignal, where Rchip denotes the sampling rate of the subsequences and Bandsignal represents the signal bandwidth. After bandpass filtering, the received signal undergoes parallel correlation with the phase-diversified PRN subsequences. The full correlation function is reconstructed by a linear combination of the n independent correlation outputs, each scaled by Nchip/n, where Nchip is the number of samples per PRN chip. Adaptive sampling-rate adjustment and resource-allocation strategies are applied to achieve efficient processing with preserved accuracy.  Results and Discussions  Experimental validation is performed on a TWSTFT platform at the National Time Service Center using TWSTFT links (NTSC–NIM, NTSC–SU, NTSC–PTB) and SATRE local-loop tests. Data from MJD 60 742 to MJD 60 749 are collected in accordance with ITU-R TF.1153.4. In local-loop tests, the proposed method provides the most stable Time of Arrival measurements while maintaining a high signal-to-noise ratio (Table 2). Time deviation outperforms traditional multi-correlator and conventional compression methods over all averaging times (Fig. 9). For operational links, superior short-term stability is observed across different baseline lengths (Fig. 10 and Fig. 11). With n = 1 and n = 2, processing speed increases by 795% and 707%, respectively, while GPU memory usage decreases by 89.77% and 84.65% (Table 4). The method supports up to 102 concurrent channels (n = 1), exceeding the 11-channel capacity of conventional approaches (Table 5). Increasing n beyond these values yields no further precision improvement but increases resource consumption, confirming an optimal trade-off between accuracy and efficiency.  Conclusions  A band-limited signal compression method is presented to address the computational constraints of TWSTFT SDR receivers. Parallel short-correlation processing combined with bandwidth-aware sampling achieves substantial gains in precision and efficiency. Experimental results confirm improved short-term stability across signal bandwidths and baseline lengths relative to conventional multi-correlator methods. The approach delivers large efficiency gains, with processing speed increases of 795% (n = 1) and 707% (n = 2) and GPU memory reductions of 89.77% and 84.65%, respectively. System scalability is markedly enhanced, supporting up to 102 concurrent channels. These results demonstrate an effective balance between performance and resource utilization for TWSTFT applications.
A Frequency-Aware and Spatially Constrained Network for Ship Instance Segmentation in SAR Images
ZHANG Boya, WANG Yong
Available online  , doi: 10.11999/JEIT250938
Abstract:
  Objective  With the development of Synthetic Aperture Radar (SAR) imaging technology, ship instance segmentation in SAR images has become an important research direction in radar signal processing. Unlike traditional optical image segmentation tasks, SAR images reflect target backscatter intensity and usually contain objects with diverse scales and irregular spatial distributions, which poses significant challenges for ship instance segmentation. Although recent studies have achieved notable progress, existing networks do not fully exploit frequency features and spatial information of targets, resulting in classification and localization errors. To address this limitation, a frequency-aware and spatially constrained network is proposed to extract frequency features and spatial information from multiscale representations, thereby improving feature representation and instance segmentation accuracy in SAR images.  Methods  For input SAR images, a frequency-aware backbone network is first applied to extract frequency features at different scales. Features from the first four stages of the backbone network are then processed by a selective feature pyramid network to guide the model to focus on the most informative regions and to fuse multiscale features effectively. After enhanced multiscale features are obtained, a region proposal network is employed to generate candidate target proposals. These features and proposals are subsequently fed into a segmentation head with spatial information constraints to produce final instance segmentation results. The frequency-aware backbone network encodes multiscale features in the frequency domain, which strengthens feature extraction for ship targets. Based on image semantic information, the selective feature pyramid network enables effective attention to informative regions and integration of features across scales. In addition, a spatially constrained mask loss function is designed to update model parameters under constraints of centroid distance and directional deviation between predicted masks and ground-truth targets.  Results and Discussions  The effectiveness and robustness of the proposed network are validated on two public datasets, SSDD and HRSID. For the SSDD dataset, P, R, F1, AP0.5, AP0.75, and AP0.5–0.95 metrics are used for evaluation. Quantitative and qualitative comparisons (Figures 6 and 7, Table 1) indicate that the proposed network improves feature extraction and feature integration for SAR images, which enables more accurate segmentation of ships with different scales in complex backgrounds. For the HRSID dataset, AP0.5, AP0.75, and AP0.5–0.95 are reported for quantitative comparison. The results (Table 3) demonstrate strong adaptability and generalization capability across different datasets and application scenarios in ship instance segmentation tasks. Additionally, ablation experiments (Table 2) confirm the contribution of each module of the proposed network to segmentation performance improvement in SAR images.  Conclusions  A frequency-aware and spatially constrained network for ship instance segmentation in SAR images is proposed. The frequency-aware backbone network enhances feature perception for SAR imagery, whereas the selective feature pyramid network guides attention toward informative regions and improves segmentation of ship targets at different scales. The segmentation head incorporates spatial information constraints into the mask loss function, which yields more accurate instance segmentation results. Experimental results on the SSDD and HRSID datasets show that the proposed method outperforms existing approaches and achieves improved effectiveness and generalization capability for ship instance segmentation in SAR images.
A Class of Double-twisted Generalized Reed-Solomon Codes and Their Extended Codes
CHENG Hongli, ZHU Shixin
Available online  , doi: 10.11999/JEIT251045
Abstract:
  Objective   In the field of coding theory, Twisted Generalized Reed-Solomon (TGRS) codes have attracted considerable research interest for their flexible structural properties. However, investigations into their extended codes remain relatively limited. Existing literature indicates that prior studies on extended TGRS codes are scarce, with only a few works delving into this area, thereby leaving significant gaps in our understanding of their error-correcting capabilities, duality properties, and practical applications. Meanwhile, the foundational parity-check matrix forms for TGRS codes presented in earlier research lack sufficient clarity and exhibit restricted parameter coverage. Specifically, previous studies fail to accommodate scenarios involving h=0, which constrains their utility in broader coding scenarios where diverse parameter configurations are required. Furthermore, constructing non-GRS codes is an intriguing and critical research topic due to their unique characteristics to resist Sidelnikov-Shestakov and Wieschebrink attacks, whereas GRS codes are vulnerable to such threats. Additionally, Maximum Distance Separable (MDS) codes, self-orthogonal codes, and almost self-dual codes are highly valued for their efficient error-correcting capabilities and structural advantages. MDS codes, achieving the Singleton bound, are essential for distributed storage systems where data integrity under node failures is critical; self-orthogonal and almost self-dual codes, with their inherent duality properties, play key roles in quantum coding, secret sharing schemes, and secure multi-party computation, where structural regularity and cryptographic security are vital. Accordingly, this paper aims to achieve the following goals: (1) characterize the MDS and Almost MDS (AMDS) properties of double-twisted GRS codes\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v}) $\end{document}and their extended codes\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v},\mathrm{\infty }) $\end{document}; (2) derive explicit and unified parity-check matrices applicable to all valid parameter ranges, including h=0; (3) establish non-GRS properties of these codes under specific parameter conditions; (4) provide rigorous necessary and sufficient conditions for the extended codes to be self-orthogonal and for the original codes to be almost self-dual; and (5) construct a class of almost self-dual double-twisted GRS codes with flexible parameters to meet diverse application requirements in secure and reliable communication systems.  Methods   The research adopts a comprehensive framework rooted in algebraic coding theory and finite field mathematics. Algebraic Analysis serves as a foundational tool: explicit parity-check matrices are derived using properties of polynomial rings over finite fields \begin{document}$ {F}_{q} $\end{document}, Vandermonde matrices structures, and polynomial interpolation techniques; The Schur Product Method is utilized to determine non-GRS properties by evaluating the dimension of the Schur square of codes and their duals, distinguishing them from GRS codes; Linear Algebra and Combinatorics are utilized to characterize MDS and AMDS properties. By examining the non-singularity of generator matrix submatrices and solving systems of equations involving symmetric sums of finite field elements, the conditions for MDS and AMDS codes are derived. These conditions rely on sets\begin{document}$ {S}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document},\begin{document}$ {L}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}, and\begin{document}$ {D}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}, which are defined based on sums of products of finite field elements. Duality theory forms the foundation for analyzing orthogonality. For self-orthogonal codes\begin{document}$ C\subseteq {C}^{\bot } $\end{document}, the generator matrix must satisfy\begin{document}$ G{G}^{\rm T}=\boldsymbol{O} $\end{document}. For almost self-dual codes (length-odd, dimension-(n-1)/2 self-orthogonal codes), this condition is combined with structural properties of dual codes and symmetric sum relations of \begin{document}$ {\alpha }_{i} $\end{document} to derive necessary and sufficient conditions.  Results and Discussions   For MDS and AMDS properties, critical findings are established: The extended double-twisted GRS code\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v},\mathrm{\infty }) $\end{document}is MDS if and only if\begin{document}$ 1\notin {S}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}and\begin{document}$ 1\notin {L}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}; the double-twisted GRS code\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v}) $\end{document}is AMDS if and only if\begin{document}$ 1\in {S}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}and\begin{document}$ (0,1)\notin {D}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}; and\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v}) $\end{document}is neither MDS nor AMDS if and only if\begin{document}$ (0,1)\in {D}_{k}(\boldsymbol{\alpha },\boldsymbol{\eta }) $\end{document}. Unified parity-check matrices of\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v}) $\end{document}and\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v},\mathrm{\infty }) $\end{document} for all\begin{document}$ 0\leq h\leq k-1 $\end{document}are derived, resolving prior limitations that excluded h=0 by removing restrictive submatrix structure constraints. For non-GRS properties, when\begin{document}$ k\geq 4 $\end{document}and\begin{document}$ n-k\geq 4 $\end{document}, \begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v}) $\end{document}and its extened codes \begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v},\mathrm{\infty }) $\end{document}are non-GRS regardless of\begin{document}$ 2k\geq n $\end{document}or\begin{document}$ 2k \lt n $\end{document}, confirmed by the dimension of their Schur squares exceeding that of corresponding GRS codes. This ensures resistance to Sidelnikov-Shestakov and Wieschebrink attacks. Regarding self-orthogonality and almost self-duality, the extended code\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v},\mathrm{\infty }) $\end{document}with\begin{document}$ h=k-1 $\end{document}is self-orthogonal under specific algebraic conditions;\begin{document}$ {C}_{k,\boldsymbol{h},\boldsymbol{\eta }}(\boldsymbol{\alpha },\boldsymbol{v}) $\end{document}with\begin{document}$ h=k-1 $\end{document}and\begin{document}$ n=2k+1 $\end{document}is almost self-dual if and only if there exists\begin{document}$ \lambda \in F_{q}^{*} $\end{document}such as\begin{document}$ \lambda {u}_{j}=v_{j}^{2} (j=1,\cdots ,2k+1) $\end{document}and a symmetric sum constraint on\begin{document}$ {\alpha }_{i} $\end{document}involving\begin{document}$ {\eta }_{1} $\end{document}and\begin{document}$ {\eta }_{2} $\end{document}holds. For odd prime power\begin{document}$ q $\end{document}, a flexible almost self-dual code with parameters\begin{document}$ [q-t-1,(q-t-2)/2,\geq (q-t-2)/2] $\end{document}is constructed using roots of \begin{document}$ m(x)=({x}^{q}-x)/f(x) $\end{document} where \begin{document}$ f(x)={x}^{t+1}-x $\end{document}, with an example over\begin{document}$ {F}_{11} $\end{document}yielding a\begin{document}$ [5,2,\geq 2] $\end{document}code.  Conclusions   This work advances the study of double-twisted GRS codes and their extensions through key contributions: (1) complete characterization of MDS and AMDS properties via explicit combinatorial sets\begin{document}$ {S}_{k} $\end{document},\begin{document}$ {L}_{k} $\end{document},\begin{document}$ {D}_{k} $\end{document}, enabling precise error-correcting capability assessment; (2) derivation of unified, explicit parity-check matrices for all\begin{document}$ 0\leq h\leq k-1 $\end{document}, overcoming prior parameter restrictions and enhancing practical utility; (3) proof of non-GRS properties for\begin{document}$ k\geq 4 $\end{document}, ensuring security against specific attacks; (4) rigorous conditions for self-orthogonal extended codes and almost self-dual original codes, deepening structural insights; (5) a flexible construction of almost self-dual codes, meeting diverse needs in secure communication and distributed storage. These results enrich coding theory and provide practical tools for robust, secure coding system design.
Multiscale Fractional Information Potential Field and Dynamic Gradient-Guided Energy Modeling for SAR and Multispectral Image Fusion
SONG Jiawen, WANG Qingsong
Available online  , doi: 10.11999/JEIT250976
Abstract:
  Objective   In remote sensing, fusion of Synthetic Aperture Radar (SAR) and MultiSpectral (MS) images is essential for comprehensive Earth observation. SAR sensors provide all-weather imaging capability and capture dielectric and geometric surface characteristics, although they are inherently affected by multiplicative speckle noise. In contrast, MS sensors provide rich spectral information that supports visual interpretation, although their performance is constrained by atmospheric conditions. The objective of SAR-MS image fusion is to integrate the structural details and scattering characteristics of SAR imagery with the spectral content of MS imagery, thereby improving performance in applications such as land-cover classification and target detection. However, existing fusion approaches, ranging from component substitution and multiscale transformation to Deep Learning (DL), face persistent limitations. Many methods fail to achieve an effective balance between noise suppression and texture preservation, which leads to spectral distortion or residual speckle, particularly in highly heterogeneous regions. DL-based methods, although effective in specific scenarios, exhibit strong dependence on training data and limited generalization across sensors. To address these issues, a robust unsupervised fusion framework is developed that explicitly models modality-specific noise characteristics and structural differences. Fractional calculus and dynamic energy modeling are combined to improve structural preservation and spectral fidelity without relying on large-scale training datasets.  Methods   The proposed framework adopts a multistage fusion strategy based on Relative Total Variation filtering for image decomposition and consists of four core components. First, a MultiScale Fractional Information Potential Field (MS-FIPF) method (Fig. 2) is proposed to extract robust detail layers. A fractional-order kernel is constructed in the Fourier domain to achieve nonlinear frequency weighting, and a local entropy-driven adaptive scale mechanism is applied to enhance edge information while suppressing noise. Second, to address the different noise distributions observed in SAR and MS detail layers, a Bayesian adaptive fusion model based on the minimum mean square error criterion is constructed. A dynamic regularization term is incorporated to adaptively balance structural preservation and noise suppression. Third, for base layers containing low-frequency geometric information, a Dynamic Gradient-Guided Multiresolution Local Energy (DGMLE) method (Fig. 3) is proposed. This method constructs a global entropy-driven multiresolution pyramid and applies a gradient-variance-controlled enhancement factor combined with adaptive Gaussian smoothing to emphasize significant geometric structures. Finally, a Scattering Intensity Adaptive Modulation (SIAM) strategy is applied through a nonlinear mapping regulated by joint entropy and root mean square error, enabling adaptive adjustment of SAR scattering contributions to maintain visual and spectral consistency.  Results and Discussions   The proposed framework is evaluated on the WHU, YYX, and HQ datasets, which represent different spatial resolutions and scene complexities, and is compared with seven state-of-the-art fusion methods. Qualitative comparisons (Figs. 5\begin{document}$ \sim $\end{document}7) show that several existing approaches, including hybrid multiscale decomposition and image fusion convolutional neural networks, exhibit limited noise modeling capability. This limitation results in spectral distortion and detail blurring caused by SAR speckle interference. Methods based on infrared feature extraction and visual information preservation also show image whitening and contrast degradation due to excessive scattering feature injection. In contrast, the proposed method effectively filters redundant SAR noise through multiscale fractional denoising and adaptive scattering modulation, while preserving MS spectral consistency and salient SAR geometric structures. Improved visual clarity and detail preservation are observed, exceeding the performance of competitive approaches such as visual saliency feature fusion, which still presents residual noise. Quantitative results (Tables 1\begin{document}$ \sim $\end{document}3) demonstrate consistent superiority across six evaluation metrics. On the WHU dataset, optimal ERGAS (3.737 0) and PSNR (24.798 3 dB) values are achieved. Performance improvements are more evident on the high-resolution YYX dataset and the structurally complex HQ dataset, where the proposed method ranks first for all indices. The mutual information on the YYX dataset reaches 3.353 5, which is nearly twice that of the second-ranked method, confirming strong multimodal information preservation. On average, the proposed framework achieves a performance improvement of 29.11% compared with the second-best baseline. Mechanism validation and efficiency analysis (Tables 4, 5) further support these results. Ablation experiments demonstrate that SIAM plays a critical role in maintaining the balance between spectral information and scattering characteristics, whereas DGMLE contributes substantially to structural fidelity. With an average runtime of 1.303 3 s, the proposed method achieves an effective trade-off between computational efficiency and fusion quality and remains significantly faster than complex transform-domain approaches such as multiscale non-subsampled shearlet transform combined with parallel convolutional neural networks.  Conclusions   A robust and unsupervised framework for SAR and MS image fusion is proposed. By integrating MS-FIPF-based fractional-order saliency extraction with DGMLE-based gradient-guided energy modeling, the proposed method addresses the long-standing trade-off between noise suppression and detail preservation. Bayesian adaptive fusion and scattering intensity modulation further improve robustness to modality differences. Experimental results confirm that the proposed framework outperforms seven representative fusion algorithms, achieving an average improvement of 29.11% across comprehensive evaluation metrics. Significant gains are observed in noise suppression, structural fidelity, and spectral preservation, demonstrating strong potential for multisource remote sensing data processing.
Joint Suppression of Range-Ambiguous Clutter and Mainlobe Deceptive Jammer with Subarray FDA-MIMO Radar
ZHANG Mengdi, LU Jiahao, XU Jingwei, LI Shiyin, WANG Ning, LIU Zhixin
Available online  , doi: 10.11999/JEIT251116
Abstract:
  Objective  In the downward-looking mode, airborne radar systems face the dual challenge of mitigating strong clutter and mainlobe deceptive jammers in increasingly complex electromagnetic environments. Clutter exhibiting both range ambiguity and range dependence constrains of Moving Target Detection (MTD) in high Pulse Repetition Frequency (PRF) radars with non-side-looking configurations. Mainlobe deceptive jammers further increase the difficulty of detecting the true target. By exploiting controllable range Degrees Of Freedom (DOFs), Waveform Diverse Array (WDA) radars, such as Frequency Diverse Array Multiple-Input Multiple-Output (FDA-MIMO) radar and Element Pulse Coding Multiple-Input Multiple-Output (EPC-MIMO) radar, show clear advantages in suppressing mainlobe deceptive jammers. However, existing WDA-based techniques are limited to suppressing false targets whose delays exceed one Pulse Repetition Interval (PRI) relative to the true target, referred to as cross-pulse repeater jammers. With advances in Digital Radio Frequency Memory (DRFM) technology, the delay of false targets is reduced, enabling the generation of false targets that share the same number of delayed pulses as the true target, referred to as intra-PRI rapid repeater jammers. Furthermore, most anti-jamming methods are developed under Gaussian white noise assumptions and do not consider practical clutter environments. Therefore, a joint suppression framework is required to simultaneously handle range-ambiguous clutter and multiple types of mainlobe deceptive jammer.  Methods  A joint suppression framework based on a subarray FDA-MIMO radar is proposed for scenarios with coexisting range-ambiguous clutter, cross-pulse repeater jammers, and intra-PRI rapid repeater jammers. Compared with conventional FDA-MIMO radar, the subarray FDA-MIMO configuration employs small frequency increments within transmit subarrays and large frequency increments across subarrays, which provides two-level range DOFs at the intra-subarray and inter-subarray scales. First, a Range-Dependent Compensation (RDC) technique is applied to separate the true target from echoes contaminated by clutter and jammers in the joint intra-subarray and inter-subarray transmit spatial frequency domain. Next, a pre-Space-Time Adaptive Processing (STAP) filter is designed by exploiting range DOFs in the intra-subarray transmit dimension to suppress range-ambiguous clutter and cross-pulse repeater jammers. Finally, subspace projection-based three-dimensional (3-D) STAP is applied to suppress local clutter and intra-PRI rapid repeater jammers.  Results and Discussions  After RDC, the true target is effectively separated from ambiguous clutter and jammers in the joint intra-subarray and inter-subarray transmit spatial frequency domain (Fig. 3). By exploiting range DOFs in the intra-subarray transmit dimension, the pre-STAP filter achieves effective suppression of range-ambiguous clutter and cross-pulse repeater jammers (Fig. 4). Local clutter in the inter-subarray transmit spatial frequency domain is suppressed by using clutter distribution characteristics in the receive-Doppler domain combined with subspace projection (Fig. 5). This enables accurate estimation of the Jammer Covariance Matrix (JCM) for intra-PRI-inner-bin rapid repeater jammers. Subsequently, 3-D STAP suppresses local clutter and intra-PRI-inner-bin rapid repeater jammers (Fig. 6, Fig. 7). Comparative simulation results show that the proposed framework achieves significantly improved suppression performance under the considered complex scenario (Fig. 8).  Conclusions  The problem of MTD in scenarios with simultaneous range-ambiguous clutter, cross-pulse repeater jammers, and intra-PRI-inner-bin rapid repeater jammers is addressed. A joint suppression framework based on subarray FDA-MIMO radar is proposed, in which small frequency increments are used within transmit subarrays and large increments across subarrays to enable flexible utilization of range DOFs. RDC achieves effective separation of the target from ambiguous clutter and jammers in the joint transmit spatial frequency domain. By exploiting intra-subarray range DOFs, a pre-STAP filter suppresses range-ambiguous clutter and cross-pulse repeater jammers. To mitigate the Inner-Bin Range Dependence (IRD) effect of clutter, a subspace projection method is developed to recover the JCM for intra-PRI-inner-bin rapid repeater jammers from clutter-contaminated data. Finally, 3-D STAP in the inter-subarray transmit-receive-Doppler domain suppresses local clutter and intra-PRI-inner-bin rapid repeater jammers. Numerical simulations verify the effectiveness of the proposed joint suppression framework.
ISAR Sequence Motion Modeling and Fuzzy Attitude Classification Method for Small Sample Space Target
YE Juhang, DUAN Jia, ZHANG Lei
Available online  , doi: 10.11999/JEIT250689
Abstract:
  Objective  Space activities continue to expand, and Space Situational Awareness (SSA) is required to support collision avoidance and national security. A core task is attitude classification of space targets to interpret states and predict possible behavior. Current classification strategies mainly depend on Ground-Based Inverse Synthetic Aperture Radar (GBISAR). Model-driven methods require accurate prior modeling and have high computational cost, whereas data-driven methods such as deep learning require large annotated datasets, which are difficult to obtain for space targets and therefore perform poorly in small-sample conditions. To address this limitation, a Fuzzy Attitude Classification (FAC) method is proposed that integrates temporal motion modeling with fuzzy set theory. The method is designed as a training-free, real-time classifier for rapid deployment under data-constrained scenarios.  Methods  The method establishes a mapping between Three-dimensional (3D) attitude dynamics and Two-dimensional (2D) ISAR features through a framework combining the Horizon Coordinate System (HCS), the UNW orbital system, and the Body-Fixed Reference Frame (BFRF). Attitude evolution is represented as Euler rotations of the BFRF relative to the UNW system. The periodic 3D rotation is projected onto the 2D Range-Doppler plane as circular keypoint trajectories. Fourier series analysis is then applied to convert the motion into One-dimensional (1D) cosine features, where phase represents angular velocity and amplitude reflects motion magnitude. A 10-point annotation model is employed to describe targets, and dimensionless roll, pitch, and yaw feature vectors are constructed. For classification, magnitude- and angle-based decision rules are defined and processed using a softmax membership function, which incorporates feature variance to compute fuzzy membership degrees. The algorithm operates directly on keypoint sequences, requires no training, and maintains linear computational complexity O(n), enabling real-time execution.  Results and Discussions  The FAC method is evaluated using a Ku-band GBISAR simulated dataset of a spinning target. The dataset contains 36 sequences, each composed of 36 frames with a resolution of 512×512 pixels and is partitioned into a reference set and a testing set. Although raw keypoint trajectories appear disordered (Fig. 4(a)), the engineered features form clear clusters (Fig. 4(b)), and the variance of the defined criteria reflects motion significance (Fig. 4(c)). Robustness is confirmed: across nine imaging angles, classification consistency remains 100% within a 0.0015 tolerance (Fig. 5(a)). Under noise conditions, consistency is maintained from 10 dB to 1 dB signal-to-noise ratio (Fig. 5(b)). When frames are removed, 90% consistency is retained at a 0.03 threshold, and six frames are identified as the minimum number required for effective classification (Fig. 5(c)). Benchmark comparisons indicate that FAC outperforms Hidden Markov Models (HMM) and Convolutional Neural Networks (CNN), preserving accuracy under noise (Fig. 6(a)), sustaining stability under frame loss where HMM degrade to random behavior (Fig. 6(b)), and achieving significantly lower processing time than both benchmarks (Fig. 6(c)).  Conclusions  A FAC method that integrates motion modeling with fuzzy reasoning is presented for small-sample space target recognition. By mapping multi-coordinate kinematics into interpretable cosine features, the method reduces dependence on prior models and large datasets while achieving training-free, linear-time processing. Simulation tests confirm robustness across observation angles, Signal-to-Noise Ratios (SNR), and frame availability. Benchmark comparisons demonstrate higher accuracy, stability, and computational efficiency relative to HMM and CNN. The FAC method provides a feasible solution for real-time attitude classification in data-constrained scenarios. Future work will extend the approach to multi-axis tumbling and validation using measured data, with potential integration of multimodal observations to improve adaptability.
Research on Model-Driven Integrated Simulation Technology for Space-Based Support
REN Yuli, YOU Lingfei, CHANG Chuangye, GUO Zhiqi
Available online  , doi: 10.11999/JEIT251004
Abstract:
  Objective  Space-based information support is a core component of modern operational systems. It acquires, transmits, and processes information through space-based platforms to provide full-process, round-the-clock information support for remote precision strikes. Therefore, digital simulation and verification of space-based information support systems have become key means for combat concept design, scheme demonstration, and rapid capability iteration. This study examines the application of Model-Based Systems Engineering(MBSE) to integrated simulation of space-based support operations. The objective is to address key challenges in information representation, system interoperability, and integrated simulation in complex combat systems. To overcome the limitations of traditional simulation approaches in cross-platform collaboration, dynamic extensibility, and efficient integration of functional logic with spatiotemporal information, a multi-perspective modeling and simulation method based on the Discrete EVent System specification(DEVS) is proposed. A hybrid integrated simulation framework is constructed.  Methods  The proposed framework enables abstract interconnection of weapon and equipment models, plug-and-play integration of simulation resources, precise global time synchronization, and high-performance real-time communication. These capabilities are achieved through four core modules: data integration management, heterogeneous software adapters, time management control, and publish–subscribe mechanisms. The framework supports interoperability and reusability of heterogeneous simulation software. On this basis, a joint simulation system is designed by integrating system architecture development and verification software with spatiotemporal simulation software for visualization and reasoning. Message middleware supports bidirectional synchronous interaction between state machines and spatiotemporal models, enabling closed-loop verification from combat concepts to digital inference. The core contribution of this research is the removal of long-standing separation between discrete event logic describing combat functions, information flow, and state machines, typically modeled using Systems Modeling Language (SysML), and continuous physical scenes, such as spatiotemporal motion and sensor coverage, constructed on visualization and deduction platforms. Through real-time bidirectional data exchange enabled by message middleware, discrete command decisions drive continuous platform behavior, whereas dynamic changes in battlefield conditions trigger corresponding combat responses.  Results and Discussions  A complete closed-loop simulation of a “maritime island and reef reconnaissance support denial operation” scenario is conducted using the joint simulation system, producing effective verification results. The simulation reproduces the full process from space-based target detection to coordinated regional denial by multi-domain forces. First, a capability–activity–equipment analysis model for the combat mission is developed using the Unified Architecture Framework (UAF), generating equipment interaction relationships and corresponding state machines. In parallel, continuous construction of the combat scenario is implemented on the visualization and deduction platform. The entire deduction process is precisely synchronized with physical motion through the state machine model deployed in the system architecture development and verification platform. Each state transition, such as “target detection,” “strike initiation,” and “effect evaluation,” triggers corresponding spatiotemporal simulation activities. Platform states and environmental data fed back from the visualization and deduction platform then drive subsequent state machine evolution. Through joint simulation, the rationality and feasibility of the operational concept, in which multi-domain unmanned forces conduct reconnaissance, deterrence, and denial under space-based information support, are verified. The results provide an intuitive and high-confidence basis for decision-making in system scheme optimization.  Conclusions  This study investigates the application of model-driven technology to the design and validation of space-based support joint operation systems. A joint simulation framework enabling deep integration of functional logic and physical scenarios is constructed and validated. Unlike conventional simulation approaches that focus on static structures or isolated functions, the proposed framework couples SysML-based discrete event logic models with continuous spatiotemporal dynamic models through a distributed architecture consisting of one core platform and multiple component adapters. This approach resolves the long-standing separation between functional modeling and scene simulation. Discrete behaviors, such as command decision-making and state transitions, directly drive platform movement and interaction within realistic spatiotemporal environments. Conversely, dynamic battlefield changes provide real-time feedback that affects higher-level logical decisions, forming a bidirectional closed loop. The framework integrates the precision of functional logic with the intuitiveness of scene simulation and enables realistic reproduction of multi-domain collaborative operations in digital space. It provides effective support for system design, operational deduction, and high-confidence verification of space-based support joint operation systems.
A Distributed Multi-Satellite Collaborative Framework for Remote Sensing Scene Classification
JIN Jing, WANG Feng
Available online  , doi: 10.11999/JEIT250866
Abstract:
  Objective  With the rapid development of space technologies, satellites generate large volumes of Remote Sensing (RS) data. Scene classification, a fundamental task in RS interpretation, is essential for earth observation applications. Although Deep Learning (DL) improves classification accuracy, most existing methods rely on centralized architectures. This design allows unified management but faces limited bandwidth, high latency, and privacy risks, which restrict scalability in multi-satellite settings. With increasing demand for distributed computation, Federated Learning (FL) has received growing attention in RS. Research on FL for RS scene classification, however, remains at an early stage. This study proposes a distributed collaborative framework for multi-satellite scene classification that applies efficient parameter aggregation to reduce communication overhead while preserving accuracy.  Methods  An FL-based framework is proposed for multi-satellite RS scene classification. Each satellite conducts local training while raw data remain stored locally to preserve privacy. Only updated model parameters are transmitted to a central server for global aggregation. The optimized global model is then broadcast to satellites to enable joint modeling and inference. To reduce the high communication cost of space-to-ground links, an inter-satellite communication mechanism is added. This design lowers communication overhead and strengthens scalability. The effect of parameter consensus on global convergence is theoretically analyzed, and an upper bound of convergence error is derived to provide a rigorous convergence guarantee and support practical applicability.  Results and Discussions  Comparative experiments are conducted on the UC-Merced and NWPU-RESISC45 datasets (Table 2, Table 3) to evaluate the proposed framework. The method consistently shows higher accuracy than centralized training, FedAvg, and FedProx under different client numbers and training ratios. On UC-Merced, Overall Accuracy (OA) reaches 96.68% at a 50% training ratio with 2 clients and rises to 97.49% at 80% with 10 clients. On NWPU-RESISC45, OA reaches 83.64% at 10% with 5 clients and 88.41% at 20% with 10 clients, both exceeding baseline methods. Confusion matrices (Fig. 4, Fig. 5) show clear diagonal dominance and only minor confusions, and t-SNE visualizations (Fig. 6) show compact intra-class clusters and well-separated inter-class distributions, indicating strong generalization even under lower training ratios. Communication energy analysis (Table 4) shows high efficiency. On UC-Merced with a 50% training ratio, the communication cost is 1.30 kJ, more than 60% lower than FedAvg and FedProx. On NWPU-RESISC45, substantial savings are also observed across all ratios.  Conclusions  This study proposes an FL-based framework for multi-satellite RS scene classification and addresses limitations of centralized training, including restricted bandwidth, high latency, and privacy concerns. By allowing satellites to conduct local training and applying central aggregation with inter-satellite consensus, the framework achieves collaborative modeling with high communication efficiency. Evaluations on UC-Merced and NWPU-RESISC45 verify the effectiveness of the method. On UC-Merced with an 80% training ratio and 10 clients, OA reaches 97.49%, higher than centralized training, FedAvg, and FedProx by 1.85%, 0.60%, and 0.81%, respectively. On NWPU-RESISC45 with a 20% training ratio, the communication energy cost is 5.88 kJ, showing reductions of 57.45% and 58.18% compared with FedAvg and FedProx. These results indicate strong generalization and efficiency across different data scales and training ratios. The framework is suited for bandwidth-limited and dynamic space environments and offers a promising direction for distributed RS applications. Future work will examine cross-task transfer learning to improve adaptability and generalization under multi-task and heterogeneous data conditions.
Lightweight Dual Convolutional Finger Vein Recognition Network Based on Attention Mechanism
ZHAO Bingyan, LIANG Yihuai, ZHANG Zhongxia, ZHANG Wenzheng
Available online  , doi: 10.11999/JEIT250380
Abstract:
  Objective  Finger vein recognition, an emerging biometric authentication technology, has garnered considerable research attention due to its unique physiological characteristics and advantages in in vivo detection. However, the existing mainstream recognition frameworks based on deep learning still face significant challenges: on the one hand, high-precision recognition often relies on complex network structures, resulting in a sharp increase in model parameters, which makes deployment difficult in memory-constrained embedded devices and computing resource-scarce edge scenarios; On the other hand, although model compression technology can reduce the computational cost, it is often accompanied by the attenuation of feature expression ability, resulting in the inherent contradiction between recognition accuracy and efficiency. To address these challenges, a lightweight double convolutional model integrating an attention mechanism is proposed. By designing a parallel heterogeneous convolution module and an attention guidance mechanism, diversified features of images are deeply mined, and recognition accuracy is finally improved while the lightweight characteristics of the model are maintained.  Methods  The proposed network architecture adopts a three-level collaborative mechanism involving "feature extraction, dynamic calibration, and decision fusion". First, a dual convolution feature extraction module is constructed based on normalized ROI images. This module employs a strategy combining heterogeneous convolutional kernels, where rectangular convolutional branches with varying shapes and sizes are utilized to capture vein topological structure characteristics and track vein diameter directions; meanwhile, square convolution branches employ stacked square convolutions to extract local texture details and background intensity distribution characteristics. These two branches operate in parallel with a reduced channel count, forming complementary feature responses through kernel shape differences, which compresses parameter quantities while enhancing feature representation differentiation. Secondly, a parallel dual attention mechanism is designed to achieve two-dimensional calibration through joint optimization of channel attention and spatial attention. Channel dimensions adaptively assign weights to strengthen key discriminative features of vein textures; spatial dimensions construct pixel-level dependency models and dynamically focus on effective discriminative regions. This mechanism adopts a parallel feature concatenation fusion strategy, preserving input structural information while avoiding additional parameter burdens and improving model sensitivity to critical features. Finally, a three-level progressive feature optimization structure is constructed. Initially, multi-scale receptive field nesting is implemented through a convolutional compression module with a stride of 2, gradually purifying primary features during dimensionality reduction; subsequently, dual fully connected layers are employed for feature space transformation. The first layer utilizes ReLU activation to construct sparse feature representations, while the final layer employs Softmax for probability calibration. This structure effectively balances the risks of shallow underfitting and deep overfitting while maintaining forward inference efficiency.  Results and Discussions  The effectiveness and robustness of the proposed network are verified on three publicly available datasets: USM, HKPU and SDUMLA. The Acc metric is employed to evaluate detection accuracy. Experimental results on network recognition performance (Table 1) demonstrate that the proposed method achieves favorable outcomes in finger vein image recognition tasks. The feature visualization heatmaps (Fig. 4, Fig. 6) prove that the model can extract complete vein discrimination features. Visualization results (Fig. 7, Fig. 8) indicate that model loss and accuracy exhibit normal trends during training, with 100% classification performance achieved, thereby validating the reliability and robustness of the proposed approach. Quantitative comparisons (Tables 2 and 3) reveal that the proposed method effectively addresses the imbalance between model complexity and classification performance, demonstrating superior performance across three datasets. Furthermore, ablation studies (Table 4) confirm the efficacy of the proposed module, showing significant improvements in finger vein image recognition performance.  Conclusions  This paper proposes a lightweight dual-channel convolutional neural network architecture incorporating an attention mechanism, comprising three core innovative modules: a dual-convolution feature extraction module, a parallel dual-attention module, and a feature optimization classification module. During feature extraction, long-range venous features and background information are collaboratively encoded through a low-channel parallel architecture, significantly reducing parameter quantities while enhancing inter-individual discriminability. The attention module efficiently captures critical venous features, maintaining feature sensitivity while overcoming the parameter expansion bottleneck of traditional attention mechanisms. The feature optimization classification module employs a progressive feature recalibration mechanism, effectively mitigating the conflicts between underfitting and overfitting during stacked dimensionality reduction. Quantitative experiments demonstrate that the proposed method achieves recognition accuracies of 99.70%, 98.33% and 98.27% on the USM, HKPU and SDUMLA datasets respectively, representing an average improvement of 2.05% compared to existing state-of-the-art methods. When benchmarked against comparable lightweight finger vein recognition approaches, the proposed method reduces parameter scale by 11.35%-60.19%, successfully balancing model lightening and performance enhancement.
Multimodal Pedestrian Trajectory Prediction with Multi-Scale Spatio-Temporal Group Modeling and Diffusion
KONG Xiangyan, GAO YuLong, WANG Gang
Available online  , doi: 10.11999/JEIT250900
Abstract:
  Objective  With the rapid advancement of autonomous driving and social robotics, accurate pedestrian trajectory prediction has become pivotal for ensuring system safety and enhancing interaction efficiency. Existing group-based modeling approaches predominantly focus on local spatial interaction, often overlooking latent grouping characteristics across the temporal dimension. To address these challenges, this research proposes a multi-scale spatiotemporal feature construction method that achieves the decoupling of trajectory shape from absolute spatiotemporal coordinates, enabling the model to accurately capture the latent group associations over different time intervals. Simultaneously, spatiotemporal interaction three-element format encoding mechanism is introduced to deeply extract the dynamic relationships between individuals and groups. By integrating the reverse process length mechanism of diffusion models, the proposed approach incrementally mitigates prediction uncertainty. This research not only offers an intelligent solution for multi-modal trajectory prediction in complex, crowded environments but also provides robust theoretical support for improving the accuracy and robustness of long-range trajectory forecasting.  Methods  The proposed algorithm performs deep modeling of pedestrian trajectories through multi-scale spatiotemporal group modeling. The system is designed across three key dimensions: group construction, interaction modeling, and trajectory generation. First, to address the limitations of traditional methods that focus on local spatiotemporal relationships while overlooking cross-dimensional latent characteristics, A multi-scale trajectory grouping model is designed. Its core innovation lies in extracting trajectory offsets to represent trajectory shapes, successfully decoupling motion features from absolute positions. This enables the model to accurately capture latent group associations among agents following similar paths over different periods. Second, a coding method based on spatiotemporal interaction three-element format is proposed. By defining neural interaction strength, interaction categories, and category functions, this method deeply analyzes the complex associations between agents and groups. This not only captures fine-grained individual interactions but also effectively reveals the global dynamic evolution of collective behavior. Finally, a Diffusion Model is introduced for multimodal prediction. Through the reverse process length mechanism of the diffusion model, the model converges progressively, effectively eliminating uncertainty during the prediction process and transforming a fuzzy prediction space into clear and plausible future trajectories.  Results and Discussions  In this study, the proposed model was evaluated against 11 state-of-the-art baseline algorithms using the NBA dataset (Table 1). Experimental results indicate that this model achieves a significant advantage in the minADE20. Notably, it demonstrates a substantial performance leap over GroupNet+CVAE in long-term prediction tasks, with minADE20 and minFDE20 improvements of 0.18 and 0.36, respectively, at the 4-second prediction horizon. Although the model slightly underperforms compared to MID in long-term trends—likely due to the frequent and intense shifts in group dynamics within NBA scenarios—it exhibits exceptional precision in instantaneous prediction. This provides strong empirical evidence for the effectiveness of multi-scale grouping strategy, based on historical trajectories, in capturing complex dynamic interactions. On the ETH/UCY datasets (Table 2), the MSGD method achieved consistent performance gains across all five sub-scenarios. Particularly in the pedestrian-dense and interaction-heavy UNIV scene, the proposed method surpassed all baseline models by leveraging the advantages of multi-scale modeling. While MSGD is slightly behind PPT in terms of long-distance endpoint constraints, it maintains a lead in minADE20. Furthermore, it outperforms Trajectory++ in velocity smoothness and directional coherence (std dev: 0.7012) (Table 3). These results suggest that while fitting the geometric shape of trajectories, the method generates naturally smooth paths that align more closely with the physical laws of human motion. Ablation studies systematically verified the independent contributions of the diffusion model, spatiotemporal feature extraction, and multi-scale grouping modules to the overall accuracy (Table 4). Grouping sensitivity analysis on the NBA dataset revealed that a full-court grouping strategy (group size of 11) significantly enhances long-term stability, resulting in a further reduction of minFDE20 by 0.026–0.03 at the 4-second (Table 5). Simultaneously, configurations with group sizes of 5 or 2 validate the significance of team formations and “one-on-one” local offensive/defensive dynamics in trajectory prediction (Table 6). Additionally, sensitivity analysis of diffusion steps and training epochs revealed a “complementary” relationship: moderately increasing the number of steps (e.g., 30–40) refines the denoising process and significantly improves accuracy, whereas excessive iterations may lead to overfitting (Table 7). Finally, qualitative visualization intuitively demonstrates that the multimodal trajectories generated by MSGD have a high degree of overlap with ground-truth data (Fig.2).  Conclusions  This study proposes a novel trajectory prediction algorithm that enhances performance primarily in two aspects: (1) It effectively captures pedestrian interactions by extracting spatiotemporal features; (2) It strengthens the modeling of collective behavior by grouping pedestrians across multiple scales. Experimental results demonstrate that the algorithm achieves state-of-the-art (SOTA) performance on both the NBA and ETH/UCY datasets. Furthermore, ablation studies verify the effectiveness of each constituent module. Despite its superior performance and adaptability, the proposed algorithm has two primary limitations: first, the current model does not account for explicit environmental information (such as maps or obstacles); second, the diffusion model involves high computational overhead during inference. Future work will focus on improvements and research in these two directions.
Hybrid PUF Tag Generation Technology for Battery Anti-counterfeiting
HE Zhangqing, LUO Siyu, ZHANG Junming, ZHANG Yin, WAN Meilin
Available online  , doi: 10.11999/JEIT250967
Abstract:
  Objective  With the global transition towards a low-carbon economy, power batteries have become crucial energy storage carriers. The traceability and security of their entire life cycle are foundational to industrial governance. In 2023, the Global Battery Alliance (GBA) introduced the "Battery Passport" system, requiring each battery to have a unique, tamper-proof, and verifiable digital identity. However, traditional digital tag solutions—such as QR codes and RFID—rely on pre-written static storage, making them vulnerable to physical cloning, data extraction, and environmental degradation. To address these issues, this paper proposes a battery anti-counterfeiting tag generation technology based on hybrid Physical Unclonable Function (PUF). The technology leverages a triple physical coupling mechanism among the battery, PCB, and IC to generate a unique battery ID, ensuring strong physical binding and anti-counterfeiting capabilities at the system level.  Methods  The proposed battery anti-counterfeiting tag consists of four core modules: an off-chip RC battery fingerprint extraction circuit, an on-chip Arbiter PUF module, an on-chip delay compensation module, and a reliability enhancement module. The off-chip RC circuit utilizes the physical coupling between the battery negative tab and the PCB's copper-clad area to form a capacitor structure, which introduces inherent manufacturing variations as a entropy source. The on-chip Arbiter PUF converts manufacturing deviations into a unique digital signature. To mitigate systemic biases caused by asymmetrical routing and off-circuit delays, a programmable delay compensation module with coarse and fine-tuning units is integrated. A reliability enhancement module is also embedded to automatically filter out unreliable response bits by monitoring delay deviations, thereby improving the reliability of the generated responses without complex error-correcting codes.  Results and Discussions  The proposed structure was implemented and tested using an FPGA Spartan-6 chip, a custom PCB, and 100Ah blade batteries. Experimental results demonstrate excellent performance: the randomness of the tag reached 48.85%, and the uniqueness averaged 49.15% under normal conditions (Fig. 11). The stability (RA) was as high as 99.98% at room temperature and normal voltage, and remained above 98% even under extreme conditions (100°C, 1.05V) (Fig. 12). To evaluate anti-desoldering capability, three physical tampering scenarios were tested: battery replacement, PCB replacement, and IC replacement. The average response change rates were 14.86%, 24.58%, and 41.66%, respectively (Fig. 13), confirming the strong physical binding among the battery, PCB, and chip. These results validate that the proposed triple physical coupling mechanism effectively resists counterfeiting and tampering.  Conclusions  This paper presents a battery anti-counterfeiting tag generation technology based on a triple physical coupling mechanism. By binding the battery tab, PCB, and chip into a unified physical structure and extracting unique fingerprints from manufacturing variations, the proposed method achieves high randomness, uniqueness, and stability. The tag is highly sensitive to physical tampering, providing a reliable foundation for battery authentication throughout its life cycle. Future work will focus on validating the structure with more advanced chip fabrication processes and different PCB manufacturers, as well as further optimizing the design for broader application.
CaRS-Align: Channel Relation Spectra Alignment for Cross-Modal Vehicle Re-identification
SA Baihui, ZHUANG Jingyi, ZHENG Jinjie, ZHU Jianqing
Available online  , doi: 10.11999/JEIT250917
Abstract:
  Objective  Visible and infrared images, as two commonly used modalities in intelligent transportation scenarios, play an important role in vehicle re-identification. However, due to differences in imaging mechanisms and spectral responses, the visual characteristics of these two modalities are inconsistent, which limits cross-modality vehicle re-identification. To address this issue, this paper proposes a Channel Relation Spectra Alignment (CaRS-Align) method, which uses channel relation spectra instead of channel-wise features as the alignment target, thereby mitigating the interference caused by imaging style differences at the relational structure level. Specifically, within each modality, the channel relation spectrum is constructed to capture stable, semantically collaborative channel-to-channel relationships through robust correlation modeling. Then, at the cross-modal level, the correlation between the corresponding channel relation spectra of the two modalities is maximized, achieving consistent alignment of channel relation structures. Experiments show that on the public MSVR310 and RGBN300 datasets, the proposed CaRS-Align method outperforms existing state-of-the-art methods. For example, on the MSVR310 dataset, under infrared-to-visible retrieval, CaRS-Align achieves a Rank-1 accuracy of 64.35%, which is 2.58% higher than existing advanced methods.  Methods  CaRS-Align follows a hierarchical optimization paradigm: (i) for each modality, it constructs a channel–channel relation spectrum by mining inter-channel dependencies, yielding a semantically coordinated relation matrix that preserves the organizational structure of semantic cues; (ii) it aligns cross-modal consistency by maximizing the correlation between the modalities’ relation spectra, enabling progressive optimization from intra-modal construction to cross-modal alignment; and (iii) it integrates relation-spectrum alignment with standard classification and retrieval objectives commonly used in re-identification to supervise backbone training for the vehicle re-identification model.  Results and Discussions  Compared with several state-of-the-art cross-modal re-identification methods on the public RGBN300 and MSVR310 datasets, CaRS-Align exhibits strong performance, achieving best- or second-best results in both retrieval modes. As shown in (Table 1), on RGBN300 it attains 75.09% Rank-1 and 55.45% mean Average Precision (mAP) in the infrared-to-visible mode, and 76.60% Rank-1 and 56.12% mAP in the visible-to-infrared mode. As shown in (Table 2), similar advantages are observed on MSVR310, with 64.54% Rank-1 and 41.25% mAP in the visible-to-infrared mode, and 64.35% Rank-1 and 40.99% mAP in the infrared-to-visible mode. (Fig. 4) presents Top-10 retrievals, where CaRS-Align notably reduces identity mismatches in both directions; and (Fig. 5) shows feature distance distributions, with substantial overlap between intra- and inter-class distances without CaRS-Align (Fig. 5(a)) versus clear separation with CaRS-Align (Fig. 5(b)), confirming more discriminative features. Collectively, these results demonstrate that modeling channel-level relational structure improves both cross-modal retrieval modes, enhances adaptability to modality shifts, and effectively mitigates mismatches arising from cross-modal differences.  Conclusions  This paper proposes a visible-infrared cross-modality vehicle re-identification method based on Channel Relation Spectra Alignment (CaRS-Align). Concretely, the channel relation spectrum is constructed within each modality to preserve the semantic co-occurrence structures in the data. Then, a channel relation spectra alignment function is designed to maximize the correlation between the spectra of different modalities, thereby achieving consistent alignment and enhancing cross-modality performance. Experiments conducted on two public datasets, MSVR310 and RGBN300, demonstrate that CaRS-Align outperforms existing state-of-the-art methods in key metrics such as Rank-1 accuracy and mAP.
Adversarial Attacks on 3D Target Recognition Driven by Gradient Adaptive Adjustment
LIU Weiquan, SHEN Xiaoying, LIU Dunqiang, SUN Yanwen, CAI Guorong, ZANG Yu, SHEN Siqi, WANG Cheng
Available online  , doi: 10.11999/JEIT251264
Abstract:
In recent years, the deep integration of artificial intelligence and optoelectronic perception systems has significantly propelled the advancement of intelligent driving technologies, with LiDAR serving as a core sensing modality that acquires high-precision, high-resolution three-dimensional point cloud data, thereby establishing itself as an indispensable information source for environmental perception in intelligent driving systems. However, deep learning-based 3D point cloud recognition models exhibit marked vulnerability to meticulously crafted adversarial perturbations, leading to a sharp degradation in recognition performance and posing a serious security challenge to these optoelectronic perception systems. Research on adversarial attack methods for 3D point clouds is therefore crucial not only for enhancing model robustness but also for ensuring the safe and reliable operation of intelligent driving systems. While existing attack methods have improved in effectiveness, their generated perturbations often lack concealment, produce outliers, and demonstrate poor imperceptibility, limiting their practical application in real-world scenarios. To address these issues, this paper proposes a Gradient Adaptive Adjustment (GAA) driven point cloud adversarial attack method. This approach begins by analyzing the decision-level vulnerabilities of 3D point cloud classifiers to identify key points significantly influencing the model’s output. It then adaptively adjusts gradient weights by incorporating local curvature information and optimizes perturbation generation under geometric constraints aligned with principal curvature directions, thereby ensuring a high attack success rate while maintaining the geometric consistency and visual naturalness of the adversarial point cloud. Experimental results on multiple public datasets demonstrate that the proposed method achieves a high attack success rate while significantly reducing perturbation intensity; for instance, on the ModelNet40 dataset against the PointNet model, it attains a 97.69% attack success rate by perturbing only 28 points on average, substantially outperforming existing comparative methods and providing an effective tool for evaluating and enhancing the security of intelligent driving optoelectronic perception systems.
Resilient Average Consensus for Second-Order Multi-Agent Systems: Algorithms and Application
FANG Chongrong, HUAN Yuehui, ZHENG Wenzhe, BAO Xianchen, LI Zheng
Available online  , doi: 10.11999/JEIT251155
Abstract:
  Objective  Multi-agent systems (MASs) are pivotal for collaborative tasks in dynamic environments, with consensus algorithms serving as a cornerstone for applications like formation control. However, MASs are vulnerable to misbehaviors (e.g., malicious attacks or accidental faults), which can disrupt consensus and compromise system performance. While resilient consensus methods exist for first-order systems, they are inadequate for second-order MASs, where agents’ dynamics involve both position and velocity. This work addresses the gap by developing a resilient average consensus framework for second-order MASs that ensures accurate collaboration under misbehaviors. The primary challenges include distributed error detection and compensating two-dimensional state errors (position and velocity) using one-dimensional acceleration inputs.  Methods  The study first derives sufficient conditions for second-order average consensus under misbehaviors, leveraging graph theory and Lyapunov stability analysis. The system is modeled as an undirected graph \begin{document}$ \mathcal{G}=(\mathcal{V},\mathcal{E}) $\end{document}, where agents follow double-integrator dynamics. Two algorithms are proposed: Finite Input-Errors Detection-Compensation (FIDC): For finite control input errors, detection strategies (1 and 2) use two-hop communication information to identify discrepancies in neighbors’ states or control inputs. Compensation Scheme I designs input sequences to satisfy consensus conditions (Corollary 1). Infinite Attack Detection-Compensation (IADC): For infinite errors in control input, velocity, and position, detection strategies are extended to identify falsified data. Compensation Schemes 2 and 3 mitigate errors, while an exponentially decaying error bound isolates persistent attackers. The algorithms are distributed and require no global knowledge.  Results and Discussions  Simulations on a 10-agent network validate the algorithms’ efficacy. Under FIDC, agents achieve exact average consensus despite finite input errors from malicious and faulty agents (Fig. 5). IADC ensures consensus among normal agents after isolating malicious ones exceeding the error bound (Fig. 6). Experimental evaluations on a multi-robot platform demonstrate resilience against real-world faults (e.g., actuator failures) and attacks (e.g., false data injection). In fault scenarios, FIDC reduces formation center deviation from 180mm to 34mm (Fig. 8). For attacks, IADC isolates malicious robots, allowing normal agents to converge correctly (Fig. 9). Discussions on relaxing Assumption 1 (non-adjacent misbehaving agents) reveal that Detection Strategy 3 and majority voting can handle certain connected malicious topologies (Fig. 3Fig. 4), though complex cases require further study.  Conclusions  This work proposes a novel resilient average consensus framework for second-order MASs. Theoretically, sufficient conditions ensure consensus under misbehaviors, while FIDC and IADC algorithms enable distributed detection, compensation, and isolation of errors. Simulations and physical experiments confirm that the methods achieve accurate average consensus against both finite and infinite errors. Future work will explore extensions to directed networks, time-varying topologies, and higher-dimensional systems.
AutoPenGPT: Drift-Resistant Penetration Testing Driven by Search-Space Convergence and Dependency Modeling
HUANG Weigang, FU Lirong, LIU Peiyu, DU Linkang, YE Tong, XIA Yifan, WANG Wenhai
Available online  , doi: 10.11999/JEIT250873
Abstract:
  Objective  Industrial Control Systems (ICS) are widely deployed in critical sectors and often remain exposed to long-standing vulnerabilities due to strict availability requirements and limited patching opportunities. The increasing exposure of externally facing management and access infrastructure has significantly expanded the attack surface, enabling attackers to pivot from boundary components into fragile production networks. Continuous penetration testing of such exposed components is therefore essential but remains costly and difficult to scale when relying on manual efforts. Recent work explores Large Language Models (LLMs) for automated penetration testing; however, existing systems often suffer from strategy drift and intention drift, leading to incoherent testing behaviors and ineffective exploitation chains.  Methods  To address these challenges, we propose AutoPenGPT, an intelligent multi-agent framework for automated Web security testing. AutoPenGPT introduces an adaptive exploration-space convergence mechanism that predicts potential vulnerability types from target semantics and constrains LLM-driven testing via a dynamically updated payload knowledge base. To mitigate intention drift in multi-step exploitation, we further design a dependency-driven strategy generation mechanism that semantically rewrites historical feedback, models step dependencies, and produces coherent, executable testing strategies in a closed-loop manner. In addition, a semi-structured prompt embedding framework is developed to support heterogeneous penetration testing tasks while preserving semantic integrity.  Results and Discussions  We evaluate AutoPenGPT on both CTF benchmarks and real-world ICS and web platforms. On CTF test sets, AutoPenGPT achieves 97.62% vulnerability type detection accuracy and 80.95% requirement completion rate, outperforming state-of-the-art tools by a significant margin. In real-world environments, it reaches approximately 70% requirement completion and uncovers six previously undisclosed vulnerabilities, demonstrating practical effectiveness.  Conclusions   The contributions are threefold: (1) We identify and systematically address strategy drift and intention drift in LLM-driven penetration testing, and propose adaptive exploration and dependency-aware strategy mechanisms to stabilize long-horizon testing behaviors. (2) We design and implement AutoPenGPT, a multi-agent penetration testing system that integrates semantic vulnerability prediction, closed-loop strategy generation, and semi-structured prompt embedding. (3) We demonstrate the effectiveness and practicality of AutoPenGPT through extensive evaluation on CTF and real-world ICS and web platforms, including the discovery of previously unknown vulnerabilities.
Multi-UAV RF Signals CNN|Triplet-DNN Heterogeneous Network Feature Extraction and Type Recognition
ZHAO Shen, LI Guangxuan, ZHOU Xiancheng, HUANG Wendi, YANG Lingling, GAO Liping
Available online  , doi: 10.11999/JEIT250757
Abstract:
  Objective  To address the detection requirements for multiple types of unmanned aerial vehicles (UAVs) operating simultaneously, the pivotal strategy involves extracting model-specific information features from the radio frequency (RF) time-frequency spectrum. Consequently, an innovative CNN|Triplet-DNN heterogeneous network architecture has been developed to optimize feature extraction and classification methodologies. This solution effectively resolves the challenge of identifying individual models within the coexisting signals of multiple UAVs, thereby laying the groundwork for efficient management and control of multiple UAVs in complex operational environments.  Methods  The CNN|Triplet-DNN heterogeneous network architecture adopts a parallel-branch structure that integrates convolutional neural network (CNN) and Triplet Convolutional Neural Network (Triplet-CNN) components. Specifically, Branch 1 employs a lightweight CNN architecture to extract global features from RF time-frequency diagrams while minimizing computational complexity. Branch 2 incorporates an enhanced center loss function to improve the discriminative capability of global features, thereby effectively resolving the ambiguity in feature boundaries of time-frequency diagrams under complex scenarios. Branch 3, built on the Triplet-CNN framework, utilizes Triplet Loss to simultaneously capture both local and global features of RF time-frequency diagrams. The complementary features from each branch are subsequently integrated and processed via a DNN fully connected layer combined with the Softmax activation function, generating probability distributions for drone signal classification. This approach significantly enhances the performance of aircraft type recognition and classification.  Results and Discussions  RF signals from the open-source DroneRFa dataset were superimposed to simulate multi-drone coexistence signals, while real-world drone signals were collected through controlled flight experiments to construct a comprehensive drone signal database. (1) based on the single-drone RF time-frequency diagrams from the open-source dataset, ablation experiments(Fig.7) were conducted on the three-branch structure of CNN|Triplet-DNN model to demonstrate the scientificity and rationality of its design, and each model was trained. (2) the simulated multi-drone coexistence signal dataset was employed for identification tasks to evaluate the recognition performance of each model under multi-drone coexistence scenarios. Experimental results(Fig.10) demonstrate that the recognition accuracy for four or fewer drone types ranges from 83% to 100%, thereby validating the efficacy of the CNN|Triplet-DNN model. (3) Each model was trained using the flight dataset and then applied to identify actual multi-drone coexistence signals. The CNN|Triplet-DNN model achieved(Fig.14) recognition accuracies of 86%, 57%, and 73% for two, three, and four drone types, respectively. Comparative analysis with the CNN, Triplet-CNN, and Transformer reveals that the CNN|Triplet-DNN exhibits superior generalizability. Notably, all models experienced performance degradation when tested against real-world data compared to the open-source dataset, primarily due to the dynamic adjustment of drone communication frequency bands, which adversely affects multi-drone recognition performance.  Conclusions  To tackle the challenge of coexistence identification for RF signals emitted by multiple UAVs, a novel heterogeneous network architecture integrating CNN|Triplet-DNN is proposed. This model, leveraging a three-branch structural framework and backpropagation algorithm, demonstrates superior capability in extracting discriminative features of aircraft models. The incorporation of DNN significantly enhances the model's generalization capacity. The efficacy and practical applicability of the proposed approach have been validated through comprehensive experiments utilizing open-source datasets and real-world flight scenarios. Future research directions will focus on dataset expansion, model optimization for dynamic communication frequency band adaptation, and enhancement of recognition performance in complex coexistence environments.
Automating Algorithm Design for Agile Satellite Task Assignment with Large Language Models and Reinforcement Learning
CHEN Yingguo, WANG Feiran, HU Yunpeng, YANG Bin, YAN Bing
Available online  , doi: 10.11999/JEIT250991
Abstract:
  Objective  The Multi-Agile Earth Observation Satellite Mission Scheduling Problem (MAEOSMSP) is an NP-hard problem. Algorithm design for this problem has long been constrained by reliance on expert experience and limited adaptability across diverse scenarios. To address this limitation, an Adaptive Algorithm Design (AAD) framework is proposed. The framework integrates a Large Language Model (LLM) and Reinforcement Learning (RL) to enable automated generation and intelligent application of scheduling algorithms. It is built on a novel offline evolution-online decision-making architecture. The objective is to discover heuristic algorithms that outperform human-designed methods and to provide an efficient and adaptive solution methodology for the MAEOSMSP.  Methods  The AAD framework adopts a two-stage mechanism. In the offline evolution stage, LLM-driven evolutionary computation is used to automatically generate a diverse and high-quality library of task assignment algorithms, thereby alleviating the limitations of manual design. In the online decision-making stage, an RL agent is trained to dynamically select the most suitable algorithm from the library based on the real-time solving state (e.g., solution improvement and stagnation). This process is formulated as a Markov decision process, which allows the agent to learn a policy that adapts to problem-solving dynamics.  Results and Discussions  The effectiveness of the AAD framework is evaluated through comprehensive experiments on 15 standard test scenarios. The framework is compared with several state-of-the-art methods, including expert-designed heuristics, an advanced deep learning approach, and ablation variants of the proposed framework. The results show that the dynamic strategies generated by AAD consistently outperform the baselines, with performance improvements of up to 9.8% in complex scenarios. Statistical analysis further indicates that AAD achieves superior solution quality and demonstrates strong robustness across different problem instances.  Conclusions  A novel AAD framework is presented to automate algorithm design for the MAEOSMSP by decoupling algorithm generation from algorithm application. The combination of LLM-based generation and RL-based decision making is validated empirically. Compared with traditional hyper-heuristics and existing LLM-based methods, the proposed architecture enables both the creation of new algorithms and their dynamic application. The framework provides a new paradigm for solving complex combinatorial optimization problems and shows potential for extension to other domains.
A Morphology-guided Decoupled Framework for Oriented SAR Ship Detection
WANG Zeyu, WANG Qingsong
Available online  , doi: 10.11999/JEIT250979
Abstract:
  Objective  Synthetic Aperture Radar (SAR) is an indispensable remote sensing technology; however, accurate ship detection in SAR imagery remains challenging. Most deep learning-based detection approaches rely on Horizontal Bounding Box (HBB) annotations, which do not provide sufficient geometric information to estimate ship orientation and scale. Although Oriented Bounding Box (OBB) annotation contains such information, reliable OBB labeling for SAR imagery is costly and frequently inaccurate because of speckle noise and geometric distortions intrinsic to SAR imaging. Weakly supervised object detection provides a potential alternative, yet approaches designed for optical imagery exhibit limited generalization capability in the SAR domain. To address these limitations, a simulation-driven decoupled framework is proposed. The objective is to enable standard HBB-based detectors to produce accurate OBB predictions without structural modification by training a dedicated orientation estimation module using a fully supervised synthetic dataset that captures essential SAR ship morphology.  Methods  The proposed framework decomposes oriented ship detection into two sequential sub-tasks: coarse localization and fine-grained orientation estimation (Fig. 1). First, an axis-aligned localization module based on a standard HBB detector, such as YOLOX, is trained using available HBB annotations to identify candidate regions of interest. This stage exploits the high-recall capability of mature detection networks and outputs image patches that potentially contain ship targets. Second, to learn orientation information without real OBB annotations, a large-scale morphological simulation dataset composed of binary images is constructed. The dataset generation begins with simple binary rectangles of randomized aspect ratios and known ground-truth orientations. To approximate the appearance of binarized SAR ship targets, morphological operations, including edge-level and region-level erosion and dilation, are applied to introduce boundary ambiguity. Structured strong scattering cross noise is further injected to simulate SAR-specific artifacts. This process yields a synthetic dataset with precise orientation labels. Third, an orientation estimation module based on a lightweight ResNet-18 architecture is trained exclusively on the synthetic dataset. This module predicts object orientation and refines aspect ratio using only shape and contour information. During inference, candidate patches produced by the localization module are binarized and processed by the orientation estimation module. Final OBBs are generated by fusing the spatial coordinates derived from the initial HBBs with the predicted orientation and refined dimensions.  Results and Discussions  The proposed method is evaluated on two public SAR ship detection benchmarks, HRSID and SSDD. Training is conducted using only HBB annotations, whereas performance is assessed against ground-truth OBBs using Average Precision at 0.5 intersection over union (AP50) and Recall (R). The method demonstrates superior performance relative to existing weakly supervised approaches and remains competitive with fully supervised methods (Table 1 and Table 2). On the HRSID dataset, an AP50 of 84.3% and a recall of 91.9% are achieved. These results exceed those of weakly supervised methods such as H2Rbox-v2 (56.2% AP50) and the approach reported by Yue et al.[14] (81.5% AP50), and also outperform several fully supervised detectors, such as R-RetinaNet (72.7% AP50) and S2ANet (80.8% AP50). A similar advantage is observed on the SSDD dataset, where an AP50 of 89.4% is obtained, representing a significant improvement over the best reported weakly supervised result of 87.3%. Qualitative inspection of detection outputs supports these quantitative results (Fig. 3). The proposed method shows a lower missed-detection rate, particularly for small and densely clustered ships, relative to other weakly supervised approaches. This robustness is attributed to the high-recall property of the first-stage localization network combined with reliable orientation cues learned from the morphological dataset. To examine key methodological aspects, additional experiments are conducted. Analysis of the domain gap between synthetic and real data using UMAP-based visualization of high-dimensional features (Fig. 5) reveals substantial overlap and similar manifold structures across domains, indicating strong morphological consistency. An ablation study of the morphological components (Fig. 4) further shows that each simulation element contributes incrementally to performance improvement, supporting the design of the high-fidelity simulation process.  Conclusions  A morphology-guided decoupled framework for oriented ship detection in SAR imagery is presented. By separating localization and orientation estimation, standard HBB-based detectors are enabled to perform accurate oriented detection without retraining. The central contribution is a fully supervised morphological simulation dataset that allows a dedicated module to learn robust orientation features from structural contours, thereby mitigating the annotation challenges associated with real SAR data. Experimental results demonstrate that the proposed approach substantially outperforms existing HBB-supervised methods and remains competitive with fully supervised alternatives. The plug-and-play design highlights its practical applicability.
A Landmark Matching Method Considering Gray–Gradient Dual-Channel Features and Deformation Parameter Optimization
XU Changding, LIU Shijie, XIAO Changjiang
Available online  , doi: 10.11999/JEIT250953
Abstract:
  Objective  High-precision optical autonomous navigation is a key technology for deep-space exploration and planetary landing missions. During the descent phase of a lunar lander, communication delays and accumulated errors in the Inertial Navigation System (INS) lead to significant positioning deviations, which pose serious risks to safe landing. Optical images acquired by the lander are matched with pre-stored lunar landmark databases to establish correspondences between image coordinates and three-dimensional coordinates of lunar surface features, thereby enabling precise position estimation. This process is challenged by dynamic illumination variation on the lunar surface, noise in prior pose information, and limited onboard computational resources. Traditional template matching methods exhibit high computational cost and sensitivity to rotation and scale variation. Keypoint-based methods, such as Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF), suffer from uneven keypoint distribution and sensitivity to illumination variation, which results in reduced robustness. Deep learning-based approaches, including SuperPoint, SuperGlue, and LF-Net, improve feature detection accuracy but require substantial computational resources, which restricts real-time onboard deployment. To address these limitations, a landmark matching algorithm is proposed that integrates gray–gradient dual-channel features with deformation parameter optimization, enabling high-precision and real-time matching for lunar optical autonomous navigation.  Methods  Dual-channel image features are constructed by combining gray-level intensity and gradient magnitude representations. Gradient features are computed using Sobel operators in the horizontal and vertical directions, and the gradient magnitude is calculated as the Euclidean norm of the two components. To reduce the effect of local brightness variation and ensure inter-region comparability, zero-mean normalization is applied independently to each feature channel. An adaptive weighting strategy is employed, in which weights are assigned according to local gradient saliency, and a bias term is introduced to retain weak texture information, thereby improving robustness under noisy conditions. Landmark matching is formulated as a nonlinear least-squares optimization problem. A deformation parameter vector is defined, which includes incremental rotation, scale, and translation relative to the prior pose. The objective function minimizes the weighted sum of squared differences between dual-channel landmark features and image features, with Tikhonov regularization applied to constrain parameter magnitude and improve numerical stability. The Levenberg–Marquardt (LM) algorithm is adopted to iteratively estimate the optimal deformation parameters. Its adaptive damping strategy enables switching between gradient descent and Gauss–Newton updates, ensuring stable convergence under large prior pose errors. Iteration terminates when the error norm falls below a predefined threshold or when the maximum iteration number is reached, yielding the optimal landmark transformation parameters.  Results and Discussions  Experiments are conducted using simulated lunar landing images generated from 60 m-resolution SLDEM (Digital Elevation Model Coregistered with SELENE Data) data, with high-fidelity illumination rendering applied to ensure realistic lighting conditions (Fig. 2). To evaluate matching performance under different scenarios, 143 landmarks are synthesized with systematically controlled perturbations in rotation, scale, and translation. Four representative methods are selected for comparison, including convolution-accelerated Normalized Cross-Correlation (NCC), SURF-based feature matching with image enhancement, globally and locally optimized NCC, and the proposed algorithm (Fig. 4). The results indicate clear performance differences among the methods. Convolution-accelerated NCC achieves sub-second runtime and demonstrates high computational efficiency, although its accuracy degrades under gray-level variation and geometric deformation, with mean absolute errors of 2.41 px along the x-axis and 4.47 px along the y-axis, and a success rate of 89.51% (Table 1). SURF-based matching achieves sub-pixel accuracy, with mean absolute errors of 0.56 px along the x-axis and 0.54 px along the y-axis, although its success rate is limited to 48.95% and its runtime exceeds one second, which restricts onboard applicability. The globally and locally optimized NCC method exhibits the lowest accuracy, with errors of 4.54 px along the x-axis and 4.92 px along the y-axis, and the longest runtime of 4.41 s, despite achieving a 100% success rate. In contrast, the proposed algorithm consistently achieves sub-pixel accuracy comparable to SURF, maintains a 100% success rate, and sustains a stable runtime of approximately 0.5 s across all test cases. Its robustness to landmark deformation and illumination variation demonstrates suitability for complex operational conditions. Overall, the results show that the proposed algorithm achieves a favorable balance among accuracy, robustness, and computational efficiency.  Conclusions  A landmark matching algorithm is presented that integrates gray–gradient dual-channel features with deformation parameter optimization. Gray-level intensity and gradient magnitude information from both landmark templates and lander images are jointly exploited to construct a dual-channel matching model that minimizes feature differences. Deformation parameters, including rotation, scale, and translation, are iteratively optimized using the LM algorithm, enabling rapid estimation of the optimal landmark position in the lander image. Experimental results show stable convergence within sub-second runtime, with an average matching error of 1.03 pixels under disturbances in attitude, scale, and position. The proposed method outperforms single-channel gray-level cross-correlation and SURF-based matching approaches in accuracy, robustness, and efficiency. These results provide practical support for the design and implementation of future autonomous lunar optical navigation systems.
Autonomous Radar Scan-Mode Recognition Method Based on High-Dimensional Features and Random Forest
WU Kanghui, GUO Zixun, FAN Yifei, XIE Jian, TAO Mingliang
Available online  , doi: 10.11999/JEIT250985
Abstract:
  Objective  Rapid and robust recognition of radar scanning modes under noncooperative electronic reconnaissance conditions is a prerequisite for threat assessment, resource scheduling, and countermeasure design. Mechanical Scanning (MST) and phased-array Electronic Scanning (EST) leave different physical imprints in the Time-Of-Arrival-Pulse Amplitude (TOA-PA) stream. However, their separability degrades under low Signal-to-Noise Ratio (SNR), nonstationary dwell scheduling, and jittered timing typical of dense electromagnetic environments. In this study, a physics-grounded, multi-domain feature framework coupled with a Random Forest (RF) classifier is developed to discriminate MST from EST using only intercepted TOA-PA sequences, without synchronization or prior emitter knowledge.  Methods  The reconnaissance reception chain is modeled, and Pulse Amplitude (PA) formation is formalized to clarify the association between antenna-pattern traversal and amplitude texture. From this physical perspective, seven complementary features are extracted across time, frequency, and graph structure: Coefficient of Variation (CV), Total Variation (TV), Gaussian Fitting Degree (GFD), Relative Width (RW), Spectral Flatness Measure (SFM), Global Clustering Coefficient (GCC) on a Horizontal Visibility Graph (HVG), and Normalized Degree Entropy (NDE). HVG construction preserves temporal order and reveals global structure induced by sequence shape. Features are computed per frame and concatenated into a seven-dimensional vector. The RF classifier is trained using bootstrap sampling and random-subspace splits, and inference is performed by majority voting over leaf-level posteriors. The full pipeline is summarized in Fig. 10. Computational complexity remains near linear: CV, TV, and RW scale as O(N); SFM is dominated by a single fast Fourier transform with O(Nlog2N); and HVG-based features scale as O(Nlog2N), satisfying low-latency constraints.  Results and Discussions  The dataset is constructed using paired MST and EST frames with time-of-arrival jitter of approximately 0.2% of the pulse repetition interval, additive white Gaussian noise across SNR levels, and realistic beam patterns that include sidelobes for both scanning schemes. Training spans 0\begin{document}$ \sim $\end{document}30 dB, and testing covers –5.5\begin{document}$ \sim $\end{document}29.5 dB in 5 dB steps. Using the proposed seven-feature vector, the RF classifier achieves an average accuracy of 97.59% across all SNRs and exceeds a support vector machine baseline with identical inputs at 96.01%. The largest margins are observed at low to mid SNR, as shown in Fig. 11. Single-feature analysis shows clear heterogeneity and complementarity. SFM provides the best single-feature performance at 0.916 1, followed by TV and NDE at 0.822 0 and 0.806 5, respectively. CV and GFD show intermediate performance at approximately 0.66, whereas RW and graph-based similarity measures are lower at approximately 0.56\begin{document}$ \sim $\end{document}0.57. Joint multi-feature inputs increase accuracy to 0.975 9, yielding an absolute gain of 5.98 percentage points over the best single feature and reducing the error rate from 8.39% to 2.41%, corresponding to a relative reduction of approximately 71%. These improvements are summarized in Table 1 and Fig. 12. Runtime evaluation indicates that the dominant computational cost arises from the fast Fourier transform and HVG construction. A per-frame computation time of approximately 0.515 ms keeps the method suitable for on-orbit and embedded processing. The performance gains arise from the joint capture of four factors: smooth versus stepwise amplitude evolution represented by CV and TV; main-lobe morphology and time scale represented by GFD and RW, as illustrated in Fig. 4; spectral concentration versus dispersion represented by SFM, as illustrated in Fig. 5; and topology induced by alternating highs and lows under dwell switching represented by HVG clustering and entropy, as detailed in Fig. 7. Together, these factors stabilize the decision boundary against noise and dwell nonstationarity.  Conclusions  A physics-grounded, multi-domain feature framework combined with an RF discriminator is presented for radar scanning mode recognition under noncooperative conditions. The method is derived from intrinsic contrasts between MST, characterized by continuous, smooth, and quasi-periodic behavior, and phased-array EST, characterized by dwell-based, jumping, and nonstationary behavior. A TOA-PA signal model consistent with engineering practice is constructed, and complementary features are designed across time (CV, TV), main-lobe morphology (GFD, RW), frequency (SFM), and graph structure (GCC, NDE). The RF classifier applies bootstrap sampling and random subspaces to reduce variance and mitigate overfitting, enabling robust decisions. Across detection scenarios from –5 dB to 30 dB, an average accuracy of 97.59% is obtained. Compared with schemes based on single-domain features or a limited feature set, the proposed framework provides higher recognition stability under low-SNR and other challenging disturbance conditions.
A Focused Attention and Feature Compact Fusion Transformer for Semantic Segmentation of Urban Remote Sensing Images
ZHOU Guoyu, ZHANG Jing, YAN Yi, ZHUO Li
Available online  , doi: 10.11999/JEIT250812
Abstract:
  Objective  Driven by the growing integration of remote sensing data acquisition and intelligent interpretation technologies within aerospace information intelligent processing, semantic segmentation of Urban Remote Sensing Image (URSI) has emerged as a key research area connecting aerospace information and urban computing. However, compared to general Remote Sensing Image (RSI), URSI exhibits a high diversity and complex of geo-objects, characterized by fine-grained intra-class variations, inter-class similarities that cause confusion, as well as blurred and irregular object boundaries. These factors present difficulties for fine-grained segmentation. Despite their success in RSI semantic segmentation, applying Transformer-based methods to URSI requires a balance between capturing detailed features and boundaries, and managing the computational cost of self-attention. To address these issues, this paper introduces a focused attention mechanism in the encoder to efficiently capture discriminative intra- and inter-class features, while performing compact edge feature fusion in the decoder.  Methods  This paper proposes a Focused attention and Feature compact Fusion Transformer (F3Former). The encoder incorporates a dedicated Feature-Focused Encoding Block (FFEB). By leveraging the focused attention mechanism, it adjusts the directions of Query and Key features such that the features of the same class are pulled closer while those of different classes are repelled, thereby enhancing intra-class consistency and inter-class separability during feature representation. This process yields a compact and highly discriminative attention distribution, which amplifies semantically critical features while curbing computational overhead. To complement this design, the decoder employs a Compact Feature Fusion Module (CFFM), where Depth-Wise Convolution (DW Conv) is utilized to minimize redundant cross-channel computations. This design strengths the discriminative power of edge representations, improvs inference efficiency and deployment adaptability, and maintains segmentation accuracy.  Results and Discussions  F3Former demonstrates favorable performance on several benchmark datasets, alongside a lower computational complexity. On the Potsdam and Vaihingen benchmarks, it attained mIoU scores of 88.33%/81.32%, respectively, ranking second only to TEFormer with marginal differences in accuracy (Table 1). Compared to other lightweight models including CMTFNet, ESST, and FSegNet, F3Former consistently delivered superior results in mIoU, mF1, and PA, demonstrating the efficacy of the proposed FFEB and CFFM modules in capturing complex URSI features. On the LoveDA dataset, it reached 53.16% mIoU and outperformed D2SFormer in several critical categories (Fig.4). Moreover, F3Former strikes a favorable balance between accuracy and efficiency, reducing parameter count and FLOPs by over 30% compared to TEFormer, with only negligible degradation in accuracy (Table 2). Qualitative results further indicate clearer boundary delineation and improved recognition of small or occluded objects relative to other lightweight approaches (Fig. 5 and Fig. 6). Ablation studies validate the critical role of both the Focused Attention (FA) mechanism and the Compact Feature Fusion Head (CFFHead) in achieving accuracy and efficiency gains (Tables 3 & Tables 4).  Conclusions  This work tackles key challenges in URSI semantic segmentation—including intra-class variability, inter-class ambiguity, and complex boundaries—by proposing F3Former. In the encoder, the FFEB improves intra-class aggregation and inter-class discrimination through directional feature modeling. In the decoder, the CFFM employs DW Conv to minimize redundancy and enhance boundary representations. With linear complexity, F3Former attains higher accuracy and stronger representational capacity while remaining efficient and deployment-friendly. Extensive experiments across multiple URSI benchmarks confirm its superior performance, highlighting its practicality for large-scale URSI applications. However, compared to existing State-Of-The-Art (SOTA) lightweight methods, the computational efficiency of the FFEB still has room for improvement. Future work is directed towards replacing Softmax with a more efficient operator to accelerate attention computation, maintaining accuracy while advancing efficient URSI semantic segmentation. Additionally, as the decoder’s channel interaction mechanism remains relatively limited, we plan to incorporate lightweight attention or pointwise convolution designs to further strengthen feature fusion.
A Review of Ground-to-Aerial Cross-View Localization Research
HU Di, YUAN Xia, XU Xiaoqiang, ZHAO Chunxia
Available online  , doi: 10.11999/JEIT250167
Abstract:
  Significance   This paper presents a comprehensive review of ground-to-aerial cross-view localization, systematically organizes representative methods, benchmark datasets, and evaluation metrics. Notably, it is the first review to systematically organize ground-to-aerial cross-view localization algorithms that integrate range sensors, such as Light Detection and Ranging (LiDAR) and millimeter-wave radar, thereby providing new perspectives for subsequent research. Ground-to-aerial cross-view localization has emerged as a key topic in computer vision, aiming to determine the precise pose of ground-based sensors by referencing aerial imagery. This technology is increasingly applied in autonomous driving, unmanned aerial vehicle navigation, intelligent transportation systems, and urban management. Despite substantial progress, ground-to-aerial cross-view localization continues to face major challenges arising from temporal and spatial variations, including seasonal changes, day-night transitions, weather conditions, viewpoint differences, and scene layout changes. These factors require more robust and accurate algorithms to reduce localization errors. This review summarizes the state of the art and provides a forward-looking discussion of challenges and research directions.  Progress   Ground-to-aerial cross-view localization has advanced rapidly, particularly through the integration of range sensors such as LiDAR and millimeter-wave radar, which has opened new research directions and application scenarios. The development of this field can be divided into several stages. Early studies rely on manually designed features, marking a transition from same-view localization to cross-view geographic localization. With the emergence of deep learning, metric learning, image transformation, and image generation methods are adopted to learn correspondences between images captured from different viewpoints. However, many deep learning models exhibit limited robustness to temporal and spatial variations, especially in long-term cross-season scenarios in which visual appearances at the same location differ markedly across seasons. Additionally, the large-scale nature of urban environments presents difficulties for efficient image retrieval and matching. Range sensors provide accurate distance measurements and three-dimensional structural information, which support reliable localization in scenes where visual cues are weak or absent. Nevertheless, effective fusion of range-sensor data and visual data remains challenging because of discrepancies in spatial resolution, sampling frequency, and sensing coverage.  Conclusions  This paper reviews the evolution of ground-to-aerial cross-view localization technologies, analyzes major technical advances and their driving factors at different stages. From an algorithmic perspective, the main categories of ground-to-aerial cross-view localization methods are systematically discussed to provide a clear theoretical framework and technical overview. The role of benchmark datasets in promoting progress in this field is highlighted by comparing the performance of representative models across datasets, thereby clarifying differences and relative advantages among methods. Although notable progress has been achieved, several challenges persist, including cross-region localization accuracy, precise localization over large-scale aerial imagery, and sensitivity to temporal changes in geographic features. Further research is required to improve the robustness, accuracy, and efficiency of localization systems.  Prospects   Future research on ground-to-aerial cross-view localization is expected to concentrate on several directions. Greater attention should be paid to transform range-sensor data into feature representations that align effectively with image features, enabling efficient cross-modal information fusion. Multi-branch network architectures, in which different modalities are processed separately and then fused, may support richer feature extraction. Graph-based models may also be explored to capture shared semantics between ground and aerial views and to support information propagation across modalities. In addition, algorithms that adapt to seasonal variation, day-night cycles, and changing weather conditions are required to enhance robustness and localization accuracy. The integration of multi-scale and multi-temporal data may further improve adaptability to spatio-temporal variation, for example through the combination of images with different spatial resolutions or acquisition times. For large-scale urban environments, efficient search and matching strategies remain essential. Parallel computing frameworks may be applied to manage large datasets and accelerate retrieval, whereas algorithmic strategies such as pruning can reduce computational redundancy and improve matching efficiency. Overall, although ground-to-aerial cross-view localization continues to face challenges, it shows substantial potential for further methodological development and practical deployment.
DetDiffRS: A Detail-Enhanced Diffusion Model for Remote Sensing Image Super-Resolution
SONG Miao, CHEN Zhiqiang, WANG Peisong, XING Xiangwei, HUANG Liwei, CHENG Jian
Available online  , doi: 10.11999/JEIT250995
Abstract:
  Objective  This study aims to enhance the reconstruction of fine structural details in High-Resolution (HR) Remote Sensing image Super-Resolution (RSSR) by leveraging Diffusion Models (DM). Although diffusion-based approaches achieve strong performance in natural image restoration, their direct application to remote sensing imagery remains suboptimal because of the pronounced imbalance between extensive low-frequency homogeneous regions, such as water bodies and farmland, and localized high-frequency regions with complex structures, such as buildings, ports, and aircraft. This imbalance leads to insufficient learning of critical high-frequency details, resulting in reconstructions that appear globally smooth but lack sharpness and realism. To address this limitation, DetDiffRS, a detail-enhanced diffusion-based framework, is proposed to explicitly increase sensitivity to high-frequency information during data sampling and optimization, thereby improving perceptual quality and structural fidelity.  Methods  The DetDiffRS framework introduces improvements at both the data input and loss-function levels to mitigate the high–low frequency imbalance in remote sensing imagery. First, a Multi-Scale Patch Sampling (MSPS) strategy is proposed to increase the probability of selecting patches containing high-frequency structures during training. This is achieved by constructing a multi-scale patch pool and applying weighted sampling to prioritize structurally complex regions. Second, a composite perceptual loss is designed to provide supervision beyond conventional denoising objectives. This loss integrates a High-Dimensional Perceptual Loss (HDPL) to enforce structural consistency in deep feature space and a High-Frequency-Aware Loss (HFAL) to constrain high-frequency components in the frequency domain. The combination of MSPS and the composite perceptual loss enables the DM to capture and reconstruct fine details more effectively, improving both objective quality metrics and visual realism.  Results and Discussions  Extensive experiments are conducted on three publicly available remote sensing datasets, AID, DOTA, and DIOR, and comparisons are performed against representative state-of-the-art super-resolution methods, including CNN-based approaches (EDSR and RCAN), Transformer-based approaches (HAT-L and TTST), GAN-based approaches (MSRGAN, ESRGAN, and SPSR), and diffusion-based approaches (SR3 and IRSDE). Quantitative evaluation using Fréchet Inception Distance (FID) on the AID dataset shows that DetDiffRS achieves the best performance in 21 of 30 scene categories, with an average FID of 48.37, exceeding the second-best method by 1.14. The improvements are most evident in texture-rich and structurally complex categories such as Dense Residential, Meadow, and River, where FID reductions exceed 3.0 relative to competing diffusion-based methods (Table 1). Although PSNR-oriented methods such as RCAN achieve the highest PSNR and SSIM values in some cases, they generate overly smooth reconstructions with limited fine detail. In contrast, DetDiffRS, supported by HDPL and HFAL, achieves a balanced improvement in objective metrics and perceptual quality, improving PSNR by 1.06 dB over SR3 on AID and SSIM by 0.084 6 on DOTA (Table 2). Visual comparisons further indicate that DetDiffRS consistently produces sharper edges, clearer structures, and more realistic textures, reducing over-smoothing in PSNR-focused methods and artifacts commonly observed in GAN-based approaches (Fig. 5 and Fig. 6).  Conclusions  This study presents DetDiffRS, a detail-enhanced diffusion-based super-resolution framework tailored to the frequency distribution characteristics of remote sensing imagery. Through integration of the MSPS strategy and a composite perceptual loss that combines HDPL and HFAL, the proposed method addresses the underrepresentation of high-frequency regions during training and achieves substantial improvements in detail preservation and perceptual fidelity. Experimental results across multiple datasets and scene types demonstrate that DetDiffRS outperforms existing CNN-, Transformer-, GAN-, and diffusion-based methods in FID while maintaining a competitive balance between PSNR, SSIM, and visual realism. These results indicate that DetDiffRS provides a robust and generalizable solution for high-quality RSSR in applications requiring structural accuracy and fine detail reconstruction.
Enhanced Super-Resolution-based Dual-Path Short-Term Dense Concatenate Metric Change Detection Network for Heterogeneous Remote Sensing Images
LI Xi, ZENG Huaien, WEI Pengcheng
Available online  , doi: 10.11999/JEIT250328
Abstract:
  Objective  In sudden-onset natural disasters such as landslides and floods, homologous pre-event and post-event remote sensing images are often unavailable in a timely manner, which restricts accurate assessment of disaster-induced changes and subsequent disaster relief planning. Optical heterogeneous remote sensing images differ in sensor type, imaging angle, imaging altitude, and acquisition time. These differences lead to challenges in cross time–space–spectrum change detection, particularly due to spatial resolution inconsistency, spectral discrepancies, and the complexity and diversity of change types for identical ground objects. To address these issues, an Enhanced Super-Resolution-Based Dual-Path Short-Term Dense Concatenate Metric Change Detection Network (ESR-DSMNet) is proposed to achieve accurate and efficient change detection in optical heterogeneous remote sensing images.  Methods  The ESR-DSMNet consists of an Enhanced Super-Resolution-Based Heterogeneous Remote Sensing Image Quality Optimization Network (ESRNet) and a Dual-Path Short-Term Dense Concatenate Metric Change Detection Network (DSMNet). ESRNet first establishes mapping relationships between remote sensing images with different spatial resolutions using an enhanced super-resolution network. Based on this mapping, low-resolution images are reconstructed to enhance high-frequency edge information and fine texture details, thereby unifying the spatial resolution of heterogeneous remote sensing images at the image level. DSMNet comprises a semantic branch, a spatial-detail branch, a dual-branch feature fusion module, and a metric module based on a batch-balanced contrast loss function. This architecture addresses spectral discrepancies at the feature level and enables accurate and efficient change detection in heterogeneous remote sensing images. Three loss functions are used to optimize the proposed network, which is evaluated and compared with twelve deep learning-based change detection benchmark methods on four datasets, including homologous and heterogeneous remote sensing image datasets.  Results and Discussions  Comparative analysis on the SYSU dataset (Table 2) shows that DSMNet outperforms the other twelve change detection methods, achieving the highest recall and F1 values of 82.98% and 79.69%, respectively. The method exhibits strong internal consistency for large-area objects and the best visual performance (Fig. 5). On the CLCD dataset (Table 2), DSMNet ranks first in accuracy among the twelve methods, with recall and F1 values of 73.98% and 71.01%, respectively, and demonstrates superior performance in detecting small-object changes (Fig. 5). On the heterogeneous remote sensing image dataset WXCD (Table 3), ESR-DSMNet achieves the highest F1 value of 95.87% compared with the other methods, with more consistent internal regions and finer building edges (Fig. 6). On the heterogeneous remote sensing image dataset SACD (Table 3), ESR-DSMNet attains the highest recall and F1 values of 92.63% and 90.55%, respectively, and produces refined edges in both dense and sparse building change detection scenarios (Fig. 6). Compared with low-resolution images, the reconstructed images present sharper edges without distortion, which improves change detection accuracy (Fig. 6). Comparisons of reconstructed image quality using different super-resolution methods (Table 4 and Fig. 7), ablation experiments on the DSMNet core modules (Table 5 and Fig. 8), and model efficiency evaluations (Table 6 and Fig. 9) further verify the effectiveness and generalization performance of the proposed method.  Conclusions  The ESR-DSMNet is proposed to address spatial resolution inconsistency, spectral discrepancies, and the complexity and diversity of change types in heterogeneous remote sensing image change detection. The ESRNet unifies spatial resolution at the image level, whereas the DSMNet mitigates spectral differences at the feature level and improves detection accuracy and efficiency. The proposed network is optimized using three loss functions and validated on two homologous and two heterogeneous remote sensing image datasets. Experimental results demonstrate that ESR-DSMNet achieves superior generalization performance and higher accuracy and efficiency than twelve advanced deep learning-based remote sensing image change detection methods. Additional experiments on reconstructed image quality, DSMNet module ablation, and model efficiency comparisons further confirm the effectiveness of the proposed approach.
Research on Segmentation Algorithm of Oral and Maxillofacial Panoramic X-ray Images under Dual-domain Multiscale State Space Network
LI Bing, HU Weijie, LIU Xia
Available online  , doi: 10.11999/JEIT250639
Abstract:
  Objective  To address significant morphological variability, blurred boundaries between teeth and gingival tissues, and overlapping grayscale distributions in periodontal regions of oral and maxillofacial panoramic X-ray images, a state space model based on Mamba, a recently proposed neural network architecture, is adopted. The model preserves the advantage of Convolutional Neural Networks (CNNs) in local feature extraction while avoiding the high computational cost associated with Transformer-based methods. On this basis, a Dual-Domain Multiscale State Space Network (DMSS-Net)-based segmentation algorithm for oral and maxillofacial panoramic X-ray images is proposed, resulting in notable improvements in segmentation accuracy and computational efficiency.  Methods  An encoder–decoder architecture is adopted. The encoder consists of dual branches to capture global contextual information and local structural features, whereas the decoder progressively restores spatial resolution. Skip connections are used to transmit fused feature maps from the encoding path to the decoding path. During decoding, fused features gradually recover spatial resolution and reduce channel dimensionality through deconvolution combined with upsampling modules, finally producing a two-channel segmentation map.  Results and Discussions  Ablation experiments are conducted to validate the contribution of each module to overall performance, as shown in Table 1. The proposed model demonstrates clear performance gains. The Dice score increases by 5.69 percentage points to 93.86%, and the 95th percentile Hausdorff distance (HD95) decreases by 2.97 mm to 18.73 mm, with an overall accuracy of 94.57%. In terms of efficiency, the model size is 81.23 MB with 90.1 million parameters, which is substantially smaller than that of the baseline model, enabling simultaneous improvement in segmentation accuracy and reduction in parameter count. Comparative experiments with seven representative medical image segmentation models under identical conditions, as reported in Table 2, show that the DMSS-Net achieves superior segmentation accuracy while maintaining a model size comparable to, or smaller than, Transformer-based models of similar scale.  Conclusions  A DMSS-Net-based segmentation algorithm for oral and maxillofacial panoramic X-ray images is proposed. The algorithm is built on a dual-domain fusion framework that strengthens long-range dependency modeling in dental images and improves segmentation performance in regions with indistinct boundaries. The spatial-domain design effectively supports long-range contextual representation under dynamically varying dental arch morphology. Moreover, enhancement in the feature domain improves sensitivity to low-contrast structures and increases robustness against image interference.
Edge-Cloud Collaborative Searchable Attribute-Based Signcryption Approach for Internet of Vehicles
YU Huifang, WANG Qinggui, WANG Zihao
Available online  , doi: 10.11999/JEIT250750
Abstract:
  Objective   The dynamic and open environment of the Internet of Vehicles (IoV) poses substantial challenges to data security and real-time performance. Large-scale data interactions are vulnerable to eavesdropping, tampering, forgery, and replay attacks. Conventional cloud computing architectures exhibit inherent latency and cannot satisfy millisecond-level real-time requirements in IoV applications, which results in inefficient data transmission and an increased risk of traffic accidents. Therefore, balancing data security and real-time performance represents a critical bottleneck for large-scale IoV deployment.  Methods   An edge-cloud collaborative searchable attribute-based signcryption method is proposed for IoV applications. A multi-layer architecture is constructed, consisting of cloud servers, edge servers, and in-vehicle terminal devices. Access control is enforced through a hybrid key-policy and ciphertext-policy mechanism derived from attribute-based signcryption and a Linear Secret Sharing Scheme (LSSS). To reduce local decryption overhead, bilinear pairing operations are outsourced to edge nodes. SM9 is adopted for trapdoor generation and signature authentication. The proposed method provides data confidentiality, signature unforgeability, and trapdoor unforgeability.  Results and Discussions   The proposed method demonstrates superior performance in an IoV edge-cloud collaborative architecture for searchable attribute-based signcryption (Tables 1 and 4). Functional characteristics are summarized in (Table 1). (Fig. 2) illustrates the variation in total computation time as the number of attributes increases. Although the total time increases slightly, the growth rate remains low. By offloading computation-intensive tasks to edge nodes, the local computational burden on user terminals is substantially reduced. This optimization is quantified by an outsourcing efficiency exceeding 96% (Table 4, Fig. 5). Instantaneous retrieval is achieved by reducing the search complexity to O(1) through a hash-based index (Fig. 4). End-to-end search latency is maintained within an acceptable range for IoV applications (Table 6), which confirms suitability for real-time data access. As shown in (Fig. 3), with an increasing number of attributes, the ciphertext size variation of the proposed method remains the smallest among the compared schemes.  Conclusions   The proposed method achieves fine-grained access control, data confidentiality, data integrity, and unforgeability, while maintaining advantages in computational and communication efficiency. Through a computation offloading mechanism, the method effectively addresses resource constraints of on-board devices in dynamic, resource-sensitive, and real-time IoV environments.
Modeling and Dynamic Analysis of Controllable Multi-double Scroll Memristor Hopfield Neural Network
LIU Song, LI Zihan, QIU Da, LUO Min, LAI Qiang
Available online  , doi: 10.11999/JEIT250972
Abstract:
  Objective  The human brain is a complex neural system capable of integrated information storage, computation, and parallel processing. The collective activity of neuronal populations processes and coordinates sensory inputs, producing highly nonlinear dynamics. Developing artificial neural network models and analyzing them with nonlinear dynamics theory is therefore of considerable scientific and practical interest. As a brain-inspired model, the Hopfield Neural Network (HNN) exhibits more diverse dynamics when a Memristor Hopfield Neural Network (MHNN) is formed by introducing a memristor into its structure. Among such systems, networks that generate Multi-Double Scroll (MDS) attractors are advantageous because their richer dynamical behavior and more complex topological structure offer strong potential for applications such as image encryption.Methods: A memristor model based on an arctangent-function series is proposed and introduced into a fully connected HNN. This forms an MHNN that incorporates electromagnetic radiation effects and memristive synaptic weights. The mechanism responsible for generating MDS chaotic attractors is examined through equilibrium-point analysis. Dynamical characteristics, including the effects of memristive synaptic coupling strength and initial offset boosting, are evaluated using bifurcation diagrams, Lyapunov-exponent spectra, and attraction basins. The system is then implemented on an FPGA platform.Results and Discussions: The MHNN generates an arbitrary number of multi-directional MDS chaotic attractors (Figs. 4, 5, 6). Adjusting the memristive synaptic coupling strength yields distinct coexisting attractor types (Figs. 7, 8). Multiple coexisting MDS chaotic attractors also emerge from modifications of the initial values (Figs. 9, 10, 11, 12). Hardware implementation on an FPGA (Figs. 13, 14) confirms the correctness and feasibility of the system.Conclusions: The proposed MHNN generates unidirectional, bidirectional, and tridirectional MDS chaotic attractors in phase space. The number of scrolls is tuned by the memristor control parameter. The system also shows initial offset boosting, and the number of coexisting attractors is regulated by this parameter. Higher-dimensional networks can be constructed by increasing the number of memristive synapses, demonstrating the broad generality of the model. Owing to its complex topology and rich dynamics, the network offers promising potential for engineering applications.
A Fault Diagnosis Method for Flight Control Systems Combining Pose-Invariant Features and a Semi-Supervised RDC-GAN Model
ZHANG Jingsen, HOU Biao, LI Zhijie, BI Wenping, WU Zitong
Available online  , doi: 10.11999/JEIT250964
Abstract:
  Objective   In recent years, China has actively promoted the development of the low-altitude economy, leading to the increasingly widespread application of drones across multiple industries. As highly complex aerial systems, Unmanned Aerial Vehicles (UAVs) are susceptible to various failures during operation. The flight control system, which serves as the core of UAV flight operations, may develop faults that are less evident than physical damage to components such as motors or propellers. However, such faults can directly cause flight instability or complete loss of control. Fault diagnosis of UAV flight control systems faces two major challenges. First, as an emerging aerial platform, UAVs have far fewer effectively accumulated training samples than traditional diagnostic targets such as bearings, resulting in data scarcity. Second, owing to strong maneuverability, UAVs exhibit substantial variations in data distribution under different flight attitudes, which limits the diagnostic accuracy of most existing models under rapidly changing operating conditions. Therefore, the development of an effective fault diagnosis method for UAV flight control systems is of both academic interest and practical engineering value.  Methods   A fault diagnosis method for flight control systems based on pose-invariant features and a semi-supervised Reloaded Dense Generative Adversarial Classification Network (RDC-GAN) is proposed. The overall framework is illustrated in Fig. 1. Flight logs collected from the UAV are used as raw diagnostic data. After data cleaning, a differential flatness-based data selection method is applied to separate the flight data into pose-dependent data and pose-independent data. For pose-dependent data, Empirical Mode Decomposition-Squeeze Excitation Network (EMD-SENet) is adopted to extract pose-invariant features, as shown in Fig. 3. An adaptive feature fusion module is then used to perform weighted fusion of pose-independent data, pose-invariant features, and pose-dependent data, as illustrated in Fig. 4. The fused features are subsequently input into a semi-supervised RDC-GAN diagnostic model, whose architecture is presented in Fig. 4. Model training is conducted in two stages. In the first stage, unsupervised training is performed to initialize the network parameters using a large set of unlabeled samples. In the second stage, supervised training is carried out with a small number of labeled samples, enabling accurate fault diagnosis under limited labeling conditions.  Results and Discussions   The proposed method is first validated on the publicly available RflyMad dataset, which contains magnetometer fault, accelerometer fault, gyroscope fault, Global Navigation Satellite System (GNSS) fault, and no-fault data under five flight attitude modes. Fig. 5 and Fig. 6 illustrate the pose-invariant features extracted by EMD-SENet and the synthetic samples generated by the RDC-GAN generator, respectively. Diagnostic performance is evaluated using Overall Accuracy (OA), Average Accuracy (AA), and the Kappa coefficient, in addition to class-wise accuracy for each fault category. The results on the RflyMad dataset are summarized in Table 3. The proposed method achieves 95.71% OA, 95.32% AA, and a Kappa coefficient of 95.41%, exceeding the second-best comparative method by 2.17%, 2.42%, and 2.40%, respectively. For real-flight experiments, a fault injection approach based on a redundant positioning system is designed. A motion capture system and an Ultra-WideBand (UWB) four-base-station positioning system are employed to ensure experimental reliability and operational safety. The experimental setup is shown in Fig. 12. Online real-flight diagnostic results are presented in Fig. 13, with an OA of 92.78%. Fault diagnosis time is reported in Table 5, and false alarm statistics are provided in Table 6.  Conclusions   A fault diagnosis method for flight control systems that integrates pose-invariant features with a semi-supervised RDC-GAN model is presented to address data scarcity and flight attitude-induced distribution variation in UAV diagnostics. Differential flatness-based data selection is used to distinguish pose-dependent data from pose-independent data, and pose-invariant features are extracted using EMD-SENet. An adaptive feature fusion strategy is applied to balance heterogeneous features, and phased semi-supervised training of the RDC-GAN model enables high diagnostic accuracy with a limited number of labeled samples. Experimental validation on the RflyMad dataset and real UAV flight scenarios confirms the effectiveness of the proposed method.
A Multi-step Channel Prediction Method Based on Pseudo-3D Convolutional Neural Network with Attention Mechanism
TAO Jing, HOU Meng, PENG Wei, ZHANG Guoyan, DAI Jiaming, LIU Weiming, WANG Haidong, WANG Zhen
Available online  , doi: 10.11999/JEIT251090
Abstract:
  Objective  With the rapid growth in connections and data traffic in Fifth Generation (5G) mobile networks, massive Multiple-Input Multiple-Output (MIMO) has become a key technology for improving network performance. The spectral efficiency and energy efficiency of massive MIMO transmission depend on accurate Channel State Information (CSI). However, the non-stationary characteristics of wireless channels, terminal processing delay, and the use of ultra-high-frequency bands intensify CSI aging, which necessitates channel prediction. Most mainstream prediction schemes are designed for generalized stationary channels and rely on single-step prediction. In non-stationary environments, CSI obtained through single-step prediction is likely to become outdated, and frequent single-step prediction greatly increases pilot overhead. To address these challenges, a multi-step channel prediction method based on a Pseudo-Three-Dimensional Convolutional Neural Network (P3D-CNN) and an attention mechanism is proposed. The method learns the joint time-frequency characteristics of CSI, leverages high frequency-domain correlation to mitigate the effect of lower time-domain correlation in multi-step prediction, and improves prediction performance.  Methods  In this study, the uplink model of a massive MIMO system is constructed (Fig. 1). CSI is obtained through channel estimation, using an Inverse Fast Fourier Transform (IFFT) at the transmitter and a Fast Fourier Transform (FFT) at the receiver. Actual channel measurements provide a CSI dataset with time–frequency dimensions, and autocorrelation analyses are performed in both domains. A multi-step channel prediction network, termed P3D-CNN with the Convolutional Block Attention Module (CBAM) (Fig. 10), is designed. The P3D-CNN structure replaces the traditional Three-Dimensional Convolutional Neural Network (3D-CNN) by decomposing the three-dimensional convolution into a two-dimensional convolution in the frequency domain and a one-dimensional convolution in the time domain, which greatly reduces computational complexity. The CBAM-based hybrid attention mechanism is incorporated to extract global information in the frequency and channel domains, further improving channel prediction accuracy.  Results and Discussions  Based on the measured CSI dataset, the prediction method using an AutoRegressive (AR) model, the prediction method using Fully Connected Long Short-Term Memory (FC-LSTM), and the prediction method using P3D-CNN-CBAM are compared under different prediction steps. Simulation results show that the average Normalized Mean Square Error (NMSE) of the proposed P3D-CNN-CBAM method is lower than that of the other two methods (Fig. 15). As the prediction step increases from 1 to 10, prediction error rises sharply because the AR model and FC-LSTM rely solely on time-domain correlation. When the prediction step is 10, the average NMSE of these two methods reaches 0.5868 and 0.7648, respectively. The P3D-CNN-CBAM method yields an average NMSE of only 0.3078, maintaining strong prediction performance. The improvement brought by integrating CBAM into the P3D-CNN network is also verified (Fig. 17). Finally, through transfer learning, the proposed method is extended from single-day datasets to multi-day scenarios.  Conclusions  Based on the measured CSI dataset, a multi-step prediction method addressing CSI aging in massive MIMO systems is proposed. The method applies P3D-CNN with CBAM to improve multi-step prediction accuracy. By replacing full three-dimensional convolution with pseudo-three-dimensional convolution, time-frequency CSI information is effectively extracted, and the CBAM mechanism enhances the learning of global features. Experimental results show that: (1) the proposed method achieves clear performance advantages over AR- and FC-LSTM-based approaches; and (2) through transfer learning, multi-step prediction is extended from single-antenna to multi-antenna scenarios.
Bionic Behavior Modeling Method for Unmanned Aerial Vehicle Swarms Empowered by Deep Reinforcement Learning
HE Ming, WU Jingjing, HAN Wei, LIU Sicong, PAN Pan, XIA Hengyu
Available online  , doi: 10.11999/JEIT251103
Abstract:
  Significance   Unmanned Aerial Vehicle (UAV) swarm technology is a core driver of low-altitude economic development and intelligent unmanned system evolution, yielding cooperative effects greater than the sum of individual UAVs in disaster response, environmental monitoring, and logistics distribution. As mission scenarios shift toward dynamic heterogeneity, strong interference, and large-scale deployment, traditional centralized control architectures, although theoretically feasible, do not achieve practical implementation and remain a major constraint on engineering application. Bionic Swarm Intelligence (BSI), a distributed intelligent paradigm that simulates the self-organization, elastic reconfiguration, and cooperative behavior of biological swarms, offers a path to overcoming these limitations. The integration of Deep Reinforcement Learning (DRL) enables a transition from static behavior simulation to adaptive autonomous learning and decision-making. The combined BSI-DRL framework allows UAV swarms to optimize cooperative strategies through data-driven interaction, addressing the limited adaptability of manually designed bionic rules. Clarifying the progress and challenges of UAV swarm modeling based on BSI-DRL is essential for supporting engineering transformation and improving practical system performance.   Progress   The progress of BSI-DRL-driven UAV swarm behavior modeling is summarized from four aspects.(1) BSI’s concept and core characteristics: BSI, a biology-oriented subset of Swarm Intelligence (SI), is defined by four characteristics: distributed control without dependence on a central command, self-organization through spontaneous disorder-to-order transition, robustness through functional maintenance under disturbances, and adaptability through dynamic strategy optimization in complex environments. (2) Three-stage paradigm transition of BSI: (a)Before 2010 (rule transplantation stage): work centered on applying fixed bionic algorithms such as particle swarm optimization and biological models (e.g., Boids, Vicsek) to UAV path planning, with SI dependent on preset rules (Fig. 2). (b)From 2010 to 2020 (systematic decentralized control stage): studies shifted toward systematic design and decentralized control theory, enabling a transition from simulation to physical verification but showing limited adaptability under dynamic conditions (Fig. 2). (c)Since 2020 (AI-enhanced autonomous learning stage): integration of DRL enabled a transition to autonomous learning and decision-making, allowing UAV swarms to develop advanced cooperative strategies when facing unknown environments (Fig. 2).(3) Typical biological swarm mechanisms and bionic mapping: Four representative biological mechanisms provide bionic prototypes. (a)Pigeon flock hierarchy, characterized by a three-tier coupled structure, supports formation control and cooperation under interference. (b)Wolf pack hunting, structured as four-stage dynamic collaboration, enables efficient task division. (c)Fish school self-repair through decentralized topology adjustment enhances swarm robustness. (d)Honeybee colony division of labor, based on decentralized decision-making and dynamic role assignment, improves task efficiency. Bionic mapping proceeds through three steps: decomposition of the biological prototype and extraction of behavioral features using dynamic mode decomposition, social interaction filtering, and group state classification (Fig. 5); abstraction of behavior rules and mathematical modeling using approaches such as differential equations and graph theory; and algorithmic adaptation and intelligent enhancement by converting mathematical models into executable rules and integrating DRL.(4) Core BSI-DRL modeling directions: Three main technical paths are summarized with horizontal comparison (Table 1). (a)Bionic-rule parameterization with DRL optimization (shallow fusion): DRL is used to optimize key parameters of bionic models, such as attraction-repulsion weights in Boids, preserving biological robustness but exhibiting instability during large-swarm training. (b)Generative bionic-rule multi-agent reinforcement learning (middle fusion): bio-inspired reward functions guide the autonomous emergence of cooperative rules, improving adaptability but reducing interpretability due to “black-box” characteristics. (c)Dynamic role assignment with hierarchical DRL (deep fusion): a three-tier architecture comprising global planning, group role assignment, and individual execution reduces decision-making complexity in heterogeneous swarms and strengthens multi-task adaptability, although multi-level coordination remains challenging. A scenario-adaptation logic based on swarm scale, environmental dynamics, and task heterogeneity, together with a multi-method fusion strategy, is also proposed.  Conclusions   This study clarifies the theoretical framework and research progress of BSI-DRL-based UAV swarm behavior modeling. BSI addresses limitations of traditional centralized control, including scale expandability, dynamic adaptability, and system credibility, by simulating biological swarm mechanisms. DRL further enables a shift toward autonomous learning. Horizontal comparison indicates complementary strengths across the three core directions: parameterization optimization maintains basic robustness, generative methods enhance dynamic adaptability, and hierarchical collaboration improves performance in heterogeneous multi-task settings. The proposed scenario-adaptation logic, which applies parameterization to small-to-medium and static scenarios, generative methods to medium-to-large and dynamic scenarios, and hierarchical collaboration to heterogeneous multi-task missions, together with the multi-method fusion strategy, offers feasible engineering pathways. Key engineering bottlenecks are also identified, including inconsistent environmental perception, unbalanced multi-objective decision-making, and limited system interpretability, providing a basis for targeted technical advancement.  Prospects  Future work focuses on five directions to enhance the capacity of BSI-DRL for complex UAV swarm tasks. (1)Cross-species biological mechanism integration: combining advantages of different biological prototypes to construct adaptive hybrid systems. (2) BSI-DRL closed-loop collaborative evolution: establishing a bidirectional interaction framework in which BSI provides initial strategies and safety boundaries, while DRL refines bionic rules online. (3)Bird-swarm-like phase-transition control and DRL fusion: using phase-transition order parameters as DRL observation indicators to improve parameter interpretability. (4)Digital-twin and hardware-in-the-loop training and verification: building high-fidelity digital-twin environments to narrow simulation–reality gaps. (5)Real-scenario performance evaluation and field deployment: conducting field tests to assess algorithm effectiveness and guide theoretical refinement.
Research on Load Modulation Enhancement of Quasi-Ideal Doherty Power Amplifier with Equivalent Transconductance Compensation
HUA Jun, XU Gaoming, CHEN Jinghao, LU Siyang, YOU Leiyuan, LÜ Yan, LI Gang, SHI Weimin, LIU Taijun
Available online  , doi: 10.11999/JEIT250789
Abstract:
  Objective  Modern wireless communication systems require efficient dynamic-range performance in RF power amplifiers. The Doherty Power Amplifier (DPA), which uses dynamic load modulation between the main and auxiliary paths, achieves high efficiency at power backoff. It is widely applied in multi-carrier 4G and 5G macro base stations. Research on DPAs generally focuses on improving backoff efficiency, backoff range, and bandwidth. However, the architecture has a structural limitation because the auxiliary amplifier, biased in Class C, exhibits weak current output compared with the main amplifier biased in Class AB. The low conduction level and short turn-on period of the auxiliary path create nonlinear imbalance and reduce overall performance.  Methods  The study addresses insufficient load modulation caused by the weak current output capability of the auxiliary amplifier. An equivalent transconductance compensation theory is proposed. It compensates the current of the auxiliary amplifier under Class C bias by injecting a compensatory current into the branch. A load-modulation-enhanced quasi-ideal high-performance DPA is developed to resolve the inherent current deficiency in the auxiliary path of traditional configurations.  Results and Discussions  A load-modulation-enhanced DPA was designed and fabricated using the GaN HEMT device CG2H40010F for the 1.3\begin{document}$ \sim $\end{document}1.8 GHz band. Measurements show that the saturated output power ranges from 43.7 to 44.5 dBm and that the Drain Efficiency (DE) exceeds 69.1%. At a 6 dB backoff, the DE remains between 62.9% and 69.4% and the gain ranges from 9.7 to 10.5 dB. At a 9 dB backoff, the DE ranges from 49.5% to 57% and the gain ranges from 10.3 to 11.5 dB. The equivalent transconductance compensation theory resolves the load modulation bottleneck of traditional DPA structures through the current-injection mechanism. It provides meaningful guidance for broadband RF power-amplifier design with high backoff efficiency.  Conclusions  The study proposes an equivalent transconductance compensation method by adding a third compensation branch to the traditional DPA structure. This mechanism corrects the weak auxiliary-amplifier current caused by Class C bias and its short turn-on period, thereby achieving a quasi-ideal load-modulation-enhanced DPA. A device operating from 1.3 to 1.8 GHz was designed to validate the method. The measured saturated DE exceeds 69.1%. The DE ranges from 62.9% to 69.4% at a 6 dB backoff and from 49.5% to 57% at a 9 dB backoff. The linearized Adjacent Channel Leakage Ratio (ACLR) is lower than –49 dBc. These results verify the feasibility of the method and show strong application potential.
Neighboring Mutual-Coupling Channel Model and Tunable-Impedance Optimization Method for Reconfigurable-Intelligent-Surface Aided Communications
WU Wei, WANG Wennai
Available online  , doi: 10.11999/JEIT251109
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RIS) attract increasing attention due to their ability to controllably manipulate electromagnetic wave propagation. A typical RIS consists of a dense array of Reflecting Elements (REs) with inter-element spacing no greater than half a wavelength, under which electromagnetic mutual coupling inevitably occurs between adjacent REs. This effect becomes more pronounced when the element spacing is smaller than half a wavelength and can significantly affect the performance and efficiency of RIS-assisted systems. Accurate modeling of mutual coupling is therefore essential for RIS optimization. However, existing mutual-coupling-aware channel models usually suffer from high computational complexity because of the large dimensionality of the mutual-impedance matrix, which restricts their practical use. To address this limitation, a simplified mutual-coupling-aware channel model based on a sparse neighboring mutual-coupling matrix is proposed, together with an efficient optimization method for configuring RIS tunable impedances.  Methods  First, a simplified mutual-coupling-aware channel model is established through two main steps. (1) A neighboring mutual-coupling matrix is constructed by exploiting the exponential decay of mutual impedance with inter-element distance. (2) A closed-form approximation of the mutual impedance between the transmitter or receiver and the REs is derived under far-field conditions. By taking advantage of the rapid attenuation of mutual impedance as spacing increases, only eight or three mutual-coupling parameters, together with one self-impedance parameter, are retained. These parameters are arranged into a neighboring mutual-coupling matrix using predefined support matrices. To further reduce computational burden, the distance term in the mutual-impedance expression is approximated by a central value under far-field assumptions, which allows the original integral formulation to be simplified into a compact analytical expression. Based on the resulting channel model, an efficient optimization method for RIS tunable impedances is developed. Through impedance decomposition, a closed-form expression for the optimal tunable-impedance matrix is derived, enabling low-complexity RIS configuration with computational cost independent of the number of REs.  Results and Discussions  The accuracy and computational efficiency of the proposed simplified models, as well as the effectiveness of the proposed impedance optimization method, are validated through numerical simulations. First, the two simplified models are evaluated against a reference model. The first simplified model accounts for mutual coupling among elements separated by at most one intermediate unit, whereas the second model considers only immediately adjacent elements. Results indicate that channel gain increases as element spacing decreases, with faster growth observed at smaller spacings (Fig. 4). The modeling error between the simplified models and the reference model remains below 0.1 when the spacing does not exceed λ/4, but increases noticeably at larger spacings. Error curves further show that the modeling errors of both simplified models become negligible when the spacing is below λ/4, indicating that the second model can be adopted to further reduce complexity (Fig. 6). Second, the computational complexity of the proposed models is compared with that of the reference model. When the number of REs exceeds four, the complexity of computing the mutual-coupling matrix in the reference model exceeds that of the proposed neighboring mutual-coupling model. As the number of REs increases, the complexity of the reference model grows rapidly, whereas that of the proposed model remains constant (Fig. 5). Finally, the proposed impedance optimization method is compared with two benchmark methods (Fig. 7, Fig. 8). When the element spacing is no greater than λ/4, the channel gain achieved by the proposed method approaches that of the benchmark method. As the spacing increases beyond this range, a clear performance gap emerges. In all cases, the proposed method yields higher channel gain than the coherent phase-shift optimization method.  Conclusions  The integration of a large number of densely arranged REs in an RIS introduces notable mutual coupling effects, which can substantially influence system performance and therefore must be considered in channel modeling and impedance optimization. A simplified mutual-coupling-aware channel model based on a neighboring mutual-coupling matrix has been proposed, together with an efficient tunable-impedance optimization method. By combining the neighboring mutual-coupling matrix with a simplified mutual-impedance expression derived under far-field assumptions, a low-complexity channel model is obtained. Based on this model, a closed-form solution for the optimal RIS tunable impedances is derived using impedance decomposition. Simulation results confirm that the proposed channel model and optimization method maintain satisfactory accuracy and effectiveness when the element spacing does not exceed λ/4. The proposed framework provides practical theoretical support and useful design guidance for analyzing and optimizing RIS-assisted systems under mutual coupling effects.
A Review of Joint EEG-fMRI Methods for Visual Evoked Response Studies
WEI Zhiwei, XIAO Xiaolin, XU Minpeng, MING Dong
Available online  , doi: 10.11999/JEIT250781
Abstract:
  Significance   The study of visual evoked responses (VERs) using non-invasive neuroimaging techniques is a cornerstone of neuroscience, providing critical insights into the mechanisms of human visual information processing. Among the available modalities, electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) are paramount. EEG captures neural electrical activity with millisecond-level temporal resolution but is fundamentally limited by its poor spatial localization capabilities. Conversely, fMRI provides millimeter-level spatial precision by measuring the blood-oxygen-level-dependent (BOLD) signal, yet its temporal resolution is inherently constrained by the sluggish nature of hemodynamic responses. This intrinsic trade-off between temporal and spatial resolution significantly hampers the ability of any single modality to fully elucidate complex visual processes such as attentional modulation, motion perception, and multi-sensory integration. To overcome this bottleneck, the joint application of EEG and fMRI has emerged as a powerful multimodal approach. By synchronously acquiring both datasets, this integrated technique synergistically combines the distinct strengths of each modality, offering a comprehensive spatiotemporal perspective on the complex dynamics of visual neural networks. Despite its growing adoption, existing literature often lacks a focused, systematic review that specifically details the core methodologies, illustrates key applications, and outlines the persistent challenges and future trends of joint EEG-fMRI in VER research. This review aims to fill this gap by providing a comprehensive and structured overview of the field, serving as a foundational reference for researchers seeking to leverage this advanced technique to explore the visual system.  Progress   This review first elaborates on the foundational technologies that enable joint EEG-fMRI studies, starting with the synchronous acquisition of data. This is addressed through MR-compatible EEG systems and dedicated synchronization hardware. The core of the review then systematically analyzes data fusion methodologies, which are categorized into asymmetric and symmetric approaches. Asymmetric fusion uses one modality to constrain the analysis of the other, exemplified by EEG-informed fMRI analysis, which uses single-trial EEG features to model fMRI data, and fMRI-informed EEG source imaging, which uses fMRI activation maps as spatial priors to enhance source localization accuracy. In contrast, symmetric fusion treats both modalities equally, with data-driven techniques like joint independent component analysis (joint ICA) being widely adopted to reveal shared underlying neural sources without strong biophysical assumptions. The application of these methodologies has yielded significant breakthroughs across multiple domains. In visual mechanism analysis, the technique has been instrumental in dissecting the complex feedforward and feedback dynamics of cortical areas involved in vision. In clinical diagnosis and evaluation, joint EEG-fMRI provides objective neurophysiological biomarkers for visual disorders like amblyopia and epilepsy by identifying distinct patterns of cortical activation deficits and network dysfunctions. In the field of brain-computer interfaces (BCIs), the fusion of multimodal features has significantly improved the accuracy and robustness of decoding visual intentions.  Conclusions  This review critically examines the joint EEG-fMRI landscape for VER studies, systematically classifying the key data acquisition and fusion methodologies and highlighting their representative applications. The analysis reveals that the choice of an optimal fusion strategy—be it asymmetric or symmetric, data-driven or model-driven—is highly dependent on the specific research question, available data quality, and underlying assumptions. While the technique has proven useful in advancing basic neuroscience, clinical diagnostics, and BCI development, its broader adoption is still hindered by persistent challenges. At the system level, hardware-induced artifacts, particularly the severe electromagnetic interference in ultra-high-field MRI environments, remain a major technical obstacle that compromises data quality. At the algorithmic level, the inherent mismatch in spatiotemporal scales between the fast, transient EEG signals and the slow, delayed BOLD response continues to pose a core fusion challenge. This is further complicated by high inter-subject variability in neural responses, which limits the generalizability of analytical models and decoding algorithms across individuals. These limitations underscore the need for continued innovation in both hardware engineering and computational methods to unlock the full potential of this powerful multimodal technique.  Prospects   Looking ahead, the research landscape for joint EEG-fMRI methods in VER studies is poised for significant evolution, constituting a long-term and complex process. With the integration of emerging technologies such as artificial intelligence, the methodological frameworks in this domain will evolve toward greater intelligence and automation. System-level trends point toward the development of next-generation hardware, including ultra-high-field MRI systems combined with artifact-immune EEG sensors and real-time artifact correction algorithms. Furthermore, the establishment of open-access, multi-center EEG-fMRI databases (following standards like BIDS) and standardized analysis pipelines will be crucial for improving the reproducibility and comparability of research findings, fostering a collaborative ecosystem. Algorithm-level trends are increasingly centered on the integration of artificial intelligence and deep learning. End-to-end neural network architectures, such as those incorporating spatiotemporal attention mechanisms, hold the promise of learning the complex, non-linear transformations between EEG and fMRI data directly, thus overcoming the limitations of traditional linear models. Moreover, leveraging transfer learning and personalized modeling frameworks can address the challenge of inter-subject variability, leading to the development of adaptive and robust models for visual decoding and clinical applications. Concurrently, as clinical and BCI applications accelerate, the critical challenge of balancing model complexity with interpretive clarity and computational efficiency warrants in-depth investigation. Ultimately, these synergistic advancements in hardware and algorithms will deepen our understanding of the visual system’s computational principles, refine the diagnosis and treatment of visual disorders, and propel the development of more intuitive and powerful brain-computer interfaces.
UMM-Det: A Unified Object Detection Framework for Heterogeneous Multi-Modal Remote Sensing Imagery
ZOU Minrui, LI Yuxuan, DAI Yimian, LI Xiang, CHENG Mingming
Available online  , doi: 10.11999/JEIT250933
Abstract:
  Objective  With the increasing demand for space-based situational awareness, object detection across multiple modalities has become a fundamental yet challenging task. Current large-scale multimodal detection models for space-based remote sensing primarily operate on single-frame images from visible light, synthetic aperture radar (SAR), and infrared modalities. Although these models achieve acceptable performance in conventional detection, they severely neglect the crucial role of infrared video sequences in enhancing the accuracy of weak and small target detection. Temporal information inherent in sequential infrared data provides discriminative cues for separating dynamic targets from complex clutter, which cannot be captured by single-frame detectors. To address this limitation, this study proposes UMM-Det, a novel unified detection model tailored for infrared sequences. The proposed model not only extends the capability of existing space-based multimodal frameworks to sequential data but also demonstrates that exploiting temporal dynamics is indispensable for next-generation high-precision space-based sensing systems.  Methods  UMM-Det builds upon the unified multimodal detection framework SM3Det but introduces three key innovations. First, the ConvNeXt backbone is replaced with InternImage, a state-of-the-art architecture featuring dynamic sampling and large receptive field modeling. This modification is intended to improve feature extraction robustness against multi-scale variations and low-contrast appearances that are typical of weak and small targets. Second, a novel spatiotemporal visual prompting module is specifically designed for the infrared branch. This module generates high-contrast motion features by applying a refined frame-difference enhancement strategy. The resulting temporal priors guide the backbone network to focus attention on dynamic target regions, thereby mitigating the confusion introduced by static background noise. In addition, to overcome the imbalance between positive and negative samples during training, the probabilistic anchor assignment (PAA) strategy is incorporated into the infrared detection head. This improves the reliability of anchor selection and enhances the precision of small target detection under highly skewed data distributions. The overall pipeline is illustrated in Fig. 1, and the schematic of the spatiotemporal visual prompting module is shown in Fig. 2.  Results and Discussions  Extensive experiments are conducted on three public benchmarks: SatVideoIRSTD for infrared sequence detection, SARDet-50K for SAR-based target detection, and DOTA for visible light remote sensing detection. Results demonstrate that UMM-Det consistently outperforms the baseline SM3Det model across all modalities. Specifically, in infrared sequence small target detection, UMM-Det improves detection accuracy by 2. 54% over SM3Det (Table 2, Fig. 5), validating the effectiveness of incorporating temporal priors. In SAR target detection (Table 2, Fig. 3), the model achieves an improvement of 2. 40% mAP@0. 5: 0. 95, while in visible light detection (Table 2, Fig. 4), a gain of 1. 77% is observed. These improvements highlight the generalizability of the proposed framework across heterogeneous modalities. Furthermore, despite performance gains, UMM-Det reduces the number of parameters by more than 50% compared with SM3Det (Table 2), thereby ensuring high efficiency and lightweight deployment suitability for space-based systems. Qualitative comparisons shown in Fig. 4 and Fig. 5 indicate that UMM-Det detects low-contrast and dynamic weak targets missed by the baseline.The discussions emphasize three major findings. First, the spatiotemporal visual prompting strategy effectively transforms frame-to-frame variations into salient motion-aware cues, which are critical for distinguishing small dynamic targets from clutter in complex infrared environments. Second, the integration of InternImage as the backbone substantially strengthens multi-scale representation capability, ensuring robustness in detecting targets of varying sizes and contrast levels. Third, the probabilistic anchor assignment strategy significantly alleviates the training imbalance problem, leading to more stable optimization and higher detection reliability. Taken together, these components demonstrate a synergistic effect, yielding superior performance not only in sequential infrared data but also in static SAR and visible modalities.  Conclusions  This study proposes UMM-Det, the first space-based multimodal detection model explicitly designed to incorporate infrared sequence information into a unified detection framework. By leveraging InternImage for advanced feature extraction, a spatiotemporal visual prompting module for motion-aware enhancement, and probabilistic anchor assignment for balanced training, UMM-Det delivers significant gains in detection accuracy while reducing computational cost by more than half. The experimental results on SatVideoIRSTD, SARDet-50K, and DOTA collectively demonstrate that the model achieves state-of-the-art performance across infrared, SAR, and visible light modalities, with improvements of 2. 54%, 2. 40%, and 1. 77% respectively. Beyond its technical contributions, UMM-Det provides an effective pathway toward the construction of next-generation high-performance space-based situational awareness systems, where precision, efficiency, and lightweight design are simultaneously critical. Future research may extend this framework to multi-satellite collaborative sensing and real-time onboard deployment.
Security Protection for Vessel Positioning in Smart Waterway Systems Based on Extended Kalman Filter–Based Dynamic Encoding
TANG Fengjian, YAN Xia, SUN Zeyi, ZHU Zhaowei, YANG Wen
Available online  , doi: 10.11999/JEIT250846
Abstract:
  Objective  With the rapid development of intelligent shipping systems, vessel positioning data face severe privacy leakage risks during wireless transmission. Traditional privacy-preserving methods, such as differential privacy and homomorphic encryption, suffer from data distortion, high computational overhead, or reliance on costly communication links, making it difficult to achieve both data integrity and efficient protection. This study addresses the characteristics of vessel stabilization systems and proposes a dynamic encoding scheme enhanced by time-varying perturbations. By integrating the Extended Kalman Filter (EKF) and introducing unstable temporal perturbations during encoding, the scheme uses receiver-side acknowledgments (ACK feedback) to achieve reference-time synchronization and independently generates synchronized perturbations through a shared random seed. Theoretical analysis and simulations show that the proposed method achieves nearly zero precision loss in state estimation for legitimate receivers, whereas decoding errors of eavesdroppers grow exponentially after a single packet loss, effectively countering both single- and multi-channel eavesdropping attacks. The shared-seed synchronization mechanism avoids complex key management and reduces communication and computational costs, making the scheme suitable for resource-constrained maritime wireless sensor networks.  Methods  The proposed dynamic encoding scheme introduces a time-varying perturbation term into the encoding process. The perturbation is governed by an unstable matrix to induce exponential error growth for eavesdroppers. The encoded signal is constructed from the difference between the current state estimate and a time-scaled reference state, combined with the perturbation term. A shared random seed between legitimate parties enables deterministic and synchronized generation of the perturbation sequence without online key exchange. At the legitimate receiver, the perturbation is canceled during decoding, enabling accurate state recovery. Local state estimation at each sensor node is performed using EKF, and the overall communication process is reinforced by acknowledgment-based synchronization to maintain consistency between the sender and receiver.  Results and Discussions  Simulations are conducted in a wireless sensor network with four sensors tracking vessel states, including position, velocity, and heading. The results indicate that legitimate receivers achieve nearly zero estimation error (Fig. 3), whereas eavesdroppers exhibit exponentially increasing errors after a single packet loss (Fig. 4). The error growth rate depends on the instability of the perturbation matrix, confirming the theoretical divergence. In multi-channel scenarios, independent perturbation sequences for each channel prevent cross-channel correlation attacks (Fig. 5). The scheme maintains low communication and computational overhead, making it practical for maritime environments. Furthermore, the method shows strong robustness to packet loss and channel variations, satisfying SOLAS requirements for data integrity and reliability.  Conclusions  A dynamic encoding scheme with time-varying perturbations is proposed for privacy-preserving vessel state estimation. By integrating EKF with an unstable perturbation mechanism, the method ensures high estimation precision for legitimate users and exponential error growth for eavesdroppers. The main contributions are as follows: (1) an encoding framework that achieves zero precision loss for legitimate receivers; (2) a lightweight synchronization mechanism based on shared random seeds, which removes complex key management; and (3) theoretical guarantees of exponential error divergence for eavesdroppers under single- or multi-channel attacks. The scheme is robust to packet loss and channel asynchrony, complies with SOLAS data integrity requirements, and is suitable for resource-limited maritime networks. Future work will extend the method to nonlinear vessel dynamics, adaptive perturbation optimization, and validation in real maritime communication environments.
Unsupervised 3D Medical Image Segmentation With Sparse Radiation Measurement
YU Xiaofan, ZOU Lanlan, GU Wenqi, CAI Jun, KANG Bin, DING Kang
Available online  , doi: 10.11999/JEIT250841
Abstract:
  Objective  Three-dimensional medical image segmentation is a central task in medical image analysis. Compared with two-dimensional imaging, it captures organ and lesion morphology more completely and provides detailed structural information, supporting early disease screening, personalized surgical planning, and treatment assessment. With advances in artificial intelligence, three-dimensional segmentation is viewed as a key technique for diagnostic support, precision therapy, and intraoperative navigation. However, methods such as SwinUNETR-v2 and UNETR++ depend on extensive voxel-level annotations, which create high annotation costs and restrict clinical use. High-quality segmentation also often requires multi-view projections to recover full volumetric information, increasing radiation exposure and patient burden. Segmentation under sparse radiation measurements is therefore an important challenge. Neural Attenuation Fields (NAF) have recently been introduced for low-dose reconstruction by recovering linear attenuation coefficient fields from sparse views, yet their suitability for three-dimensional segmentation remains insufficiently examined. To address this limitation, a unified framework termed NA-SAM3D is proposed, integrating NAF-based reconstruction with interactive segmentation to enable unsupervised three-dimensional segmentation under sparse-view conditions, reduce annotation dependence, and improve boundary perception.  Methods  The framework is designed in two stages. In the first stage, sparse-view reconstruction is performed with NAF to generate a continuous three-dimensional attenuation coefficient tensor from sparse X-ray projections. Ray sampling and positional encoding are applied to arbitrary three-dimensional points, and the encoded features are forwarded to a Multi-Layer Perceptron (MLP) to predict linear attenuation coefficients that serve as input for segmentation. In the second stage, interactive segmentation is performed. A three-dimensional image encoder extracts high-dimensional features from the attenuation coefficient tensor, and clinician-provided point prompts specify regions of interest. These prompts are embedded into semantic features by an interactive user module and fused with image features to guide the mask decoder in producing initial masks. Because point prompts provide only local positional cues, boundary ambiguity and mask expansion may occur. To address these issues, a Density-Guided Module (DGM) is introduced at the decoder output stage. NAF-derived attenuation coefficients are transformed into a density-aware attention map, which is fused with the initial masks to strengthen tissue-boundary perception and improve segmentation accuracy in complex anatomical regions.  Results and Discussions  NA-SAM3D is evaluated on a self-constructed colorectal cancer dataset comprising 299 patient cases (collected in collaboration with Nanjing Hospital of Traditional Chinese Medicine) and on two public benchmarks: the Lung CT Segmentation Challenge (LCTSC) and the Liver Tumor Segmentation Challenge (LiTS). The results show that NA-SAM3D achieves overall better performance than mainstream unsupervised three-dimensional segmentation methods based on full radiation observation (SAM-MED series) and reaches accuracy comparable to, or in some cases higher than, the fully supervised SwinUNETR-v2. Compared with SAM-MED3D, NA-SAM3D increases the Dice on the LCTSC dataset by more than 3%, while HD95 and ASD decrease by 5.29 mm and 1.32 mm, respectively, indicating improved boundary localization and surface consistency. Compared with the sparse-field-based method SA3D, NA-SAM3D achieves higher Dice scores on all three datasets (Table 1). Compared with the fully supervised SwinUNETR-v2, NA-SAM3D reduces HD95 by 1.28 mm, and the average Dice is only 0.3% lower. Compared with SA3D, NA-SAM3D increases the average Dice by about 6.6% and reduces HD95 by about 11 mm, further confirming its capacity to restore structural details and boundary information under sparse-view conditions (Table 2). Although the overall performance remains slightly lower than that of the fully supervised UNETR++ model, NA-SAM3D still shows strong competitiveness and good generalization under label-free inference. Qualitative analysis shows that in complex pelvic and intestinal regions, NA-SAM3D produces clearer boundaries and higher contour consistency (Fig. 3). On public datasets, segmentation of the lung and liver also shows superior boundary localization and contour integrity (Fig. 4). Three-dimensional visualization further confirms that in colorectal, lung, and liver regions, NA-SAM3D achieves stronger structural continuity and boundary preservation than SAM-MED2D and SAM-MED3D (Fig. 5). The DGM further enhances boundary sensitivity, increasing Dice and mIoU by 1.20% and 3.31% on the self-constructed dataset, and by 4.49 and 2.39 percentage points on the LiTS dataset (Fig. 6).  Conclusions  An unsupervised three-dimensional medical image segmentation framework, NA-SAM3D, is presented, integrating NAF-based reconstruction with interactive segmentation to achieve high-precision segmentation under sparse radiation measurements. The DGM effectively uses attenuation coefficient priors to enhance boundary recognition in complex lesion regions. Experimental results show that the framework approaches the performance of fully supervised methods under unsupervised inference and yields an average Dice improvement of 2.0%, indicating strong practical value and clinical potential for low-dose imaging and complex anatomical segmentation. Future work will refine the model for additional anatomical regions and assess its practical use in preoperative planning.
Multi-Matrix Representative Ordered Statistics Decoding
WANG Yiwen, WANG Qianfan, LIANG Jifan, SONG Linqi, MA Xiao
Available online  , doi: 10.11999/JEIT250854
Abstract:
  Objective   Representative ordered statistics decoding (ROSD) is a class of efficient decoding algorithms originally proposed for staircase matrix codes, which support parallel Gaussian elimination (GE) to enable low-latency implementations. In this paper, ROSD is extended to general linear block codes by utilizing the minimum-weight staircase generator matrix (MWSGM) construction, which generates staircase-structured matrices for arbitrary linear codes. Building upon this, we propose a multi-matrix representative OSD (MM-ROSD) framework that exploits the diversity of multiple candidate staircase matrices to enhance decoding performance and reduce complexity. For performance analysis, a saddlepoint-approximation-based analytical framework is developed to predict the upper bound of the frame error rate (FER) and estimate the required average number of searches.  Methods  The proposed MM-ROSD algorithm consists of two main components:(1) Multi-matrix construction and selection strategy: In the construction phase, the first \begin{document}$ M $\end{document} minimum-weight candidate codewords are retained in the first row (i.e., the first staircase), and for each candidate, the remaining rows are searched independently, yielding \begin{document}$ \text{M} $\end{document} staircase generator matrices with improved basis diversity. In the decoding phase, the optimal staircase matrix is selected according to the sum of reliabilities of the available re-encoding bases within each candidate matrix, and ROSD is then performed on the selected matrix.(2) Saddlepoint-based performance analysis: A saddlepoint approximation method is introduced to theoretically estimate both the FER upper bound and the required average number of searches, providing valuable guidelines for complexity-performance trade-offs and parameter tuning.  Results and Discussions  Extensive simulations are conducted over BPSK-modulated AWGN channels using 5G CA-polar codes \begin{document}$ \mathcal{C}[128{,}64] $\end{document} concatenated with an 11-bit CRC. Key findings include:Saddlepoint approximation accuracy: The predicted FER upper bound matches simulation results closely across the entire SNR range and tightly approaches both the maximum-likelihood (ML) lower bound and the random coding union (RCU) bound. Similarly, the estimated average number of searches aligns closely with simulations in both mid and high SNR regions, validating the accuracy of the analytical framework.Impact of multi-matrix diversity: Increasing the number of pre-stored staircase matrices \begin{document}$ M $\end{document} effectively improves basis quality and decoding performance. For example, with \begin{document}$ \text{M}\in \{1{,}2,8\} $\end{document} and a limited maximum number of searches \begin{document}$ {\ell}_{\max }\in \{{10}^{4},{10}^{5},{10}^{6}\} $\end{document}, FER performance significantly improves and approaches finite-length capacity and ML lower bounds (Fig. 3(a)). Under a limited search list (e.g., \begin{document}$ {\ell}_{\max }={10}^{4} $\end{document}), the FER and average number of searches are substantially reduced by increasing \begin{document}$ M $\end{document}. This effect is mainly due to the improved quality of the re-encoding basis afforded by the multi-matrix strategy. Under larger budgets (e.g., \begin{document}$ {\ell}_{\max }={10}^{6} $\end{document}), increasing \begin{document}$ M $\end{document} primarily reduces the average number of searches.  Conclusions  This work extends ROSD to general linear block codes and introduces an efficient MM-ROSD framework based on MWSGM construction. By leveraging the diversity of multiple candidate staircase matrices and the low-latency advantage of parallel GE, the proposed approach significantly improves decoding performance while reducing the average number of searches. Furthermore, the saddlepoint-based analytical framework accurately predicts both FER and the average number of searches, providing theoretical guidance for practical system design. Simulation results demonstrate that, under identical maximum search constraints, MM-ROSD achieves substantial FER gains and significant reductions in average number of searches compared with the baseline single-matrix ROSD, making it a promising decoding framework for short-block codes in ultra-reliable low-latency communication (URLLC) and hyper-reliable low-latency communication (HRLLC) scenarios.
Design of Dynamic Resource Awareness and Task Offloading Schemes in Multi-Access Edge Computing Networks
ZHANG Bingxue, LI Xisheng, YOU Jia
Available online  , doi: 10.11999/JEIT250640
Abstract:
  Objective  With the development of industrial Internet of Things and the widespread use of multi-mode terminal equipment, multi-access edge computing has become a key technology to support low delay and energy-efficient industrial applications. The task offloading mechanism of edge computing is the core method to solve the large number and complex task processing requirements of multi-mode terminals. In the multi-access edge computing system, the network selection of end users has a great impact on the offloading mechanism and resource allocation. However, the existing network selection mechanism focuses on the user's selection decision, and ignores the impact of user’s task execution, task data offloading transmission and processing on network performance. For the research on the formulation of task offloading mechanism, the existing research focuses on the offloading delay, energy consumption optimization and resource allocation, ignoring the impact of multi-access heterogeneous network collaborative computing on resource costs and the dynamic resource balance between heterogeneous networks. In order to meet these challenges, this paper considers the impact of users’ diverse needs and heterogeneous resource providers’ differentiated capabilities on the decision-making of offloading in a complex computing environment, and makes the decision-making of user task execution cost optimization and rational allocation of dynamic resources in multi-access heterogeneous networks, so as to reduce the system operation cost, improve the quality of service, and efficiently and cooperatively utilize heterogeneous resources.  Methods  According to the multi-access edge computing network model, this paper establishes the cost calculation model for the task execution time, energy consumption and communication resource consumption of different networks for the end-user task selection. Based on the auction theory, it establishes the cost-effective model of computing task evaluation and bidding for the interaction between users and edge servers, and establishes the objective optimization problem according to the combinatorial two-way auction theory. Then, a dynamic resource sensing and task offloading algorithm based on auction mechanism is proposed. Through the two-way broadcast of the task information to be accessed and the required resources, network selection judgment and dynamic resource allocation are carried out. Only when the available resources meet the user resource constraints can the server offer effective bidding. An effective bidding edge server is proposed to compete for the opportunity of user task execution until the user obtains an optimal bidding and corresponding server to complete the auction matching process of the user task.  Results and Discussions  The dynamic resource allocation and task offloading algorithm based on auction mechanism considers the heterogeneous network status and resource usage, and selects the task offloading location according to the resource allocation. By setting the simulation system parameters, the edge computing model of heterogeneous wireless network cooperation is constructed, and the impact of network size on task offload cost and task offload data volume is analyzed. The simulation results show that the dynamic resource allocation and task offloading algorithm based on auction mechanism can reduce the system cost by at least 5% compared with other benchmark algorithms (Fig. 3), which is more obvious when there are more end users. Changes in the number of servers in heterogeneous networks have a certain impact on users' selection of a network for task offloading (Fig. 4, 5, 6). Under different algorithms, the proposed algorithm has a 10% improvement in the amount of task offload data compared with the benchmark algorithm (Fig. 7. 8). Finally, the impact of the change of the communication resource cost parameter on the user’s choice of 5G public network for task offloading is studied. The larger the communication cost parameter, the amount of data processed by the end user’s choice of 5G public network offloading task is significantly reduced (Fig. 9).  Conclusions  Aiming at the complex data processing requirements of multi-mode terminals, this paper constructs a multi-access edge computing cooperation network architecture for multi-mode terminals. The flexible and intelligent selection of wireless communication network by multi-mode terminals provides more resources for end-user task offloading. A server bidding and user target bidding model is established based on the auction model, and a dynamic resource perception and task unloading algorithm based on the auction mechanism is proposed to offload multi-mode terminal tasks, network selection and resource allocation. The algorithm first dynamically adjusts and selects the offloading network and allocates computing and communication resources according to the access tasks, and then selects the task offloading location with the minimum execution cost according to the bidding competition of each edge server. The results show that the proposed algorithm can effectively reduce the system cost compared with the benchmark algorithm, and improve the amount of data offloading from end-user tasks to multi edge servers, make full use of edge computing resources, and improve the system energy efficiency and operation efficiency.
Research on Low Leakage Current Voltage Sampling Method for Multi-cell Series Battery Packs
GUO Zhongjie, GAO Yuyang, DONG Jianfeng, BAI Ruokai
Available online  , doi: 10.11999/JEIT250733
Abstract:
  Objective  The battery voltage sampling circuit is one of the key components of the Battery Management Integrated Circuit (BMIC). It is responsible for real-time monitoring of the battery’s voltage status, and its working performance directly determines the safety status of the series battery pack. The traditional resistive voltage sampling circuit has the problem of channel leakage current, which will affect the consistency of battery voltage and sampling accuracy. Meanwhile, the level-shifting circuit in the high-voltage domain includes high-voltage operational amplifiers, and a large number of high-voltage MOSFETs result in additional area overhead.  Methods  This paper proposes a low leakage current battery voltage sampling circuit applied to 14-series lithium batteries. Improved on the basis of the traditional resistive voltage sampling circuit, the channel leakage current can be reduced to the pA level by designing an operational amplifier isolated active drive technology. According to the different voltage domains of the series battery pack, different voltage conversion methods are adopted. The first section of the battery is isolated using a unity-gain buffer, and then voltage conversion is performed through a resistive voltage division method. Batteries from Section 2 to 13 adopt operational amplifier isolated active driving to synchronously follow the voltage across the batteries, and then convert the followed voltage into a ground-referenced voltage through a level-shifting circuit. The voltage sampling process of the highest-section battery consumes the power of the entire series battery pack and will not affect the consistency of the series battery pack. Therefore, the highest-section battery directly uses the level-shifting circuit for voltage conversion.  Results and Discussions  This paper conducts a detailed design and complete performance verification of the circuit based on the 0.35 μm high-voltage BCD process. The overall layout area of the designed battery voltage sampling circuit is 3105 μm × 638 μm (Fig. 10). From the verification results, it can be concluded that under the different processes and temperatures, after adopting the operational amplifier isolated active drive technology designed in this paper, the maximum channel leakage current is only 48.9 pA. However, the minimum channel leakage current of the traditional voltage sampling circuit is 1.169×106 pA (Fig. 12, Fig. 13). Reduce the impact of the sampling process on battery inconsistency from 18.56% to 2.122 ppm (Fig. 14). In addition, under comprehensive PVT verification conditions, the maximum measurement error of the battery voltage sampling circuit designed in this paper is 0.9 mV (Fig. 15, Fig. 16, Fig. 17).  Conclusions  This paper proposes an operational amplifier isolated active drive technology to mitigate the issue in traditional resistive voltage sampling circuits where channel leakage current affects battery voltage consistency and sampling accuracy. Through the battery voltage sampling circuit designed in this paper, the maximum channel leakage current is 48.9 pA, the inconsistency of battery voltage is 2.122 ppm, and the maximum measurement error is 1.25 mV. It can achieve extremely low channel leakage current while ensuring sampling accuracy. The low-leakage-current battery voltage sampling circuit proposed can be applied to the 14-series lithium battery management chip.
Federated Semi-Supervised Image Segmentation with Dynamic Client Selection
LIU Zhenbing, LI Huanlan, WANG Baoyuan, LU Haoxiang, PAN Xipeng
Available online  , doi: 10.11999/JEIT250834
Abstract:
  Objective  Multicenter validation is an inevitable trend in clinical research, yet strict privacy regulations, heterogeneous cross-institutional data distributions and scarce pixel-level annotations limit the applicability of conventional centralized medical image segmentation models. This study aims to develop a federated semi-supervised framework that jointly exploits labelled and unlabeled prostate MRI data from multiple hospitals, explicitly considering dynamic client participation and non-independent and identically distributed (Non-IID) data, so as to improve segmentation accuracy and robustness under real-world constraints.  Methods  A cross-silo federated semi-supervised learning paradigm is adopted, in which clients with pixel-wise annotations act as labeled clients and those without annotations act as unlabeled clients. Each client maintains a local student network for prostate segmentation. On unlabeled clients, a teacher network with the same architecture is updated by the exponential moving average of student parameters and generates perturbed pseudo-labels to supervise the student through a hybrid consistency loss that combines Dice and binary cross-entropy terms. To mitigate the negative influence of heterogeneous and low-quality updates, a performance-driven dynamic client selection and aggregation strategy is introduced. At each communication round, clients are evaluated on their local validation sets, and only those whose Dice scores exceed a threshold are retained; then a Top-K subset is aggregated with normalized contribution weights derived from validation Dice, with bounds to avoid gradient vanishing and single-client dominance. For unlabeled clients, a penalty factor is applied to down-weight unreliable pseudo-labeled updates. As the segmentation backbone, a Multi-scale Feature Fusion U-Net (MFF-UNet) is constructed. Starting from a standard encoder–decoder U-Net, an FPN-like pyramid is inserted into the encoder, where multi-level feature maps are channel-aligned by 1×1 convolutions, fused in a top–down pathway by upsampling and element-wise addition, and refined using 3×3 convolutions. The decoder progressively upsamples these fused features and combines them with encoder features via skip connections, enabling joint modelling of global semantics and fine-grained boundaries. The framework is evaluated on T2-weighted prostate MRI from six centers: three labeled clients and three unlabeled clients. All 3D volumes are resampled, sliced into 2D axial images, resized and augmented. Dice coefficient and 95th percentile Hausdorff distance (HD95) are used as evaluation metrics.  Results and Discussions  On the six-center dataset, the proposed method achieves average Dice scores of 0.8405 on labeled clients and 0.7868 on unlabeled clients, with corresponding HD95 values of 8.04 and 8.67 pixels, respectively. These results are consistently superior to or on par with several representative federated semi-supervised or mixed-supervision methods, and the improvements are most pronounced on distribution-shifted unlabeled centers. Qualitative visualization shows that the proposed method produces more complete and smoother prostate contours with fewer false positives in challenging low-contrast or small-volume cases, compared with the baselines. Attention heatmaps extracted from the final decoder layer demonstrate that U-Net suffers from attention drift, SegMamba displays diffuse responses and nnU-Net exhibits weak activations for small lesions, whereas MFF-UNet focuses more precisely on the prostate region with stable high responses, indicating enhanced discriminative capability and interpretability.  Conclusions  A federated semi-supervised prostate MRI segmentation framework that integrates teacher–student consistency learning, multi-scale feature fusion and performance-driven dynamic client selection is presented. The method preserves patient privacy by keeping data local, alleviates annotation scarcity by exploiting unlabeled clients and explicitly addresses client heterogeneity through reliability-aware aggregation. Experiments on a six-center dataset demonstrate that the proposed framework achieves competitive or superior overlap and boundary accuracy compared with state-of-the-art federated semi-supervised methods, particularly on distribution-shifted unlabeled centers. The framework is model-agnostic and can be extended to other organs, imaging modalities and cross-institutional segmentation tasks under stringent privacy and regulatory constraints.
Progress in Modeling Cardiac Myocyte Calcium Cycling and Investigating Arrhythmia Mechanisms: A Study Focused on the Ryanodine Receptor
GAO Ying, ZHANG Yucheng, WANG Wenyao, SU Xuanyi, SONG Zhen
Available online  , doi: 10.11999/JEIT250957
Abstract:
  Significance   Ryanodine receptor (RyR) is an essential regulator of cardiac intracellular calcium homeostasis by controlling the release of Ca2+ from the sarcoplasmic reticulum (SR). Its functional abnormalities, such as overactivation or impaired activity, are critical mechanisms underlying early and delayed afterdepolarizations, significantly increasing the risk of arrhythmias. The dynamic coupling between electrical activity and calcium cycling in cardiomyocytes involves highly dynamic and spatially organized processes that are challenging to fully capture experimentally. Conventional experimental techniques, such as animal models and pharmacological studies, are limited by high costs and difficulties in controlling variables. As a result, developing mathematical models and computer simulations of the RyR has become a crucial approach for investigating RyR function regulation under physiological and pathological conditions, as well as its arrhythmogenic mechanisms. This review provides a systematic overview of RyR biology and modeling. It begins by synthesizing RyR structural features and fundamental functional properties to establish a mechanistic basis for gating and regulation. Next, it evaluates contemporary and emerging modeling techniques, outlining the merits and limitations of various computational approaches. The review then summarizes the integration of RyR models into cardiac Ca2+ cycling frameworks and their applications across cardiomyocyte subtypes. Furthermore, the review covers arrhythmogenic mechanisms arising from RyR dysfunction and examines targeted drug therapies designed to normalize channel activity. Finally, it highlights artificial intelligence and cardiac digital twins as emerging paradigms for advancing RyR modeling and therapeutic applications.  Progress   The accumulation of RyR structural data has driven continuous innovation in modeling strategies. Early models often used phenomenological strategies that were practical but mechanistically limited. Markov models now represent the dominant computational framework for simulating RyR gating behavior, enabling detailed replication of calcium sparks and other key events through discrete state transitions. A key advantage of deterministic integration over other numerical methods for solving Markov models is its superior computational efficiency and remarkable flexibility in adapting to diverse cardiomyocyte types. However, it ignores the stochastic nature of RyR opening and fails to reproduce stochastic fluctuations in intracellular calcium concentration, potentially leading to discrepancies between simulations and physiological reality. In contrast, stochastic Markov models can capture these random behaviors, which are critical for investigating arrhythmogenic phenomena like calcium waves. However, they necessitate substantial experimental data and considerable computational resources, consequently hindering their broader-scale application. The development of artificial intelligence methods, including the use of deep neural networks to compress Markov models into single equations, has substantially improved computational efficiency. Meanwhile, structural biology advances have clarified the conformational dynamics of RyRs and subunit cooperativity in gating, especially in diastolic calcium leak, prompting more detailed models like those incorporating subunit interactions or molecular dynamics. Additionally, various RyR models have been successfully integrated into cardiac action potential frameworks, serving as powerful tools for investigating arrhythmogenic mechanisms like delayed afterdepolarizations (DADs) and early afterdepolarizations (EADs). These models not only enhance the understanding of electrical disturbances caused by RyR dysfunction but also provide a valuable platform for antiarrhythmic drug screening and mechanistic research.  Conclusion  Several RyR models have been developed that accurately simulate essential physiological processes such as calcium sparks, enabling broad application in cardiomyocyte calcium dynamics studies. However, current modeling efforts face considerable challenges:(1) Lack of a unified modeling framework. There is still no unified RyR model capable of accurately simulating calcium dynamics across the wide spectrum of physiological and pathological conditions. To select appropriate model for intracellular calcium handling, careful evaluation of the specific effects of different models is necessary. (2) Computational burden restricts multiscale integration. While multiscale models are essential to bridge arrhythmic mechanisms from cellular calcium dynamics to tissue-level propagation by incorporating heterogeneity, their high computational cost presents a formidable barrier to scaling for clinically relevant applications. (3) Underdeveloped pacemaker cell models. Existing research focuses largely on ventricular and atrial myocytes, while pacemaker cell models are relatively underdeveloped and often employ “common pool” approximations that fail to capture spatial calcium gradients. Future research should therefore prioritize the development of detailed pacemaker cell models that represent calcium release unit (CRU) networks and incorporate realistic RyR dynamics. While still in early stages of development for RyR modeling, emerging approaches like artificial intelligence and cardiac digital twins thus offer substantial potential to advance both mechanistic understanding and applications in precision medicine.  Prospects   The future of RyR research will increasingly rely on combining multidisciplinary advances across structural biology, biophysics, and computational science. Integrative efforts are essential to bridge molecular-scale conformational changes of RyR to organ-level cardiac function, which will enable the creation of scalable and clinically actionable models that not only deepen mechanistic insight but also accelerate translational innovation in precision cardiology. Emerging tools like AI and cardiac digital twins offer a pathway toward clinically relevant, multi-scale cardiac models that incorporate patient-specific electrophysiology and calcium handling. Such models could profoundly improve our understanding of arrhythmia mechanisms and heart failure pathophysiology, while also serving as predictive platforms for mechanism-based personalized antiarrhythmic therapy development.
A Neural Network-Based Robust Direction Finding Algorithm for Mixed Circular and Non-Circular Signals Under Array Imperfections
YU Qi, YIN Jiexin, LIU Zhengwu, WANG Ding
Available online  , doi: 10.11999/JEIT250884
Abstract:
  Objective   Direction Of Arrival (DOA) estimation is affected by low Signal-to-Noise Ratios (SNR), the coexistence of Circular Signals (CSs) and Non-Circular Signals (NCSs), and multiple forms of array imperfections. Conventional subspace-based estimators exhibit model mismatch in such environments and show reduced accuracy. Although neural-network methods provide data-driven alternatives, the effective use of the distinctive statistical properties of NCSs and the maintenance of robustness against diverse array errors remain insufficiently addressed. The objective is to design a DOA estimation algorithm that operates reliably for mixed CSs and NCSs in the presence of array imperfections and provides improved estimation accuracy in challenging operating conditions.  Methods   A robust DOA estimation algorithm is proposed based on an improved Vision Transformer (ViT) model. A six-channel image-like input is first constructed by fusing features derived from the covariance matrix and pseudo-covariance matrix of the received signal. These channels include the real component, imaginary component, magnitude, phase, magnitude ratio reflecting the NCS characteristic, and the phase of the pseudo-covariance matrix. A gradient-masking mechanism is introduced to adaptively fuse core and auxiliary features. The ViT architecture is then modified: the standard patch-embedding module is replaced with a convolutional layer to extract local information, and a dual-class-token attention mechanism, placed at the sequence head and tail, is designed to enhance feature representation. A standard Transformer encoder is used for deep feature learning, and DOA estimation is performed through a multi-label classification head.  Results and Discussions   Extensive simulations are carried out to assess the proposed algorithm (6C-ViT) against MUSIC, NC-MUSIC, a Convolutional Neural Network (6C-CNN), a Residual Network (6C-ResNet), and a MultiLayer Perceptron (6C-MLP). Performance is evaluated using Root Mean Square Error (RMSE) and angular estimation error under different operating conditions. Under single-source scenarios with low SNR and no array errors, 6C-ViT achieves near-zero RMSE across most angles and shows minor edge deviations (Fig. 2). It maintains the lowest RMSE across the SNR range from –20 dB to 15 dB (Fig. 3), indicating good generalization to unseen SNR levels. In dual-source scenarios containing mixed CS and NCSs under array errors, 6C-ViT shows clear advantages. Its estimation errors fluctuate slightly around zero, whereas competing techniques present larger errors and pronounced instabilities, especially near array edges (Fig. 4). Its RMSE decreases steadily as SNR increases and reaches below 0.1° at high SNR, while traditional approaches saturate around 0.4° (Fig. 5). Robust behavior is further observed across different numbers of signal sources (K = 1, 2, 3) and snapshot counts (100 to 2 000). 6C-ViT preserves high accuracy and stability under these variations, whereas other methods show marked degradation or instability, most evident at low snapshot counts or with multiple sources (Fig. 6). When evaluated using unknown modulation types, including UQPSK with a non-circularity rate of 0.6 and 64QAM, under array errors, 6C-ViT continues to produce the lowest RMSE across most angles (Fig. 7), demonstrating strong generalization capability. Ablation studies (Fig. 8) confirm the contributions of the six-channel input, the gradient masking module, the convolutional embedding, and the dual class token mechanism. The complete configuration yields the highest accuracy and the most stable performance.  Conclusions   Strong robustness is demonstrated in complex scenarios that contain mixed CS and NCSs, multiple array imperfections, low SNR, and closely spaced sources. By fusing multi-dimensional features of the received signal and using an enhanced Transformer architecture, the algorithm attains higher estimation accuracy and improved generalization across different signal types, error conditions, snapshot counts, and noise levels compared with subspace- and neural-network-based baselines. The method provides a reliable DOA estimation solution for demanding practical environments.
Dynamic State Estimation of Distribution Network by Integrating High-degree Cubature Kalman Filter and Long Short-Term Memory Under False Data Injection Attack
XU Daxing, SU Lei, HAN Heqiao, WANG Hailun, ZHANG Heng, CHEN Bo
Available online  , doi: 10.11999/JEIT250805
Abstract:
  Objective  Dynamic state estimation of distribution networks is presented as a core technique for maintaining secure and stable operation in cyber-physical power systems. Its practical performance is limited by strong system nonlinearity, high-dimensional state characteristics, and the threat posed by False Data Injection Attack (FDIA). A method that integrates High-degree Cubature Kalman Filter (HCKF) with Long Short-Term Memory network (LSTM) is proposed. HCKF is applied to enhance estimation precision in nonlinear high-dimensional scenarios. The estimation outputs from HCKF and Weighted Least Squares (WLS) are combined for rapid FDIA identification using residual-based analysis. The LSTM model is then employed to reconstruct measurement data of compromised nodes and refine state estimation results. The approach is validated on the IEEE 33-bus distribution system, demonstrating reliable accuracy enhancement and effective attack resilience.  Methods   The strong nonlinearity of distribution networks limits the estimation accuracy of dynamic methods based on the Cubature Kalman Filter (CKF). A hybrid measurement state estimation model that combines data from Phasor Measurement Unit (PMU) and Supervisory Control And Data Acquisition (SCADA) is established. HCKF is applied to enhance estimation performance in nonlinear, high-dimensional scenarios by generating higher-order cubature points. Under FDIA, the estimation outputs from WLS and HCKF are jointly assessed, allowing rapid intrusion detection through residual evaluation and state consistency checking. Once an attack is identified, an LSTM model performs time-series prediction to reconstruct the measurement data of compromised nodes. The reconstructed data replace abnormal values, enabling correction of the final state estimation.  Results and Discussions  Experiments on the IEEE 33-bus distribution system show that without FDIA, HCKF achieves higher estimation accuracy for voltage magnitude and phase angle than CKF. The Average voltage Relative Error (ARE) of voltage magnitude decreases by 57.9%, and the corresponding phase-angle error decreases by 28.9%, confirming the superiority of the method for strongly nonlinear and high-dimensional state estimation. Under FDIA, residual-based detection effectively identifies cyber attacks and avoids false alarms and missed detections. The prediction error of LSTM for the measurement data of compromised nodes and their associated branches remains on the order of 10–6, indicating high reconstruction fidelity. The combined HCKF and LSTM maintains stable state tracking after intrusion, and its performance exceeds that of WLS and adaptive Unscented Kalman Filter.  Conclusions  The dynamic state estimation method that integrates HCKF and LSTM enhances adaptability to strong nonlinearity and high-dimensional characteristics of distribution networks. Rapid and accurate FDIA identification is achieved through residual evaluation, and LSTM reconstructs the measurement data of compromised nodes with high reliability. The method maintains high estimation accuracy under normal operation and preserves stability and precision under cyber intrusion. It offers technical support for secure and stable operation of distribution networks in the presence of malicious attacks.
A Fake Attention Map-Driven Multi-Task Deepfake Video Detection Model
LIU Pengyu, ZHENG Tianyang, DONG Min
Available online  , doi: 10.11999/JEIT250926
Abstract:
  Objective  Deepfake detection is a major challenge in multimedia forensics and information security as synthetic media generation advances. Most high-quality detection methods rely on supervised binary classification models with implicit attention mechanisms. Although these models learn discriminative features and reveal manipulation traces, their performance decreases when confronted with unseen forgery techniques. The absence of explicit guidance during feature fusion reduces sensitivity to subtle artifacts and weakens cross-domain generalization. To address these issues, a detection framework named F-BiFPN-MTLNet is proposed. The framework is designed to achieve high detection accuracy and strong generalization by introducing an explicit forgery-attention-guided multi-scale feature fusion mechanism and a multi-task learning strategy. This research strengthens the interpretability and robustness of deepfake detection models, particularly in real-world settings where forgery methods are diverse and continuously changing.  Methods  The proposed F-BiFPN-MTLNet contains two components: a Forgery-attention-guided Bidirectional Feature Pyramid Network (F-BiFPN) and a Multi-Task Learning Network (MTLNet). The F-BiFPN (Fig. 1) is designed to provide explicit guidance for fusing multi-scale feature representations from different backbone layers. Instead of using simple top-down and bottom-up fusion, a forgery-attention map is applied to supervise the fusion process. This map highlights potential manipulation regions and assigns adaptive weights to each feature level, ensuring that both semantic and spatial details are retained and redundant information is reduced. This attention-guided fusion strengthens the sensitivity of the network to fine-grained forged traces and improves the quality of the resulting representations.  Results and Discussions  Experiments are conducted on multiple benchmark datasets, including FaceForensics++, DFDC, and Celeb-DF (Table 1). The proposed F-BiFPN-MTLNet shows consistent gains over state-of-the-art methods in both Area Under the Curve (AUC) and Average Precision (AP) metrics (Table 2). The findings show that attention-guided fusion strengthens the detection of subtle manipulations, and the multi-task learning structure stabilizes performance across different forgery types. Ablation analyses (Table 3) confirm the complementary effects of the two modules. Removing F-BiFPN reduces sensitivity to local artifacts, whereas omitting the self-consistency branch reduces robustness under cross-dataset evaluation. Visualization results (Fig. 3) show that F-BiFPN-MTLNet consistently focuses on forged regions and produces interpretable attention maps that align with actual manipulation areas. The framework achieves a balanced improvement in accuracy, generalization, and transparency, while maintaining computational efficiency suitable for practical forensic applications.  Conclusions  In this study, a forgery-attention-guided weighted bidirectional feature pyramid network combined with a multi-task learning framework is proposed for robust and interpretable deepfake detection. The F-BiFPN provides explicit supervision for multi-scale feature fusion through forgery-attention maps, reducing redundancy and emphasizing informative regions. The MTLNet introduces a learnable mask branch and a self-consistency branch, jointly strengthening localization accuracy and cross-domain robustness. Experimental results show that the proposed model exceeds existing baselines in AUC and AP metrics while retaining strong interpretability through visualized attention maps. Overall, F-BiFPN-MTLNet achieves a balanced improvement in fine-grained localization, detection reliability, and generalization ability. Its explicit attention and multi-task strategies offer a new direction for developing interpretable and resilient deepfake detection systems. Future work will examine the extension of the framework to weakly supervised and unsupervised settings, reduce dependence on pixel-level annotations, and explore adversarial training strategies to strengthen adaptability against evolving forgery methods.
Continuous Federation of Noise-resistant Heterogeneous Medical Dialogue Using the Trustworthiness-based Evaluation
LIU Yupeng, ZHANG Jiang, TANG Shichen, MENG Xin, MENG Qingfeng
Available online  , doi: 10.11999/JEIT250057
Abstract:
  Objective   To address the key challenges of client model heterogeneity, data distribution heterogeneity, and text noise in medical dialogue federated learning, this paper proposes a trustworthiness-based, noise-resistant heterogeneous medical dialogue federated learning method, termed FedRH. FedRH enhances robustness by improving the objective function, aggregation strategy, and local update process, among other components, based on credibility evaluation.  Methods   Model training is divided into a local training stage and a heterogeneous federated learning stage. During local training, text noise is mitigated using a symmetric cross-entropy loss function, which reduces the risk of overfitting to noisy text. In the heterogeneous federated learning stage, an adaptive aggregation mechanism incorporates clean, noisy, and heterogeneous client texts by evaluating their quality. Local parameter updates consider both local and global parameters simultaneously, enabling continuous adaptive updates that improve resistance to both random and structured (syntax/semantic) noise and model heterogeneity. The main contributions are threefold: (1) A local noise-resistant training strategy that uses symmetric cross-entropy loss to prevent overfitting to noisy text during local training; (2) A heterogeneous federated learning approach based on client trustworthiness, which evaluates each client’s text quality and learning effectiveness to compute trust scores. These scores are used to adaptively weight clients during model aggregation, thereby reducing the influence of low-quality data while accounting for text heterogeneity; (3) A local continuous adaptive aggregation mechanism, which allows the local model to integrate fine-grained global model information. This approach reduces the adverse effects of global model bias caused by heterogeneous and noisy text on local updates.  Results and Discussions   The effectiveness of the proposed model is systematically validated through extensive, multi-dimensional experiments. The results indicate that FedRH achieves substantial improvements over existing methods in noisy and heterogeneous federated learning scenarios (Table 2, Table 3). The study also presents training process curves for both heterogeneous models (Fig. 3) and isomorphic models (Fig. 6), supplemented by parameter sensitivity analysis, ablation experiments, and a case study.  Conclusions   The proposed FedRH framework significantly enhances the robustness of federated learning for medical dialogue tasks in the presence of heterogeneous and noisy text. The main conclusions are as follows: (1) Compared to baseline methods, FedRH achieves superior performance in client-side models under heterogeneous and noisy text conditions. It demonstrates improvements across multiple metrics, including precision, recall, and factual consistency, and converges more rapidly during training. (2) Ablation experiments confirm that both the symmetric cross-entropy-based local training strategy and the credibility-weighted heterogeneous aggregation approach contribute to performance gains.
Research on an EEG-based Neurofeedback System for the Auxiliary Intervention of Post-Traumatic Stress Disorder
TAN Lize, DING Peng, WANG Fan, LI Na, GONG Anmin, NAN Wenya, LI Tianwen, ZHAO Lei, FU Yunfa
Available online  , doi: 10.11999/JEIT250093
Abstract:
  Objective  The ElectroEncephaloGram (EEG)-based Neurofeedback Regulation (ENR) system is designed for real-time modulation of dysregulated stress responses to reduce symptoms of Post-Traumatic Stress Disorder (PTSD) and anxiety. This study evaluates the system’s effectiveness and applicability using a series of neurofeedback paradigms tailored for both PTSD patients and healthy participants.  Methods  Employing real-time EEG monitoring and feedback, the ENR system targets the regulation of alpha wave activity, to alleviate mental health symptoms associated with dysregulated stress responses. The system integrates MATLAB and Unity3D to support a complete workflow for EEG data acquisition, processing, storage, and visual feedback. Experimental validation includes both PTSD patients and healthy participants to assess the system’s effects on neuroplasticity and emotional regulation. Primary assessment indices include changes in alpha wave dynamics and self-reported reductions in stress and anxiety.  Results and Discussions  Compared with conventional therapeutic methods, the ENR system shows significant potential in reducing symptoms of PTSD and anxiety. During functionality tests, the system effectively captures and regulates alpha wave activity, enabling real-time and efficient neurofeedback. Dynamic adjustment of feedback thresholds and task paradigms allows participants to improve stress responses and emotional states following training. Quantitative data indicate clear enhancements in EEG pattern modulation, while qualitative assessments reflect improvements in participants’ self-reported stress and anxiety levels.  Conclusion  This study presents an effective and practical EEG-based neurofeedback regulation system that proves applicable and beneficial for both individuals with PTSD and healthy participants. The successful implementation of the system provides a new technological approach for mental health interventions and supports ongoing personalized neuroregulation strategies. Future research should explore broader applications of the system across neurological conditions to fully assess its efficacy and scalability.
Design of a CNN Accelerator Based on Systolic Array Collaboration with Inter-Layer Fusion
LU Di, WANG Zhen Fa
Available online  , doi: 10.11999/JEIT250867
Abstract:
  Objective  With the rapid deployment of deep learning in edge computing, the demand for efficient Convolutional Neural Network (CNN) accelerators has become increasingly urgent. Although traditional CPUs and GPUs provide strong computational power, they suffer from high power consumption, large latency, and limited scalability in real-time embedded scenarios. FPGA-based accelerators, owing to their reconfigurability and parallelism, present a promising alternative. However, existing implementations often face challenges such as low resource utilization, memory access bottlenecks, and difficulties in balancing throughput with energy efficiency. To address these issues, this paper proposes a systolic array–based CNN accelerator with layer-fusion optimization, combined with an enhanced memory hierarchy and computation scheduling strategy. By designing hardware-oriented convolution mapping methods and employing lightweight quantization schemes, the proposed accelerator achieves improved computational efficiency and reduced resource consumption while meeting real-time inference requirements, making it suitable for complex application scenarios such as intelligent surveillance and autonomous driving.  Methods  This paper addresses the critical challenges commonly observed in FPGA-based Convolutional Neural Network (CNN) accelerators, including data transfer bottlenecks, insufficient resource utilization, and low processing unit efficiency. We propose a hybrid CNN accelerator architecture based on systolic array–assisted layer fusion, in which computation-intensive adjacent layers are deeply bound and executed consecutively within the same systolic array. This design reduces frequent off-chip memory access of intermediate results, decreases data transfer overhead and power consumption, and improves both computation speed and overall energy efficiency. A dynamically reconfigurable systolic array method is further developed to provide hardware-level adaptability for multi-dimensional matrix multiplications, thereby avoiding the resource waste of deploying dedicated hardware for different computation scales, reducing overall FPGA logic resource consumption, and enhancing adaptability and flexibility of hardware resources. In addition, a streaming systolic array computation scheme is introduced through carefully orchestrated computation flow and control logic, ensuring that processing elements within the systolic array remain in a high-efficiency working state. Data continuously flows through the computation engine in a highly pipelined and parallelized manner, improving the utilization of internal processing units, reducing idle cycles, and ultimately enhancing overall throughput.  Results and Discussions  To explore the optimal quantization precision of neural network models, experiments were conducted on the MNIST dataset using two representative architectures, VGG16 and ResNet50, under fixed-point quantization with 12-bit, 10-bit, 8-bit, and 6-bit precision. The results, as shown in Table 1, indicate that when the quantization bit width falls below 8 bits, model inference accuracy drops significantly, suggesting that excessively low precision severely compromises the representational capacity of the model. On the proposed accelerator architecture, VGG16, ResNet50, and YOLOv8n achieved peak computational performances of 390.25 GOPS, 360.27 GOPS, and 348.08 GOPS, respectively. To comprehensively evaluate the performance advantages of the proposed accelerator, comparisons were made with FPGA accelerator designs reported in existing literature, as summarized in Table 4. Table 5 further presents a comparison of the proposed accelerator with conventional CPU and GPU platforms in terms of performance and energy efficiency. During the acceleration of VGG16, ResNet50, and YOLOv8n, the proposed accelerator achieved computational throughput that was 1.76×, 3.99×, and 2.61× higher than that of the corresponding CPU platforms, demonstrating significant performance improvements unattainable by general-purpose processors. Moreover, in terms of energy efficiency, the proposed accelerator achieved improvements of 3.1× (VGG16), 2.64× (ResNet50), and 2.96× (YOLOv8n) compared with GPU platforms, highlighting its superior energy utilization efficiency.  Conclusions  This paper proposes a systolic array–assisted layer-fusion CNN accelerator architecture. First, a theoretical analysis of the accelerator’s computational density is conducted, demonstrating the performance advantages of the proposed design. Second, to address the design challenge arising from the variability in local convolution window sizes of the second layer, a novel dynamically reconfigurable systolic array method is introduced. Furthermore, to enhance the overall computational efficiency, a streaming systolic array scheme is developed, in which data continuously flows through the computation engine in a highly pipelined and parallelized manner. This design reduces idle cycles within the systolic array and improves the overall throughput of the accelerator. Experimental results show that the proposed accelerator achieves high throughput with minimal loss in inference accuracy. Specifically, peak performance levels of 390.25 GOPS, 360.27 GOPS, and 348.08 GOPS were attained for VGG16, ResNet50, and YOLOv8n, respectively. Compared with traditional CPU and GPU platforms, the proposed design exhibits superior energy efficiency, demonstrating that the accelerator architecture is particularly well-suited for resource-constrained and energy-sensitive application scenarios such as edge computing.
Research on A Miniaturized Wide Stopband Folded Substrate Integrated Waveguide Filter
KE Rongjie, WANG Hongbin, CHENG Yujian
Available online  , doi: 10.11999/JEIT250869
Abstract:
To meet the requirements of 5G/6G communication systems for miniaturization, high integration, and wide stopband, a fourth-order bandpass filter based on the eighth-mode folded substrate integrated waveguide (FSIW) is proposed in this paper using high-temperature co-fired ceramic (HTCC) technology. The miniaturization advantages of folded SIW are combined with the three-dimensional integration characteristics of HTCC in this filter. Size reduction is achieved through the eighth-mode FSIW cavity structure, with a size of 0.29λg × 0.29λg, where λg is the waveguide wavelength corresponding to its center operating frequency (f0). High-order mode coupling is suppressed using metal vias, transmission zeros are introduced by loading a bent microstrip line, and the high-frequency response is optimized through the addition of an L-shaped stub. As a result, three controllable transmission zeros are formed in the upper stopband, with a wide stopband characteristic of 20 dB@3.73f0 achieved. The measured results show that the filter has a center frequency of 6.4 GHz. Although it exhibits a certain amount of frequency deviation and insertion loss, compared with existing research, it demonstrates distinct advantages in terms of miniaturization, stopband width, and the number of transmission zeros, holding promising potential for applications in high-density integrated communication systems.  Objective  With the rapid advancement of 5G/6G communication systems, there is an urgent demand for radio frequency (RF) microwave devices that combine miniaturization, high integration, and wide stopband performance. As a core component of RF transceiver front-ends, bandpass filters play a critical role in transmitting useful signals and suppressing interference. Traditional substrate integrated waveguide (SIW) filters suffer from limitations such as large size, restricted stopband extension, and insufficient controllability of transmission zeros, which hinder their application in high-density integrated communication systems. To address these challenges, this paper proposes a miniaturized wide stopband fourth-order bandpass filter based on eighth-mode folded substrate integrated waveguide (FSIW) and high-temperature co-fired ceramic (HTCC) technology, which aims to achieve a balance between compact size and broad stopband.  Methods  The proposed filter integrates the miniaturization advantage of folded SIW with the three-dimensional integration capability of HTCC technology. First, an eighth-mode FSIW cavity structure is designed by modifying the quarter-mode FSIW cavity. The square patch is replaced with a triangular patch (i.e., eighth-mode cavity I), and slots are further etched in the triangular patch (i.e., eighth-mode cavity Ⅱ). Second, a fourth-order bandpass filter is constructed by symmetrically designing two triangular metal patches for each cavity type and stacking them vertically, with a common metal layer (fifth layer) featuring coupling windows to enable coupling between upper and lower cavities. To optimize performance, three key techniques are employed: metal vias to suppress high-order mode coupling, bent microstrip lines to introduce transmission zeros, and an L-shaped stub to improve high-frequency response. Finally, parameter scanning analysis is conducted on critical dimensions (d2, s4, s6) to verify the controllability of transmission zeros, and the filter is fabricated using HTCC technology, employing the Al2O3 substrate with relative permittivity 9.8 and loss tangent 0.0002.  Results and Discussions  The measured results indicate that the proposed filter achieves a center frequency of 6.4 GHz. Although processing and assembly errors lead to slight frequency deviation and increased insertion loss, the filter exhibits outstanding performance compared to existing designs (Table 2). In terms of miniaturization, the filter achieves a size of 0.29λg×0.29λg, which is significantly smaller than most reported SIW filters. Regarding stopband performance, the upper stopband extends to 20dB@3.73f0, which is superior to that of filters with comparable sizes. Three controllable transmission zeros are generated in the upper stopband, with parameter scanning verifying their flexibility (Fig.13).  Conclusions  A miniaturized wide stopband fourth-order bandpass filter based on eighth-mode FSIW is successfully designed in this paper. The key achievements are summarized as follows. For one thing, the eighth-mode FSIW cavity structure combined with HTCC technology delivers a compact size of 0.29λg×0.29λg, meeting the high integration demands of 5G/6G systems. For another, the integration of metal vias, bent microstrip lines, and L-shaped stubs realizes a wide stopband of 20 dB@3.73f0 along with three controllable transmission zeros, which enhances interference suppression capability. Additionally, parameter adjustment allows for flexible tuning of transmission zero positions without influencing the passband, improving design flexibility for different interference conditions. Overall, these innovations collectively overcome the challenges of miniaturization, stopband performance, and design flexibility in SIW filters, offering a practical and competitive solution for the RF front-ends of next-generation high-density integrated communication systems.
A Multi-scale Spatiotemporal Correlation Attention and State Space Modeling-based Approach for Precipitation Nowcasting
ZHENG Hui, CHEN Fu, HE Shuping, QIU Xuexing, ZHU Hongfang, WANG Shaohua
Available online  , doi: 10.11999/JEIT250786
Abstract:
  Objective  Precipitation nowcasting, as one of the most representative tasks in the field of meteorological forecasting, uses radar echoes or precipitation sequences to predict precipitation distribution in the next 0-2 hours. It provides scientific and technological support for disaster warning and key decision-making, and maximizes the protection of people's lives and property. Current mainstream methods generally have problems such as loss of local details, inadequate representation of conditional information, and insufficient adaptability to complex areas. Therefore, this paper proposes a PredUMamba model based on the diffusion model. In this model, on the one hand, a Mamba block based on an adaptive zigzag scanning mechanism is introduced, which not only fully mines the key local detail information but also effectively reduces the computational complexity. On the other hand, a multi-scale spatio-temporal correlation attention model is designed to enhance the interaction ability of spatio-temporal hierarchical features while achieving a comprehensive representation of conditional information. More importantly, a radar echo dataset tailored for precipitation nowcasting in complex regions was constructed, specifically a radar dataset from the southern Anhui mountainous area, to validate the model's ability to accurately predict sudden, extreme rainfall events in complex areas. This research provides a new intelligent solution and theoretical support for precipitation nowcasting.  Methods  The PredUMamba model proposed in this paper adopts a two-stage diffusion model network. In the first stage, a frame-by-frame Variational Auto Encoder (VAE) is trained to map precipitation data in pixel space to a low-dimensional latent space. In the second stage, a diffusion network is constructed on the latent space after VAE encoding. In the diffusion network, this paper proposes an adaptive zigzag Mamba module, which adopts a spatio-temporal alternating adaptive zigzag scanning strategy, in which sequential scanning is performed within the rows of the data block and turn-back scanning is performed between rows, effectively capturing the detailed features of the precipitation field while maintaining low computational complexity. In addition, this paper designs a multi-scale spatio-temporal correlation attention module on both temporal and spatial scales. On the temporal scale, adaptive convolution kernels and convolution layers containing attention mechanisms are used to capture local and global information. On the spatial scale, a lightweight correlation attention is designed to aggregate spatial information, thus enhancing the ability to mine historical conditional information. Finally, this paper constructs a radar dataset for the southern Anhui mountainous area for the precipitation nowcasting task in complex terrain areas, which helps to verify the adaptability of the PredUMamba model and other models in the field to complex terrain areas.  Results and Discussions  In the PredUMamba model, by designing the adaptive zigzag Mamba module and the multi-scale spatio-temporal correlation attention module, the mining capability of the intrinsic spatio-temporal jointness of the data is enhanced, which can more accurately capture the characteristics of the conditional information and make prediction results that are more in line with the actual situation. Experimental results show that the PredUMamba model achieves the best performance in all indicators on the Southern Anhui Mountain Area and Shanghai radar datasets. On the SEVIR dataset, FVD, CSI_pool4, and CSI_pool16 are all superior to other methods, the CSI and CRPS also achieve very competitive results. In addition, further visualization prediction results show that PredUMamba's prediction results do not blur over time (Fig. 4), which indicates that the model has higher stability, and also has significant advantages in detail generation and overall motion trend capture, which indicates that the model can better generate edge details aligned with real precipitation conditions while maintaining accurate motion pattern predictions.  Conclusions  This paper proposes an innovative PredUMamba model based on a diffusion network architecture. The model significantly improves the model performance by introducing the Mamba module with adaptive zigzag scanning mechanism and the multi-scale spatio-temporal correlation attention module. The adaptive zigzag scanning Mamba module effectively captures the fine-grained spatio-temporal characteristics of precipitation data through a scanning strategy that alternates time and space, while reducing computational complexity. The multi-scale spatio-temporal correlation attention module enhances the ability to mine historical conditional information through a dual-branch network in the time dimension and a lightweight correlation attention mechanism in the spatial dimension, realizing the joint representation of local and global features. In order to verify the applicability of the model in complex terrain areas, this paper also constructed a radar dataset for the southern Anhui mountainous area. This dataset covers precipitation information under various terrain conditions and provides important support for extreme precipitation prediction in complex terrain areas. In addition, this study further conducts comparative experiments on the constructed dataset and some public datasets in the field. The experimental results show that the PredUMamba model achieved the best results in all indicators on the southern Anhui mountainous area and Shanghai radar datasets. On the SEVIR dataset, FVD, CSI_pool4 and CSI_pool16 all outperformed other methods, and the CRPS and CSI also achieved very competitive results. However, this study is only designed around a purely data-driven intelligent forecasting method, future work will focus on combining physical condition constraint information to improve the interpretability of the model and further optimize the prediction accuracy of small and medium-scale convective systems.
Radio Map Enabled Path Planning for Multiple Cellular-Connected Unmanned Aerial Vehicles
ZHOU Decheng, WANG Wei, SHAO Xiang, CHEN Mei, XIAO Jianghao
Available online  , doi: 10.11999/JEIT250821
Abstract:
  Objective  In collaborative operation scenarios of cellular-connected Unmanned Aerial Vehicles (UAVs), conflict avoidance strategies often result in unbalanced service quality. Traditional schemes typically focus on minimizing the total task completion time, failing to ensure service fairness. To address this, a radio map-assisted cooperative path planning scheme is proposed. The primary objective is to minimize the maximum weighted sum of task completion time and communication disconnection time across all UAVs, thereby ensuring balanced service quality in multi-UAV scenarios.  Methods  A Signal-to-Interference-Plus-Noise Ratio (SINR) map is constructed to evaluate communication quality. The 2D airspace is discretized into grids, and link gain maps are generated via ray-tracing and Axis-Aligned Bounding Box detection to determine Line-of-Sight (LoS) or Non-Line-of-Sight (NLoS) link conditions. The SINR map is then derived by selecting the base station with the maximum expected SINR for each grid. To solve the optimization problem, an Improved Conflict-Based Search (ICBS) algorithm with a hierarchical structure is developed. At the high-level stage, proximity conflicts are managed to maintain safety distances, and the cost function is reconstructed to prioritize fairness by minimizing the maximum weighted time. The low-level stage employs a bidirectional A* algorithm for single UAV path planning, utilizing parallel search mechanisms to accelerate the process while adhering to the constraints from the high-level stage.  Results and Discussions  The effectiveness of the proposed scheme is verified through simulations across various scenarios. The building height and position distribution are illustrated, where base station locations are marked by red stars and building heights are represented by color gradients from light to dark indicating increasing height (Fig.2). The complex wireless propagation environment between UAVs and ground base stations is revealed by the constructed SINR map at an altitude of 60 m (Fig.3), showing significant SINR degradation in specific areas caused by building blockage and co-channel interference, which leads to the formation of communication blind zones. Trajectory planning results for four UAVs at an altitude of 60 m with a SINR threshold of 2 dB demonstrate that all UAVs effectively avoid signal blind zones and complete tasks without collision risks under the proposed scheme (Fig.4). A trade-off between task completion time and communication disconnection time is demonstrated by the weight coefficient (Fig.5). A monotonically increasing trend is observed in the maximum weighted time as the weight coefficient increases, whereas the maximum disconnection time decreases significantly. Superior computational efficiency is exhibited by the bidirectional A* algorithm compared to Dijkstra’s and traditional A* algorithms, while optimal solution quality is maintained (Table 1). Identical weighted times are achieved by all three algorithms, confirming the optimality of the bidirectional A* approach, yet the runtime is significantly reduced due to the bidirectional parallel search mechanism. The proposed scheme is compared with three different benchmark schemes, achieving the lowest maximum weighted time across various SINR thresholds (Fig.6). Performance analysis across different UAV altitudes indicates that a stable maximum weighted time is maintained by the proposed scheme below 75 m, while sharp increases are observed above this height due to intensified interference from non-serving base stations (Fig.7). Furthermore, the scalability analysis demonstrates significant improvements of the proposed scheme over benchmark schemes, particularly when conflicts become more frequent (Fig.8).  Conclusions  To address the fairness issue in cellular-connected multi-UAV systems, a radio map-assisted path planning scheme is proposed to minimize the maximum weighted time. Based on a constructed discretized SINR map, an Improved Conflict-Based Search (ICBS) algorithm is developed. At the high-level stage, proximity conflicts and a reconstructed cost function are introduced to ensure safety and fairness, while a bidirectional A* algorithm is employed at the low-level stage to accelerate search efficiency. Simulation results indicate that the proposed scheme effectively reduces the maximum weighted time compared to benchmark schemes, significantly enhancing the fairness and overall performance of multi-UAV collaboration.
Robust Adaptive Beamforming for Sparse Arrays
FAN Xuhui, WANG Yuyi, WANG Anyi, XU Yanhong, CUI Can
Available online  , doi: 10.11999/JEIT250952
Abstract:
  Objective  The rapid development of modern communication technologies, such as 5G networks and Internet of Things (IoT) applications, increases the complexity of signal processing in wireless communication and radar systems. Adaptive beamforming is widely used because it extracts the signal of interest in the presence of interference and noise. Traditional robust adaptive beamforming methods address steering vector mismatch, which may result from environmental nonstationarity, Direction-Of-Arrival (DOA) estimation errors, imperfect array calibration, antenna deformation, and local scattering. However, they do not leverage the advantages of the Sparse Array (SA), which reduces hardware complexity and system cost. They also often fail to suppress SideLobe Levels (SLLs) under interference conditions, limiting their effectiveness in complex electromagnetic environments. To address these issues, a robust adaptive beamforming algorithm is proposed that incorporates SA and low-SLL constraints.  Methods  Unlike conventional sparse approaches that place thel0 norm penalty in the objective function, the proposed method introduces the l0 norm into the constraint. This formulation ensures that the optimized array configuration meets the pre-specified number of active sensors and avoids the uncertainty associated with sparse-weight tuning in multi-objective optimization models. In addition to the sparsity constraint, an SLL suppression constraint is incorporated to impose an upper bound on array response in interference and clutter directions. By integrating these constraints into a unified optimization framework, the method achieves a robust Minimum Variance Distortionless Response (MVDR) beamforming scheme that exhibits sparsity, adaptivity, and robustness. To address the nonconvexity of the formulated optimization problem, a convex relaxation strategy is used to convert the nonconvex constraint into a convex one. Based on this formulation, robust adaptive beamforming methods are developed to generate a sparse weight solution from a Uniform Linear Array (ULA). Although the method is derived from a ULA, the sparse weight solution provides practical advantages. By assigning zero weights to selected sensors, the number of active elements is reduced, lowering hardware cost and computational burden while preserving desirable beamforming performance. The main contribution of this work lies in establishing a unified framework that enables collaborative optimization of robustness, beam performance, SLL, and array sparsity.  Results and Discussions  A series of simulation experiments were conducted to evaluate the performance of the proposed sparse robust beamforming algorithm under multiple scenarios, including multi-interference environments, steering vector mismatch, Angle-Of-Arrival (AOA) mismatch, low Signal-to-Noise Ratio (SNR) conditions, and complex electromagnetic environments based on practical antenna arrays. The results show that the algorithm maintains stable mainlobe gain in the desired signal direction while forming deep nulls in interference directions. First, in the presence of steering vector mismatch, conventional MVDR beamformers often exhibit reduced mainlobe gain or beam pointing deviation, which compromises desired-signal reception. By contrast, the proposed method maintains a stable, distortionless mainlobe direction under mismatch conditions, ensuring high gain in the desired signal direction (Fig. 2(a), Fig. 3(a)). Second, with the introduction of an SLL constraint, clutter is suppressed effectively and peak SLLs are reduced markedly (Fig. 2(b)). Third, under low-SNR conditions, the method shows strong noise resistance. Even in heavily noise-contaminated scenarios, it maintains effective interference suppression and achieves high output Signal-to-Interference-plus-Noise Ratio (SINR), demonstrating adaptability to weak-target detection and cluttered environments. Moreover, the optimized SA configuration achieves beamforming performance close to that of a ULA while activating only part of the sensors (Fig. 2). Finally, experimental validation using real antenna arrays further confirms the method’s effectiveness (Fig. 3). Stable performance is maintained, and high gain is achieved in the desired direction even under AOA estimation mismatch (Fig. 4). Overall, the results indicate that the proposed method enhances robustness and hardware efficiency and provides reliable performance in complex electromagnetic environments.  Conclusions  A robust adaptive beamforming algorithm for sparse arrays is proposed. The central innovation is the construction of a joint optimization model that integrates array sparsity, robustness to steering vector mismatch, and low SLL constraints within a unified framework. Compared with approaches such as MVDR, which emphasizes interference suppression, Covariance Matrix Reconstruction (CMR), which enhances robustness, and Non-Adjacent Constrained Sparsity (NACS), which achieves array sparsity, the proposed method attains a balanced improvement across these dimensions. Simulation results show that in scenarios featuring steering vector errors, AOA estimation mismatches, and low-SNR conditions, the method maintains satisfactory beamforming performance with reduced hardware cost, demonstrating strong practical engineering utility and application potential.
Physical Layer Authentication for Large Language Models in Maritime Communications
CHEN Qiaoxin, XIAO Liang, WANG Pengcheng, LI Jieling, YAO Jinqing, XU Xiaoyu
Available online  , doi: 10.11999/JEIT250804
Abstract:
  Objective  PHYsical (PHY)-layer authentication exploits channel state information to detect spoofing attacks. However, for smart ocean applications supported by Large Language Models (LLMs), authentication accuracy and speed remain limited because of insufficient channel estimation and rapidly time-varying channels in short-packet communications with constrained preamble length. An environment perception-aware PHY-layer authentication scheme is therefore proposed for LLM edge inference in maritime applications. A hypothesis-testing-based multi-mode authentication framework is designed to evaluate channel state information and packet arrival interval. Application types and environmental indicators inferred by the LLM are used in reinforcement learning to optimize the authentication mode and test threshold, thereby improving authentication accuracy and speed.  Methods  An environment perception-aware PHY-layer authentication scheme is developed for LLM edge inference in maritime wireless networks. Hypothesis-testing-based multi-mode authentication is used to jointly evaluate channel state information and packet arrival interval for spoofing detection. Reinforcement learning is adopted to optimize the authentication mode and test threshold according to application types and environmental indicators inferred by a multimodal LLM fed with images and prompts. A multi-level policy risk function is formulated to quantify miss-detection risk and to reduce exploration probability for unsafe policies. A Benna-Fusi synapse-based continual learning mechanism is proposed to obtain multi-scale optimization experience across multiple maritime scenarios, such as deck and cabin environments, and to replay identical cases to accelerate policy optimization.  Results and Discussions  Simulations are conducted using four legal devices and a shipborne server with maritime channel data collected in the Xiamen Pearl Harbor area. A spoofing attacker moving at 1.5 m/s transmits false data packets to the server with a maximum power of 100 mW. The results demonstrate clear performance gains over benchmark methods. Compared with RLPA, the proposed scheme achieves an 84.2% reduction in false alarm rate and an 82.3% reduction in miss-detection rate. These gains are attributed to the use of LLM-derived environmental indicators and a safe exploration mechanism that avoids high-risk authentication policies leading to increased miss detection.  Conclusions  A PHY-layer authentication scheme is proposed for LLM-enabled intelligent maritime wireless networks, in which both the authentication mode and test threshold are optimized to counter spoofing attacks. By jointly using LLM-derived environmental indicators, channel state information, and packet arrival interval, a safe exploration mechanism is applied to improve authentication accuracy and efficiency. Simulation results confirm that the proposed scheme reduces the false alarm rate by 84.2% and the miss-detection rate by 82.3% compared with the benchmark RLPA.
An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing
GAO Wenchao, SUO Jianhua, ZHANG Ao
Available online  , doi: 10.11999/JEIT250363
Abstract:
  Objective   Deep learning technology is widely applied to source code vulnerability detection. Existing approaches are mainly sequence-based or graph-based. Sequence-based models convert structured code into linear sequences, which causes loss of syntactic and structural information and often results in a high false-positive rate. Graph-based models capture structural features but cannot represent execution order, and their detection granularity is usually limited to the function level. Both types of methods lack interpretability, which limits the ability of developers to locate vulnerability sources. Large Language Models (LLM) show progress in code understanding, yet they still present high computational cost, hallucination risk in security analysis, and insufficient modeling of complex program logic. To address these issues, an interpretable vulnerability detection method based on graphs and code slicing (GSVD)is proposed. The method integrates structural semantics and sequential information and provides fine-grained, line-level explanations for prediction results.  Methods   The proposed method contains four modules: code graph feature extraction, code sequence feature extraction, feature fusion, and an interpreter module (Fig. 1). First, the source code is normalized, and the Joern static analysis tool is used to generate multiple code graphs, including the Abstract Syntax Tree (AST), Data Dependency Graph (DDG), and Control Dependency Graph (CDG). These graphs represent program structure, data flow, and control flow. Node features are initialized using CodeBERT embeddings combined with one-hot encodings of node types. With the adjacency matrix of each graph, a Gated Graph Convolutional Network (GGCN) with a self-attention pooling layer is applied to extract deep structural semantic features. A code slicing algorithm based on taint analysis (Algorithm 1) is then designed. In this algorithm, taint sources are identified, and taints are propagated according to data and control dependencies to generate concise slices related to potential vulnerabilities. These slices remove unrelated code and are processed using a Bidirectional Long Short-Term Memory (BiLSTM) network to capture long-range sequential dependencies. After extracting graph and sequence features, a gating mechanism is introduced for feature fusion. The fused feature vectors are passed through a Gated Recurrent Unit (GRU), which learns dependencies between structural and sequential representations through its dynamic state updates. To support both vulnerability detection and localization, a VDExplainer is designed. Inspired by the HITS algorithm, it iteratively computes node “authority” and “hub” scores under an edge-mask constraint to estimate node importance and provide node-level interpretability for vulnerability explanation.  Results and Discussions  The effectiveness of the GSVD model is evaluated through comparative experiments on the Devign dataset (FFmpeg + Qemu), as shown in (Table 2). GSVD is benchmarked against several baseline models and achieves the highest accuracy and F1-score, reaching 64.57% and 61.89%, respectively. The recall improves to 62.63%, indicating that the method enhances vulnerability detection and reduces missed reports. To assess the effectiveness of the GRU-based fusion module, three fusion strategies were compared: feature concatenation, weighted sum, and attention mechanism (Table 3). GSVD delivers the best overall performance. Although its precision (61.17%) is slightly lower than the weighted sum method (63.33%), its combined accuracy, recall, and F1-score show more consistent performance. Ablation studies (Tables 4–5) further highlight the contribution of the slicing algorithm. The taint propagation–based slicing method reduces the average number of code lines from 51.98 to 17.30, a 66.72% reduction, and lowers the data redundancy rate to 6.42%. In contrast, VulDeePecker and SySeVR report 19.58% and 22.10%, respectively. This reduction in noise leads to a 1.53% gain in F1-score, confirming that the slicing module enhances focus on critical code segments. The interpretability of GSVD is validated on the Big-Vul dataset using the VDExplainer module (Table 6). Compared with the standard GNNExplainer, the proposed method achieves higher localization accuracy at all evaluation thresholds. When 50% of the nodes are selected, localization accuracy improves by 7.65%, highlighting the advantage of VDExplainer in node-level vulnerability explanation. In summary, GSVD demonstrates strong performance in detection accuracy and interpretability, offering practical support for both vulnerability identification and localization.  Conclusions   The GSVD model addresses the limitations of single-modal methods by integrating graph structures with taint-based code slices. It improves both detection accuracy and interpretability. The VDExplainer enables node-level and line-level localization, enhancing the practical applicability of the model. Experimental findings support the advantages of the proposed method in detection performance and interpretability.
Complete Coverage Path Planning Algorithm Based on Rulkov-like Chaotic Mapping
LIU Sicong, HE Ming, LI Chunbiao, HAN Wei, LIU Chengzhuo, XIA Hengyu
Available online  , doi: 10.11999/JEIT250887
Abstract:
  Objective  This study proposes a Complete Coverage Path Planning (CCPP) algorithm based on a sine-constrained Rulkov-Like Hyper-Chaotic (SRHC) mapping. The work addresses key challenges in robotic path planning and focuses on improving coverage efficiency, path unpredictability, and obstacle adaptability for mobile robots in complex environments, including disaster rescue, firefighting, and unknown-terrain exploration. Traditional methods often exhibit predictable movement patterns, fall into local optima, and show inefficient backtracking, which motivates the development of an approach that uses chaotic dynamics to strengthen exploration capability.  Methods  The SRHC-CCPP algorithm integrates three components: 1. SRHC Mapping A hyper-chaotic system with nonlinear coupling (Eq. 1) generates highly unpredictable trajectories. Lyapunov exponent analysis (Fig. 3ab), phase-space diagrams (Fig. 1), and parameter-sensitivity studies (Table 1) confirm chaotic behavior under conditions such as a=0.01 and b=1.3. 2. Memory-Driven Exploration A dynamic visitation grid prioritizes uncovered regions and reduces redundancy (Algorithm 1). 3.Collision detection combined with normal-vector reflection reduces oscillations in cluttered environments (Fig. 4). Simulations employ a Mecanum-wheel robot model (Eq. 2) to provide omnidirectional mobility.  Results and Discussions  1. Efficiency: SRHC-CCPP achieved faster coverage and improved uniformity in both obstacle-free and obstructed scenarios (Fig. 56). The chaotic driver increased path diversity by 37% compared with rule-based methods. 2. Robustness: The algorithm demonstrated initial-value sensitivity and adaptability to environmental noise (Table 2). 3. Scalability Its low computational overhead supported deployment in large-scale grids (>104 cells).  Conclusions  The SRHC-CCPP algorithm advances robotic path planning by: 1. Merging hyper-chaotic unpredictability with memory-guided efficiency, which reduces repetitive loops. 2. Offering real-time obstacle negotiation through adaptive reflection mechanics. 3. Providing a versatile framework suited to applications that require high coverage reliability and dynamic responsiveness. Future work may examine multi-agent extensions and three-dimensional environments.
Speaker Verification Based on Tide-Ripple Convolution Neural Network
CHEN Chen, YI Zhixin, LI Dongyuan, CHEN Deyun
Available online  , doi: 10.11999/JEIT250713
Abstract:
  Objective  State-of-the-art speaker verification models typically rely on fixed receptive fields, which limits their ability to represent multi-scale acoustic patterns while increasing parameter counts and computational loads. Speech contains layered temporal–spectral structures, yet the use of dynamic receptive fields to characterize these structures is still not well explored. The design principles for effective dynamic receptive field mechanisms also remain unclear.  Methods  Inspired by the non-linear coupling behavior of tidal surges, a Tide-Ripple Convolution (TR-Conv) layer is proposed to form a more effective receptive field. TR-Conv constructs primary and auxiliary receptive fields within a window by applying power-of-two interpolation. It then employs a scan-pooling mechanism to capture salient information outside the window and an operator mechanism to perceive fine-grained variations within it. The fusion of these components produces a variable receptive field that is multi-scale and dynamic. A Tide-Ripple Convolutional Neural Network (TR-CNN) is developed to validate this design. To mitigate label noise in training datasets, a total loss function is introduced by combining a NoneTarget with Dynamic Normalization (NTDN) loss and a weighted Sub-center AAM Loss variant, improving model robustness and performance.  Results and Discussions  The TR-CNN is evaluated on the VoxCeleb1-O/E/H benchmarks. The results show that TR-CNN achieves a competitive balance of accuracy, computation, and parameter efficiency (Table 1). Compared with the strong ECAPA-TDNN baseline, the TR-CNN (C=512, n=1) model attains relative EER reductions of 4.95%, 4.03%, and 6.03%, and MinDCF reductions of 31.55%, 17.14%, and 17.42% across the three test sets, while requiring 32.7% fewer parameters and 23.5% less computation (Table 2). The optimal TR-CNN (C=1024, n=1) model further improves performance, achieving EERs of 0.85%, 1.10%, and 2.05%. Robustness is strengthened by the proposed total loss function, which yields consistent improvements in EER and MinDCF during fine-tuning (Table 3). Additional evaluations, including ablation studies (Tables 5 and 6), component analyses (Fig. 3 and Table 4), and t-SNE visualizations (Fig. 4), confirm the effectiveness and robustness of each module in the TR-CNN architecture.  Conclusions  This research proposes a simple and effective TR-Conv layer built on the T-RRF mechanism. Experimental results show that TR-Conv forms a more expressive and effective receptive field, reducing parameter count and computational cost while exceeding conventional one-dimensional convolution in speech feature modeling. It also exhibits strong lightweight characteristics and scalability. Furthermore, a total loss function combining the NTDN loss and a Sub-center AAM loss variant is proposed to enhance the discriminability and robustness of speaker embeddings, particularly under label noise. TR-Conv shows potential as a general-purpose module for integration into deeper and more complex network architectures.
Architecture and Operational Dynamics for Enabling Symbiosis and Evolution of Network Modalities
ZHANG Huifeng, HU Yuxiang, ZHU Jun, ZOU Tao, HUANGFU Wei, LONG Keping
Available online  , doi: 10.11999/JEIT250949
Abstract:
  Objective  The paradigm shift toward polymorphic networks enables dynamic deployment of diverse network modalities on shared infrastructure but introduces two fundamental challenges. First, symbiosis complexity arises from the absence of formal mechanisms to orchestrate coexistence conditions, intermodal collaboration, and resource efficiency gains among heterogeneous network modalities, which results in inefficient resource use and performance degradation. Second, evolutionary uncertainty stems from the lack of lifecycle-oriented frameworks to govern triggering conditions (e.g., abrupt traffic surges), optimization objectives (service-level agreement compliance and energy efficiency), and transition paths (e.g., seamless migration from IPv6 to GEO-based modalities) during network modality evolution, which constrains adaptive responses to vertical industry demands such as vehicular networks and smart manufacturing. This study aims to establish a theoretical and architectural foundation to address these gaps by proposing a three-plane architecture that supports dynamic coexistence and evolution of polymorphic networks with deterministic service-level agreement guarantees.  Methods  The architecture decouples network operation into four domains: (1) The business domain dynamically clusters services using machine learning according to quality-of-service requirements. (2) The modal domain generates specialized network modalities through software-defined interfaces. (3) The function domain enables baseline capability pooling by atomizing network functions into reusable components. (4) The resource domain supports fine-grained resource scheduling through elementization techniques. The core innovation lies in three synergistic planes: (1) The evolutionary decision plane applies predictive analytics for adaptive selection and optimization of network modalities. (2) The intelligent generation plane orchestrates modality deployment with global resource awareness. (3) The symbiosis platform plane dynamically composes baseline capabilities to support modality coexistence.  Results and Discussions  The proposed architecture advances beyond conventional approaches by avoiding virtualization overhead through native deployment of network modalities directly on polymorphic network elements. Resource elementization and capability pooling jointly support efficient cross-modality resource sharing. Closed-loop interactions among the decision, generation, and symbiosis planes enable autonomous network evolution that adapts to time-varying service demands under unified control objectives.  Conclusions  A theoretically grounded framework is presented to support dynamic symbiosis of heterogeneous network modalities on shared infrastructure through business-driven decision mechanisms and autonomous evolution. The architecture provides a scalable foundation for future systems that integrate artificial intelligence. Future work will extend this paradigm to integrated 6G satellite-terrestrial scenarios, where spatial-temporal resource complementarity is expected to play a central role.
Optimization of Short Packet Communication Resources for UAV Assisted Power Inspection
CHU Hang, DONG Zhihao, CAO Jie, SHI Huaifeng, ZENG Haiyong, ZHU Xu
Available online  , doi: 10.11999/JEIT250852
Abstract:
  Objective  In Unmanned Aerial Vehicle (UAV)–assisted power grid inspection, the real-time acquisition and transmission of multi-modal data (key parameters, images, and videos) are essential for secure grid operation. These tasks require heterogeneous communication conditions, including ultra-reliable low-latency transmission and high-bandwidth data delivery. The limited wireless communication resources and UAV energy constraints restrict the ability to meet these conditions and reduce data timeliness and task performance. The present study is designed to establish a collaborative optimization framework for transmission scheduling and communication resource allocation, ensuring minimal system overhead while meeting task performance and reliability requirements.  Methods  To address the challenges mentioned above, a collaborative optimization framework is established for data transmission scheduling and communication resource allocation. Data transmission scheduling is formulated as a Markov Decision Process (MDP), in which communication consumption is incorporated into the decision cost. At the resource allocation level, Non-Orthogonal Multiple Access (NOMA) technology is applied to increase spectral efficiency. This approach reduces communication cost, maintains transmission reliability, and supports heterogeneous data transmission requirements in UAV-assisted power inspection.  Results and Discussions  The effectiveness of the proposed framework is verified through comprehensive simulations. A scenario is established in which the UAV is required to collect data from multiple distributed power towers within a designated area. A trade-off is observed between reliability and transmission speed (Fig. 3). At the same transmission rate, the bit error rate is reduced by approximately one order of magnitude. When a minimum long-packet signal-to-noise ratio threshold of 7 dB is applied, the optimized transmission system reduces the bit error rate from the 10–3 level to the 10–5 level while requiring only about a 0.4 Mbps decrease in transmission rate. After algorithm optimization, a lower effective signal-to-noise ratio is needed to achieve the same bit error rate; under the same signal-to-noise ratio, the short-packet error performance is improved, indicating more stable system behavior and higher transmission efficiency (Fig. 4).  Conclusions  This study presents a collaborative optimization framework that addresses the challenges posed by limited communication resources and heterogeneous data transmission requirements in UAV power inspection. By integrating MDP-based adaptive scheduling with NOMA-based joint resource allocation, the framework maintains an appropriate balance between communication performance and system overhead. The findings provide a theoretical and practical foundation for efficient, low-cost, and reliable data transmission in future intelligent autonomous aerial systems.
A Reliable Service Chain Option for Global Migration of Intelligent Twins in Vehicular Metaverses
QIU Xianyi, WEN Jinbo, KANG Jiawen, ZHANG Tao, CAI Chengjun, LIU Jiqiang, XIAO Ming
Available online  , doi: 10.11999/JEIT250612
Abstract:
  Objective   As an emerging paradigm that integrates metaverses with intelligent transportation systems, vehicular metaverses are becoming a driving force in the transformation of the automotive industry. Within this context, intelligent twins act as digital counterparts of vehicles, covering their entire lifecycle and managing vehicular applications to provide immersive services. However, seamless migration of intelligent twins across RoadSide Units (RSUs) faces challenges such as excessive transmission delays and data leakage, particularly under cybersecurity threats like Distributed Denial of Service (DDoS) attacks. To address these issues, this paper proposes a globally optimized scheme for secure and dynamic intelligent twin migration based on RSU chains. The proposed approach mitigates transmission latency and enhances network security, ensuring that intelligent twins can be migrated reliably and securely through RSU chains even in the presence of multiple types of DDoS attacks.  Methods   A set of reliable RSU chains is first constructed using a communication interruption–free mechanism, which enables the rational deployment of intelligent twins for seamless RSU connectivity. This mechanism ensures continuous communication by dynamically reconfiguring RSU chains according to real-time network conditions and vehicle mobility. The secure migration of intelligent twins along these RSU chains is then formulated as a Partially Observable Markov Decision Process (POMDP). The POMDP framework incorporates dynamic network state variables, including RSU load, available bandwidth, computational capacity, and attack type. These variables are continuously monitored to support decision-making. Migration efficiency and security are evaluated based on total migration delay and the number of DDoS attacks encountered; these metrics serve as reward functions for optimization. Deep Reinforcement Learning (DRL) agents iteratively learn from their interactions with the environment, refining RSU chain selection strategies to maximize both security and efficiency. Through this algorithm, the proposed scheme mitigates excessive transmission delays caused by network attacks in vehicular metaverses, ensuring reliable and secure intelligent twin migration even under diverse DDoS attack scenarios.  Results and Discussions   The proposed secure dynamic intelligent twin migration scheme employs the MADRL framework to select efficient and secure RSU chains within the POMDP. By defining a suitable reward function, the efficiency and security of intelligent twin migration are evaluated under varying RSU chain lengths and different attack scenarios. Simulation results confirm that the scheme enhances migration security in vehicular metaverses. Shorter RSU chains yield lower migration delays than longer ones, owing to reduced handovers and lower communication overhead (Fig. 2). Additionally, the total reward reaches its maximum when the RSU chain length is 6 (Fig. 3). The MADQN approach exhibits strong defense capabilities against DDoS attacks. Under direct attacks, MADQN achieves final rewards that are 65.3% and 51.8% higher than those obtained by random and greedy strategies, respectively. Against indirect attacks, MADQN improves performance by 9.3%. Under hybrid attack conditions, MADQN increases the final reward by 29% and 30.9% compared with the random and greedy strategies, respectively (Fig. 4), demonstrating the effectiveness of the DRL-based defense strategy in handling complex and dynamic threats. Experimental comparisons with other DRL algorithms, including PPO, A2C, and QR-DQN, further highlight the superiority of MADQN under direct, indirect, and hybrid DDoS attacks (Figs. 57). Overall, the proposed scheme ensures reliable and efficient intelligent twin migration across RSUs even under diverse security threats, thereby supporting high-quality interactions in vehicular metaverses.  Conclusions   This study addresses the challenge of secure and efficient global migration of intelligent twins in vehicular metaverses by integrating RSU chains with a POMDP-based optimization framework. Using the MADQN algorithm, the proposed scheme improves both the efficiency and security of intelligent twin migration under diverse network conditions and attack scenarios. Simulation results confirm significant gains in performance. Along identical driving routes, shorter RSU chains provide higher migration efficiency and stronger defense capabilities. Under various types of DDoS attacks, MADQN consistently outperforms baseline strategies, achieving higher final rewards than random and greedy approaches across all scenarios. Compared with other DRL algorithms, MADQN increases the final reward by up to 50.1%, demonstrating superior adaptability in complex attack environments. Future work will focus on enhancing the communication security of RSU chains, including the development of authentication mechanisms to ensure that only authorized vehicles can access RSU edge communication networks.
Quality Map-guided Fidelity Compression Method for High-energy Regions of Spectral Data
LIU Xiangli, LI Zan, CHEN Yifeng, CHEN Le
Available online  , doi: 10.11999/JEIT250650
Abstract:
  Objective  In the context of intelligent evolution in communication and radar technologies, inefficiency in Radio Frequency (RF) data compression represents a critical bottleneck that restricts transmission bandwidth expansion and system energy efficiency improvement. Conventional compression methods fail to balance compression ratio and reconstruction accuracy in complex scenarios characterized by non-uniform energy distribution. This study aims to address fidelity compression of spectral data with non-uniform energy distribution by developing a quality map-guided method that preserves high-energy regions and improves the adaptability of RF signal processing in complex environments.  Methods  A quality map-guided fidelity compression method is proposed. A three-dimensional energy mask is constructed to dynamically guide the encoder and enhance features in high-energy regions. Multi-level complex convolution and inverted residual connections are adopted for efficient feature extraction and reconstruction. The quality map is derived from local energy and amplitude variations of RF signals by fusing energy proportion and variation as structured prior information. A rate–distortion joint optimization loss function is designed by integrating weighted mean squared error, complex correlation loss, and phase difference loss, with learnable parameters used to balance competing objectives (Fig. 1). The compression network follows an encoder–decoder framework that incorporates quality map extractors, deep encoders and decoders, and entropy coding. Complex convolution, residual spatial feature transformation for multi-scale and high-frequency feature preservation, and gated normalization for low-energy noise suppression are employed (Figs. 26).  Results and Discussions  Experiments conducted on the public dataset RML2018.01a demonstrate the superiority of the proposed method. Reconstruction accuracy: Visual comparisons of real and imaginary components and amplitude spectra show strong overlap between reconstructed and original signals (Figs. 78), with reconstruction errors mainly concentrated in low-energy regions. The Peak Signal-to-Noise Ratio (PSNR) remains ≥35 dB across the tested –4 to 20 dB signal-to-Noise Ratio (SNR) range, confirming robust performance even under extremely low signal-to-noise conditions (Fig. 9). Ablation experiments: Removal of the quality map guidance mechanism results in significant reconstruction errors in high-energy regions, reflected by lower PSNR, higher Mean Relative Error (MRE), and reduced correlation coefficients compared with the complete method (Fig. 9). These results confirm the critical role of the quality map in preserving high-energy features. Comparative analysis: Relative to conventional methods, including LFZip and CORAD, the proposed method achieves superior performance at –4 dB SNR, with higher PSNR (35.75 dB vs. ≤29.45 dB), lower MRE (6.91% vs. ≥8.45%), and stronger correlation coefficients (0.898 vs. ≤0.832), at the expense of a slightly lower compression ratio (Table 1). Self-built dataset validation: To evaluate adaptability to practical complex scenarios, supplementary experiments are performed using a MATLAB-simulated dataset (Table 3) comprising five modulation schemes (BPSK, QPSK, 8PSK, 16QAM, and 64QAM), an additive white Gaussian noise plus Rayleigh fading channel, SNRs from –4 to 20 dB with a 6 dB step, 25,000 samples, and an 8:1:1 data split. Under fading channels, the proposed method continues to outperform baseline methods at –4 dB SNR, achieving a PSNR of 34.61 dB (vs. 28.46/27.88 dB), MRE of 7.53% (vs. 9.00%/9.38%), and a correlation coefficient of 0.885 (vs. 0.821/0.808; Table 3), with optimal rate–distortion performance observed across all compression ratios (Fig. 11). The slight performance degradation relative to RML2018.01a is attributed to Rayleigh fading-induced energy dispersion. Consistent superiority across datasets confirms strong robustness to non-uniform energy distribution and complex channel characteristics in practical applications.  Conclusions  A quality map-guided fidelity compression method for frequency-domain RF data is presented to address challenges caused by non-uniform energy distribution. High-energy region features are effectively preserved through dynamic feature enhancement and multi-dimensional loss optimization. Experimental results demonstrate advantages in reconstruction accuracy and noise resistance, providing a viable framework for high-fidelity compression of complex RF signals in communication and radar systems. Future work will extend the method to real-time processing scenarios and incorporate physical-layer constraints to further enhance practical applicability.
Performance Analysis of Spatial-Reference-Signal-Based Digital Interference Cancellation Systems
XIN Yedi, HE Fangmin, GE Songhu, XING Jinling, GUO Yu, CUI Zhongpu
Available online  , doi: 10.11999/JEIT250679
Abstract:
  Objective  With the rapid development of wireless communications, an increasing number of transceivers are deployed on platforms with limited spatial and spectral resources. Restrictions in frequency and spatial isolation cause high-power local transmitters to couple signals into nearby high-sensitivity receivers, causing co-site interference. Interference cancellation serves as an effective mitigation technique, whose performance depends on precise acquisition of a reference signal representing the interference waveform. Compared with digital sampling, Radio Frequency (RF) sampling enables simpler implementation. However, existing RF-based approaches are generally restricted to low-power communication systems. In high-power RF systems, RF sampling faces critical challenges, including excessive sampling power loss and high integration complexity. Therefore, developing new sampling methods and cancellation architectures suitable for high-power RF systems is of substantial theoretical and practical value.  Methods  To overcome the limitations of conventional high-power RF interference sampling methods based on couplers, a spatial-reference-based digital cancellation architecture is proposed. A directional sampling antenna and its associated link are positioned near the transmitter to acquire the reference signal. This configuration, however, introduces spatial noise, link noise, and possible multipath effects, which can degrade cancellation performance. A system model is developed, and closed-form expressions for the cancellation ratio under multipath conditions are derived. The validity of these expressions is verified through Monte Carlo simulations using three representative modulated signals. Furthermore, a systematic analysis is conducted to evaluate the effects of key system parameters on cancellation performance.  Results and Discussions  Based on the proposed spatial-reference-based digital cancellation architecture, analytical expressions for the cancellation ratio are derived and validated through extensive simulations. These expressions enable systematic evaluation of the key performance factors. For three representative modulation schemes, the cancellation ratio shows excellent consistency between theoretical predictions and simulation results under various conditions, including receiver and sampling channel Interference-to-Noise Ratios (INRs), time-delay mismatch errors, and filter tap numbers (Figs. 2–4). The established theoretical framework is further applied to analyze the effects of system parameters. Simulations quantitatively assess (1) the influence of filter tap number, multipath delay spread, and the number of multipaths on cancellation performance in multipath environments (Figs. 5–7), and (2) the upper performance bounds and contour characteristics under different INR combinations in the receiver and sampling channels (Figs. 8–9).  Conclusion  To reduce the high deployment complexity and substantial insertion loss associated with coupler-based RF interference sampling in high-power systems, a digital interference cancellation architecture based on spatial reference signals is proposed. Closed-form expressions and performance bounds for the cancellation ratio of rectangular band-limited interference under multipath conditions are derived. Simulation results demonstrate that the proposed expressions provide high accuracy in representative scenarios. Based on the analytical findings, the effects of key parameters are examined, including INRs in receiver and sampling channels, filter tap length, multipath delay spread, number of paths, and time-delay mismatch. The results provide practical insights that support the design and optimization of spatial reference–based digital interference cancellation systems.
Flexible Network Modal Packet Processing Pipeline Construction Mechanism for Cloud-Network Convergence Environment
ZHU Jun, XU Qi, ZHANG Fujun, WANG Yongjie, ZOU Tao, LONG Keping
Available online  , doi: 10.11999/JEIT250806
Abstract:
  Objective  With the deep integration of information network technologies and vertical application domains, the demand for cloud–network convergence infrastructure becomes increasingly significant, and the boundaries between cloud computing and network technologies are gradually fading. The advancement of cloud–network convergence technologies gives rise to diverse network service requirements, creating new challenges for the flexible processing of multimodal network packets. The device-level network modal packet processing flexible pipeline construction mechanism is essential for realizing an integrated environment that supports multiple network technologies. This mechanism establishes a flexible protocol packet processing pipeline architecture that customizes a sequence of operations such as packet parsing, packet editing, and packet forwarding according to different network modalities and service demands. By enabling dynamic configuration and adjustment of the processing flow, the proposed design enhances network adaptability and meets both functional and performance requirements across heterogeneous transmission scenarios.  Methods  Constructing a device-level flexible pipeline faces two primary challenges: (1) it must flexibly process diverse network modal packet protocols across polymorphic network element devices. This requires coordination of heterogeneous resources to enable rapid identification, accurate parsing, and correct handling of packets in various formats; (2) the pipeline construction must remain flexible, offering a mechanism to dynamically generate and configure pipeline structures that can adjust not only the number of stages but also the specific functions of each stage. To address these challenges, this study proposes a polymorphic network element abstraction model that integrates heterogeneous resources. The model adopts a hyper-converged hardware architecture that combines high-performance switching ASIC chips with more programmable but less computationally powerful FPGA and CPU devices. The coordinated operation of hardware and software ensures unified and flexible support for custom network protocols. Building upon the abstraction model, a protocol packet flexible processing compilation mechanism is designed to construct a configurable pipeline architecture that meets diverse network service transmission requirements. This mechanism adopts a three-stage compilation structure consisting of front-end, mid-end, and back-end processes. In response to adaptation issues between heterogeneous resources and differentiated network modal demands, a flexible pipeline technology based on Intermediate Representation (IR) slicing is further proposed. This technology decomposes and reconstructs the integrated IR of multiple network modalities into several IR subsets according to specific optimization methods, preserving original functionality and semantics. By applying the IR slicing algorithm, the mechanism decomposes and maps the hybrid processing logic of multimodal networks onto heterogeneous hardware resources, including ASICs, FPGAs, and CPUs. This process enables flexible customization of network modal processing pipelines and supports adaptive pipeline construction for different transmission scenarios.  Results and Discussions  To demonstrate the construction effectiveness of the proposed flexible pipeline, a prototype verification system for polymorphic network elements is developed. As shown in Fig. 6, the system is equipped with Centec CTC8180 switch chips, multiple domestic FPGA chips, and domestic multi-core CPU chips. On this polymorphic network element prototype platform, protocol processing pipelines for IPv4, GEO, and MF network modalities are constructed, compiled, and deployed. As illustrated in Fig. 7, packet capture tests verify that different network modalities operate through distinct packet processing pipelines. To further validate the core mechanism of network modal flexible pipeline construction, the IR code size before and after slicing is compared across the three network modalities and allocation strategies described in Section 6.2. The integrated P4 code for the three modalities, after front-end compilation, produces an unsliced intermediate code containing 32,717 lines. During middle-end compilation, slicing is performed according to the modal allocation scheme, generating IR subsets for ASIC, CPU, and FPGA with code sizes of 23,164, 23,282, and 22,772 lines, respectively. The performance of multimodal protocol packet processing is then assessed, focusing on the effects of different traffic allocation strategies on network protocol processing performance. As shown in Fig. 9, the average packet processing delay in Scheme 1 is significantly higher than in the other schemes, reaching 4.237 milliseconds. In contrast, the average forwarding delays in Schemes 2, 3, and 4 decrease to 54.16 microseconds, 32.63 microseconds, and 15.48 microseconds, respectively. These results demonstrate that adjusting the traffic allocation strategy, particularly the distribution of CPU resources for GEO and MF modalities, effectively mitigates processing bottlenecks and markedly improves the efficiency of multimodal network communication.  Conclusions  Experimental evaluations verify the superiority of the proposed flexible pipeline in construction effectiveness and functional capability. The results indicate that the method effectively addresses complex network environments and diverse service demands, demonstrating stable and high performance. Future work focuses on further optimizing the architecture and expanding its applicability to provide more robust and flexible technical support for protocol packet processing in hyper-converged cloud–network environments.
Ultra-Low-Power IM3 Backscatter Passive Sensing System for IoT Applications
HUANG Ruiyang, WU Pengde
Available online  , doi: 10.11999/JEIT250787
Abstract:
  Objective  With advances in wireless communication and electronic manufacturing, the Internet of Things (IoT) continues to expand across healthcare, agriculture, logistics, and other sectors. The rapid increase in IoT devices creates significant energy challenges, as billions of units generate substantial cumulative consumption, and battery-powered nodes require recurrent charging that raises operating costs and contributes to electronic waste. Energy-efficient strategies are therefore needed to support sustainable IoT deployment. Current approaches focus on improving energy availability and lowering device power demand. Energy Harvesting (EH) technology enables the collection and storage of solar, thermal, kinetic, and Radio Frequency (RF) energy for Ambient IoT (AmIoT) applications. However, conventional IoT devices, particularly those containing active RF components, often require high power, and limited EH efficiency can constrain real-time sensing transmission. To address these constraints, this work proposes an Intermodulation-Product-Third-Order (IM3) backscatter passive sensing system that enables direct analog sensing transmission while maintaining RF EH efficiency.  Methods  The IM3 signal is a nonlinear distortion product generated when two fundamental tones pass through nonlinear devices such as transistors and diodes, producing components at 2f1f2 and 2f2f1. The central contribution of this work is the establishment of a controllable functional relationship between sensor information and IM3 signal frequencies, enabling information encoding through IM3 frequency characteristics. The regulatory element is an embedded impedance module designed as a parallel resonant tank composed of resistors, inductors, and capacitors and integrated into the rectifier circuit. Adjusting the tank’s resonant frequency regulates the conversion efficiency from the fundamental tones to IM3 components: when the resonant frequency approaches a target IM3 frequency, a high-impedance load is produced, lowering the conversion efficiency of that specific IM3 component while leaving other IM3 components unchanged. Sensor information modulates the resonant frequency by generating a DC voltage applied to a voltage-controlled varactor. By mapping sensor information to impedance states, impedance states to IM3 conversion efficiency, and IM3 frequency features back to sensor information, passive sensing is achieved.  Results and Discussions  A rectifying transmitter operating in the UHF 900 MHz band is designed and fabricated (Fig. 8). One signal source is fixed at 910.5 MHz, and the other scans 917~920 MHz, generating IM3 components in the 923.5~929.5 MHz range. Both sources provide an output power of 0 dBm, and the transmitted sensor information is expressed as a DC voltage. Experimental measurements show a power trough in the backscattered IM3 spectrum; as the DC voltage varies from 0 to 5 V, the trough position shifts accordingly (Fig. 9), with more than 10 dB attenuation across the range, giving adequate resolution determined by the varactor diode’s capacitance ratio. The embedded impedance module shows minimal effect on RF-to-DC efficiency (Fig. 10): at a fixed DC voltage, efficiency decreases by approximately 5 basis points at the modulation frequency, independent of input power, and under fixed input power, different sampled voltages cause about 5 basis points of efficiency reduction at different frequencies. These results confirm that the rectifier circuit maintains stable efficiency and meets low-power data transmission requirements.  Conclusions  This paper proposes a passive sensing system based on backscattered IM3 signals that enables simultaneous efficient RF EH and sensing readout. The regulation mechanism between the difference-frequency embedded impedance module and backscattered IM3 intensity is demonstrated. Driven by sensing information, the module links the sensed quantity to IM3 intensity to realize passive readout. Experimental results show that the embedded impedance reduces the target-frequency IM3 component by more than 10 dB, and the RF-to-DC efficiency decreases by only 5 percentage points during readout. Tests in a microwave anechoic chamber indicate that the error between the IM3-derived bias voltage and the measured value remains within 5%, confirming stable operation. The system addresses the energy-information transmission constraint and supports battery-free communication for passive sensor nodes. It extends device lifespan and reduces maintenance costs in Ultra-Low-Power scenarios such as wireless sensor networks and implantable medical devices, offering strong engineering relevance.
MCL-PhishNet: A Multi-Modal Contrastive Learning Network for Phishing URL Detection
DONG Qingwei, FU Xueting, ZHANG Benkui
Available online  , doi: 10.11999/JEIT250758
Abstract:
  Objective  The growing complexity and rapid evolution of phishing attacks present challenges to traditional detection methods, including feature redundancy, multi-modal mismatch, and limited robustness to adversarial samples.  Methods  MCL-PhishNet is proposed as a Multi-Modal Contrastive Learning framework that achieves precise phishing URL detection through a hierarchical syntactic encoder, bidirectional cross-modal attention mechanisms, and curriculum contrastive learning strategies. In this framework, multi-scale residual convolutions and Transformers jointly model local grammatical patterns and global dependency relationships of URLs, whereas a 17-dimensional statistical feature set improves robustness to adversarial samples. The dynamic contrastive learning mechanism optimizes the feature-space distribution through online spectral-clustering-based semantic subspace partitioning and boundary-margin constraints.  Results and Discussions  This study demonstrates consistent performance across different datasets (EBUU17 accuracy 99.41%, PhishStorm 99.41%, Kaggle 99.30%), validating the generalization capability of MCL-PhishNet. The three datasets differ significantly in sample distribution, attack types, and feature dimensions, yet the method in this study maintains stable high performance, indicating that the multimodal contrastive learning framework has strong cross-scenario adaptability. Compared to methods optimized for specific datasets, this approach avoids overfitting to particular dataset distributions through end-to-end learning and an adaptive feature fusion mechanism.  Conclusions  This paper addresses the core challenges in phishing URL detection, such as the difficulty of dynamic syntax pattern modeling, multimodal feature mismatches, and insufficient adversarial robustness, and proposes a multimodal contrastive learning framework, MCL-PhishNet. Through a collaborative mechanism of hierarchical syntax encoding, dynamic semantic distillation, and curriculum optimization, it achieves 99.41% accuracy and a 99.65% F1 score on datasets like EBUU17 and PhishStorm, improving existing state-of-the-art methods by 0.27%~3.76%. Experiments show that this approach effectively captures local variation patterns in URLs (such as numeric substitution attacks in ‘payp41-log1n.com’) through a residual convolution-Transformer collaborative architecture and reduces the false detection rate of path-sensitive parameters to 0.07% via a bidirectional cross-modal attention mechanism. However, the proposed framework has relatively high complexity. Although the hierarchical encoding module of MCL-PhishNet (including multi-scale CNNs, Transformers, and gated networks) improves detection accuracy, it also increases the number of model parameters. Moreover, the current model is trained primarily on English-based public datasets, resulting in significantly reduced detection accuracy for non-Latin characters (such as Cyrillic domain confusions) and regional phishing strategies (such as ‘fake’ URLs targeting local payment platforms).
Research on Collaborative Reasoning Framework and Algorithms of Cloud-Edge Large Models for Intelligent Auxiliary Diagnosis Systems
HE Qian, ZHU Lei, LI Gong, YOU Zhengpeng, YUAN Lei, JIA Fei
Available online  , doi: 10.11999/JEIT250828
Abstract:
  Objective  The deployment of Large Language Models (LLMs) in intelligent auxiliary diagnosis is constrained by limited computing resources for local hospital deployment and by privacy risks related to the transmission and storage of medical data in cloud environments. Low-parameter local LLMs show 20%–30% lower accuracy in medical knowledge question answering and 15%–25% reduced medical knowledge coverage compared with full-parameter cloud LLMs, whereas cloud-based systems face inherent data security concerns. To address these issues, a cloud-edge LLM collaborative reasoning framework and related algorithms are proposed for intelligent auxiliary diagnosis systems. The objective is to design a cloud-edge collaborative reasoning agent equipped with intelligent routing and dynamic semantic desensitization to enable adaptive task allocation between the edge (hospital side) and cloud (regional cloud). The framework is intended to achieve a balanced result across diagnostic accuracy, data privacy protection, and resource use efficiency, providing a practical technical path for the development of medical artificial intelligence systems.  Methods  The proposed framework adopts a layered architectural design composed of a four-tier progressive architecture on the edge side and a four-tier service-oriented architecture on the cloud side (Fig. 1). The edge side consists of resource, data, model, and application layers, with the model layer hosting lightweight medical LLMs and the cloud-edge collaborative agent. The cloud side comprises AI IaaS, AI PaaS, AI MaaS, and AI SaaS layers, functioning as a center for computing power and advanced models. The collaborative reasoning process follows a structured workflow (Fig. 2), beginning with user input parsed by the agent to extract key clinical features, followed by reasoning node decision-making. Two core technologies support the agent: 1) Intelligent routing: This mechanism defaults to edge-side processing and dynamically selects the reasoning path (edge or cloud) through a dual-driven weight update strategy. It integrates semantic feature similarity computed through Chinese word segmentation and pre-trained medical language models and incorporates historical decision data, with an exponential moving average used to update feature libraries for adaptive optimization. 2) Dynamic semantic desensitization: Employing a three-stage architecture (sensitive entity recognition, semantic correlation analysis, and hierarchical desensitization decision-making), this technology identifies sensitive entities through a domain-enhanced Named Entity Recognition (NER) model, calculates entity sensitivity and desensitization priority, and applies a semantic similarity constraint to prevent excessive desensitization. Three desensitization strategies (complete deletion, general replacement, partial masking) are used based on entity sensitivity. Experimental validation is conducted with two open-source Chinese medical knowledge graphs (CMeKG and CPubMedKG) containing more than 2.7 million medical entities. The experimental environment (Fig. 3) deploys a qwen3:1.7b model on the edge and the Jiutian LLM on the cloud, with a 5,000-sample evaluation dataset divided into entity-level, relation-level, and subgraph-level questions. Performance is assessed with three metrics: answer accuracy, average token consumption, and average response time.  Results and Discussions  Experimental results show that the proposed framework achieves strong performance across the main evaluation dimensions. For answer accuracy, the intelligent routing mechanism attains 72.44% on CMeKG (Fig. 4) and 66.20% on CPubMedKG (Fig. 5), which are higher than the edge-side LLM alone (60.73% and 54.18%) and close to the cloud LLM (72.68% and 66.49%). These results indicate that the framework maintains diagnostic consistency with cloud-based systems while taking advantage of edge-side capabilities. For resource use, the intelligent routing model reduces average token consumption to 61.27, representing 45.63% of the cloud LLM’s token usage (131.68) (Fig. 6), which supports substantial cost reduction. For response time, the edge-side LLM shows latency greater than 6 s because of limited computing power, whereas the cloud LLM reaches 0.44 s latency through dedicated line access (8% of the 5.46 s latency under internet access). The intelligent routing model produces average latency values between those of the edge and cloud LLMs under both access modes (Fig. 7), consistent with expected trade-offs. The framework also shows applicability across common medical scenarios (Table 1), including outpatient triage, chronic disease management, medical image analysis, intensive care, and health consultation, by combining local real-time processing with cloud-based deep reasoning. Limitations appear in emergency rescue settings with weak network conditions because of latency constraints and in rare disease diagnosis because of limited edge-side training samples and potential loss of specific features during desensitization. Overall, the results verify that the cloud-edge collaborative reasoning mechanism reduces computing resource overhead while preserving consistency in diagnostic results.  Conclusions  This study constructs a cloud-edge LLM collaborative reasoning framework for intelligent auxiliary diagnosis systems, addressing the challenges of limited local computing power and cloud data privacy risks. Through the integration of intelligent routing, prompt engineering adaptation, and dynamic semantic desensitization, the framework achieves balanced optimization of diagnostic accuracy, data security, and resource economy. Experimental validation shows that its accuracy is comparable to cloud-only LLMs while resource consumption is substantially reduced, providing a feasible technical path for medical intelligence development. Future work focuses on three directions: intelligent on-demand scheduling of computing and network resources to mitigate latency caused by edge-side computing constraints; collaborative deployment of localized LLMs with Retrieval-Augmented Generation (RAG) to raise edge-side standalone accuracy above 90%; and expansion of diagnostic evaluation indicators to form a three-dimensional scenario–node–indicator system incorporating sensitivity, specificity, and AUC for clinical-oriented assessment.
Crosstalk-Free Frequency-Spin Multiplexed Multifunctional Device Realized by Nested Meta-Atoms
ZHANG Ming, DONG Peng, TAO En, YANG Lin, HAN Qi, HE Yuhang, HOU Weimin, LI Kang
Available online  , doi: 10.11999/JEIT251202
Abstract:
  Objective  To address the challenges of high manufacturing costs and signal crosstalk in existing multi-dimensional multiplexed metasurfaces, this study proposes a crosstalk-free, frequency-spin multiplexed single-layer metasurface based on nested bi-spectral meta-atoms. By physically superimposing two C-shaped split-ring resonators targeting the Ku-band (12.5 GHz) and K-band (22 GHz), the design achieves four fully independent information channels (two frequencies and two spin states) without relying on spatial division or multi-layer stacking. The objective is to demonstrate independent, high-performance vortex beam generation and holographic imaging, offering a simplified, low-cost solution for advanced 6G communication and sensing systems.  Methods  The metasurface employs a reflective metal-dielectric-metal structure where each unit cell nests an outer (OCSRR) and inner (ICSRR) resonator. Through parameter sweeps using CST Microwave Studio, specific structures were selected to ensure high cross-polarization conversion at target frequencies while maintaining negligible response at non-target bands. Independent spin multiplexing is realized by combining transmission phase and geometric phase via controlled resonator rotation. Two prototypes were fabricated using PCB technology: MS1 for generating focused vortex beams (l= +1, +2, +3, +4) and MS2 for holographic imaging (“H”, “B”, “K”, “D”). Performance was validated via near-field scanning measurements under oblique incidence using a vector network analyzer.  Results and Discussions  Simulations and experimental measurements confirm the excellent frequency selectivity and spin decoupling of the nested design. The OCSRR and ICSRR dictate responses at 12.5 GHz and 22 GHz respectively, behaving as a linear superposition with minimal crosstalk. MS1 successfully generated four focused vortex beams with distinct topological charges, achieving an average mode purity of 88.25%. MS2 reconstructed four independent, clear holographic images with high channel isolation. The close agreement between measured results and simulations verifies the device's robustness and the effectiveness of the crosstalk-free design strategy under practical illumination conditions.  Conclusions  This work demonstrates a reliable method for constructing crosstalk-free frequency-spin multiplexed metasurfaces using nested meta-atoms. By enabling simultaneous, independent manipulation of electromagnetic waves across four channels on a single layer, the proposed approach significantly reduces design complexity and fabrication costs. The successful realization of multi-channel vortex beams and holography highlights the potential of this technology for integrated, multi-functional applications in next-generation wireless communications and optical systems.
Dynamic Wavelet Multi-Directional Perception and Geometry Axis-Solution Guided 3D CT Fracture Image Segmentation
ZHANG Yinhui, LIU Kai, HE Zifen, ZHANG Jinkai, CHEN Guangchen, MA Zhijian
Available online  , doi: 10.11999/JEIT250732
Abstract:
  Objective  Accurate segmentation of fracture surfaces in 3D CT images is critical for orthopedic surgical planning, particularly in determining optimal nail insertion angles perpendicular to fracture planes. However, existing methods exhibit three key limitations: insufficient capture of deep global volumetric context, directional texture ambiguity in low-contrast fracture regions, and underutilization of geometric features decoding. To overcome these challenges, we propose DWAG-Net, a novel framework that integrates Dynamic Wavelet Perception and Geometry-Axis Guidance, to significantly enhance segmentation precision for complex tibial fractures and provide reliable 3D digital guidance for preoperative planning.  Methods  The architecture extends 3D nnU-Netv2 with three core innovations. First, the Dynamic Multi-View Aggregation (DMVA) module dynamically fuses tri-planar (axial/sagittal/coronal) and full-volume features via learnable parameter interpolation (optimized kernel size: 2×2×2) and channel-wise Hadamard product, enhancing global context modeling. Second, the Wavelet Direction Perception Enhancement (WDPE) module decomposes inputs using 3D Symlets discrete wavelet transform and applies direction-specific enhancement to eight subbands: adaptive convolutional kernels (e.g., [5,3,3] for depth-dominant fractures) amplify texture details in high-frequency subbands, while cross-subband fusion strategies integrate complementary features. Third, the Geometry Axis-Solution Guided (GASG) module embeds in decoders to enforce anatomical consistency by computing axis-level affinity maps (depth/height/width) that combine geometric similarity with spatial distance decay, and by optimizing boundary delineation through rotational positional encoding and multi-axis attention. The model was trained on the YN-TFS dataset (110 tibial fracture CT scans, resolution 0.39–1.00 mm) using SGD optimizer (lr=0.01, momentum=0.99) and a class-weighted loss (background:0.5, bone:1, fracture:5) to mitigate severe pixel imbalance.  Results and Discussions  DWAG-Net achieved state-of-the-art performance, with a mean Dice score of 71.20% (Table 1), surpassing nnU-Netv2 by 5.06% (fracture surface Dice: 69.48%, +7.12%). Boundary precision improved significantly, yielding a mean HD95 of 1.38 mm (fracture surface: 1.54 mm, –3.70 mm). Ablation studies (Table 2) confirmed each module’s contribution: DMVA improved Dice by 2.40% through adaptive multi-view fusion; WDPE resolved directional ambiguity, adding a 5.84% fracture-surface Dice gain; GASG provided a further 1.20% gain by enforcing geometric consistency. Key configurations included optimal DMVA parameters (2×2×2), Symlets wavelets, and sequential axis processing (D→H→W). Qualitatively, DWAG-Net preserved fracture integrity where U-Mamba/nnWNet failed and reduced over-segmentation compared with nnFormer/UNETR++ (Fig. 4).  Conclusions  DWAG-Net establishes a new state-of-the-art for 3D fracture segmentation by synergizing multi-directional wavelet perception and geometric guidance. Its key innovations—DMVA for volumetric context fusion, WDPE for directional texture enhancement, and GASG for anatomically consistent decoding—deliver clinically critical precision (71.20% Dice, 1.38 mm HD95) and enable data-driven surgical planning. Future work will focus on optimizing loss functions for severe class imbalance.
Kepler’s Laws Inspired Single Image Detail Enhancement Algorithm
JIANG He, SUN Mang, ZHENG Zhou, WU Peilin, CHENG Deqiang, ZHOU Chen
Available online  , doi: 10.11999/JEIT250455
Abstract:
  Objective  Single-image detail enhancement based on residual learning has received extensive attention in recent years. In these methods, the residual layer is updated by using the similarity between the residual layer and the detail layer, and it is then combined linearly with the original image to enhance image detail. This update process is a greedy algorithm, which tends to trap the system in local optima and limits overall performance. Inspired by Kepler’s laws, the residual update is treated as the dynamic adjustment of planetary positions. By applying Kepler’s laws and computing the global optimal position of the planets, precise updates of the residual layer are achieved.  Methods  The input image is partitioned into multiple blocks. For each block, its candidate blocks are treated as “planets”, and the best matching block is treated as a “star”. The positions of the “planets” and the “star” are updated by computing the differences between each “planet” and the original image block until the positions converge, which determines the location of the global optimal matching block.  Results and Discussions  In this study, 16 algorithms are tested on three datasets at two magnification levels (Table 1). The test results show that the proposed algorithm achieves strong performance in both PSNR and SSIM evaluations. During detail enhancement, compared with other algorithms, the proposed algorithm shows stronger edge preservation capability (Fig. 7). However, it is not robust to noise (Fig. 8Fig. 10), and the performance of the enhanced images continues to decline as noise intensity increases (Fig.11). Both the initial gravitational constant and the gravitational attenuation rate constant present a fluctuating trend, meaning they increase first and then decrease (Fig. 12). When the gradient loss and texture loss weights are set to 0.001, the KLDE system achieves its best performance (Fig. 13).  Conclusions  This study proposes a single-image detail enhancement algorithm inspired by Kepler’s laws. By treating the residual update process as the dynamic adjustment of planetary positions, the algorithm applies Kepler’s laws to optimize residual layer updates, reduces the tendency of greedy search to reach local optima, and achieves more precise image detail enhancement. Experimental results show that the algorithm performs better than existing methods in visual effects and quantitative metrics and produces natural enhancement results. The running time remains relatively long because the iterative update of candidate blocks and the calculation of parameters such as gravity form the main computational bottleneck. Future work will focus on optimizing the algorithm structure to reduce unnecessary searches and improve system efficiency. The algorithm does not require training and achieves strong performance, which indicates potential value in high-precision offline image enhancement scenarios.
A Test-Time Adaptive Method for Nighttime Image-Aided Beam Prediction
SUN kunayng, YAO Rui, ZHU Hancheng, ZHAO JIaqi, LI Xixi, HU Dianlin, HUANG Wei
Available online  , doi: 10.11999/JEIT250530
Abstract:
The latency of traditional beam management in dynamic scenarios and the severe degradation of vision-aided beam prediction under adverse environmental conditions in millimeter-wave (mmWave) systems are addressed by a nighttime image-assisted beam prediction method based on Test-Time Adaptation (TTA). mmWave communications rely on massive Multiple-Input Multiple-Output (MIMO) technology to achieve high-gain narrow beam alignment. However, conventional beam scanning suffers from exponential complexity and latency, limiting applicability in high-mobility settings such as vehicular networks. Vision-assisted schemes that employ deep learning to map image features to beam parameters experience sharp performance loss in low-light, rainy, or foggy environments because of distribution shifts between training data and real-time inputs. In the proposed framework, a TTA mechanism is introduced to overcome the limitations of static inference by performing a single gradient back propagation across model parameters during inference on degraded images. This adaptation dynamically aligns cross-domain feature distributions without the need for adverse-condition data collection or annotation. An entropy minimization-based consistency strategy is further designed to enforce agreement between original and augmented views, guiding parameter updates toward higher confidence and lower uncertainty. Experiments on real nighttime scenarios demonstrate that the framework achieves a top-3 beam prediction accuracy of 93.01%, improving performance by nearly 20% over static inference and outperforming conventional low-light enhancement. By leveraging the semantic consistency of fixed-base-station deployments, this lightweight online adaptation improves robustness, providing a promising solution for efficient beam management in mmWave systems operating in complex open environments.  Objective   mmWave communication, a cornerstone of 5G and beyond, relies on massive MIMO architectures to counter severe path loss through high-gain narrow beam alignment. Traditional beam management schemes, based on exhaustive beam scanning and channel measurement, incur exponential complexity and latency on the order of hundreds of milliseconds, making them unsuitable for high-mobility scenarios such as vehicular networks. Vision-aided beam prediction has recently emerged as a promising alternative, using deep learning to map visual features (e.g., user location and motion) to optimal beam parameters. Although this approach achieves high accuracy under daytime conditions (>90%), it experiences sharp performance degradation in low-light, rainy, or foggy environments because of domain shifts between training data (typically daylight images) and real-time degraded inputs. Existing countermeasures depend on offline data augmentation, which is costly and provides limited generalization to unseen adverse environments. To overcome these limitations, this work proposes a lightweight online adaptation framework that dynamically aligns cross-domain features during inference, eliminating the need for pre-collected adverse-condition data. The objective is to enable robust mmWave communications in unpredictable environments, a necessary step toward practical deployment in autonomous driving and industrial IoT.  Methods   The proposed TTA method operates in three stages. First, a pre-trained beam prediction model with a ResNet-18 backbone is initialized using daylight images and labeled beam indices. During inference, real-time low-quality nighttime images are processed through two parallel pipelines: (1) the original view and (2) a data-augmented view incorporating Gaussian noise. A consistency loss is applied to minimize the prediction distance between the two views, enforcing robustness against local feature perturbations. In parallel, an entropy minimization loss sharpens the output probability distribution by penalizing high prediction uncertainty. These combined losses drive a single-step gradient back propagation that updates all model parameters. Through this mechanism, feature distributions between the training (daylight) and testing (nighttime) domains are aligned without altering global semantic representations, as illustrated in Fig. 2. The system architecture consists of a roadside base station equipped with an RGB camera and a 32-element antenna array, which captures environmental data and executes real-time beam prediction.  Results and Discussions   Experiments on a real-world dataset demonstrate the effectiveness of the proposed method. Under nighttime conditions, the TTA framework achieves a top-3 beam prediction accuracy of 93.01%, exceeding static inference (71.25%) and traditional low-light enhancement methods (85.27%) (Table 3). Ablation studies further validate the contributions of each component: the online feature alignment mechanism, optimized for small-batch data, significantly improves accuracy (Table 4), and the entropy minimization strategy with multi-view consistency learning provides additional gains (Table 5). As shown in Figure 4, the framework exhibits rapid convergence during online testing, enabling base stations to promptly recover performance when faced with new environmental disturbances.  Conclusions   This study addresses the limited robustness of existing vision-aided beam prediction methods in dynamically changing environments by introducing a TTA framework for nighttime image-assisted beam prediction. A small-batch adaptive feature alignment strategy is developed to mitigate feature mismatches in unseen domains while satisfying real-time communication constraints. Besides, a joint optimization framework integrates classical low-light image enhancement with multi-view consistency learning, thereby improving feature discrimination under complex lighting conditions. Experiments conducted on real-world data confirm the effectiveness of the proposed algorithm, achieving more than 20% higher top-3 beam prediction accuracy compared with direct testing. These results demonstrate the framework’s robustness in dynamic environments and its potential to optimize vision-aided communication systems under non-ideal conditions. Future work will extend this approach to beam prediction under rain and fog, as well as to multi-modal perception-assisted communication systems.
Minimax Robust Kalman Filtering under Multistep Random Measurement Delays and Packet Dropouts
YANG Chunshan, ZHAO Ying, LIU Zheng, QIU Yuan, JING Benqin
Available online  , doi: 10.11999/JEIT250741
Abstract:
  Objective  Networked Control Systems (NCSs) provide advantages such as flexible installation, convenient maintenance, and reduced cost, but they also present challenges arising from random measurement delays and packet dropouts caused by communication network unreliability and limited bandwidth. Moreover, system noise variance may fluctuate significantly under strong electromagnetic interference. In NCSs, time delays are random and uncertain. When a set of Bernoulli-distributed random variables is used to describe multistep random measurement delays and packet dropouts, the fictitious noise method in existing studies introduces autocorrelation among different components, which complicates the computation of fictitious noise variances and makes it difficult to establish robustness. This study presents a solution for minimax robust Kalman filtering in systems characterized by uncertain noise variance, multistep random measurement delays, and packet dropouts.  Methods  The main challenges lie in model transformation and robustness verification. When a set of Bernoulli-distributed random variables is employed to represent multistep random measurement delays and packet dropouts, a series of strategies are applied to address the minimax robust Kalman filtering problem. First, a new model transformation method is proposed based on the flexibility of the Hadamard product in multidimensional data processing, after which a robust time-varying Kalman estimator is designed in a unified framework following the minimax robust filtering principle. Second, the robustness proof is established using matrix elementary transformation, strictly diagonally dominant matrices, the Gerŝgorin circle theorem, and the Hadamard product theorem within the framework of the generalized Lyapunov equation method. Additionally, by converting the Hadamard product into a matrix product through matrix factorization, a sufficient condition for the existence of a steady-state estimator is derived, and the robust steady-state Kalman estimator is subsequently designed.  Results and Discussions  The proposed minimax robust Kalman filter extends the robust Kalman filtering framework and provides new theoretical support for addressing the robust fusion filtering problem in complex NCSs. The curves (Fig. 5) present the actual accuracy \begin{document}${\text{tr}}{{\mathbf{\bar P}}^l}(N)$\end{document}, \begin{document}$l = a,b,c,d$\end{document} as a function of \begin{document}$ 0.1 \le {\alpha _0} $\end{document}, \begin{document}${\alpha _1} $\end{document}, \begin{document}${\alpha _2} \le 1 $\end{document}. It is observed that situation (1) achieves the highest robust accuracy, followed by situations (2) and (3), whereas situation (4) exhibits poorer accuracy. This difference arises because the estimators in situation (1) receive measurements with one-step random delay, whereas situation (4) experiences a higher packet loss rate. The curves (Fig. 5) confirm the validity and effectiveness of the proposed method. Another simulation is conducted for a mass-spring-damper system. The comparison between the proposed approach and the optimal robust filtering method (Table 2, Fig. 7) indicates that although the proposed method ensures that the actual prediction error variance attains the minimum upper bound, its actual accuracy is slightly lower than the optimal prediction accuracy.  Conclusions  The minimax robust Kalman filtering problem is investigated for systems characterized by uncertain noise variance, multistep random measurement delays, and packet dropouts. The system noise variance is uncertain but bounded by known conservative upper limits, and a set of Bernoulli-distributed random variables with known probabilities is used to represent the multistep random measurement delays and packet dropouts between the sensor and the estimator. The Hadamard product is used to enhance the model transformation method, followed by the design of a minimax robust time-varying Kalman estimator. Robustness is demonstrated through matrix elementary transformation, the Gerschgorin circle theorem, the Hadamard product theorem, matrix factorization, and the Lyapunov equation method. A sufficient condition is established for the time-varying generalized Lyapunov equation to possess a unique steady-state positive semidefinite solution, based on which a robust steady-state estimator is constructed. The convergence between the time-varying and steady-state estimators is also proven. Two simulation examples verify the effectiveness of the proposed approach. The presented methods overcome the limitations of existing techniques and provide theoretical support for solving the robust fusion filtering problem in complex NCSs.
Power Grid Data Recovery Method Driven by Temporal Composite Diffusion Networks
YAN Yandong, LI Chenxi, LI Shijie, YANG Yang, GE Yuhao, HUANG Yu
Available online  , doi: 10.11999/JEIT250435
Abstract:
  Objective  Smart grid construction drives modern power systems, and distribution networks serve as the key interface between the main grid and end users. Their stability, power quality, and efficiency depend on accurate data management and analysis. Distribution networks generate large volumes of multi-source heterogeneous data that contain user consumption records, real-time meteorology, equipment status, and marketing information. These data streams often become incomplete during collection or transmission due to noise, sensor failures, equipment aging, or adverse weather. Missing data reduces the reliability of real-time monitoring and affects essential tasks such as load forecasting, fault diagnosis, health assessment, and operational decision making. Conventional approaches such as mean or regression imputation lack the capacity to maintain temporal dependencies. Generative models such as Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) do not represent the complex statistical characteristics of grid data with sufficient accuracy. This study proposes a diffusion model based data recovery method for distribution networks. The method is designed to reconstruct missing data, preserve semantic and statistical integrity, and enhance data utility to support smart grid stability and efficiency.  Methods  This paper proposes a power grid data augmentation method based on diffusion models. The core of the method is that input Gaussian noise is mapped to the target distribution space of the missing data so that the recovered data follows its original distribution characteristics. To reduce semantic discrepancy between the reconstructed data and the actual data, the method uses time series sequence embeddings as conditional information. This conditional input guides and improves the diffusion generation process so that the imputation remains consistent with the surrounding temporal context.  Results and Discussions  Experimental results show that the proposed diffusion model based data augmentation method achieves higher accuracy in recovering missing power grid data than conventional approaches. The performance demonstrates that the method improves the completeness and reliability of datasets that support analytical tasks and operational decision making in smart grids.  Conclusions  This study proposes and validates a diffusion model based data augmentation method designed to address data missingness in power distribution networks. Traditional restoration methods and generative models have difficulty capturing the temporal dependencies and complex distribution characteristics of grid data. The method presented here uses temporal sequence information as conditional guidance, which enables accurate imputation of missing values and preserves the semantic integrity and statistical consistency of the original data. By improving the accuracy of distribution network data recovery, the method provides a reliable approach for strengthening data quality and supports the stability and efficiency of smart grid operations.
Research on Directional Modulation Multi-carrier Waveform Design for Integrated Sensing and Communication
HUANG Gaojian, ZHANG Shengzhuang, DING Yuan, LIAO Kefei, JIN Shuanggen, LI Xingwang, OUYANG Shan
Available online  , doi: 10.11999/JEIT250680
Abstract:
  Objective  With the concurrent evolution of wireless communication and radar technologies, spectrum congestion has become increasingly severe. Integrated Sensing and Communication (ISAC) has emerged as an effective approach that unifies sensing and communication functionalities to achieve efficient spectrum and hardware sharing. Orthogonal Frequency Division Multiplexing (OFDM) signals are regarded as a key candidate waveform due to their high flexibility. However, estimating target azimuth angles and suppressing interference from non-target directions remain computationally demanding, and confidential information transmitted in these directions is vulnerable to eavesdropping. To address these challenges, the combination of Directional Modulation (DM) and OFDM, termed OFDM-DM, provides a promising solution. This approach enables secure communication toward the desired direction, suppresses interference in other directions, and reduces radar signal processing complexity. The potential of OFDM-DM for interference suppression and secure waveform design is investigated in this study.  Methods  As a physical-layer security technique, DM is used to preserve signal integrity in the intended direction while deliberately distorting signals in other directions. Based on this principle, an OFDM-DM ISAC waveform is developed to enable secure communication toward the target direction while simultaneously estimating distance, velocity, and azimuth angle. The proposed waveform has two main advantages: the Bit Error Rate (BER) at the radar receiver is employed for simple and adjustable azimuth estimation, and interference from non-target directions is suppressed without additional computational cost. The waveform maintains the OFDM constellation in the target direction while distorting constellation points elsewhere, which reduces correlation with the original signal and enhances target detection through time-domain correlation. Moreover, because element-wise complex division in the Two-Dimensional Fast Fourier Transform (2-D FFT) depends on signal integrity, phase distortion in signals from non-target directions disrupts phase relationships and further diminishes the positional information of interference sources.  Results and Discussions  In the OFDM-DM ISAC system, the transmitted signal retains its communication structure within the target beam, whereas constellation distortion occurs in other directions. Therefore, the BER at the radar receiver exhibits a pronounced main lobe in the target direction, enabling accurate azimuth estimation (Fig. 5). In the time-domain correlation algorithm, the target distance is precisely determined, while correlation in non-target directions deteriorates markedly due to DM, thereby achieving effective interference suppression (Fig. 6). Additionally, during 2-D FFT processing, signal distortion disrupts the linear phase relationship among modulation symbols in non-target directions, causing conventional two-dimensional spectral estimation to fail and further suppressing positional information of interference sources (Fig. 7). Additional simulations yield one-dimensional range and velocity profiles (Fig. 8). The results demonstrate that the OFDM-DM ISAC waveform provides structural flexibility, physical-layer security, and low computational complexity, making it particularly suitable for environments requiring high security or operating under strong interference conditions.  Conclusions  This study proposes an OFDM-DM ISAC waveform and systematically analyzes its advantages in both sensing and communication. The proposed waveform inherently suppresses interference from non-target directions, eliminating target ambiguity commonly encountered in traditional ISAC systems and thereby enhancing sensing accuracy. Owing to the spatial selectivity of DM, only legitimate directions can correctly demodulate information, whereas unintended directions fail to recover valid data, achieving intrinsic physical-layer security. Compared with existing methods, the proposed waveform simultaneously attains secure communication and interference suppression without additional computational burden, offering a lightweight and high-performance solution suitable for resource-constrained platforms. Therefore, the OFDM-DM ISAC waveform enables high-precision sensing while maintaining communication security and hardware feasibility, providing new insights for multi-carrier ISAC waveform design.
Low-Complexity Joint Estimation Algorithm for Carrier Frequency Offset and Sampling Frequency Offset in 5G-NTN Low Earth Orbit Satellite Communications
GONG Xianfeng, LI Ying, LIU Mingyang, ZHAI Shenghua
Available online  , doi: 10.11999/JEIT251086
Abstract:
  Objective   The Doppler effect presents a major impairment in Low Earth Orbit (LEO) satellite communications within 5G Non-Terrestrial Networks (5G-NTN), introducing Carrier Frequency Offset (CFO), Sampling Frequency Offset (SFO), and Inter-Subcarrier Frequency Offset (ISFO) across subcarriers. Although existing estimation algorithms focus mainly on CFO and SFO, the effect of ISFO remains inadequately addressed. ISFO becomes particularly detrimental to receiver performance when OFDM systems utilize a large number of subcarriers and high-order modulation. Moreover, under joint CFO and SFO conditions, conventional maximum likelihood estimation (MLE) methods often involve one- or two-dimensional grid searches, incurring high computational complexity. To mitigate these issues, this paper proposes two novel joint estimation algorithms for CFO and SFO.  Methods   This paper analyzes the influence of non-ideal factors at the transmitter, receiver, and channel, such as local oscillator offset, sampling frequency offset in Digital-to-Analog and Analog-to-Digital converters, and Doppler effect. A mathematical model for the received OFDM signal is developed, and the mechanism through which SFO and ISFO distort the phase of frequency-domain subcarriers is derived. Leveraging the pilot structure of 5G-NTN, two joint CFO and SFO estimation algorithms are introduced: (1) Algorithm 1 exploits the sequence correlation between the received frequency-domain DMRS signal vectors, denoted as and . After phase pre-compensation is applied to , the normalized cross-correlation vector is computed. An objective function is constructed based on this vector, and its unimodal property within the main lobe is utilized to efficiently estimate the parameters via a bisection search. (2) Algorithm 2 treats the estimation parameters as analogous to carrier frequency offsets in single-carrier systems and adopts an L&R-based autocorrelation approach to derive approximate closed-form expressions.  Results and Discussions   A computational complexity analysis is performed comparing the proposed algorithms with conventional one-dimensional (1D-ML) and two-dimensional (2D-ML) grid-search MLE methods. Numerical results demonstrate that Algorithm 1 achieves substantial complexity reduction. Specifically, the number of complex multiplications—the dominant computational cost—is only 4% of that of the 2D-ML method, 8% of that of Algorithm 2, and 44% of that of the 1D-ML method. Although Algorithm 2 is computationally heavier, it provides a closed-form estimation expression. The performance of each algorithm is evaluated in terms of the mean square error (MSE) of the estimated parameters. Simulations show that for a subcarrier number of 3072, the 1D-ML algorithm slightly outperforms others at SNRs below 5 dB. However, since robust modulation schemes (e.g., BPSK, QPSK) typically used at low SNRs can tolerate larger offsets, the medium-to-high SNR regime is of greater practical interest, where all four algorithms exhibit comparable estimation performance.  Conclusions   This paper addresses the impact of Doppler effect in 5G-NTN LEO satellite communications by analyzing the mechanism and influence of ISFO and proposing two joint estimation algorithms for CFO and SFO. First, a mathematical model of the received signal is established considering non-ideal factors such as CFO, SFO, and ISFO. It is derived that the combined effect of SFO and ISFO on OFDM signals is equivalent to their linear superposition, effectively expanding the range of the equivalent SFO. Second, the objective function is defined using the cross-correlation vector of two DMRS sequences. Leveraging its unimodal characteristic within the main lobe, a binary search algorithm is employed to achieve rapid convergence. Subsequently, the parameter to be estimated—determined by SFO and ISFO—is analogized to the carrier frequency offset in single-carrier systems. An approximate closed-form solution for parameter estimation is derived using the L&R algorithm. Finally, complexity analysis and performance simulations are conducted. The results demonstrate that the proposed algorithms not only significantly reduce computational complexity but also exhibit excellent estimation performance. The outcomes of this research can be applied to the development of 5G-NTN LEO satellite payloads and terminal products, demonstrating promising potential for widespread application.
Hybrid Vibration Isolation Design Based on Piezoelectric Actuator and Quasi-zero Stiffness System
YANG Liu, ZHAO Haiyang, ZHAO Kun, CHENG Jiajia, LI Dongjie
Available online  , doi: 10.11999/JEIT250310
Abstract:
  Objective  Precision instruments now operate under increasingly demanding vibration conditions, and conventional passive isolation methods are insufficient for maintaining stable laboratory environments. Vibrations generated by personnel movement, machinery operation, and vehicle transit can travel long distances and penetrate structural materials, reaching instrument platforms and reducing measurement accuracy, stability, and reliability. Passive isolation units such as rubber elements and springs show limited performance when dealing with low-frequency and small-amplitude excitation. Quasi-Zero Stiffness (QZS) systems improve low-frequency isolation but their performance depends on amplitude and requires strict installation accuracy. Active vibration isolation uses controlled actuators between the vibration source and the support structure to reduce disturbances. Piezoelectric ceramics offer high precision and rapid response, and are widely applied in such systems. Purely active isolation, however, may perform poorly at high frequencies due to sensor sampling limitations and actuator response bandwidth. High-frequency or large-amplitude excitation also results in high actuator energy demand, while the hysteresis characteristics of piezoelectric ceramics reduce control precision. Combining active and passive approaches is therefore an effective strategy for ensuring vibration stability in precision laboratory applications.  Methods  A hybrid vibration isolation strategy is developed by integrating a piezoelectric actuator with a QZS mechanism. A stacked piezoelectric ceramic actuator is designed to generate the required output force and displacement, and elastic spacers are used to apply a preload that improves operational stability and linearity. The QZS system is formed by combining positive and negative stiffness components to achieve high static stiffness with low dynamic stiffness. To address hysteresis in the piezoelectric actuator, an improved Bouc-Wen (B-W) model is adopted and an inverse model is constructed to enable hysteresis compensation. The actuator is then coupled with the QZS structure, and the vibration isolation performance of the hybrid system is assessed through numerical simulation.  Results and Discussions  An active-passive vibration isolation device is developed, comprising a QZS system formed by linear springs and an active piezoelectric stack actuator (Fig. 9a). Because the traditional B-W algorithm does not accurately describe the dynamic relationship between acceleration and voltage, a voltage-derivative term (Equation 13) is introduced to improve the conventional model. This modification refines the force-voltage representation, enhances model adaptability, and enables accurate description of the acceleration-voltage response over a broader operating range. Forward model parameters are identified using the differential evolution algorithm (Table 1), and an inverse model is constructed through direct inversion with parameters obtained using the same optimization method (Table 2). The forward and inverse modules are then cascaded to compensate for hysteresis (Fig. 8). Dynamic equations for the QZS system and the linearized piezoelectric actuator are derived (Equation 16). An adaptive sliding-mode controller incorporating a Luenberger sliding-mode observer is subsequently designed to regulate vibration signals, and active isolation performance is verified.  Conclusions  The proposed hybrid vibration isolation design integrates the passive low-frequency isolation capability of the QZS system with the active control potential of the piezoelectric actuator, offering a feasible approach for vibration suppression in precision instruments. The hysteresis behavior of piezoelectric ceramics is characterized and fitted effectively, and an inverse model is established to compensate for the nonlinear voltage-acceleration response. A dynamic model of the combined passive-active configuration is derived, and vibration signals are regulated using adaptive sliding-mode control with a Luenberger sliding-mode observer. The resulting system demonstrates stable vibration reduction, indicating strong applicability and research value.
Application of WAM Data Set and Classification Method of Electromagnetic Wave Absorbing Materials
YUAN Yuyang, ZHANG Junhan, LI Dandan, SHA Jian jun
Available online  , doi: 10.11999/JEIT250166
Abstract:
The performance of electromagnetic radiation shielding and absorbing materials depends primarily on thickness, maximum reflection loss, and effective absorption bandwidth. Current research focuses on Metal–Organic Frameworks (MOFs), carbon-based, and ceramic absorbing materials, analyzed using weak artificial intelligence techniques applied to the Wave-Absorbing Materials (WAM) dataset. After dividing the dataset into training and testing subsets, data augmentation, correlation analysis, and principal component analysis are performed. A decision tree algorithm is then applied to establish classification indicators, revealing that the reflection loss of MOF materials exceeds that of carbon-based materials. MOFs are more likely to achieve a maximum reflection loss below –45 dB. The random forest algorithm demonstrates stronger generalization ability than the decision tree algorithm, with a higher ROC–AUC value. Neural network classification shows that the self-organizing map neural network yields superior classification performance, whereas the probabilistic neural network performs poorly. When the binary classification problem is extended to a three-class problem, nonlinear classification, clustering, and Boosting algorithms indicate that maximum reflection loss serves as a key discriminative feature. Further analysis confirms that the WAM dataset is nonlinearly separable and that fuzzy clustering achieves better results. Artificial intelligence facilitates the identification of relationships between material properties and absorption performance, accelerates the development of new Wave-Absorbing Materials (WAM), and supports the construction of a knowledge graph and database for absorbing materials.  Objective   Computational materials science, high-throughput experimentation, and the Materials Genome Initiative (MGI) have emerged as key frontiers in modern materials research. The MGI provides a strategic framework and developmental roadmap for advancing materials discovery through artificial intelligence. Analogous to gene sequencing in bioinformatics, its central objective is to accelerate the identification of novel material compositions and structures. Extracting valuable information from large-scale datasets substantially reduces costs, enhances efficiency, fosters interdisciplinary integration, and promotes transformative progress in materials development. Big data analytics, high-performance computing, and advanced algorithms form the core pillars of this initiative, supplying essential support for new materials research and development. Nevertheless, the discovery of new compositions and structures depends on the effective screening of candidate materials to identify those exhibiting superior properties suitable for engineering applications. Achieving this goal requires the establishment of comprehensive datasets, the development of reliable classification algorithms, the improvement of model generalization performance, and the advancement of application-oriented software tools.  Methods   Pattern recognition techniques are employed in this study. A self-developed WAM dataset is first constructed, comprising a test set and a validation set. Data preprocessing is performed initially, including data augmentation, data integration, and principal component analysis. Decision tree and random forest algorithms are applied to establish classification indicators and define classification criteria. Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN) models are subsequently utilized for material classification. Finally, the accuracy of various clustering algorithms is evaluated, and the fuzzy clustering algorithm is found to achieve relatively superior performance and satisfactory classification results.  Results and Discussions   It is found that the reflection loss of MOF materials is superior to that of carbon-based materials. Semantic segmentation algorithms are identified as unsuitable for classifying the WAM dataset. Among the neural network approaches, the SOM achieves higher classification accuracy than the PNN. The WAM dataset is determined to be nonlinearly separable, indicating that classification performance depends strongly on the intrinsic data distribution characteristics. The maximum reflection loss is identified as the key indicator for effective classification.  Conclusions   A self-developed WAM dataset is constructed to address the lack of publicly available datasets for applying pattern recognition methods to electromagnetic WAM. The performance of multiple algorithms is evaluated, and the optimal algorithm is identified according to the dataset characteristics. The conventional binary classification problem is extended to a three-class framework, providing the foundation for further research on multi-class classification. The application of artificial intelligence algorithms is found to enhance the credibility and reliability of the research, reduce time and labor costs, and facilitate the exploration of relationships between material properties and absorption performance. This approach shortens the research and development cycle, supports the screening of new materials, and contributes to the establishment of a knowledge base for absorbing materials. However, the knowledge extracted from the WAM dataset remains limited by data sparsity, which constrains the effectiveness of artificial intelligence methods.
Multi-Channel Switching Array DOA Estimation Algorithm Based on FRIDA
CHEN Tao, XI Haolin, ZHAN Lei, YU Yuwei
Available online  , doi: 10.11999/JEIT250350
Abstract:
  Objective   With the increasing complexity of electromagnetic environments, the demand for higher estimation accuracy in practical direction-finding systems is rising. Enlarging the antenna array is an effective approach to improve estimation accuracy; however, it also significantly increases system complexity. This study aims to reduce the number of channels required while preserving the Direction-Of-Arrival (DOA) estimation performance achievable with full-channel data. By combining the channel compression algorithm, which reduces channel usage, with the time-modulated array structure that incorporates RF front-end switches, this paper proposes a multi-channel switching array DOA estimation algorithm based on FRIDA.  Methods   The algorithm introduces a selection matrix composed of switches between the antenna array and the channels. This matrix directs the signal received by a selected antenna into the corresponding channel, thereby enabling a specific subarray to capture the data. By switching across different subarrays, multiple reduced-channel received data covariance matrices are collected. To ensure phase consistency within these covariance matrices, common array elements are specified for each subarray. After weighted summation, these covariance matrices are combined to restore the dimensionality of the covariance matrix, producing the total covariance matrix. Next, the elements of the total covariance matrix that correspond to identical array-element spacings are weighted and summed, yielding the full-channel received data vector. Using this vector, an FRI reconstruction model is established. Finally, the incident angle is estimated through the combination of the proximal gradient descent algorithm and the parameter recovery algorithm.  Results and Discussions   Simulation results of DOA estimation for SA-FRI under multiple source incidence demonstrate that the full-channel received data vectors reconstructed from multiple covariance matrices of reduced-channel data can successfully discriminate multi-source incident signals, achieving performance comparable to that of full-channel data (Fig. 2). Further simulations evaluating estimation accuracy with varying numbers of snapshots and Signal-to-Noise Ratios (SNRs) show that the accuracy of the proposed algorithm improves with increasing snapshots and SNR. Under identical conditions, the use of more channels yields higher DOA estimation accuracy (Figs. 3 and 4). Comparisons of four different algorithms under varying SNRs and snapshot numbers indicate that estimation accuracy increases with both parameters. The proposed algorithm consistently outperforms the other algorithms under the same conditions (Figs. 5 and 6). Finally, verification with measured data produces results consistent with the simulations (Fig. 9), further confirming the effectiveness of the proposed algorithm.   Conclusions   To address the challenge of reducing the number of channels in practical DOA estimation systems, this study proposes an array-switching DOA estimation method based on proximal gradient descent. The algorithm first reduces channel usage through a switching matrix, then generates multiple covariance matrices by sequentially switching different subarray access channels. These covariance matrices are combined to reconstruct the full-channel received data covariance matrix. Finally, the DOA parameters of incident signals are estimated using the proximal gradient descent algorithm. Simulation results confirm that the proposed algorithm achieves reduced channel usage while maintaining reliable estimation accuracy. Moreover, validation with measured data collected from an actual DOA estimation system demonstrates results consistent with the simulations, further verifying the algorithm’s effectiveness.
Lightweight Incremental Deployment for Computing-Network Converged AI Services
WANG Qinding, TAN bin, HUANG Guangping, DUAN Wei, YANG Dong, ZHANG Hongke
Available online  , doi: 10.11999/JEIT250663
Abstract:
  Objective   The rapid expansion of Artificial Intelligence (AI) computing services has heightened the demand for flexible access and efficient utilization of computing resources. Traditional Domain Name System (DNS) and IP-based scheduling mechanisms are constrained in addressing the stringent requirements of low latency and high concurrency, highlighting the need for integrated computing-network resource management. To address these challenges, this study proposes a lightweight deployment framework that enhances network adaptability and resource scheduling efficiency for AI services.  Methods   The AI-oriented Service IDentifier (AISID) is designed to encode service attributes into four dimensions: Object, Function, Method, and Performance. Service requests are decoupled from physical resource locations, enabling dynamic resource matching. AISID is embedded within IPv6 packets (Fig. 5), consisting of a 64-bit prefix for identification and a 64-bit service-specific suffix (Fig. 4). A lightweight incremental deployment scheme is implemented through hierarchical routing, in which stable wide-area routing is managed by ingress gateways, and fine-grained local scheduling is handled by egress gateways (Fig. 6). Ingress and egress gateways are incrementally deployed under the coordination of an intelligent control system to optimize resource allocation. AISID-based paths are encapsulated at ingress gateways using Segment Routing over IPv6 (SRv6), whereas egress gateways select optimal service nodes according to real-time load data using a weighted least-connections strategy (Fig. 8). AISID lifecycle management includes registration, query, migration, and decommissioning phases (Table 2), with global synchronization maintained by the control system. Resource scheduling is dynamically adjusted according to real-time network topology and node utilization metrics (Fig. 7).  Results and Discussions   Experimental results show marked improvements over traditional DNS/IP architectures. The AISID mechanism reduces service request initiation latency by 61.3% compared to DNS resolution (Fig. 9), as it eliminates the need for round-trip DNS queries. Under 500 concurrent requests, network bandwidth utilization variance decreases by 32.8% (Fig. 10), reflecting the ability of AISID-enabled scheduling to alleviate congestion hotspots. Computing resource variance improves by 12.3% (Fig. 11), demonstrating more balanced workload distribution across service nodes. These improvements arise from AISID’s precise semantic matching in combination with the hierarchical routing strategy, which together enhance resource allocation efficiency while maintaining compatibility with existing IPv6/DNS infrastructure (Fig. 23). The incremental deployment approach further reduces disruption to legacy networks, confirming the framework’s practicality and viability for real-world deployment.  Conclusions   This study establishes a computing-network convergence framework for AI services based on semantic-driven AISID and lightweight deployment. The key innovations include AISID’s semantic encoding, which enables dynamic resource scheduling and decoupled service access, together with incremental gateway deployment that optimizes routing without requiring major modifications to legacy networks. Experimental validation demonstrates significant improvements in latency reduction, bandwidth efficiency, and balanced resource utilization. Future research will explore AISID’s scalability across heterogeneous domains and its robustness under dynamic network conditions.
Optimized Design of Non-Transparent Bridge for Heterogeneous Interconnects in Hyper-converged Infrastructure
ZHENG Rui, SHEN Jianliang, LV Ping, DONG Chunlei, SHAO Yu, ZHU Zhengbin
Available online  , doi: 10.11999/JEIT250272
Abstract:
  Objective  The integration of heterogeneous computing resource clusters into modern Hyper-Converged Infrastructure (HCI) systems imposes stricter performance requirements in latency, bandwidth, throughput, and cross-domain transmission stability. Traditional HCI systems primarily rely on the Ethernet TCP/IP protocol, which exhibits inherent limitations, including low bandwidth efficiency, high latency, and limited throughput. Existing PCIe Switch products typically employ Non-Transparent Bridges (NTBs) for conventional dual-system connections or intra-server communication; however, they do not meet the performance demands of heterogeneous cross-domain transmission within HCI environments. To address this limitation, a novel Dual-Mode Non-Transparent Bridge Architecture (D-MNTBA) is proposed to support dual transmission modes. D-MNTBA combines a fast transmission mode via a bypass mechanism with a stable transmission mode derived from the Traditional Data Path Architecture (TDPA), thereby aligning with the data characteristics and cross-domain streaming demands of HCI systems. Hardware-level enhancements in address and ID translation schemes enable D-MNTBA to support more complex mappings while minimizing translation latency. These improvements increase system stability and effectively support the cross-domain transmission of heterogeneous data in HCI systems.  Methods  To overcome the limitations of traditional single-pass architectures and the bypass optimizations of the TDPA, the proposed D-MNTBA incorporates both a fast transmission path and a stable transmission path. This dual-mode design enables the NTB to leverage the data characteristics of HCI systems for telegram-based streaming, thereby reducing dependence on intermediate protocols and data format conversions. The stable transmission mode ensures reliable message delivery, while the fast transmission mode—enhanced through hardware-level optimizations in address and ID translation—supports high-real-time cross-domain communication. This combination improves overall transmission performance by reducing both latency and system overhead. To meet the low-latency demands of the bypass transmission path, the architecture implements hardware-level enhancements to the address and ID conversion modules. The address translation module is expanded with a larger lookup table, allowing for more complex and flexible mapping schemes. This enhancement enables efficient utilization of non-contiguous and fragmented address spaces without compromising performance. Simultaneously, the ID conversion module is optimized through multiple conversion strategies and streamlined logic, significantly reducing the time required for ID translation.  Results and Discussions  Address translation in the proposed D-MNTBA is validated through emulation within a constructed HCI environment. The simulation log for indirect address translation shows no errors or deadlocks, and successful hits are observed on BAR2/3. During dual-host disk access, packet header addresses and payload content remain consistent, with no packet loss detected (Fig. 14), indicating that indirect address translation is accurately executed under D-MNTBA. ID conversion performance is evaluated by comparing the proposed architecture with the TDPA implemented in the PEX8748 chip. The switch based on D-MNTBA exhibits significantly shorter ID conversion times. A maximum reduction of approximately 34.9% is recorded, with an ID conversion time of 71 ns for a 512-byte payload (Fig. 15). These findings suggest that the ID function mapping method adopted in D-MNTBA effectively reduces conversion latency and enhances system performance. Throughput stability is assessed under sustained heavy traffic with payloads ranging from 256 to 2048 bytes. The maximum throughputs of D-MNTBA, the Ethernet card, and PEX8748 are measured at 1.36 GB/s, 0.97 GB/s, and 0.9 GB/s, respectively (Fig. 16). Compared to PEX8748 and the Ethernet architecture, D-MNTBA improves throughput by approximately 51.1% and 40.2%, respectively, and shows the slowest degradation trend, reflecting superior stability in heterogeneous cross-domain transmission. Bandwidth comparison reveals that D-MNTBA outperforms TDPA and the Ethernet card, with bandwidth improvements of approximately 27.1% and 19.0%, respectively (Fig. 17). These results highlight the significant enhancement in cross-domain transmission performance achieved by the proposed architecture in heterogeneous environments.  Conclusions  This study proposes a Dual-Mode D-MNTBA to address the challenges of heterogeneous interconnection in HCI systems. By integrating a fast transmission path enabled by a bypass architecture with the stable transmission path of the TDPA, D-MNTBA accommodates the specific data characteristics of cross-domain transmission in heterogeneous environments and enables efficient message routing. D-MNTBA enhances transmission stability while improving system-wide performance, offering robust support for high-real-time cross-domain transmission in HCI. It also reduces latency and overhead, thereby improving overall transmission efficiency. Compared with existing transmission schemes, D-MNTBA achieves notable gains in performance, making it a suitable solution for the demands of heterogeneous domain interconnects in HCI systems. However, the architectural enhancements, particularly the bypass design and associated optimizations, increase logic resource utilization and power consumption. Future work should focus on refining hardware design, layout, and wiring strategies to reduce logic complexity and resource consumption without compromising performance.
Considering Workload Uncertainty in Strategy Gradient-based Hyper-heuristic Scheduling for Software Projects
SHEN Xiaoning, SHI Jiangyi, MA Yanzhao, CHEN Wenyan, SHE Juan
Available online  , doi: 10.11999/JEIT250769
Abstract:
  Objective  The Software Project Scheduling Problem (SPSP) is essential for allocating resources and arranging tasks in software development, and it affects economic efficiency and competitiveness. Deterministic assumptions used in traditional models overlook common fluctuations in task effort caused by requirement changes or estimation deviation. These assumptions often reduce feasibility and weaken scheduling stability in dynamic development settings. This study develops a multi-objective model that integrates task effort uncertainty and represents it using asymmetric triangular interval type-2 fuzzy numbers to reflect real development conditions. The aim is to improve decision quality under uncertainty by designing an optimization method that shortens project duration and increases employee satisfaction, thereby strengthening robustness and adaptability in software project scheduling.  Methods  A Policy Gradient-based Hyper-Heuristic Algorithm (PGHHA) is developed to solve the formulated model. The framework contains a High-Level Strategy (HLS) and a set of Low-Level Heuristics (LLHs). The High-Level Strategy applies an Actor-Critic reinforcement learning structure. The Actor network selects appropriate LLHs based on real-time evolutionary indicators, including population convergence and diversity, and the Critic network evaluates the actions selected by the Actor. Eight LLHs are constructed by combining two global search operators, the matrix crossover operator and the Jaya operator with random jitter, with two local mining strategies, duration-based search and satisfaction-based search. Each LLH is configured with two neighborhood depths (V1=5 and V2=20), determined through Taguchi orthogonal experiments. Each candidate solution is encoded as a real-valued task-employee effort matrix. Constraints including skill coverage, maximum dedication, and maximum participant limits are applied during optimization. A prioritized experience replay mechanism is introduced to reuse historical trajectories, which accelerates convergence and improves network updating efficiency.  Results and Discussions  Experimental evaluation is performed on twelve synthetic cases and three real software projects. The algorithm is assessed against six representative methods to validate the proposed strategies. HyperVolume Ratio (HVR) and Inverted Generational Distance (IGD) are used as performance indicators, and statistical significance is examined using Wilcoxon rank-sum tests with a 0.05 threshold. The findings show that the PGHHA achieves better convergence and diversity than all comparison methods in most cases. The quantitative improvements are reflected in the summarized values (Table 5, Table 6). The visual distribution of Pareto fronts (Fig. 4, Fig. 5) shows that the obtained solutions lie below those of alternative algorithms and display more uniform coverage, indicating higher convergence precision and improved spread. The computational cost increases because of neural network training and the experience replay mechanism, as shown in Fig. 6. However, the improvement in solution quality is acceptable considering the longer planning period of software development. Modeling effort uncertainty with asymmetric triangular interval type-2 fuzzy numbers enhances system stability. The adaptive heuristic selection driven by the Actor-Critic mechanism and the prioritized experience replay strengthens performance under dynamic and uncertain conditions. Collectively, the evidence indicates that the PGHHA provides more reliable support for software project scheduling, maintaining diversity while optimizing conflicting objectives under uncertain workload environments.  Conclusions  A multi-objective software project scheduling model is developed in this study, where task effort uncertainty is represented using asymmetric triangular interval type-2 fuzzy numbers. A PGHHA is designed to solve the model. The algorithm applies an Actor-Critic reinforcement learning structure as the high-level strategy to adaptively select LLHs according to the evolutionary state. A prioritized experience replay mechanism is incorporated to enhance learning efficiency and accelerate convergence. Tests on synthetic and real cases show that: (1) The proposed algorithm delivers stronger convergence and diversity under uncertainty than six representative algorithms; (2) The combination of global search operators and local mining strategies maintains a suitable balance between exploration and exploitation. (3) The use of type-2 fuzzy representation offers a more stable characterization of effort uncertainty than type-1 fuzzy numbers. The current work focuses on a single-project context. Future work will extend the model to multi-project environments with shared resources and inter-project dependencies. Additional research will examine adaptive reward strategies and lightweight network designs to reduce computational demand while preserving solution quality.
Detection of Underwater Acoustic Transient Signals under Alpha Stable Distribution Noise
CHEN Wen, ZOU Nan, ZHANG Guangpu, LI Yanhe
Available online  , doi: 10.11999/JEIT250500
Abstract:
  Objective  Transient signals are generated during changes in the state of underwater acoustic targets and are difficult to suppress or remove. Therefore, they serve as an important basis for covert detection of underwater targets. Practical marine noise exhibits non-Gaussian behavior with impulsive components, which degrade or disable conventional Gaussian-based detectors, including energy detection commonly used in engineering systems. Existing approaches apply nonlinear processing or fractional lower-order statistics to mitigate non-Gaussian noise, yet they face drawbacks such as signal distortion and increased computational cost. To address these issues, an Alpha-stable noise model is adopted. A Data-Preprocessing denoising Short-Time Cross-Correntropy Detection (DP-STCCD) method is proposed to enable passive detection and Time-of-Arrival (ToA) estimation for unknown deterministic transient signals in non-Gaussian underwater environments.  Methods  The method consists of two stages: data-preprocessing denoising and short-time cross-correntropy detection. In the preprocessing stage, an outlier detection approach based on the InterQuartile Range (IQR) is used. Upper and lower thresholds are calculated to remove impulsive spikes while retaining local signal structure. Multi-stage filtering is then applied to further reduce noise. Median filtering reconstructs the signal with limited detail loss, and modified mean filtering suppresses remaining spikes by discarding extreme values within local windows. In the detection stage, the denoised signal is divided into short frames. Short-time cross-correntropy with a Gaussian kernel is calculated between adjacent frames to form the detection statistic. A first-order recursive filter estimates background noise and determines adaptive thresholds. Detection outputs are generated using joint amplitude–width decision rules. ToA estimation is performed by locating peaks in the short-time cross-correntropy. The method does not require prior noise information and improves robustness in non-Gaussian environments through data cleaning and information-theoretic feature extraction.  Results and Discussions  Simulations under symmetric Alpha-stable noise verify the effectiveness of the method. The preprocessing stage removes impulsive spikes while preserving key temporal features (Fig. 3). After denoising, the performance of energy detection shows partial recovery, and the peak-to-average ratio of short-time cross-correntropy features increases by 10 dB (Fig. 4, Fig. 5). Experimental results show that DP-STCCD provides higher detection probability and improved ToA estimation accuracy compared with Data Preprocessing denoising-Energy Detection(DP-ED). Under conditions with characteristic index α=1.5 and a Generalized Signal-to-Noise Ratio (GSNR) of −11 dB, DP-STCCD yields a 30.2% improvement in detection probability and an 18.4% increase in ToA estimation precision relative to the comparative method (Fig. 6, Fig. 9(a)). These findings confirm the robustness and detection capability of the proposed approach in non-Gaussian underwater noise environments.  Conclusions  A joint detection method, DP-STCCD, combining data-preprocessing denoising and short-time cross-correntropy features is proposed for transient signal detection under Alpha-stable noise. Preprocessing approaches based on IQR outlier detection and multi-stage filtering suppress impulsive interference while preserving key time-domain characteristics. Short-time cross-correntropy improves detection sensitivity and ToA estimation accuracy. The results show that the proposed method provides better performance than traditional energy detection under low GSNR and maintains stable behavior across different characteristic indices. The method offers a feasible solution for covert underwater target detection in non-Gaussian environments. Future work will optimize the algorithm for real marine noise and improve its engineering applicability.
Geospatial Identifier Network Modal Design and Scenario Applications for Vehicle-infrastructure Cooperative Networks
PAN Zhongxia, SHEN Congqi, LUO Hanguang, ZHU Jun, ZOU Tao, LONG Keping
Available online  , doi: 10.11999/JEIT250807
Abstract:
  Objective  Vehicle-infrastructure cooperative Networks (V2X)are open and contain large numbers of nodes with high mobility, frequent topology changes, unstable wireless channels, and varied service requirements. These characteristics create challenges to efficient data transmission. A flexible network that supports rapid reconfiguration to meet different service requirements is considered essential in Intelligent Transportation Systems (ITS). With the development of programmable network technologies, programmable data-plane techniques are shifting the architecture from rigid designs to adaptive and flexible systems. In this work, a protocol standard based on geospatial information is proposed and combined with a polymorphic network architecture to design a geospatial identifier network modal. In this modal, the traditional three-layer protocol structure is replaced by packet forwarding based on geospatial identifiers. Packets carry geographic location information, and forwarding is executed directly according to this information. Addressing and routing based on geospatial information are more efficient and convenient than traditional IP-based approaches. A vehicle-infrastructure cooperative traffic system based on geospatial identifiers is further designed for intelligent transportation scenarios. This system supports direct geographic forwarding for road safety message dissemination and traffic information exchange. It enhances safety and improves route-planning efficiency within V2X.  Methods  The geospatial identifier network modal is built on a protocol standard that uses geographic location information and a flexible polymorphic network architecture. In this design, the traditional IP addressing mechanism in the three-layer network is replaced by a geospatial identifier protocol, and addressing and routing are executed on programmable polymorphic network elements. To support end-to-end transmission, a protocol stack for the geospatial identifier network modal is constructed, enabling unified transmission across different network modals. A dynamic geographic routing mechanism is further developed to meet the transmission requirements of the GEO modal. This mechanism functions in a multimodal network controller and uses the relatively stable coverage of roadside base stations to form a two-level mapping: “geographic region–base station/geographic coordinates–terminal.” This mapping supports precise path matching for GEO modal packets and enables flexible, centrally controlled geographic forwarding. To verify the feasibility of the geospatial identifier network modal, a vehicle-infrastructure cooperative intelligent transportation system supporting geospatial identifier addressing is developed. The system is designed to facilitate efficient dissemination of road safety and traffic information. The functional requirements of the system are analyzed, and the business processing flow and overall architecture are designed. Key hardware and software modules are also developed, including the geospatial representation data-plane code, traffic control center services, roadside base stations, and in-vehicle terminals, and their implementation logic is presented.  Results and Discussions  System evaluation is carried out from four aspects: evaluation environment, operational effectiveness, theoretical analysis, and performance testing. A prototype intelligent transportation system is deployed, as shown in Figure 7 and Figure 8. The prototype demonstrates correct message transmission based on the geospatial identifier modal. A typical vehicle-to-vehicle communication case is used to assess forwarding efficiency, where an onboard terminal (T3) sends a road-condition alert (M) to another terminal (T2). Sequence-based analysis is applied to compare forwarding performance between the GEO modal and a traditional IP protocol. Theoretical analysis indicates that the GEO modal provides higher forwarding efficiency, as shown in Fig. 9. Additional performance tests are conducted by adjusting the number of terminals (Fig. 10), background traffic (Fig. 11), and transmission bandwidth (Fig. 12) to observe the transmission behavior of geospatial identifier packets. The results show that the intelligent transportation system maintains stable and efficient transmission performance under varying network conditions. System evaluation confirms its suitability for typical vehicle-infrastructure cooperative communication scenarios, supporting massive connectivity and elastic traffic loads.  Conclusions  By integrating a flexible polymorphic network architecture with a protocol standard based on geographic information, a geospatial identifier network modal is developed and implemented. The modal enables direct packet forwarding based on geospatial location. A prototype vehicle-infrastructure cooperative intelligent transportation system using geospatial identifier addressing is also designed for intelligent transportation scenarios. The system supports applications such as road-safety alerts and traffic information broadcasting, improves vehicle safety, and enhances route-planning efficiency. Experimental evaluation shows that the system maintains stable and efficient performance under typical traffic conditions, including massive connectivity, fluctuating background traffic, and elastic service loads. With the continued development of vehicular networking technologies, the proposed system is expected to support broader intelligent transportation applications and contribute to safer and more efficient mobility systems.
An Implicit Certificate-Based Lightweight Authentication Scheme for Power Industrial Internet of Things
WANG Sheng, ZHANG Linghao, TENG Yufei, LIU Hongli, HAO Junyang, WU Wenjuan
Available online  , doi: 10.11999/JEIT250457
Abstract:
  Objective  The rapid development of the Internet of Things, cloud computing, and edge computing drives the evolution of the Power Industrial Internet of Things (PIIoT) into core infrastructure for smart power systems. In this architecture, terminal devices collect operational data and send it to edge gateways for preliminary processing before transmission to cloud platforms for further analysis and control. This structure improves efficiency, reliability, and security in power systems. However, the integration of traditional industrial systems with open networks introduces cybersecurity risks. Resource-constrained devices in PIIoT are exposed to threats that may lead to data leakage, privacy exposure, or disruption of power services. Existing authentication mechanisms either impose high computational and communication overhead or lack sufficient protection, such as forward secrecy or resistance to replay and man-in-the-middle attacks. This study focuses on designing a lightweight and secure authentication method suitable for the PIIoT environment. The method is intended to meet the operational needs of power terminal devices with limited computing capability while ensuring strong security protection.  Methods  A secure and lightweight identity authentication scheme is designed to address these challenges. Implicit certificate technology is applied during device identity registration, embedding public key authentication information into the signature rather than transmitting a complete certificate during communication. Compared with explicit certificates, implicit certificates are shorter and allow faster verification, reducing transmission and validation overhead. Based on this design, a lightweight authentication protocol is constructed using only hash functions, XOR operations, and elliptic curve point multiplication. This protocol supports secure mutual authentication and session key agreement while remaining suitable for resource-constrained power terminal devices. A formal analysis is then performed to evaluate security performance. The results show that the scheme achieves secure mutual authentication, protects session key confidentiality, ensures forward secrecy, and resists replay and man-in-the-middle attacks. Finally, experimental comparisons with advanced authentication protocols are conducted. The results indicate that the proposed scheme requires significantly lower computational and communication overhead, supporting its feasibility for practical deployment.  Results and Discussions  The proposed scheme is evaluated through simulation and numerical comparison with existing methods. The implementation is performed on a virtual machine configured with 8 GB RAM, an Intel i7-12700H processor, and Ubuntu 22.04, using the Miracl-Python cryptographic library. The security level is set to 128 bits, with the ed25519 elliptic curve, SHA-256 hash function, and AES-128 symmetric encryption. Table 1 summarizes the performance of the cryptographic primitives. As shown in Table 2, the proposed scheme achieves the lowest computational cost, requiring three elliptic curve point multiplications on the device side and five on the gateway side. These values are substantially lower than those of traditional certificate-based authentication, which may require up to 14 and 12 operations, respectively. Compared with other representative authentication approaches, the proposed method further reduces the computational burden on devices, improving suitability for resource-limited environments. Table 3 shows that communication overhead is also minimized, with the smallest total message size (3 456 bits) and three communication rounds, attributed to the implicit certificate mechanism. As shown in Fig. 5, the authentication process exhibits the shortest execution time among all evaluated schemes. The runtime is 47.72 ms on devices and 82.88 ms on gateways, indicating lightweight performance and suitability for deployment in Industrial Internet of Things applications.  Conclusions  A lightweight and secure identity authentication scheme based on implicit certificates is presented for resource-constrained terminal devices in the PIIoT. Through the integration of a low-overhead authentication protocol and efficient certificate processing, the scheme maintains a balance between security and performance. It enables secure mutual authentication, protects session key confidentiality, and ensures forward secrecy while keeping computational and communication overhead minimal. Security analysis and experimental evaluation confirm that the scheme provides stronger protection and higher efficiency compared with existing approaches. It offers a practical and scalable solution for enhancing the security architecture of modern power systems.
Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation
WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun
Available online  , doi: 10.11999/JEIT250934
Abstract:
  Objective  Hydro-Turbine Generator Units (HTGUs) require reliable early fault detection to maintain operational safety and reduce maintenance cost. Acoustic signals provide a non-intrusive and sensitive monitoring approach, but their use is limited by complex structural acoustics, strong background noise, and the scarcity of abnormal data. An unsupervised acoustic anomaly detection framework is presented, in which a large-scale pretrained audio model is integrated with density-based k-nearest neighbors estimation. This framework is designed to detect anomalies using only normal data and to maintain robustness and strong generalization across different operational conditions of HTGUs.  Methods  The framework performs unsupervised acoustic anomaly detection for HTGUs using only normal data. Time-domain signals are preprocessed with Z-score normalization and Fbank features, and random masking is applied to enhance robustness and generalization. A large-scale pretrained BEATs model is used as the feature encoder, and an Attentive Statistical Pooling module aggregates frame-level representations into discriminative segment-level embeddings by emphasizing informative frames. To improve class separability, an ArcFace loss replaces the conventional classification layer during training, and a warm-up learning rate strategy is adopted to ensure stable convergence. During inference, density-based k-nearest neighbors estimation is applied to the learned embeddings to detect acoustic anomalies.  Results and Discussions  The effectiveness of the proposed unsupervised acoustic anomaly detection framework for HTGUs is examined using data collected from eight real-world machines. As shown in Fig. 7 and Table 2, large-scale pretrained audio representations show superior capability compared with traditional features in distinguishing abnormal sounds. With the FED-KE algorithm, the framework attains high accuracy across six metrics, with Hmean reaching 98.7% in the wind tunnel and exceeding 99.9% in the slip-ring environment, indicating strong robustness under complex industrial conditions. As shown in Table 4, ablation studies confirm the complementary effects of feature enhancement, ASP-based representation refinement, and density-based k-NN inference. The framework requires only normal data for training, reducing dependence on scarce fault labels and enhancing practical applicability. Remaining challenges include computational cost introduced by the pretrained model and the absence of multimodal fusion, which will be addressed in future work.  Conclusions  An unsupervised acoustic anomaly detection framework is proposed for HTGUs, addressing the scarcity of fault samples and the complexity of industrial acoustic environments. A pretrained large-scale audio foundation model is adopted and fine-tuned with turbine-specific strategies to improve the modeling of normal operational acoustics. During inference, a density-estimation-based k-NN mechanism is applied to detect abnormal patterns using only normal data. Experiments conducted on real-world hydropower station recordings show high detection accuracy and strong generalization across different operating conditions, exceeding conventional supervised approaches. The framework introduces foundation-model-based audio representation learning into the hydro-turbine domain, provides an efficient adaptation strategy tailored to turbine acoustics, and integrates a robust density-based anomaly scoring mechanism. These components jointly reduce dependence on labeled anomalies and support practical deployment for intelligent condition monitoring. Future work will examine model compression, such as knowledge distillation, to enable on-device deployment, and explore semi-/self-supervised learning and multimodal fusion to enhance robustness, scalability, and cross-station adaptability.
A Review on Phase Rotation and Beamforming Scheme for Intelligent Reflecting Surface Assisted Wireless Communication Systems
XING Zhitong, LI Yun, WU Guangfu, XIA Shichao
Available online  , doi: 10.11999/JEIT250790
Abstract:
  Objective  Since the large-scale commercial deployment of 5G networks in 2020 and the continued development of 6G technology, modern communication systems need to function under increasingly complex channel conditions. These include ultra-high-density urban environments and remote areas such as oceanic regions, deserts, and forests. To meet these challenges, low-energy solutions capable of dynamically adjusting and reconfiguring wireless channels are required. Such solutions would improve transmission performance by lowering latency, increasing data rates, and strengthening signal reception, and would support more efficient deployment in demanding environments. The Intelligent Reflecting Surface (IRS) has gained attention as a promising approach for reshaping channel conditions. Unlike traditional active relays, an IRS operates passively and adds minimal energy consumption. When integrated with communication architectures such as Single Input Single Output (SISO), Multiple Input Single Output (MISO), and Multiple Input Multiple Output (MIMO), an IRS can improve transmission efficiency, reduce power consumption, and enhance adaptability in complex scenarios. This paper reviews IRS-assisted communication systems, with emphasis on signal transmission models, beamforming methods, and phase-shift optimization strategies.  Methods  This review examines IRS technology in modern communication systems by analyzing signal transmission models across three fundamental configurations. The discussion begins with IRS-assisted SISO systems, in which IRS control of incident signals through reflection and phase shifting improves single-antenna communication by mitigating traditional propagation constraints. The analysis then extends to MISO and MIMO architectures, where the relationship between IRS phase adjustments and MIMO precoding is assessed to determine strategies that support high spectral efficiency. Based on these transmission models, this review surveys joint optimization and precoding methods tailored for IRS-enhanced MIMO systems. These algorithms can be grouped into four categories that meet different operational requirements. The first aims to minimize power consumption by reducing total energy use while maintaining acceptable communication quality, which is important for energy-sensitive applications such as IoT systems and green communication scenarios. The second seeks to maximize energy efficiency by optimizing the ratio of achievable data rate to power consumption rather than lowering energy use alone, thereby improving performance per unit of energy. The third focuses on maximizing the sum rate by increasing aggregated throughput across users to strengthen overall system capacity in high-density 5G and 6G environments. The fourth prioritizes fairness-aware rate maximization by applying resource allocation methods that ensure equitable bandwidth distribution among users while sustaining high Quality of Service (QoS). Together, these optimization approaches provide a framework for advancing IRS-assisted MIMO systems and allow engineers and researchers to balance performance, energy efficiency, and user fairness according to specific application needs in next-generation wireless networks.  Results and Discussions  This review shows that IRS assisted communication systems provide important capabilities for next-generation wireless networks through four major advantages. First, IRS strengthens system performance by reconfiguring propagation environments and improving signal strength and coverage in non-line-of-sight conditions, including urban canyons, indoor environments, and remote regions, while also maintaining reliable connectivity in high-mobility cases such as vehicular communication. Second, the technology supports high energy efficiency because of its passive operation, which adds minimal power overhead yet improves spectral efficiency. This characteristic is valuable for sustainable large-scale IoT deployments and green 6G systems that may incorporate energy-harvesting designs. Third, IRS shows strong adaptability when integrated with different communication architectures, including SISO for basic signal enhancement, MISO for improved beamforming, and MIMO for spatial multiplexing, enabling use across environments ranging from ultra-dense urban networks to remote or airborne communication platforms. Finally, recent progress in beamforming and phase-shift optimization strengthens system performance through coherent signal combining, interference suppression in multi-user settings, and low-latency operation for time-critical applications. Machine learning methods such as deep reinforcement learning are also being investigated for real-time optimization. Together, these capabilities position IRS as a key technology for future 6G networks with the potential to support smart radio environments and broad-area connectivity, although further study is required to address challenges in channel estimation, scalability, and standardization.  Conclusions  This review highlights the potential of IRS technology in next-generation wireless communication systems. By enabling dynamic channel reconfiguration with minimal energy overhead, IRS strengthens the performance of SISO, MISO, and MIMO systems and supports reliable operation in complex propagation environments. The surveyed signal transmission models and optimization methods form a technical basis for continued development of IRS-assisted communication frameworks. As research and industry move toward 6G, IRS is expected to support ultra-reliable, low-latency, and energy-efficient global connectivity. Future studies should address practical deployment challenges such as hardware design, real-time signal processing, and progress toward standardization.
Defeating Voice Conversion Forgery by Active Defense with Diffusion Reconstruction
TIAN Haoyuan, CHEN Yuxuan, CHEN Beijing, FU Zhangjie
Available online  , doi: 10.11999/JEIT250709
Abstract:
  Objective  Voice deep generation technology is able to produce speech that is perceptually realistic. Although it enriches entertainment and everyday applications, it is also exploited for voice forgery, creating risks to personal privacy and social security. Existing active defense techniques serve as a major line of protection against such forgery, yet their performance remains limited in balancing defensive strength with the imperceptibility of defensive speech examples, and in maintaining robustness.  Methods  An active defense method against voice conversion forgery is proposed on the basis of diffusion reconstruction. The diffusion vocoder PriorGrad is used as the generator, and the gradual denoising process is guided by the diffusion prior of the target speech so that the protected speech is reconstructed and defensive speech examples are obtained directly. A multi-scale auditory perceptual loss is further introduced to suppress perturbation amplitudes in frequency bands sensitive to the human auditory system, which improves the imperceptibility of the defensive examples.  Results and Discussions  Defense experiments conducted on four leading voice conversion models show that the proposed method maintains the imperceptibility of defensive speech examples and, when speaker verification accuracy is used as the evaluation metric, improves defense ability by about 32% on average in white-box scenarios and about 16% in black-box scenarios compared with the second-best method, achieving a stronger balance between defense ability and imperceptibility (Table 2). In robustness experiments, the proposed method yields an average improvement of about 29% in white-box scenarios and about 18% in black-box scenarios under three compression attacks (Table 3), and an average improvement of about 35% in the white-box scenario and about 17% in the black-box scenario under Gaussian filtering attack (Table 4). Ablation experiments further show that the use of multi-scale auditory perceptual loss improves defense ability by 5% to 10% compared with the use of single-scale auditory perceptual loss (Table 5).  Conclusions  An active defense method against voice conversion forgery based on diffusion reconstruction is proposed. Defensive speech examples are reconstructed directly through a diffusion vocoder so that the generated audio better approximates the distribution of the original target speech, and a multi-scale auditory perceptual loss is integrated to improve the imperceptibility of the defensive speech. Experimental results show that the proposed method achieves stronger defense performance than existing approaches in both white-box and black-box scenarios and remains robust under compression coding and smoothing filtering. Although the method demonstrates clear advantages in defense performance and robustness, its computational efficiency requires further improvement. Future work is directed toward diffusion generators that operate with a single time step or fewer time steps to enhance computational efficiency while maintaining defense performance.
A Polymorphic Network Backend Compiler for Domestic Switching Chips
TU Huaqing, WANG Yuanhong, XU Qi, ZHU Jun, ZOU Tao, LONG Keping
Available online  , doi: 10.11999/JEIT250132
Abstract:
  Objective  The P4 language and programmable switching chips offer a feasible approach for deploying polymorphic networks. However, polymorphic network packets written in P4 cannot be directly executed on the domestically produced TsingMa.MX programmable switching chip developed by Centec, which necessitates the design of a specialized compiler to translate and deploy the P4 language on this chip. Existing backend compilers are mainly designed and optimized for software-programmable switches such as BMv2, FPGAs, and Intel Tofino series chips, rendering them unsuitable for compiling polymorphic network programs for the TsingMa.MX chip. To resolve this limitation, a backend compiler named p4c-TsingMa is proposed for the TsingMa.MX switching chip. This compiler enables the translation of high-level network programming languages into executable formats for the TsingMa.MX chip, thereby supporting the concurrent parsing and forwarding of multiple network modal packets.  Methods  p4c-TsingMa first employs a preorder traversal approach to extract key information, including protocol types, protocol fields, and actions, from the Intermediate Representation (IR). It then performs instruction translation to generate corresponding control commands for the TsingMa.MX chip. Additionally, p4c-TsingMa adopts a User Defined Field (UDF) entry merging method to consolidate matching instructions from different network modalities into a unified lookup table. This design enables the extraction of multiple modal matching entries in a single operation, thereby enhancing chip resource utilization.  Results and Discussions  The p4c-TsingMa compiler is implemented in C++, mapping network modal programs written in the P4 language into configuration instructions for the TsingMa.MX switching chip. A polymorphic network packet testing environment (Fig. 7) is established, where multiple types of network data packets are simultaneously transmitted to the same switch port. According to the configured flow tables, the chip successfully identifies polymorphic network data packets and forwards them to their corresponding ports (Fig. 8). Additionally, the table entry merging algorithm improves register resource utilization by 37.5% to 75%, enabling the chip to process more than two types of modal data packets concurrently.  Conclusions  A polymorphic network backend compiler, p4c-TsingMa, is designed specifically for domestic switching chips. By utilizing the FlexParser and FlexEdit functions of the TsingMa chip, the compiler translates polymorphic network programs into executable commands for the TsingMa.MX chip, enabling the chip to parse and modify polymorphic data packets. Experimental results demonstrate that p4c-TsingMa achieves high compilation efficiency and improves register resource utilization by 37.5% to 75%.
The Storage and Calculation of Biological-like Neural Networks for Locally Active Memristor Circuits
LI Fupeng, WANG Guangyi, LIU Jingbiao, YING Jiajie
Available online  , doi: 10.11999/JEIT250631
Abstract:
  Objective  At present, binaryBinary computing systems have encounteredreached bottlenecks in terms of power consumption, operation speed, and storage capacity. In contrast, the biological nervous system seems to have unlimited capacity. The biological nervous system has significant advantages indemonstrates remarkable low-power computing and dynamic storage capability, which is closely relatedattributed to the working mechanism of neurons transmitting neural signals through the directional secretion of neurotransmitters. After analyzing the Hodgkin-–Huxley model of the squid giant axon, Professor Leon Chua proposed that synapses could be composed of locally passive memristors, and neurons could be made up of locally active memristors. The two types of memristors share similar , both exhibiting electrical characteristics with similar to those of nerve fibers. Since the first experimental claim of memristors was claimed to be found, locally active memristive devices have been identified in the research of devices with layered structures. The, and circuits constructed from thosethese devices exhibit differentdisplay various types of neuromorphic _dynamics under different excitations,. However, ano single two-terminal device capable of achieving multi-state storage has not yet been reported. Locally active memristors have advantages inare advantageous for generating biologically -inspired neural signals. Various forms of locally active memristor, as their models can produce neural morphological signals based on spike pulses. The generation of neuralsuch signals involvesis achieved through the amplification and computation of stimulus signals, and its working mechanism can beexternal stimuli, a process realized using capacitance-controlled memristor oscillators. When a memristor operates in the locally active domiandomain, the output voltage of its third-order circuit undergoes a period-doubling bifurcation as the capacitance in the circuit changes regularly, forming a multi-state mapping between capacitance values and oscillating voltages. In this paper, the localwork, a locally active memristor-based third-order circuitiscircuit is used as a unit to generate neuromorphic signals, thereby forming a biologically -inspired neural operation unit, and an. An operation network can be formed based on the operation unitconstructed from such units, providing a framework for storage and computation in biological-like neural networks.  Methods  The mathematical model of the Chua Corsage Memristor proposed by Leon Chua wasis selected for analysis. The characteristics of the partial locallocally active domain wereare examined, and an appropriate operating point and together with external components were chosenis determined to establish a third-order memristor chaotic circuit. Circuit simulation and analysis were then conductedare performed on this circuit. When the memristor operates in the locally active domain, the oscillator formed by its third-order circuit can simultaneously perform the functions ofperforms signal amplification, computation, and storage. In this wayconfiguration, the third-order circuit can be performis regarded as thea nerve cell, and the variable capacitors are treated as cynapses. Enables the synapses. The electrical signal and the dielectric capacitor to workoperate in succession, allowingenabling the third-order oscillation circuit of the memristor to functionbehave like a neuron, withwhere alternating electricalelectric fields and neurotransmitters formingcapacitive dynamics mimic neurotransmitter-mediated processes to form a brain-like computing and storage system. The secretion of biological neurotransmitters hasis characterized by a threshold characteristic, and , with the membrane threshold voltage controlsregulating the secretionrelease of neurotransmitters to the postsynaptic membrane, thereby forming the transmission oftransmitting neural signals. The Analogously, the step peak value of the oscillation circuit can servefunctions as the trigger voltage for the transfer of the capacity electrolytecapacitive charge.  Results and Discussions  This study utilizes the third-order circuit of a locallocally active memristor to generate stable voltage oscillations exhibiting period-doubling bifurcation voltage signal oscillations as the external capacitance changes. The variation of varies. Changes in capacitance in the circuit causes different forms of electrical signals lead to be seriallydifferent forms of electrical signals being sequentially output at the memristor terminals of the memristor, and, with the voltage amplitude of these signals changes stably in a exhibiting stable periodic mannervariation. This results inestablishes a stable multi-state mapping relationship between the changed capacitance values and the output voltage signalsignals, thereby forming the basis of a storage and computing unit, and, subsequently, a storage and computing network. Currently, At present, it remains necessary to develop a structure that enablesallows the dielectric to transfer and change theadjust capacitance value to the next stage under the control of thea modulated voltage threshold needs to be realized. It is similar, analogous to the function of neurotransmitter secretion. The in biological systems. These results demonstrate the feasibility of using the third-order oscillation circuit of thea memristor as a storage and computing unit is expounded, and and highlight the potential for constructing a storage and computing structurearchitecture based on the change of capacitance value is obtainedvariation.  Conclusions  When the Chua Corsage Memristor operates in its locally active domain, its third-order circuit –powered solely by a voltage-stabilized source generates stable period-doubling bifurcation oscillations as the external capacitance changesvaries. The seriallysequentially output oscillating signals exhibitdisplay stable voltage amplitudes/and periods and haswith threshold characteristics. The change of the Variations in capacitance in the circuit causesinduce different forms of electrical signals to be serially output at the memristor terminals of the memristor, and the, with voltage amplitude of these signals changes stably in a periodic manner.amplitudes changing periodically and stably. This results inestablishes a stable multi-state mapping relationship between the changed capacitance values and the output voltage signalvoltages, thereby forming a storage and computing unit, and subsequently, by extension, a storage and computing network. Currently, aA structure is need to realize the transfer of thethat enables dielectric transfer and capacitance adjustment to the subordinatenext stage under the control of thea modulated voltage threshold, similar to the function of neurotransmitter secretion., still needs to be developed. The findings demonstrate the feasibility of using the third-order oscillation circuit of thea memristor as a storage and computing unit is obtained, and describe a potential storage and computing structurearchitecture based on the variation of capacitance value is describedvariation.
Research on UAV Swarm Radiation Source Localization Method Based on Dynamic Formation Optimization
WU Sujie, WU Binbin, YANG Ning, WANG Heng, GUO Daoxing, GU Chuan
Available online  , doi: 10.11999/JEIT251023
Abstract:
In dense and structurally complex urban environments, UAV swarm–based radiation source localization is often degraded by signal attenuation, multipath propagation, and building obstructions. To overcome these limitations, this paper proposes a dynamic formation–adjustment localization method for UAV swarms. By optimizing the geometric configuration of the swarm, the method reduces path loss and interference, thereby enhancing localization accuracy. Received signal strength is used to assess signal quality in real time, enabling adaptive formation adjustments that improve signal propagation. Moreover, geometric dilution of precision and root mean square error metrics are integrated to further refine swarm geometry and improve distance estimation reliability. Experimental results show that the proposed method converges faster and significantly improves localization accuracy in complex urban environments, reducing errors by over 80%. The method also adapts effectively to environmental variations, demonstrating strong robustness and practical applicability.  Objective  UAV swarm localization and formation control in urban environments are challenged by obstacles, signal attenuation, and rapidly changing conditions, which limit the reliability of traditional methods. To overcome these issues, this study proposes a radiation source localization approach that integrates Received Signal Strength Indicator–based sensing with dynamic formation adjustment, aiming to improve localization accuracy and enhance system robustness in complex urban scenarios.  Methods  The proposed method uses Received Signal Strength Indicator measurements to estimate the distance to the radiation source and dynamically adjusts the UAV swarm formation to reduce localization errors. These adjustments are driven by real-time feedback incorporating relative positions, signal strength, and environmental variations. Localization accuracy is further enhanced through a multi-sensor fusion strategy that integrates GPS, IMU, and depth camera data. A data-quality assessment mechanism evaluates signal reliability and triggers formation adaptation when the signal falls below a predefined threshold. Overall, the optimization process minimizes positioning errors and improves the robustness of the localization system.  Results and Discussions  Simulation experiments in a ROS-based environment were conducted to evaluate the proposed UAV swarm localization method under urban obstacles and multipath conditions. The swarm began in a hexagonal formation and dynamically adjusted its geometry according to environmental changes and localization confidence (Fig. 3-4). As shown in Fig. 5, localization errors fluctuated during initialization but quickly converged to below 1 m after 150 s. Formation comparisons (Fig. 6) showed that symmetric structures such as hexagonal and triangular formations maintained errors under 0.5 m, while asymmetric formations (T and Y-shape) produced deviations up to 4.9 m. Further comparisons (Fig. 7) indicated that traditional RSSI saturated near 15 m, DOA fluctuated between 5–14 m, and TDOA failed due to synchronization issues, whereas the proposed method achieved sub-meter accuracy within 60 s and remained robust throughout the mission. These results demonstrate that combining RSSI-based distance estimation with dynamic formation adjustment substantially improves localization accuracy, convergence speed, and adaptability in complex environments.  Conclusions  This paper tackles the challenge of UAV swarm localization in complex urban environments by integrating RSSI-based distance estimation with dynamic formation adjustment and multi-sensor fusion. ROS-based simulations validate the effectiveness of the proposed method, showing that: (1) localization errors converge rapidly to sub-meter levels, reaching below 1 m within 150 s even in NLoS conditions; (2) symmetric formations, such as hexagonal and triangular structures, outperform asymmetric ones, reducing errors by up to 67% compared with fixed Y-shaped formations; and (3) relative to traditional RSSI, DOA, and TDOA approaches, the proposed method achieves faster convergence, higher stability, and greater robustness.
Short Packet Secure Covert Communication Design and Optimization
TIAN Bo, YANG Weiwei, SHA Li, SHANG Zhihui, CAO Kuo, LIU Changming
Available online  , doi: 10.11999/JEIT250800
Abstract:
  Objective  The study addresses the dual security threats of eavesdropping and detection in Multiple-Input Single-Output (MISO) communication systems under short packet transmission conditions. An integrated secure and covert transmission scheme is proposed, combining physical layer security with covert communication techniques. The approach aims to overcome the limitations of conventional encryption in short packet scenarios, enhance communication concealment, and ensure information confidentiality. The optimization objective is to maximize the Average Effective Secrecy and Covert Rate (AESCR) through the joint optimization of packet length and transmit power, thereby providing robust security for low-latency Internet of Things (IoT) applications.  Methods  An MISO system model employing MRT beamforming is adopted to exploit spatial degrees of freedom for improved security. Through theoretical analysis, closed-form expressions are derived for the warden’s (Willie’s) optimal detection threshold and minimum detection error probability. A statistical covertness constraint based on Kullback–Leibler (KL) divergence is formulated to convert intractable instantaneous requirements into a tractable average constraint. A new performance metric, the AESCR, is proposed to comprehensively assess system performance in terms of covertness, secrecy, and reliability. The optimization strategy centers on the joint design of packet length and transmit power. By utilizing the inherent coupling between these variables, the original dual-variable maximization problem is reformulated into a tractable form solvable through an efficient one-dimensional search.  Results and Discussions   Simulation results confirm the theoretical analysis, showing close consistency between the derived expressions and Monte Carlo simulations for Willie’s detection error probability. The findings indicate that multi-antenna configurations markedly enhance the AESCR by directing signal energy toward the legitimate receiver and reducing eavesdropping risk. The proposed joint optimization of transmit power and packet length achieves a substantially higher AESCR than power-only optimization, particularly under stringent covertness constraints. The study further reveals key trade-offs: an optimal packet length exists that balances coding gain and exposure risk, while relaxed covertness constraints yield continuous improvements in AESCR. Moreover, multi-antenna technology is shown to be crucial for mitigating the inherent low-power limitations of covert communication.  Conclusions  This study presents an integrated framework for secure and covert communication in short packet MISO systems, achieving notable performance gains through the joint optimization of transmit power and packet length. The main contributions include: (1) a transmission architecture that combines security and covertness, supported by closed-form solutions for the warden’s detection threshold and error probability under a KL divergence-based constraint; (2) the introduction of the AESCR metric, which unifies the assessment of secrecy, covertness, and reliability; and (3) the formulation and efficient resolution of the AESCR maximization problem. Simulation results verify that the proposed joint optimization strategy exceeds power-only optimization, particularly under stringent covertness conditions. The AESCR increases monotonically with the number of transmit antennas, and an optimal packet length is identified that balances transmission efficiency and covertness.
Adaptive Cache Deployment Based on Congestion Awareness and Content Value in LEO Satellite Networks
LIU Zhongyu, XIE Yaqin, ZHANG Yu, ZHU Jianyue
Available online  , doi: 10.11999/JEIT250670
Abstract:
  Objective  Low Earth Orbit (LEO) satellite networks are central to future space–air–ground integrated systems, offering global coverage and low-latency communication. However, their high-speed mobility leads to rapidly changing topologies, and strict onboard cache constraints hinder efficient content delivery. Existing caching strategies often overlook real-time network congestion and content attributes (e.g., freshness), which leads to inefficient resource use and degraded Quality of Service (QoS). To address these limitations, we propose an adaptive cache placement strategy based on congestion awareness. The strategy dynamically couples real-time network conditions, including link congestion and latency, with a content value assessment model that incorporates both popularity and freshness.This integrated approach enhances cache hit rates, reduces backhaul load, and improves user QoS in highly dynamic LEO satellite environments, enabling efficient content delivery even under fluctuating traffic demands and resource constraints.  Methods  The proposed strategy combines a dual-threshold congestion detection mechanism with a multi-dimensional content valuation model. It proceeds in three steps. First, satellite nodes monitor link congestion in real time using dual latency thresholds and relay congestion status to downstream nodes through data packets. Second, a two-dimensional content value model is constructed that integrates popularity and freshness. Popularity is updated dynamically using an Exponential Weighted Moving Average (EWMA), which balances historical and recent request patterns to capture temporal variations in demand. Freshness is evaluated according to the remaining data lifetime, ensuring that expired or near-expired content is deprioritized to maintain cache efficiency and relevance. Third, caching thresholds are adaptively adjusted according to congestion level, and a hop count control factor is introduced to guide caching decisions. This coordinated mechanism enables the system to prioritize high-value content while mitigating congestion, thereby improving overall responsiveness and user QoS.  Results and Discussions  Simulations conducted on ndnSIM demonstrate the superiority of the proposed strategy over PaCC (Popularity-Aware Closeness-based Caching), LCE (Leave Copy Everywhere), LCD (Leave Copy Down), and Prob (probability-based caching with probability = 0.5). The key findings are as follows. (1) Cache hit rate. The proposed strategy consistently outperforms conventional methods. As shown in Fig. 8, the cache hit rate rises markedly with increasing cache capacity and Zipf parameter, exceeding those of LCE, LCD, and Prob. Specifically, the proposed strategy achieves improvements of 43.7% over LCE, 25.3% over LCD, 17.6% over Prob, and 9.5% over PaCC. Under high content concentration (i.e., larger Zipf parameters), the improvement reaches 29.1% compared with LCE, highlighting the strong capability of the strategy in promoting high-value content distribution. (2) Average routing hop ratio. The proposed strategy also reduces routing hops compared with the baselines. As shown in Fig. 9, the average hop ratio decreases as cache capacity and Zipf parameter increase. Relative to PaCC, the proposed strategy lowers the average hop ratio by 2.24%, indicating that content is cached closer to users, thereby shortening request paths and improving routing efficiency. (3) Average request latency. The proposed strategy achieves consistently lower latency than all baseline methods. As summarized in Table 2 and Fig. 10, the reduction is more pronounced under larger cache capacities and higher Zipf parameters. For instance, with a cache capacity of 100 MB, latency decreases by approximately 2.9%, 5.8%, 9.0%, and 10.3% compared with PaCC, Prob, LCD, and LCE, respectively. When the Zipf parameter is 1.0, latency reductions reach 2.7%, 5.7%, 7.2%, and 8.8% relative to PaCC, Prob, LCD, and LCE, respectively. Concretely, under a cache capacity of 100 MB and Zipf parameter of 1.0, the average request latency of the proposed strategy is 212.37 ms, compared with 236.67 ms (LCE), 233.45 ms (LCD), 225.42 ms (Prob), and 218.62 ms (PaCC).  Conclusions  This paper presents a congestion-aware adaptive caching placement strategy for LEO satellite networks. By combining real-time congestion monitoring with multi-dimensional content valuation that considers both dynamic popularity and freshness, the strategy achieves balanced improvements in caching efficiency and network stability. Simulation results show that the proposed method markedly enhances cache hit rates, reduces average routing hops, and lowers request latency compared with existing schemes such as PaCC, Prob, LCD, and LCE. These benefits hold across different cache sizes and request distributions, particularly under resource-constrained or highly dynamic conditions, confirming the strategy’s adaptability to LEO environments. The main innovations include a closed-loop feedback mechanism for congestion status, dynamic adjustment of caching thresholds, and hop-aware content placement, which together improve resource utilization and user QoS. This work provides a lightweight and robust foundation for high-performance content delivery in satellite–terrestrial integrated networks. Future extensions will incorporate service-type differentiation (e.g., delay-sensitive vs. bandwidth-intensive services), and orbital prediction to proactively optimize cache migration and updates, further enhancing efficiency and adaptability in 6G-enabled LEO networks.
Research on Non-cooperative Interference Suppression Technology for Dual Antennas without Channel Prior Information
YAN Cheng, LI Tong, PAN Wensheng, DUAN Baiyu, SHAO Shihai
Available online  , doi: 10.11999/JEIT250378
Abstract:
  Objective  In electronic countermeasures, friendly communication links are vulnerable to interference from adversaries. The auxiliary antenna scheme is employed to extract reference signals for interference cancellation, which improves communication quality. Although the auxiliary antenna is designed to capture interference signals, it often receives communication signals at the same time, and this reduces the suppression capability. Typical approaches for non-cooperative interference suppression include interference rejection combining and spatial domain adaptive filtering. These approaches rely on the uncorrelated nature of the interference and desired signals to achieve suppression. They also require channel information and interference noise information, which restricts their applicability in some scenarios.  Methods  This paper proposes the Fast ICA-based Simulated Annealing Algorithm for SINR Maximization (FSA) to address non-cooperative interference suppression in communication systems. Designed for scenarios without prior channel information, FSA applies a weighted reconstruction cancellation method implemented through a Finite Impulse Response (FIR) filter. The method operates in a dual-antenna system in which one antenna supports communication and the other provides an auxiliary reference for interference. Its central innovation is the optimization of weighted reconstruction coefficients using the Simulated Annealing algorithm, together with Fast Independent Component Analysis (Fast ICA) for SINR estimation. The FIR filter reconstructs interference from the auxiliary antenna signal using optimized coefficients and then subtracts this reconstructed interference from the main received signal to improve communication quality. Accurate SINR estimation in non-cooperative settings is difficult because the received signals contain mixed components. FSA addresses this through blind source separation based on Fast ICA, which extracts sample signals of both communication and interference components. SINR is then calculated from cross-correlation results between these separated signals and the signals after interference suppression. The Simulated Annealing algorithm functions as a probabilistic optimization process that adjusts reconstruction coefficients to maximize the output SINR. Starting from initial coefficients, the algorithm perturbs them and evaluates the resulting SINR. Using the Monte Carlo acceptance rule, it allows occasional acceptance of perturbations that do not yield immediate improvement, which supports escape from local optima and promotes convergence toward global solutions. This iterative process identifies optimal filter coefficients within the search range. The combined use of Fast ICA and Simulated Annealing enables interference suppression without prior channel information. By pairing blind estimation with robust optimization, the method provides reliable performance in dynamic interference environments. The FIR-based structure offers a practical basis for real-time interference cancellation. FSA is therefore suitable for electronic countermeasure applications where channel conditions are unknown and change rapidly. This approach advances beyond conventional techniques that require channel state information and offers improved adaptability in non-cooperative scenarios while maintaining computational efficiency through the combined use of blind source separation and intelligent optimization.  Results and Discussions  The performance of the proposed FSA is assessed through simulations and experiments. The output SINR is improved under varied conditions. In simulations, a maximum SINR improvement of 27.2 dB is achieved when the communication and auxiliary antennas have a large SINR difference and are placed farther apart (Fig. 5). The performance is reduced when the channel correlation between the antennas increases. Experimental results confirm these observations, and an SINR improvement of 19.6 dB is measured at a 2 m antenna separation (Fig. 7). The method is shown to be effective for non-cooperative interference suppression without prior channel information, although its performance is affected by antenna configuration and channel correlation.  Conclusions  The proposed FSA method provides an effective solution for non-cooperative interference suppression in communication systems. The method applies weighted reconstruction cancellation optimized by the Simulated Annealing algorithm and uses Fast ICA-based SINR estimation to improve communication quality without prior channel information. The results from simulations and experiments show that the method performs well across varied conditions and has potential for practical electronic warfare applications. The study finds that the performance of the FSA method depends on the SINR difference and the channel correlation between the communication and auxiliary antennas. Future research focuses on refining the algorithm for more complex scenarios and examining the effect of system parameters on its performance. These findings support the development of communication systems that operate reliably in challenging interference environments.
Design and Optimization for Orbital Angular Momentum–based wireless-powered Noma Communication System
CHEN Ruirui, CHEN Yu, RAN Jiale, SUN Yanjing, LI Song
Available online  , doi: 10.11999/JEIT250634
Abstract:
  Objective  The Internet of Things (IoT) requires not only interconnection among devices but also seamless connectivity among users, information, and things. Ensuring stable operation and extending the lifespan of IoT Devices (IDs) through continuous power supply have become urgent challenges in IoT-driven Sixth-Generation (6G) communications. Radio Frequency (RF) signals can simultaneously transmit information and energy, forming the basis for Simultaneous Wireless Information and Power Transfer (SWIPT). Non-Orthogonal Multiple Access (NOMA), a key technology in Fifth-Generation (5G) communications, enables multiple users to share the same time and frequency resources. Efficient wireless-powered NOMA communication requires a Line-of-Sight (LoS) channel. However, the strong correlation in LoS channels severely limits the degree of freedom, making it difficult for conventional spatial multiplexing to achieve capacity gains. To address this limitation, this study designs an Orbital Angular Momentum (OAM)-based wireless-powered NOMA communication system. By exploiting OAM mode multiplexing, multiple data streams can be transmitted independently through orthogonal OAM modes, thereby significantly enhancing communication capacity in LoS channels.  Methods  The OAM-based wireless-powered NOMA communication system is designed to enable simultaneous energy transfer and multi-channel information transmission for IDs under LoS conditions. Under the constraints of the communication capacity threshold and the harvested energy threshold, this study formulates a sum-capacity maximization problem by converting harvested energy into the achievable uplink information capacity. The optimization problem is decomposed into two subproblems. A closed-form expression for the optimal Power-Splitting (PS) factor is derived, and the optimal power allocation is obtained using the subgradient method. The transmitting Uniform Circular Array (UCA) employs the Movable Antenna (MA) technique to adjust both position and array angle. To maintain system performance under typical parallel misalignment conditions, a beam-steering method is investigated.  Results and Discussions  Simulation results demonstrate that the proposed OAM-based wireless-powered NOMA communication system effectively enhances capacity performance compared with conventional wireless communication systems. As the OAM mode increases, the sum capacity of the ID decreases. This occurs because higher OAM modes exhibit stronger hollow divergence characteristics, resulting in greater energy attenuation of the received OAM signals (Fig. 3). The sum capacity of the ID increases with the PS factor (Fig. 4). However, as the harvested energy threshold increases, the system’s sum capacity decreases (Fig. 5). When the communication capacity threshold increases, the sum capacity first rises and then gradually declines (Fig. 6). In power allocation optimization, allocating more power to the ID with the best channel condition further improves the total system capacity.  Conclusions  To enhance communication capacity under LoS conditions, this study designs an OAM-based wireless-powered NOMA communication system that employs mode multiplexing to achieve independent multi-channel information transmission. On this basis, a sum-capacity maximization problem is formulated under communication capacity and harvested energy threshold constraints by transforming harvested energy into achievable uplink information capacity. The optimization problem is decomposed into two subproblems. A closed-form expression for the optimal PS factor is derived, and the optimal power allocation is obtained using the subgradient method. In future work, the MA technique will be integrated into the proposed OAM-based wireless-powered NOMA system to further optimize sum-capacity performance based on the three-dimensional spatial configuration and adjustable array angle.
Dynamic Target Localization Method Based on Optical Quantum Transmission Distance Matrix Constructing
ZHOU Mu, WANG Min, CAO Jingyang, HE Wei
Available online  , doi: 10.11999/JEIT250020
Abstract:
  Objective  Quantum information research has grown rapidly with the integration of quantum mechanics, information science, and computer science. Grounded in principles such as quantum superposition and quantum entanglement, quantum information technology can overcome the limitations of traditional approaches and address problems that classical information technologies and conventional computers cannot resolve. As a core technology, space-based quantum information technology has advanced quickly, offering new possibilities to overcome the performance bottlenecks of conventional positioning systems. However, existing quantum positioning methods mainly focus on stationary targets and have difficulty addressing the dynamic variations in the transmission channels of entangled photon pairs caused by particles, scatterers, and noise photons in the environment. These factors hinder the detection of moving targets and increase positioning errors because of reduced data acquisition at fixed points during target motion. Traditional wireless signal-based localization methods also face challenges in dynamic target tracking, including signal attenuation, multipath effects, and noise interference in complex environments. To address these limitations, a dynamic target localization method based on constructing an optical quantum transmission distance matrix is proposed. This method achieves high-precision and robust dynamic localization, meeting the requirements for moving target localization in practical scenarios. It provides centimeter-level positioning accuracy and significantly enhances the adaptability and stability of the system for moving targets, supporting the future practical application of quantum-based dynamic localization technology.  Methods  To improve the accuracy of the dynamic target localization system, a dynamic threshold optical quantum detection model based on background noise estimation is proposed, utilizing the characteristics of optical quantum echo signals. A dynamic target localization optical path is established in which two entangled optical signals are generated through the Spontaneous Parametric Down-Conversion (SPDC) process. One signal is retained as a reference in a local Single-Photon Detector (SPD), and the other is transmitted toward the moving target as the signal light. The optical quantum echo signals are analyzed, and the background noise is estimated using a coincidence counting algorithm. The detection threshold is then dynamically adjusted and compared with the signals from the detection unit, enabling rapid detection of dynamic targets. To accommodate variations in quantum echo signals caused by target motion, an adaptive optical quantum grouping method based on velocity measurement is introduced. The time pulse sequence is initially coarsely grouped to calculate the rough velocity of the target. The grouping size is subsequently adjusted according to the target’s speed, updating the time grouping sequence and further optimizing the distance measurement accuracy to generate an updated velocity matrix. The photon transmission distance matrix is refined using the relative velocity error matrix. By constructing a system of equations involving the coordinates of the light source, the optical quantum transmission distance matrix, and the dynamic target coordinate sequence, the target position is estimated through the least squares method. This approach improves localization accuracy and effectively reduces errors arising from target motion.  Results and Discussions  The effectiveness of the proposed method is verified through both simulations and experimental validation on a practical measurement platform. The experimental results demonstrate that the dynamic threshold detection approach based on background noise estimation achieves high-sensitivity detection performance (Fig. 7). When a moving target enters the detection range, rapid identification is realized, enabling subsequent dynamic localization. The adaptive grouping method based on velocity measurement significantly improves the performance of the quantum dynamic target localization system. Through grouped coincidence counting, the problem of blurred coincidence counting peaks caused by target movement is effectively mitigated (Fig. 8), achieving high-precision velocity measurement (Table 1) and reducing localization errors associated with motion. Centimeter-level positioning accuracy is attained (Fig. 9). Furthermore, an entangled optical quantum experimental platform is established, with analyses focusing on measurement results under different velocities and localization performances across various methods. The findings confirm the reliability and adaptability of the proposed approach in improving distance measurement accuracy (Fig. 12).  Conclusions  A novel method for dynamic target localization in entangled optical quantum dynamics is proposed based on constructing an optical quantum transmission distance matrix. The method enhances distance measurement accuracy and optimizes the overall positioning accuracy of the localization system through a background noise estimation-based dynamic threshold detection model and a velocity measurement-based adaptive grouping approach. By integrating the optical quantum transmission distance matrix with the least squares optimization method, the proposed framework offers a promising direction for achieving more precise quantum localization systems and demonstrates strong potential for real-time dynamic target tracking. This approach not only improves the accuracy of dynamic quantum localization systems but also broadens the applicability of quantum localization technology in complex environments. It is expected to provide solid support for real-time quantum dynamic target localization and find applications in intelligent health monitoring, the Internet of Things, and autonomous driving.
Performance Optimization of UAV-RIS-assisted Communication Networks Under No-Fly Zone Constraints
XU Junjie, LI Bin, YANG Jingsong
Available online  , doi: 10.11999/JEIT250681
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RIS) mounted on Unmanned Aerial Vehicles (UAVs) are considered an effective approach to enhance wireless communication coverage and adaptability in complex or constrained environments. However, two major challenges remain in practical deployment. The existence of No-Fly Zones (NFZs), such as airports, government facilities, and high-rise areas, restricts the UAV flight trajectory and may result in communication blind spots. In addition, the continuous attitude variation of UAVs during flight causes dynamic misalignment between the RIS and the desired reflection direction, which reduces signal strength and system throughput. To address these challenges, a UAV-RIS-assisted communication framework is proposed that simultaneously considers NFZ avoidance and UAV attitude adjustment. In this framework, a quadrotor UAV equipped with a bottom-mounted RIS operates in an environment containing multiple polygonal NFZs and a group of Ground Users (GUs). The aim is to jointly optimize the UAV trajectory, RIS phase shift, UAV attitude (represented by Euler angles), and Base Station (BS) beamforming to maximize the system sum rate while ensuring complete obstacle avoidance and stable, high-quality service for GUs located both inside and outside NFZs.  Methods  To achieve this objective, a multi-variable coupled non-convex optimization problem is formulated, jointly capturing UAV trajectory, RIS configuration, UAV attitude, and BS beamforming under NFZ constraints. The RIS phase shifts are dynamically adjusted according to the UAV orientation to maintain beam alignment, and UAV motion follows quadrotor dynamics while avoiding polygonal NFZs. Because of the high dimensionality and non-convexity of the problem, conventional optimization approaches are computationally intensive and lack real-time adaptability. To address this issue, the problem is reformulated as a Markov Decision Process (MDP), which enables policy learning through deep reinforcement learning. The Soft Actor-Critic (SAC) algorithm is employed, leveraging entropy regularization to improve exploration efficiency and convergence stability. The UAV-RIS agent interacts iteratively with the environment, updating actor-critic networks to determine UAV position, RIS phase configuration, and BS beamforming. Through continuous learning, the proposed framework achieves higher throughput and reliable NFZ avoidance, outperforming existing benchmarks.  Results and Discussions  As shown in (Fig. 3), the proposed SAC algorithm achieves higher communication rates than PPO, DDPG, and TD3 during training, benefiting from entropy-regularized exploration that prevents premature convergence. Although DDPG converges faster, it exhibits instability and inferior long-term performance. As illustrated in (Fig. 4), the UAV trajectories under different conditions demonstrate the proposed algorithm’s capability to achieve complete obstacle avoidance while maintaining reliable communication. Regardless of variations in initial UAV positions, BS locations, or NFZ configurations, the UAV consistently avoids all NFZs and dynamically adjusts its trajectory to serve users located both inside and outside restricted zones, indicating strong adaptability and scalability of the proposed model. As shown in (Fig. 5), increasing the number of BS antennas enhances system performance. The proposed framework significantly outperforms fixed phase shift, random phase shift, and non-RIS schemes because of improved beamforming flexibility.  Conclusions  This paper investigates a UAV-RIS-assisted wireless communication system in which a quadrotor UAV carries an RIS to enhance signal reflection and ensure NFZ avoidance. Unlike conventional approaches that emphasize avoidance alone, a path integral-based method is proposed to generate obstacle-free trajectories while maintaining reliable service for GUs both inside and outside NFZs. To improve generality, NFZs are represented as prismatic obstacles with regular n-sided polygonal cross-sections. The system jointly optimizes UAV trajectory, RIS phase shifts, UAV attitude, and BS beamforming. A DRL framework based on the SAC algorithm is developed to enhance system efficiency. Simulation results demonstrate that the proposed approach achieves reliable NFZ avoidance and maximized sum rate, outperforms benchmarks in communication performance, scalability, and stability.
Vegetation Height Prediction Dataset Oriented to Mountainous Forest Areas
YU Cuilin, ZHONG Zixuan, PANG Hongyi, DING Yusheng, LAI Tao, Huang Haifeng, WANG Qingsong
Available online  , doi: 10.11999/JEIT250941
Abstract:
  Objective   Vegetation height is a key ecological parameter that reflects forest vertical structure, biomass, ecosystem functions, and biodiversity. Existing open-source vegetation height datasets are often sparse, unstable, and poorly suited to mountainous forest regions, which limits their utility for large-scale modeling. This study constructs the Vegetation Height Prediction Dataset (VHP-Dataset) to provide a standardized large-scale training resource that integrates multi-source remote sensing features and supports supervised learning tasks for vegetation height estimation.  Methods   The VHP-Dataset is constructed by integrating Landsat 8 multispectral imagery, the digital elevation model AW3D30 (ALOS World 3D, 30 m), land cover data CGLS-LC100 (Copernicus Global Land Service, Land Cover 100 m), and tree canopy cover data GFCC30TC (Global Forest Canopy Cover 30 m Tree Canopy). Canopy height from GEDI L2A (Global Ecosystem Dynamics Investigation, Level 2A) footprints is used as the target variable. A total of 18 input features is extracted, covering spatial location, spectral reflectance, topographic structure, vegetation indices, and vegetation cover information (Table 4, Fig. 4). For model validation, five representative approaches are applied: Extremely Randomized Trees (ExtraTree), Random Forest (RF), Artificial Neural Network (ANN), Broad Learning System (BLS), and Transformer. Model performance is assessed using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Standard Deviation (SD), and Coefficient of Determination (R2).  Results and Discussions   The experimental results show that the VHP-Dataset supports stable vegetation height prediction across regions and terrain conditions, which reflects its scientific validity and practical applicability. Model comparisons indicate that ExtraTree achieves the best performance in most regions, and Transformer performs well in specific areas, which confirms that the dataset is compatible with different approaches (Table 6). Stratified analyses show that prediction errors increase under high canopy cover and steep slope conditions, and predictions remain more stable at higher elevations (Figs. 69). These findings indicate that the dataset captures the effects of complex topography and canopy structure on model accuracy. Feature importance analysis shows that spatial location, topographic factors, and canopy cover indices are the primary drivers of prediction accuracy, while spectral and land cover information provide complementary contributions (Fig. 10).  Conclusions   The results show that the VHP-Dataset supports vegetation height prediction across regions and terrain types, which reflects its scientific validity and applicability. The dataset enables robust predictions with traditional machine learning methods such as tree-based models, and it also provides a foundation for deep learning approaches such as Transformers, which reflects broad methodological compatibility. Stratified analyses based on vegetation cover and terrain show the effects of complex canopy structures and topographic factors on prediction accuracy, and feature importance analysis identifies spatial location, topographic attributes, and canopy cover indices as the primary drivers. Overall, the VHP-Dataset fills the gap in large-scale high-quality datasets for vegetation height prediction in mountainous forests and provides a standardized benchmark for cross-regional model evaluation and comparison. This offers value for research on vegetation height prediction and forest ecosystem monitoring.
Comparison of DeepSeek-V3.1 and ChatGPT-5 in Multidisciplinary Team Decision-making for Colorectal Liver Metastases
ZHANG Yangzi, XU Ting, GAO Zhaoya, SI Zhenduo, XU Weiran
Available online  , doi: 10.11999/JEIT250849
Abstract:
  Objective   ColoRectal Cancer (CRC) is the third most commonly diagnosed malignancy worldwide. Approximately 25~50% of patients with CRC develop liver metastases during the course of their disease, which increases the disease burden. Although the MultiDisciplinary Team (MDT) model improves survival in ColoRectal Liver Metastases (CRLM), its broader implementation is limited by delayed knowledge updates and regional differences in medical standards. Large Language Models (LLMs) can integrate multimodal data, clinical guidelines, and recent research findings, and can generate structured diagnostic and therapeutic recommendations. These features suggest potential to support MDT-based care. However, the actual effectiveness of LLMs in MDT decision-making for CRLM has not been systematically evaluated. This study assesses the performance of DeepSeek-V3.1 and ChatGPT-5 in supporting MDT decisions for CRLM and examines the consistency of their recommendations with MDT expert consensus. The findings provide evidence-based guidance and identify directions for optimizing LLM applications in clinical practice.  Methods   Six representative virtual CRLM cases are designed to capture key clinical dimensions, including colorectal tumor recurrence risk, resectability of liver metastases, genetic mutation profiles (e.g., KRAS/BRAF mutations, HER2 amplification status, and microsatellite instability), and patient functional status. Using a structured prompt strategy, MDT treatment recommendations are generated separately by the DeepSeek-V3.1 and ChatGPT-5 models. Independent evaluations are conducted by four MDT specialists from gastrointestinal oncology, gastrointestinal surgery, hepatobiliary surgery, and radiation oncology. The model outputs are scored using a 5-point Likert scale across seven dimensions: accuracy, comprehensiveness, frontier relevance, clarity, individualization, hallucination risk, and ethical safety. Statistical analysis is performed to compare the performance of DeepSeek-V3.1 and ChatGPT-5 across individual cases, evaluation dimensions, and clinical disciplines.  Results and Discussions   Both LLMs, DeepSeek-V3.1 and ChatGPT-5, show robust performance across all six virtual CRLM cases, with an average overall score of ≥ 4.0 on a 5-point scale. This performance indicates that clinically acceptable decision support is provided within a complex MDT framework. DeepSeek-V3.1 shows superior overall performance compared with ChatGPT-5 (4.27±0.77 vs. 4.08±0.86, P=0.03). Case-by-case analysis shows that DeepSeek-V3.1 performs significantly better in Cases 1, 4, and 6 (P=0.04, P<0.01, and P =0.01, respectively), whereas ChatGPT-5 receives higher scores in Case 2 (P<0.01). No significant differences are observed in Cases 3 and 5 (P=0.12 and P=1.00, respectively), suggesting complementary strengths across clinical scenarios (Table 3). In the multidimensional assessment, both models receive high scores (range: 4.12\begin{document}$ \sim $\end{document}4.87) in clarity, individualization, hallucination risk, and ethical safety, confirming that readable, patient-tailored, reliable, and ethically sound recommendations are generated. Improvements are still needed in accuracy, comprehensiveness, and frontier relevance (Fig. 1). DeepSeek-V3.1 shows a significant advantage in frontier relevance (3.90±0.65 vs. 3.24±0.72, P=0.03) and ethical safety (4.87±0.34 vs. 4.58±0.65, P= 0.03) (Table 4), indicating more effective incorporation of recent evidence and more consistent delivery of ethically robust guidance. For the case with concomitant BRAF V600E and KRAS G12D mutations, DeepSeek-V3.1 accurately references a phase III randomized controlled study published in the New England Journal of Medicine in 2025 and recommends a triple regimen consisting of a BRAF inhibitor + EGFR monoclonal antibody + FOLFOX. By contrast, ChatGPT-5 follows conventional recommendations for RAS/BRAF mutant populations-FOLFOXIRI+bevacizumab-without integrating recent evidence on targeted combination therapy. This difference shows the effect of timely knowledge updates on the clinical value of LLM-generated recommendations. For MSI-H CRLM, ChatGPT-5’s recommendation of “postoperative immunotherapy” is not supported by phase III evidence or existing guidelines. Direct use of such recommendations may lead to overtreatment or ineffective therapy, representing a clear ethical concern and illustrating hallucination risks in LLMs. Discipline-specific analysis shows notable variation. In radiation oncology, DeepSeek-V3.1 provides significantly more precise guidance on treatment timing, dosage, and techniques than ChatGPT-5 (4.55±0.67 vs. 3.38±0.91, P<0.01), demonstrating closer alignment with clinical guidelines. In contrast, ChatGPT-5 performs better in gastrointestinal surgery (4.48±0.67 vs. 4.17 ±0.85, P=0.02), with experts rating its recommendations on surgical timing and resectability as more concise and accurate. No significant differences are identified in gastrointestinal oncology and hepatobiliary surgery (P=0.89 and P=0.14, respectively), indicating comparable performance in these areas (Table 5). These findings show a performance bias across medical sub-specialties, demonstrating that LLM effectiveness depends on the distribution and quality of training data.  Conclusions   Both DeepSeek-V3.1 and ChatGPT-5 demonstrated strong capabilities in providing reliable recommendations for CRLM-MDT decision-making. Specifically, DeepSeek-V3.1 showed notable advantages in integrating cutting-edge knowledge, ensuring ethical safety, and performing in the field of radiation oncology, whereas ChatGPT-5 excelled in gastrointestinal surgery, reflecting a complementary strength between the two models. This study confirms the feasibility of leveraging LLMs as “MDT collaborators”, offering a readily applicable and robust technical solution to bridge regional disparities in clinical expertise and enhance the efficiency of decision-making. However, model hallucination and insufficient evidence grading remain key limitations. Moving forward, mechanisms such as real-world clinical validation, evidence traceability, and reinforcement learning from human feedback are expected to further advance LLMs into more powerful auxiliary tools for CRLM-MDT decision support.
Differentiable Sparse Mask Guided Infrared Small Target Fast Detection Network
SHENG Weidong, WU Shuanglin, XIAO Chao, LONG Yunli, LI Xiaobin, ZHANG Yiming
Available online  , doi: 10.11999/JEIT250989
Abstract:
  Objective  Infrared small target detection holds significant and irreplaceable application value across various critical domains, including infrared guidance, environmental monitoring, and security surveillance. Its importance is underscored by tasks such as early warning systems, precision targeting, and pollution tracking, where timely and accurate detection is paramount. The core challenges in this domain stem from the inherent characteristics of infrared small targets: their extremely small size (typically less than 9×9 pixels), limited spatial features due to long imaging distance and the high probability of being overwhelmed by complex and cluttered backgrounds, such as cloud cover, sea glint, or urban thermal noise. These factors make it difficult to distinguish genuine targets from background clutter using conventional methods. Existing approaches to infrared small target detection can be broadly categorized into traditional model-based methods and modern deep learning techniques. Traditional methods often rely on manually designed background suppression operators, such as morphological filters (e.g., Top-Hat) or low-rank matrix recovery (e.g., IPI). While these methods are interpretable in simple scenarios, they struggle to adapt to dynamic and complex real-world environments, leading to high false alarm rates and limited robustness. On the other hand, deep learning-based methods, particularly those employing dense convolutional neural networks (CNNs), have shown improved detection performance by leveraging data-driven feature learning. However, these networks often fail to fully account for the extreme imbalance between target and background pixels—where targets typically constitute less than 1% of the entire image. This imbalance results in significant computational redundancy, as the network processes vast background regions that contribute little to the detection task, thereby hampering efficiency and real-time performance. To address these challenges, exploiting the sparsity of infrared small targets offers a promising direction. By designing a sparse mask generation module that capitalizes on target sparsity, it becomes feasible to coarsely extract potential target regions while filtering out the majority of redundant background areas. This coarse target region can then be refined through subsequent processing stages to achieve satisfactory detection performance. This paper presents an intelligent solution that effectively balances high detection accuracy with computational efficiency, making it suitable for real-time applications.  Methods  This paper proposes an end-to-end infrared small target detection network guided by a differentiable sparse mask. First, an input infrared image is preprocessed with convolution to generate raw features. A differentiable sparse mask generation module then uses two convolution branches to produce a probability map and a threshold map, and outputs a binary mask via a differentiable binarization function to extract target candidate regions and filter background redundancy. Next, a target region sampling module converts dense raw features into sparse features based on the binary mask. A sparse feature extraction module with a U-shaped structure (composed of encoders, decoders, and skip connections) using Minkowski Engine sparse convolution performs refined processing only on non-zero target regions to reduce computation. Finally, a pyramid pooling module fuses multi-scale sparse features, and the fused features are fed into a target-background binary classifier to output detection results.  Results and Discussions  To fully validate the effectiveness of the proposed method, comprehensive experiments were conducted on two mainstream infrared small target datasets: NUAA-SIRST, which contains 427 real-world infrared images extracted from actual videos, and NUDT-SIRST, a large-scale synthetic dataset with 1327 diverse images. The method was compared against 3 representative traditional algorithms (e.g., Top-Hat, IPI) and 6 state-of-the-art deep learning methods (e.g., DNA-Net, ACM). Results demonstrate the method achieves competitive detection performance: on NUAA-SIRST, it attains 74.38% IoU, 100% Pd, and 7.98×10-6 Fa; on NUDT-SIRST, it reaches 83.03% IoU, 97.67% Pd, and 9.81×10-6 Fa, matching the performance of leading deep learning methods. Notably, it excels in efficiency: with only 0.35M parameters, 11.10G Flops, and 215.06 FPS, its FPS is 4.8 times that of DNA-Net, significantly cutting computational redundancy. Ablation experiments (Fig.6) confirm the differentiable sparse mask module effectively filters most backgrounds while preserving target regions. Visual results (Fig.5) show fewer false alarms than traditional methods like PSTNN, as its "coarse-to-fine" mode reduces background interference, verifying balanced performance and efficiency.  Conclusions  This paper addresses the massive computational redundancy of existing dense computing methods in infrared small target detection—caused by extremely unbalanced target-background proportion (target proportion is usually smaller than 1% of the whole image)—by proposing a fast infrared small target detection network guided by a differentiable sparse mask. The network adaptively extracts candidate target regions and filters background redundancy via a differentiable sparse mask generation module, and constructs a feature extraction module based on Minkowski Engine sparse convolution to reduce computation, forming an end-to-end "coarse-to-fine" detection framework. Experiments on NUDT-SIRST and NUAA-SIRST datasets demonstrate that the proposed method achieves comparable detection performance to existing deep learning methods while significantly optimizing computational efficiency, balancing detection accuracy and speed. It provides a new idea for reducing redundancy based on sparsity in infrared small target detection, is applicable to scenarios like remote sensing detection, infrared guidance and environmental monitoring that require both real-time performance and accuracy, and offers useful references for the lightweight development of the field.
A Deception Jamming Discrimination Algorithm Based on Phase Fluctuation for Airborne Distributed Radar System
LV Zhuoyu, YANG Chao, SUO Chengyu, WEN Cai
Available online  , doi: 10.11999/JEIT240787
Abstract:
  Objective   Deception jamming in airborne distributed radar systems presents a crucial challenge, as false echoes generated by Digital Radio Frequency Memory (DRFM) devices tend to mimic true target returns in amplitude, delay, and Doppler characteristics. These similarities complicate target recognition and subsequently degrade tracking accuracy. To address this problem, attention is directed to phase fluctuation signatures, which differ inherently between authentic scattering responses and synthesized interference replicas. Leveraging this distinction is proposed as a means of improving discrimination reliability under complex electromagnetic confrontation conditions.  Methods   A signal-level fusion discrimination algorithm is proposed based on phase fluctuation variance. Five categories of synchronization errors that affect the phase of received echoes are analyzed and corrected, including filter mismatch, node position errors, and equivalent amplitude-phase deviations. Precise matched filters are constructed through a fine-grid iterative search to eliminate residual phase distortion caused by limited sampling resolution. Node position errors are estimated using a DRFM-based calibration array, and equivalent amplitude-phase deviations are corrected through an eigendecomposition-based procedure. After calibration, phase vectors associated with target returns are extracted, and the variance of these vectors is taken as the discrimination criterion. Authentic targets present large phase fluctuations due to complex scattering, whereas DRFM-generated replicas exhibit only small variations.  Results and Discussions   Simulation results show that the proposed method achieves reliable discrimination under typical airborne distributed radar conditions. When the signal-to-noise ratio is 25 dB and the jamming-to-noise ratio is 3 dB, the misjudgment rate for false targets approaches zero when more than five receiving nodes are used (Fig.10, Fig.11). The method remains robust even when only a few false targets are present and performs better than previously reported approaches, where discrimination fails in single- or dual-false-target scenarios (Fig.14). High recognition stability is maintained across different jamming-to-noise ratios and receiver quantities (Fig.13). The importance of system-level error correction is confirmed, as discrimination accuracy declines significantly when synchronization errors are not compensated (Fig.12).  Conclusions   A phase-fluctuation-based discrimination algorithm for airborne distributed radar systems is presented. By correcting system-level errors and exploiting the distinct fluctuation behavior of phase signatures from real and false echoes, the method achieves reliable deception-jamming discrimination in complex electromagnetic environments. Simulations indicate stable performance under varying numbers of false targets, demonstrating good applicability for distributed configurations. Future work will aim to enhance robustness under stronger environmental noise and clutter.
Two-Channel Joint Coding Detection for Cyber-Physical Systems Against Integrity Attacks
MO Xiaolei, ZENG Weixin, FU Jiawei, DOU Keqin, WANG Yanwei, SUN Ximing, LIN Sida, SUI Tianju
Available online  , doi: 10.11999/JEIT250729
Abstract:
  Objective  Cyber-Physical Systems (CPS) are widely applied across infrastructure, aviation, energy, healthcare, manufacturing, and transportation, as computing, control, and sensing technologies advance. Due to the real-time interaction between information and physical processes, such systems are exposed to security risks during data exchange. Attacks on CPS can be grouped into availability, integrity, and reliability attacks based on information security properties. Integrity attacks manipulate data streams to disrupt the consistency between system inputs and outputs. Compared with the other two types, integrity attacks are more difficult to detect because of their covert and dynamic nature. Existing detection strategies generally modify control signals, sensing signals, or system models. Although these approaches can detect specific categories of attacks, they may reduce control performance and increase model complexity and response delay.  Methods  A joint additive and multiplicative coding detection scheme for the two-channel structure of control and output is proposed. Three representative integrity attacks are tested, including a control-channel bias attack, an output-channel replay attack, and a two-channel covert attack. These attacks remain stealthy by partially or fully obtaining system information and manipulating data so the residual-based χ2 detector output stays below the detection threshold. The proposed method introduces paired additive watermarking signals with positive and negative patterns, together with paired multiplicative coding and decoding matrices on both channels. These additional unknown signals and parameters introduce information uncertainty to the attacker and cause the residual statistics to deviate from the expected values constructed using known system information. The watermarking pairs and matrix pairs operate through different mechanisms. One uses opposite-sign injection, while the other uses a mutually inverse transformation. Therefore, normal control performance is maintained when no attack is present. The time-varying structure also prevents attackers from reconstructing or bypassing the detection mechanism.  Results and Discussions  Simulation experiments on an aerial vehicle trajectory model are conducted to assess both the influence of integrity attacks on flight paths and the effectiveness of the proposed detection scheme. The trajectory is modeled using Newton’s equations of motion, and attitude dynamics and rotational motion are omitted to focus on positional behavior. Detection performance with and without the proposed method is compared under the three attack scenarios (Fig. 2, Fig. 3, Fig. 4). The results show that the proposed scheme enables effective identification of all attack types and maintains stable system behavior, demonstrating its practical applicability and improvement over existing approaches.  Conclusions  This study addresses the detection of integrity attacks in CPS. Three representative attack types (bias, replay, and covert attacks) are modeled, and the conditions required for their successful execution are analyzed. A detection approach combining additive watermarking and multiplicative encoding matrices is proposed and shown to detect all three attack types. The design uses paired positive–negative additive watermarks and paired encoding and decoding matrices to ensure accurate detection while maintaining normal control performance. A time-varying configuration is adopted to prevent attackers from reconstructing or bypassing the detection elements. Using an aerial vehicle trajectory simulation, the proposed approach is demonstrated to be effective and applicable to cyber-physical system security enhancement.
Full Field-of-View Optical Calibration with Microradian-Level Accuracy for Space Laser Communication Terminals on Low-Earth-Orbit Constellation Applications
XIE Qingkun, XU Changzhi, BIAN Jingying, ZHENG Xiaosong, ZHANG Bo
Available online  , doi: 10.11999/JEIT250734
Abstract:
  Objective  The Coarse Pointing Assembly (CPA) is a core element in laser communication systems and supports wide-field scanning, active orbit–attitude compensation, and dynamic disturbance isolation. To address multi-source disturbances such as orbital perturbations and attitude maneuvers, a high-precision, high-bandwidth, and fast-response Pointing, Acquisition, and Tracking (PAT) algorithm is required. Establishing a full Field-Of-View (FOV) optical calibration model between the CPA and the detector is essential for suppressing image degradation caused by spatial pointing deviations. Conventional calibration methods often rely on ray tracing to simulate beam offsets and infer calibration relationships, yet they show several limitations. These limitations include high modeling complexity caused by non-coaxial paths, multi-reflective surfaces, and freeform optics; susceptibility to systematic errors generated by assembly tolerances, detector non-uniformity, and thermal drift; and restricted applicability across the full FOV due to spatial anisotropy. A high-precision calibration method that remains effective across the entire FOV is therefore needed to overcome these challenges and ensure stable and reliable laser communication links.  Methods  To achieve precise CPA–detector calibration and address the limitations of traditional approaches, this paper presents a full FOV optical calibration method with microradian-level accuracy. Based on the optical design characteristics of periscope-type laser terminals, an equivalent optical transmission model of the CPA is established and the mechanism of image rotation is examined. Leveraging the structural rigidity of the optical transceiver channel, the optical transmission matrix is simplified to a constant matrix, yielding a full-space calibration model that directly links CPA micro-perturbations to spot displacements. By correlating the CPA rotation angles between the calibration target points and the actual operating positions, the calibration task is further reduced to estimating the calibration matrix at the target points. Random micro-perturbations are applied to the CPA to induce corresponding micro-displacements of the detector spot. A calibration equation based on CPA motion and spot displacement is formulated, and the calibration matrix is obtained through least-squares regression. The full-space calibration relationship between the CPA and detector is then derived through matrix operations.  Results and Discussions  Using the proposed calibration method, an experimental platform (Fig. 4) is constructed for calibration and verification with a periscope laser terminal. Accurate measurements of the conjugate motion relationship between the CPA and the CCD detector spot are obtained (Table. 1). To evaluate calibration accuracy and full-space applicability, systematic verification is conducted through single-step static pointing and continuous dynamic tracking. In the static pointing verification, the mechanical rotary table is moved to three extreme diagonal positions, and the CPA performs open-loop pointing based on the established CPA–detector calibration relationship. Experimental results show that the spot reaches the intended target position (Fig. 5), with a pointing accuracy below 12 mrad (RMS). In the dynamic tracking experiment, system control parameters are optimized to maintain stable tracking of the platform beam. During low-angular-velocity motion of the rotary table, the laser terminal sustains stable tracking (Fig. 6). The CPA trajectory shows a clear conjugate relationship with the rotary table motion (Fig. 6(a), Fig. 6(b)), and the tracking accuracy in both orthogonal directions is below 4 mrad (Fig. 6(c), Fig. 6(d)). The independence of the optical transmission matrix from the selection of calibration target points is also examined. By increasing the spatial accessibility of calibration points, the method reduces operational complexity while maintaining calibration precision. Improved spatial distribution of calibration points further enhances calibration efficiency and accuracy.  Conclusions  This paper presents a full FOV optical calibration method with microradian-level accuracy based on single-target micro-perturbation measurement. To satisfy engineering requirements for rapid linking and stable tracking, a full-space optical matrix model for CPA–detector calibration is constructed using matrix optics. Random micro-perturbations applied to the CPA at a single target point generate a generalized transfer equation, from which the calibration matrix is obtained through least-squares estimation. Experimental results show that the model mitigates image rotation, mirroring, and tracking anomalies, suppresses calibration residuals to below 12 mrad across the full FOV, and limits the dynamic tracking error to within 5 mrad per axis. The method eliminates the need for additional hardware and complex alignment procedures, providing a high-precision and low-complexity solution that supports rapid deployment in the mass production of Low-Earth-Orbit (LEO) laser terminals.
Visible Figure Part of Multi-source Maritime Ship Dataset
CUI Yaqi, ZHOU Tian, XIONG Wei, XU Saifei, LIN Chuanqi, XIA Mutao, SUN Weiwei, TANG Tiantian, ZHANG Jie, GUO Hengguang, SONG Penghan, HUAN Yingchun, ZHANG Zhenjie
Available online  , doi: 10.11999/JEIT250138
Abstract:
  Objective  The increasing intensity of marine resource development and maritime operations has heightened the need for accurate vessel detection under complex marine conditions, which is essential for protecting maritime rights and interests. In recent years, object detection algorithms based on deep learning—such as YOLO and Faster R-CNN—have emerged as key methods for maritime target perception due to their strong feature extraction capabilities. However, their performance relies heavily on large-scale, high-quality training data. Existing general-purpose datasets, such as COCO and PASCAL VOC, offer limited vessel classes and predominantly feature static, urban, or terrestrial scenes, making them unsuitable for marine environments. Similarly, specialized datasets like SeaShips and the Singapore Marine Dataset (SMD) suffer from constraints such as limited data sources, simple scenes, small sample sizes, and incomplete coverage of marine target categories. These limitations significantly hinder further performance improvement of detection algorithms. Therefore, the development of large-scale, multimodal, and comprehensive marine-specific datasets represents a critical step toward resolving current application challenges. This effort is urgently needed to strengthen marine monitoring capabilities and ensure operational safety at sea.  Methods  To overcome the aforementioned challenges, a multi-sensor marine target acquisition system integrating radar, visible-light, infrared, laser, Automatic Identification System (AIS), and Global Positioning System (GPS) technologies is developed. A two-month shipborne observation campaign is conducted, yielding 200 hours of maritime monitoring and over 90 TB of multimodal raw data. To efficiently process this large volume of low-value-density data, a rapid annotation pipeline is designed, combining automated labeling with manual verification. Iterative training of intelligent annotation models, supplemented by extensive manual correction, enables the construction of the Visible Figure Part of the Multi-Source Maritime Ship Dataset (MSMS-VF). This dataset comprises 265,233 visible-light images with 1,097,268 bounding boxes across nine target categories: passenger ship, cargo vessel, speedboat, sailboat, fishing boat, buoy, floater, offshore platform, and others. Notably, 55.84% of targets are small, with pixel areas below 1,024. The dataset incorporates diverse environmental conditions including backlighting, haze, rain, and occlusion, and spans representative maritime settings such as harbor basins, open seas, and navigation channels. MSMS-VF offers a comprehensive data foundation for advancing maritime target detection, recognition, and tracking research.  Results and Discussions  The MSMS-VF dataset exhibits substantially greater diversity than existing datasets (Table 1, Table 2). Small targets, including buoys and floaters, occur frequently (Table 5), posing significant challenges for detection. Five object detection models—YOLO series, Real-Time Detection Transformer (RT-DETR), Faster R-CNN, Single Shot MultiBox Detector (SSD), and RetinaNet—are assessed, together with five multi-object tracking algorithms: Simple Online and Realtime Tracking (SORT), Optimal Compute for SORT (OC-SORT), DeepSORT, ByteTrack, and MotionTrack. YOLO models exhibit the most favorable trade-off between speed and accuracy. YOLOv11 achieves a mAP50 of 0.838 on the test set and a processing speed of 34.43 FPS (Table 6). However, substantial performance gaps remain for small targets; for instance, YOLOv11 yields a mAP50 of 0.549 for speedboats, markedly lower than the 0.946 obtained for large targets such as cargo vessels (Table 7). RT-DETR shows moderate performance on small objects, achieving a mAP50 of 0.532 for floaters, whereas conventional models like Faster R-CNN perform poorly, with mAP50 values below 0.1. For tracking, MotionTrack performs best under low-frame-rate conditions, achieving a MOTA of 0.606, IDF1 of 0.750, and S of 0.681 using a Gaussian distance cascade-matching strategy (Table 8, Fig. 13).  Conclusions  This study presents the MSMS-VF dataset, which offers essential data support for maritime perception research through its integration of multi-source inputs, diverse environmental scenarios, and a high proportion of small targets. Experimental validation confirms the dataset’s utility in training and evaluating state-of-the-art algorithms, while also revealing persistent challenges in detecting and tracking small objects under dynamic maritime conditions. Nevertheless, the dataset has limitations. The current data are predominantly sourced from waters near Yantai, leading to imbalanced ship-type representation and the absence of certain vessel categories. Future efforts will focus on expanding data acquisition to additional maritime regions, broadening the scope of multi-source data collection, and incrementally releasing extended components of the dataset to support ongoing research.
Performance Analysis for Self-Sustainable Intelligent Metasurface Based Reliable and Secure Communication Strategies
QU Yayun, CAO Kunrui, WANG Ji, XU Yongjun, CHEN Jingyu, DING Haiyang, JIN Liang
Available online  , doi: 10.11999/JEIT250637
Abstract:
  Objective  The Reconfigurable Intelligent Surface (RIS) is generally powered by a wired method, and its power cable functions as a “tail” that restricts RIS maneuverability during outdoor deployment. A Self-Sustainable Intelligent Metasurface (SIM) that integrates RIS with energy harvesting is examined, and an amplified SIM architecture is presented. The reliability and security of SIM communication are analyzed, and the analysis provides a basis for its rational deployment in practical design.  Methods   The static wireless-powered and dynamic wireless-powered SIM communication strategies are proposed to address the energy and information outage challenges faced by SIM. The communication mechanism of the un-amplified SIM and amplified SIM (U-SIM and A-SIM) under these two strategies is examined. New integrated performance metrics of energy and information, termed joint outage probability and joint intercept probability, are proposed to evaluate the strategies from the perspectives of communication reliability and communication security.  Results and Discussions   The simulations evaluate the effect of several critical parameters on the communication reliability and security of each strategy. The results indicate that: (1) Compared to alternative schemes, at low base station transmit power, A-SIM achieves optimal reliability under the dynamic wireless-powered strategy and optimal security under the static wireless-powered strategy (Figs. 2 and 3). (2) Under the same strategy type, increasing the number of elements at SIM generally enhances reliability but reduces security. With a large number of elements, U-SIM maintains higher reliability than A-SIM, while A-SIM achieves higher security than U-SIM (Figs. 4 and 5). (3) An optimal amplification factor maximizes communication reliability for SIM systems (Fig. 6).  Conclusions   The results show that the dynamic wireless-powered strategy can mitigate the reduction in the reliability of SIM communication caused by insufficient energy. Although the amplified noise of A-SIM decreases reliability, it can improve security. Under the same static or dynamic strategies, as the number of elements at SIM increases, A-SIM provides better security, whereas U-SIM provides better reliability.
Energy Consumption Optimization of Cooperative NOMA Secure Offload for Mobile Edge Computing
CHEN Jian, MA Tianrui, YANG Long, LÜ Lu, XU Yongjun
Available online  , doi: 10.11999/JEIT250606
Abstract:
  Objective  Mobile Edge Computing (MEC) is used to strengthen the computational capability and response speed of mobile devices by shifting computing and caching functions to the network edge. Non-Orthogonal Multiple Access (NOMA) further supports high spectral efficiency and large-scale connectivity. Because wireless channels are broadcast, the MEC offload transmission process is exposed to potential eavesdropping. To address this risk, physical-layer security is integrated into a NOMA-MEC system to safeguard secure offloading. Existing studies mainly optimize performance metrics such as energy use, latency, and throughput, or improve security through NOMA-based co-channel interference and cooperative interference. However, the combined effect of performance and security has not been fully examined. To reduce the energy required for secure offloading, a cooperative NOMA secure offload scheme is designed. The distinctive feature of the proposed scheme is that cooperative nodes provide forwarding and computational assistance at the same time. Through joint local computation between users and cooperative nodes, the scheme strengthens security in the offload process while reducing system energy consumption.  Methods  The joint design of computational and communication resource allocation for the nodes is examined by dividing the offloading procedure into two stages: NOMA offloading and cooperative offloading. Offloading strategies for different nodes in each stage are considered, and an optimization problem is formulated to minimize the weighted total system energy consumption under secrecy outage constraints. To handle the coupled multi-variable and non-convex structure, secrecy transmission rate constraints and secrecy outage probability constraints, originally expressed in probabilistic form, are first transformed. The main optimization problem is then separated into two subproblems: slot and task allocation, and power allocation. For the non-convex power allocation subproblem, the non-convex constraints are replaced with bilinear substitutions, and sequential convex approximations are applied. An alternating iterative resource allocation algorithm is ultimately proposed, allowing the load, power, and slot assignment between users and cooperative nodes to be adjusted according to channel conditions so that energy consumption is minimized while security requirements are satisfied.  Results and Discussions  Theoretical analysis and simulation results show that the proposed scheme converges quickly and maintains low computational complexity. Relative to existing NOMA full-offloading schemes, assisted computing schemes, and NOMA cooperative interference schemes, the proposed offloading design reduces system energy consumption and supports a higher load under identical secrecy constraints. The scheme also demonstrates strong robustness, as its performance is less affected by weak channel conditions or increased eavesdropping capability.  Conclusions  The study shows that system energy consumption and security constraints are closely coupled. In the MECg offloading process, communication, computation, and security are not independent. Performance and security can be improved at the same time through the effective use of cooperative nodes. When cooperative nodes are present, NOMA and forwarding cooperation can reduce the effects of weak channel conditions or high eavesdropping risks on secure and reliable transmission. Cooperative nodes can also share users’ local computational load to strengthen overall system performance. Joint local computation between users and cooperative nodes further reduces the security risks associated with long-distance wireless transmission. Thus, secure offloading in MEC is not only a Physical Layer Security issue in wireless transmission but also reflects the coupled relationship between communication and computation that is specific to MEC. By making full use of idle resources in the network, cooperative communication and computation among idle nodes can enhance system security while maintaining performance.
Single-Channel High-Precision Sparse DOA Estimation of GNSS Signals for Deception Suppression
KANG Weiquan, LU Zunkun, LI Baiyu, SONG Jie, XIAO Wei
Available online  , doi: 10.11999/JEIT250725
Abstract:
  Objective  The proliferation of spoofing attacks poses a significant threat to the reliability and security of Global Navigation Satellite Systems (GNSS), which are critical for navigation and positioning across civilian and military applications. Traditional anti-spoofing methods relying on multi-antenna arrays incur high hardware complexity and exhibit limited estimation accuracy under low signal-to-noise ratio (SNR) conditions, compromising their effectiveness in resource-constrained or adverse environments. This research proposes a novel single-channel high-precision sparse direction-of-arrival (DOA) estimation method aimed at suppressing spoofing signals in GNSS receivers. The primary goals are to substantially reduce the hardware complexity associated with spoofing detection and to achieve superior DOA estimation performance even in extremely low SNR scenarios. By exploiting the spatial sparsity of GNSS signals and integrating advanced signal processing techniques, this approach seeks to deliver a cost-effective, robust solution for enhancing GNSS security against deceptive interference.  Methods  The proposed method leverages a single-channel processing framework to estimate the DOA of GNSS signals with high precision, employing a multi-step strategy tailored for spoofing suppression. The process begins with reconstructing the digital intermediate frequency signal using tracking loop parameters—such as code phase and carrier Doppler—derived from a reference array element. This reconstruction capitalizes on the orthogonality of pseudo-random noise codes inherent to GNSS signals, enabling correlation between the reconstructed signal and the original array data to enhance the SNR prior to despreading. This step isolates a clean steering vector, minimizing noise and interference contributions. The method then harnesses the spatial sparsity of GNSS signals, which arises from the limited number of authentic satellites and potential spoofing sources in the angular domain. An overcomplete dictionary is constructed, comprising steering vectors corresponding to a grid of possible azimuth and elevation angles. The DOA estimation is reformulated as a sparse reconstruction problem, where the steering vector is represented as a sparse combination of dictionary elements. To solve this efficiently, the Alternating Direction Method of Multipliers (ADMM) is employed, iteratively optimizing a regularized objective that balances data fidelity with sparsity. A two-stage grid refinement approach—starting with a coarse search followed by a finer resolution—reduces computational demands while maintaining accuracy. Once DOA estimates are obtained, spoofing signals are identified by their angular proximity to authentic signals, and a Linearly Constrained Minimum Variance (LCMV) beamformer is applied to suppress these interferers while preserving legitimate signals.  Results and Discussions  Simulations were conducted to assess the proposed method’s performance across various low SNR conditions, using a 4×4 uniform planar array and Beidou B3I signals as a test case. The results reveal that the single-channel sparse DOA estimation method significantly outperforms traditional algorithms like Unitary ESPRIT and Cyclic MUSIC in both accuracy and resolution. In scenarios with an SNR as low as –35 dB, the proposed approach achieves root mean square errors (RMSE) for azimuth and elevation estimates below 1 degree (Fig.2), compared to errors exceeding 30 degrees for the benchmark methods (Fig. 3(a), Fig. 3(b)). It also resolves signals separated by as little as 1 degree (Fig. 4(a), Fig. 4(b)), highlighting its superior resolution capability. Building upon the accurate DOA estimates obtained in the proposed method, LCMV beamforming successfully suppressed spoofing signals. As shown in Fig. 5(b), the proposed method's high-fidelity DOA estimates allowed the beamformer to place deep nulls precisely at the estimated spoofing directions (e.g., (10°, 250°) and (20°, 250°)), effectively attenuating spoofers while preserving genuine signals. In contrast, the lower DOA estimation accuracy of Cyclic MUSIC (Fig. 5(a)) resulted in misaligned nulls and compromised suppression performance. This validates the practical utility of the high-precision DOA estimates for robust spoofing mitigation.  Conclusions  This study introduces a pioneering single-channel high-precision sparse DOA estimation method for GNSS spoofing suppression, addressing the limitations of traditional multi-antenna approaches in terms of complexity and low-SNR performance. By integrating signal reconstruction, sparse modeling, and ADMM-based optimization, the method achieves exceptional accuracy and resolution under challenging conditions, validated through simulations showing RMSE below 1 degree at -35 dB SNR. Coupled with LCMV beamforming, it effectively mitigates spoofing threats, enhancing GNSS reliability with minimal hardware requirements. This cost-effective solution is particularly valuable for resource-limited applications, reducing dependency on complex arrays while maintaining robust security. Future work could explore its adaptability to dynamic environments, such as moving spoofers or multipath scenarios, and its integration with complementary anti-spoofing techniques. Overall, this research provides a practical, high-performance framework for securing GNSS systems, with significant implications for navigation safety and operational efficiency.
Multi-modal Joint Automatic Modulation Recognition Method Towards Low SNR Sequences
WANG Zhen, LIU Wei, LU Wanjie, NIU Chaoyang, LI Runsheng
Available online  , doi: 10.11999/JEIT250594
Abstract:
  Objective  The rapid evolution of data-driven intelligent algorithms and the rise of multi-modal data indicate that the future of Automatic Modulation Recognition (AMR) lies in joint approaches that integrate multiple domains, use multiple frameworks, and connect multiple scales. However, the embedding spaces of different modalities are heterogeneous, and existing models lack cross-modal adaptive representation, limiting their ability to achieve collaborative interpretation. To address this challenge, this study proposes a performance-interpretable two-stage deep learning–based AMR (DL-AMR) method that jointly models the signal in the time and transform domains. The approach explicitly and implicitly represents signals from multiple perspectives, including temporal, spatial, frequency, and intensity dimensions. This design provides theoretical support for multi-modal AMR and offers an intelligent solution for modeling low Signal-to-Noise Ratio (SNR) time sequences in open environments.  Methods  The proposed AMR network begins with a preprocessing stage, where the input signal is represented as an in-phase and quadrature (I–Q) sequence. After wavelet thresholding denoising, the signal is converted into a dual-channel representation, with one channel undergoing Short-Time Fourier transform (STFT). This preprocessing yields a dual-stream representation comprising both time-domain and transform-domain signals. The signal is then tokenized through time-domain and transform-domain encoders. In the first stage, explicit modal alignment is performed. The token sequences from the time and transform domains are input in parallel into a contrastive learning module, which explicitly captures and strengthens correlations between the two modalities in dimensions such as temporal structure and amplitude. The learned features are then passed into the feature fusion module. Bidirectional Long Short-Term Memory (BiLSTM) and local representation layers are employed to capture temporally sparse features, enabling subsequent feature decomposition and reconstruction. To refine feature extraction, a subspace attention mechanism is applied to the high-dimensional sparse feature space, allowing efficient capture of discriminative information contained in both high-frequency and low-frequency components. Finally, Convolutional Neural Network – Kolmogorov-Arnold Network (CNN-KAN) layers replace traditional multilayer perceptrons as classifiers, thereby enhancing classification performance under low SNR conditions.  Results and Discussions  The proposed method is experimentally validated on three datasets: RML2016.10a, RML2016.10b, and HisarMod2019.1. Under high SNR conditions (SNR > 0 dB), classification accuracies of 93.36%, 93.13%, and 93.37% are achieved on the three datasets, respectively. Under low SNR conditions, where signals are severely corrupted or blurred by noise, recognition performance decreases but remains robust. When the SNR ranges from –6 dB to 0 dB, overall accuracies of 78.36%, 80.72%, and 85.43% are maintained, respectively. Even at SNR levels below –6 dB, accuracies of 17.10%, 21.30%, and 29.85% are obtained. At particularly challenging low-SNR levels, the model still achieves 43.45%, 44.54%, and 60.02%. Compared with traditional approaches, and while maintaining a low parameter count (0.33–0.41 M), the proposed method improves average recognition accuracy by 2.12–7.89%, 0.45–4.64%, and 6.18–9.53% on the three datasets. The improvements under low SNR conditions are especially significant, reaching 4.89–12.70% (RML2016.10a), 2.62–8.72% (RML2016.10b), and 4.96–11.63% (HisarMod2019.1). The results indicate that explicit modeling of time–transform domain correlations through contrastive learning, combined with the hybrid architecture consisting of LSTM for temporal sequence modeling, CNN for local feature extraction, and KAN for nonlinear approximation, substantially enhances the noise robustness of the model.  Conclusions  This study proposes a two-stage AMR method based on time–transform domain multimodal fusion. Explicit multimodal alignment is achieved through contrastive learning, while temporal and local features are extracted using a combination of LSTM and CNN. The KAN is used to enhance nonlinear modeling, enabling implicit feature-level multimodal fusion. Experiments conducted on three benchmark datasets demonstrate that, compared with classical methods, the proposed approach improves recognition accuracy by 2.62–11.63% within the SNR range of –20 to 0 dB, while maintaining a similar number of parameters. The performance gains are particularly significant under low-SNR conditions, confirming the effectiveness of multimodal joint modeling for robust AMR.
A Vehicle-Infrastructure Cooperative 3D Object Detection Scheme Based on Adaptive Feature Selection
LIANG Yan, YANG Huilin, SHAO Kai
Available online  , doi: 10.11999/JEIT250601
Abstract:
  Objective  Vehicle-infrastructure cooperative Three-Dimensional (3D) object detection is viewed as a core technology for intelligent transportation systems. As autonomous driving advances, the fusion of roadside and vehicle-mounted LiDAR data provides beyond-line-of-sight perception for vehicles, offering clear potential for improving traffic safety and efficiency. Conventional cooperative perception, however, is constrained by limited communication bandwidth and insufficient aggregation of heterogeneous data, which restricts the balance between detection performance and bandwidth usage. These constraints hinder the practical deployment of cooperative perception in complex traffic environments. This study proposes an Adaptive Feature Selection-based Vehicle-Infrastructure Cooperative 3D Object Detection Scheme (AFS-VIC3D) to address these challenges. Spatial filtering theory is used to identify and transmit the critical features required for detection, improving 3D perception performance while reducing bandwidth consumption.  Methods  AFS-VIC3D uses a coordinated design for roadside and vehicle-mounted terminals. Incoming point clouds are encoded into Bird’s-Eye View (BEV) features through PointPillar encoders, and metadata synchronization ensures spatiotemporal alignment. At the roadside terminal, key features are selected using two parallel branches: a Graph Structure Feature Enhancement Module (GSFEM) and an Adaptive Communication Mask Generation Module (ACMGM). Multi-scale features are then extracted hierarchically with a ResNet backbone. The outputs of both branches are fused through elementwise multiplication to generate optimized features for transmission. At the vehicle-mounted terminal, BEV features are processed using homogeneous backbones and fused through a Multi-Scale Feature Aggregation (MSFA) module across scale, spatial, and channel dimensions, reducing sensor heterogeneity and improving detection robustness.  Results and Discussions  The effectiveness and robustness of AFS-VIC3D are validated on both the DAIRV2X real-world dataset and the V2XSet simulation dataset. Comparative experiments (Table 1, Fig. 5) show that the model attains higher detection accuracy with lower communication overhead and exhibits slower degradation under low-bandwidth conditions. Ablation studies (Table 2) demonstrate that each module (GSFEM, ACMGM, and MSFA) contributes to performance. GSFEM improves the discriminability of target features, and ACMGM used with GSFEM further reduces communication cost. A comparison of feature transmission methods (Table 3) shows that adaptive sampling based on scene complexity and target density (C-DASFAN) yields higher accuracy and lower bandwidth usage, confirming the advantage of ACMGM. BEV visualizations (Fig. 6) indicate that predicted bounding boxes align closely with ground truth with minimal redundancy. Analysis of complex scenarios (Fig. 7) shows fewer missed detections and false positives, demonstrating robustness in high-density and complex road environments. Feature-level visualization (Fig. 8) further verifies that GSFEM and ACMGM enhance target features and suppress background noise, improving overall detection performance.  Conclusions  This study presents an AFS-VIC3D that addresses the key challenges of limited communication bandwidth and heterogeneous data aggregation through a coordinated design combining roadside dual-branch feature optimization and vehicle-mounted MSFA. The GSFEM module uses graph neural networks to enhance the discriminability of target features, the ACMGM module optimizes communication resources through communication mask generation, and the MSFA module improves heterogeneous data aggregation between vehicle and infrastructure terminals through joint spatial and channel aggregation. Experiments on the DAIR-V2X and V2XSet datasets show that AFS-VIC3D improves 3D detection accuracy while lowering communication overhead, with clear advantages in complex traffic scenarios. The framework offers a practical and effective solution for vehicle-infrastructure cooperative 3D perception and demonstrates strong potential for deployment in bandwidth-constrained intelligent transportation systems.
High Area-efficiency Radix-4 Number Theoretic Transform Hardware Architecture with Conflict-free Memory Access Optimization for Lattice-based Cryptography
ZHENG Jiwen, ZHAO Shilei, ZHANG Ziyue, LIU Zhiwei, YU Bin, HUANG Hai
Available online  , doi: 10.11999/JEIT250687
Abstract:
  Objective  The advancement of Post-Quantum Cryptography (PQC) standardization increases the demand for efficient Number Theoretic Transform (NTT) hardware modules. Existing high-radix NTT studies primarily optimize in-place computation and configurability, yet the performance is constrained by complex memory access behavior and a lack of designs tailored to the parameter characteristics of lattice-based schemes. To address these limitations, a high area-efficiency radix-4 NTT design with a Constant-Geometry (CG) structure is proposed. The modular multiplication unit is optimized through an analysis of common modulus properties and the integration of multi-level operations, while memory allocation and address-generation strategies are refined to reduce capacity requirements and improve data-access efficiency. The design supports out-of-place storage and achieves conflict-free memory access, providing an effective hardware solution for radix-4 CG NTT implementation.  Methods  At the algorithmic level, the proposed radix-4 CG NTT/INTT employs a low-complexity design and removes the bit-reversal step to reduce multiplication count and computation cycles, with a redesigned twiddle-factor access scheme. For the modular multiplication step, which is the most time-consuming stage in the radix-4 butterfly, the critical path is shortened by integrating the multiplication with the first-stage K−RED reduction and simplifying the correction logic. To support three parameter configurations, a scalable modular-multiplication method is developed through an analysis of the shared properties of the moduli. At the architectural level, two coefficients are concatenated and stored at the same memory address. A data-decomposition and reorganization scheme is designed to coordinate memory interaction with the dual-butterfly units efficiently. To achieve conflict-free memory access, a cyclic memory-reuse strategy is employed, and read and write address-generation schemes using sequential and stepped access patterns are designed, which reduces required memory capacity and lowers control-logic complexity.  Results and Discussions  Experimental results on Field Programmable Gate Arrays demonstrate that the proposed NTT architecture achieves high operating frequency and low resource consumption under three parameter configurations, together with notable improvement in the Area–Time Product (ATP) compared with existing designs (Table 1). For the configuration with 256 terms and a modulus of 7 681, the design uses 2 397 slices, 4 BRAMs, and 16 DSPs, achieves an operating frequency of 363 MHz, and yields at least a 56.4% improvement in ATP. For the configuration with 256 terms and a modulus of 8 380 417, it uses 3 760 slices, 6 BRAMs, and 16 DSPs, achieves an operating frequency of 338 MHz, and yields at least a 69.8% improvement in ATP. For the configuration with 1 024 terms and a modulus of 12 289, it uses 2 379 slices, 4 BRAMs, and 16 DSPs, achieves an operating frequency of 357 MHz, and yields at least a 50.3% improvement in ATP.  Conclusions  A high area-efficiency radix-4 NTT hardware architecture for lattice-based PQC is proposed. The use of a low-complexity radix-4 CG NTT/INTT and the removal of the bit-reversal step reduce latency. Through an analysis of shared characteristics among three moduli and the merging of partial computations, a scalable modular-multiplication architecture based on K²−RED reduction is designed. The challenges of increased storage requirements and complex address-generation logic are addressed by reusing memory efficiently and designing sequential and stepped address-generation schemes. Experimental results show that the proposed design increases operating frequency and reduces resource consumption, yielding lower ATP under all three parameter configurations. As the present work focuses on a dual-butterfly architecture, future research may examine higher-parallelism designs to meet broader performance requirements.
Tensor-Train Decomposition for Lightweight Liver Tumor Segmentation
MA Jinlin, YANG Jipeng
Available online  , doi: 10.11999/JEIT250293
Abstract:
  Objective  Convolutional Neural Networks (CNNs) have recently achieved notable progress in medical image segmentation. Their conventional convolution operations, however, remain constrained by locality, which reduces their ability to capture global contextual information. Researchers have pursued two main strategies to address this limitation. Hybrid CNN–Transformer architectures use self-attention to model long-range dependencies, and this markedly improves segmentation accuracy. State-space models such as the Mamba series reduce computational cost and retain global modeling capacity, and they also show favorable scalability. Although CNN–Transformer models remain computationally demanding for real-time use, Mamba-based approaches still face challenges such as boundary blur and parameter redundancy when segmenting small targets and low-contrast regions. Lightweight network design has therefore become a research focus. Existing lightweight methods, however, still show limited segmentation accuracy for liver tumor targets with very small sizes and highly complex boundaries. This paper proposes an efficient lightweight method for liver tumor segmentation that aims to meet the combined requirements of high accuracy and real-time performance for small targets with complex boundaries.  Methods  The proposed method integrates three strategies. A Tensor-Train Multi-Scale Convolutional Attention (TT-MSCA) module is designed to improve segmentation accuracy for small targets and complex boundaries. This module optimizes multi-scale feature fusion through a TT_Layer and employs tensor decomposition to integrate feature information across scales, which supports more accurate identification and segmentation of tumor regions in challenging images. A feature extraction module with a multi-branch residual structure, termed the IncepRes Block, strengthens the model’s capacity to capture global contextual information. Its parallel multi-branch design processes features at several scales and enriches feature representation at a relatively low computational cost. All standard 3×3 convolutions are then decoupled into two consecutive strip convolutions. This reduces the number of parameters and computational cost although the feature extraction capacity is preserved. The combination of these modules allows the method to improve segmentation accuracy and maintain high efficiency, and it demonstrates strong performance for small targets and blurry boundary regions.  Results and Discussions  Experiments on the LiTS2017 and 3Dircadb datasets show that the proposed method reaches Dice coefficients of 98.54% and 97.95% for liver segmentation, and 94.11% and 94.35% for tumor segmentation. Ablation studies show that the TT-MSCA module and the IncepRes Block improve segmentation performance with only a modest computational cost, and the SC Block reduces computational cost while accuracy is preserved (Table 2). When the TT-MSCA module is inserted into the reduced U-Net on the LiTS2017 dataset, the tumor Dice and IoU reach 93.73% and 83.60%. These values are second only to the final model. On the 3Dircadb dataset, adding the SC Block after TT-MSCA produces a slight accuracy decrease but reduces GFLOPs by a factor of 4.15. Compared with the original U-Net, the present method improves liver IoU by 3.35% and tumor IoU by 5.89%. The TT-MSCA module also consistently exceeds the baseline MSCA module. It increases liver and tumor IoU by 2.59% and 1.95% on LiTS2017, and by 2.03% and 3.13% on 3Dircadb (Table 5). These results show that the TT_Layer strengthens global context perception and fine-detail representation through multi-scale feature fusion. The proposed network contains 0.79 M parameters and 1.43 GFLOPs, which represents a 74.9% reduction in parameters compared with CMUNeXt (3.15 M). Real-time performance evaluation records 156.62 FPS, more than three times the 50.23 FPS of the vanilla U-Net (Table 6). Although accuracy decreases slightly in a few isolated metrics, the overall accuracy–compression balance is improved, and the method demonstrates strong practical value for lightweight liver tumor segmentation.  Conclusions  This paper proposes an efficient liver tumor segmentation method that improves segmentation accuracy and meets real-time requirements. The TT-MSCA module enhances recognition of small targets and complex boundaries through the integration of spatial and channel attention. The IncepRes Block strengthens the network’s perception of liver tumors of different sizes. The decoupling of standard 3×3 convolutions into two consecutive strip convolutions reduces the parameter count and computational cost while preserving feature extraction capacity. Experimental evidence shows that the method reduces errors caused by complex boundaries and small tumor sizes and can satisfy real-time deployment needs. It offers a practical technical option for liver tumor segmentation. The method requires many training iterations to reach optimal data fitting, and future work will address improvements in convergence speed.
LLM-based Data Compliance Checking for Internet of Things Scenarios
LI Chaohao, WANG Haoran, ZHOU Shaopeng, YAN Haonan, ZHANG Feng, LU Tianyang, XI Ning, WANG Bin
Available online  , doi: 10.11999/JEIT250704
Abstract:
  Objective  The implementation of regulations such as the Data Security Law of the People’s Republic of China, the Personal Information Protection Law of the People’s Republic of China, and the European Union General Data Protection Regulation (GDPR) has established data compliance checking as a central mechanism for regulating data processing activities, ensuring data security, and protecting the legitimate rights and interests of individuals and organizations. However, the characteristics of the Internet of Things (IoT), defined by large numbers of heterogeneous devices and the dynamic, extensive, and variable nature of transmitted data, increase the difficulty of compliance checking. Logs and traffic data generated by IoT devices are long, unstructured, and often ambiguous, which results in a high false-positive rate when traditional rule-matching methods are applied. In addition, the dynamic business environments and user-defined compliance requirements further increase the complexity of rule design, maintenance, and decision-making.  Methods  A large language model-driven data compliance checking method for IoT scenarios is proposed to address the identified challenges. In the first stage, a fast regular expression matching algorithm is employed to efficiently screen potential non-compliant data based on a comprehensive rule database. This process produces structured preliminary checking results that include the original non-compliant content and the corresponding violation type. The rule database incorporates current legislation and regulations, standard requirements, enterprise norms, and customized business requirements, and it maintains flexibility and expandability. By relying on the efficiency of regular expression matching and generating structured preliminary results, this stage addresses the difficulty of reviewing large volumes of long IoT text data and enhances the accuracy of the subsequent large language model review. In the second stage, a Large Language Model (LLM) is employed to evaluate the precision of the initial detection results. For different categories of violations, the LLM adaptively selects different prompt words to perform differentiated classification detection.  Results and Discussions  Data are collected from 52 IoT devices operating in a real environment, including log and traffic data (Table 2). A compliance-checking rule library for IoT devices is established in accordance with the Cybersecurity Law, the Data Security Law, other relevant regulations, and internal enterprise information-security requirements. Based on this library, the collected data undergo a first-stage rule-matching process, yielding a false-positive rate of 64.3% and identifying 55 080 potential non-compliant data points. Three aspects are examined: benchmark models, prompt schemes, and role prompts. In the benchmark model comparison, eight mainstream large language models are used to evaluate detection performance (Table 5), including Qwen2.5-32B-Instruct, DeepSeek-R1-70B, and DeepSeek-R1-0528 with different parameter configurations. After review and testing by the large language model, the initial false-positive rate is reduced to 6.9%, which demonstrates a substantial improvement in the quality of compliance checking. The model’s own error rate remains below 0.01%. The prompt-engineering assessment shows that prompt design exerts a strong effect on review accuracy (Table 6). When general prompts are applied, the final false-positive rate remains high at 59%. When only chain-of-thought prompts or concise sample prompts are used, the false-positive rate is reduced to approximately 12% and 6%, respectively, and the model’s own error rate decreases to about 30% and 13%. Combining these strategies further reduces the error rate of the small-sample prompt approach to 0.01%. The effect of system-role prompt words on review accuracy is also evaluated (Table 7). Simple role prompts yield higher accuracy and F1 scores than the absence of role prompts, whereas detailed role prompts provide a clearer overall advantage than simple role prompts. Ablation experiments (Table 8) further examine the contribution of rule classification and prompt engineering to compliance checking. Knowledge supplementation is applied to reduce interference and misjudgment among rules, lower prompt redundancy, and decrease the false-alarm rate during large language model review.  Conclusions  A large language model-driven data compliance checking method for IoT scenarios is presented. The method is designed to address the challenge of assessing compliance in large-scale unstructured device data. Its feasibility is verified through rationality analysis experiments, and the results indicate that false-positive rates are effectively reduced during compliance checking. The initial rule-based method yields a false-positive rate of 64.3%, which is reduced to 6.9% after review by the large language model. Additionally, the error introduced by the model itself is maintained below 0.01%.
Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation
ZHAO Chuanbin, XU Weihua, LIN bo, ZHANG Tengyu, FENG Yuan, GAO Feifei
Available online  , doi: 10.11999/JEIT250685
Abstract:
  Objective  Integrated Sensing And Communications (ISAC) is regarded as a key enabling technology for Sixth-Generation mobile communications (6G), as it simultaneously senses and monitors information in the physical world while maintaining communication with users. The technology supports emerging scenarios such as low-altitude economy, digital twin systems, and vehicle networking. Current ISAC research primarily concentrates on wireless devices that include base stations and terminals. Visual sensing, which provides strong visibility and detailed environmental information, has long been a major research direction in computer science. This study proposes the integration of visual sensing with wireless-device sensing to construct a multimodal ISAC system. In this system, visual sensing captures environmental information to assist wireless communications, and wireless signals help overcome limitations inherent to visual sensing.  Methods  The study first explores the correlation mechanism between environmental vision and wireless communications. Key algorithms for visual-sensing-assisted wireless communication are then discussed, including beam prediction, occlusion prediction, and resource scheduling and allocation methods for multiple base stations and users. These schemes demonstrate that visual sensing, used as prior information, enhances the communication performance of the multimodal ISAC system. The sensing gains provided by wireless devices combined with visual sensors are subsequently explored. A static-environment reconstruction scheme and a dynamic-target sensing scheme based on wireless–visual fusion are proposed to obtain global information about the physical world. In addition, a “vision–communication” simulation and measurement dataset is constructed, establishing a complete theoretical and technical framework for multimodal ISAC.  Results and Discussions  For visual-sensing-assisted wireless communications, the hardware prototype system constructed in this study is shown in (Fig. 6) and (Fig. 7), and the corresponding hardware test results are presented in (Table 1). The results show that visual sensing assists millimetre-wave communications in performing beam alignment and beam prediction more effectively, thereby improving system communication performance. For wireless-communication-assisted sensing, the hardware prototype system is shown in (Fig. 8), and the experimental results are shown in (Fig. 9) and (Table 2). The static-environment reconstruction obtained through wireless–visual fusion shows improved robustness and higher accuracy. Depth estimation based on visual and communication fusion also presents strong robustness in rainy and snowy weather, with the RMSE reduced by approximately 50% compared with pure visual algorithms. These experimental results indicate that vision-enabled multimodal ISAC systems present strong potential for practical application.  Conclusions  A multimodal ISAC system that integrates visual sensing with wireless-device sensing is proposed. In this system, visual sensing captures environmental information to assist wireless communications, and wireless signals help overcome the inherent limitations of visual sensing. Key algorithms for visual-sensing-assisted wireless communication are examined, including beam prediction, occlusion prediction, and resource scheduling and allocation for multiple base stations and users. The sensing gains brought by wireless devices combined with visual sensors are also analysed. Static-environment reconstruction and dynamic-target sensing schemes based on wireless–visual fusion are proposed to obtain global information about the physical world. A “vision–communication” simulation and measurement dataset is further constructed, forming a coherent theoretical and technical framework for multimodal ISAC. Experimental results show that vision-enabled multimodal ISAC systems present strong potential for use in 6G networks.
IRS Deployment for Highly Time Sensitive Short Packet Communications: Distributed or Centralized Deployment?
ZHANG Yangyi, GUAN Xinrong, YANG Weiwei, CAO Kuo, WANG Meng, CAI Yueming
Available online  , doi: 10.11999/JEIT250720
Abstract:
  Objective  The rapid advancement of the Industrial Internet of Things (IIoT) creates latency-sensitive applications such as environmental monitoring and precision control, which depend on short-packet communications and require strict timeliness of information delivery. An Intelligent Reflecting Surface (IRS) is regarded as a feasible method to enhance the reliability and timeliness of these communications because its reflection coefficients can be dynamically adjusted. Previous work has mainly focused on optimizing the phase shifts of IRS elements, whereas the potential gains associated with flexible IRS deployment have not been fully examined. Adjusting the physical placement of IRS units provides additional degrees of freedom that can improve timeliness performance. Two representative deployment strategies, distributed IRS and centralized IRS, form different effective channels and result in different capacity characteristics. This study investigates and compares these deployment modes in IRS-assisted short-packet communication systems. By assessing their Age of Information (AoI) performance under practical channel estimation overheads, the analysis offers guidance on selecting deployment strategies that achieve superior timeliness under diverse system conditions.  Methods  The paper investigates an IRS-assisted short-packet communication system in which multiple terminal devices transmit short packets to an Access Point (AP) through IRS reflection. Two deployment strategies are considered: distributed and centralized IRS. In the distributed scheme, each device is supported by a dedicated IRS with M reflecting elements positioned nearby. In the centralized scheme, all IRS elements are placed near the AP. The average AoI is used as the performance metric to compare the timeliness of these strategies. The complex distribution of the composite channel gain makes closed-form average AoI analysis difficult. To address this issue, the Moment Matching (MM) approximation is employed to estimate the distribution of the composite channel gain. By incorporating pilot overhead into the analytical model, closed-form expressions for the average AoI of both deployment schemes are obtained, enabling a thorough performance comparison.  Results and Discussions  Simulation results show that the AoI performance of distributed and centralized IRS deployments differs under varying system conditions. When the IRS carries a large number of reflecting elements, the distributed configuration yields better AoI performance (Fig. 4). Under high transmission power, the centralized configuration presents improved AoI performance (Fig. 5). For scenarios with long AP–device distances, the distributed deployment produces more favorable AoI results (Fig. 6). As the system bandwidth increases, the centralized architecture shows a rapid decrease in AoI and eventually performs better than the distributed configuration (Fig. 7).  Conclusions  This study provides a comparative analysis of timeliness performance in IRS-assisted short-packet communication systems under distributed and centralized deployment strategies. The MM method is employed to approximate the composite channel gain with a gamma distribution, which supports the derivation of an approximate expression for the average packet error rate. A closed-form expression for the average AoI is then developed by accounting for channel estimation overhead. Simulation results show that the two deployment strategies exhibit different AoI advantages under varying operating conditions. The distributed configuration achieves better AoI performance when a large number of reflecting elements is used or when the AP–device distance is long. The centralized configuration provides improved AoI performance under high transmission power or wide system bandwidth.
Knowledge-guided Few-shot Earth Surface Anomalies Detection
JI Hong, GAO Zhi, CHEN Boan, AO Wei, CAO Min, WANG Qiao
Available online  , doi: 10.11999/JEIT251000
Abstract:
  Objective   Earth Surface Anomalies (ESAs), defined as sudden natural or human-generated disruptions on the Earth’s surface, present severe risks and widespread effects. Timely and accurate ESA detection is therefore essential for public security and sustainable development. Remote sensing offers an effective approach for this task. However, current deep learning models remain limited due to the scarcity of labeled data, the complexity of anomalous backgrounds, and distribution shifts across multi-source remote sensing imagery. To address these issues, this paper proposes a knowledge-guided few-shot learning method. Large language models generate abstract textual descriptions of normal and anomalous geospatial features. These descriptions are encoded and fused with visual prototypes to construct a cross-modal joint representation. The integrated representation improves prototype discriminability in few-shot settings and demonstrates that linguistic knowledge strengthens ESA detection. The findings suggest a feasible direction for reliable disaster monitoring when annotated data are limited.  Methods   The knowledge-guided few-shot learning method is constructed on a metric-based paradigm in which each episode contains support and query sets, and classification is achieved by comparing query features with class prototypes through distance-based similarity and cross-entropy optimization (Fig. 1). To supplement limited visual prototypes, class-level textual descriptions are generated with ChatGPT through carefully designed prompts, producing semantic sentences that characterize the appearance, attributes, and contextual relations of normal and anomalous categories (Fig. 2, 3). These descriptions encode domain-specific properties such as anomaly extent, morphology, and environmental effect, which are otherwise difficult to capture when only a few visual samples are available. The sentences are encoded with a Contrastive Language–Image Pre-training (CLIP) text encoder, and task-adaptive soft prompts are introduced by generating tokens from support features and concatenating them with static embeddings to form adaptive word embeddings. Encoded sentence vectors are processed with a lightweight self-attention module to model dependencies across multiple descriptions and to obtain a coherent paragraph-level semantic representation (Fig. 4). The resulting semantic prototypes are fused with the visual prototypes through weighted addition to produce cross-modal prototypes that integrate visual grounding and linguistic abstraction. During training, query samples are compared with the cross-modal prototypes, and optimization is guided by two objectives: a classification loss that enforces accurate query–prototype alignment, and a prototype regularization loss that ensures semantic prototypes are discriminative and well separated. The entire method is implemented in an episodic training framework (Algorithm 1).  Results and Discussions   The proposed method is evaluated under both cross-domain and in-domain few-shot settings. In the cross-domain case, models are trained on NWPU45 or AID and tested on ESAD to assess ESAs recognition. As shown in the comparisons (Table 2), traditional meta-learning methods such as MAML and Meta-SGD reach accuracies below 50%, whereas metric-based baselines such as ProtoNet and RelationNet demonstrate greater stability but remain limited. The proposed method reaches 61.99% on the NWPU45→ESAD and 59.79% on the AID→ESAD settings, outperforming ProtoNet by 4.72% and 2.67% respectively. In the in-domain setting, where training and testing are conducted on the same dataset, the method reaches 76.94% on NWPU45 and 72.98% on AID, and consistently exceeds state-of-the-art baselines such as S2M2 and IDLN (Table 3). Ablation experiments further support the contribution of each component. Using only visual prototypes produces accuracies of 57.74% and 72.16%, and progressively incorporating simple class names, task-oriented templates, and ChatGPT-generated descriptions improves performance. The best accuracy is achieved by combining ChatGPT descriptions, learnable tokens, and an attention-based mechanism, reaching 61.99% and 76.94% (Table 4). Parameter sensitivity analysis shows that an appropriate weight for language features (α = 0.2) and the use of two learnable tokens yield optimal performance (Fig. 5).  Conclusions   This paper addresses ESAs detection in remote sensing imagery through a knowledge-guided few-shot learning method. The approach uses large language models to generate abstract textual descriptions for anomaly categories and conventional remote sensing scenes, thereby constructing multimodal training and testing resources. These descriptions are encoded into semantic feature vectors with a pretrained text encoder. To extract task-specific knowledge, a dynamic token learning strategy is developed in which a small number of learnable parameters are guided by visual samples within few-shot tasks to generate adaptive semantic vectors. An attention-based semantic knowledge module models dependencies among language features and produces cross-modal semantic vectors for each class. By fusing these vectors with visual prototypes, the method forms joint multimodal representations used for query–prototype matching and network optimization. Experimental evaluations show that the method effectively leverages prior knowledge contained in pretrained models, compensates for limited visual data, and improves feature discriminability for anomalies recognition. Both cross-domain and in-domain results confirm consistent gains over competitive baselines, highlighting the potential of the approach for reliable application in real-world remote sensing anomalies detection scenarios.
An Overview on Integrated Sensing and Communication for Low altitude economy
ZHU Zhengyu, WEN Xinping, LI Xingwang, WEI Zhiqing, ZHANG Peichang, LIU Fan, FENG Zhiyong
Available online  , doi: 10.11999/JEIT250747
Abstract:
The Low-altitude Internet of Things (IoT) develops rapidly, and the Low Altitude Economy is treated as a national strategic emerging industry. Integrated Sensing and Communication (ISAC) for the Low Altitude Economy is expected to support more complex tasks in complex environments and provides a foundation for improved security, flexibility, and multi-application scenarios for drones. This paper presents an overview of ISAC for the Low Altitude Economy. The theoretical foundations of ISAC and the Low Altitude Economy are summarized, and the advantages of applying ISAC to the Low Altitude Economy are discussed. Potential applications of key 6G technologies, such as covert communication and Millimeter-Wave (mm-wave) systems in ISAC for the Low Altitude Economy, are examined. The key technical challenges of ISAC for the Low Altitude Economy in future development are also summarized.  Significance   The integration of UAVs with ISAC technology is expected to provide considerable advantages in future development. When ISAC is applied, the overall system payload can be reduced, which improves UAV maneuverability and operational freedom. This integration offers technical support for versatile UAV applications. With ISAC, low-altitude network systems can conduct complex tasks in challenging environments. UAV platforms equipped with a single function do not achieve the combined improvement in communication and sensing that ISAC enables. ISAC-equipped drones are therefore expected to be used more widely in aerial photography, agriculture, surveying, remote sensing, and telecommunications. This development will advance related theoretical and technical frameworks and broaden the application scope of ISAC.  Progress  ISAC networks for the low-altitude economy offer efficient and flexible solutions for military reconnaissance, emergency disaster relief, and smart city management. The open aerial environment and dynamic deployment requirements create several challenges. Limited stealth increases exposure to hostile interception, and complex terrains introduce signal obstruction. High bandwidth and low latency are also required. Academic and industrial communities have investigated technologies such as covert communication, intelligent reflecting surfaces, and mm-wave communication to enhance the reliability and intelligence of ISAC in low-altitude operational scenarios.  Conclusions  This paper presents an overview of current applications, critical technologies, and ongoing challenges associated with ISAC in low-altitude environments. It examines the integration of emerging 6G technologies, including covert communication, Reconfigurable Intelligent Surfaces (RIS), and mm-wave communication within ISAC frameworks. Given the dynamic and complex characteristics of low-altitude operations, recent advances in UAV swarm power control algorithms and covert trajectory optimization based on deep reinforcement learning are summarized. Key unresolved challenges are also identified, such as spatiotemporal synchronization, multi-UAV resource allocation, and privacy preservation, which provide reference directions for future research.  Prospects   ISAC technology provides precise and reliable support for drone logistics, urban air mobility, and large-scale environmental monitoring in the low-altitude economy. Large-scale deployment of ISAC systems in complex and dynamic low-altitude environments remains challenging. Major obstacles include limited coordination and resource allocation within UAV swarms, spatiotemporal synchronization across heterogeneous devices, competing requirements between sensing and communication functions, and rising concerns regarding privacy and security in open airspace. These issues restrict the high-quality development of the low-altitude economy.
Finite-time Adaptive Sliding Mode Control of Servo Motors Considering Frictional Nonlinearity and Unknown Loads
ZHANG Tianyu, GUO Qinxia, YANG Tingkai, GUO Xiangji, MING Ming
Available online  , doi: 10.11999/JEIT250521
Abstract:
  Objective  Ultra-fast laser processing with an infinite field of view requires servo motor systems with superior tracking accuracy and robustness. However, such systems are highly nonlinear and affected by coupled unknown load disturbances and complex friction, which constrain the performance of conventional controllers. Although Sliding Mode Control (SMC) exhibits inherent robustness, traditional SMC and observer designs cannot achieve accurate finite-time disturbance compensation under strong nonlinearities, thus limiting high-speed and high-precision trajectory tracking. To address this limitation, a novel finite-time adaptive SMC approach is proposed to ensure rapid and precise angular position tracking within a finite time, satisfying the stringent synchronization requirements of advanced laser processing systems.  Methods  A novel control strategy is developed by integrating an adaptive disturbance observer fused with a Radial Basis Function Neural Network (RBFNN) and finite-time Sliding Mode Control (SMC). First, the unknown load disturbance and complex frictional nonlinear dynamics are combined into a unified "lumped disturbance" term, improving model generality and the ability to represent real operating conditions. Second, a finite-time adaptive disturbance observer is constructed to estimate this lumped disturbance. The observer utilizes the universal approximation capability of the RBFNN to learn and approximate the dynamic characteristics of unknown disturbances online. Simultaneously, a finite-time adaptive law based on the error norm is introduced to update the neural network weights in real time, ensuring rapid and accurate finite-time estimation of the lumped disturbance while reducing dependence on precise model parameters. Based on this design, a finite-time SMC is developed. The controller uses the observer’s disturbance estimation as a feedforward compensation term, incorporates a carefully formulated finite-time sliding surface and equivalent control law, and introduces a saturation function to suppress control input chattering. A suitable Lyapunov function is then constructed, and the finite-time stability theory is rigorously applied to prove the practical finite-time convergence of both the adaptive observer and the closed-loop control system, guaranteeing that the system tracking error converges to a bounded neighborhood near the origin within finite time.  Results and Discussions  To verify the effectiveness and superiority of the proposed control strategy, a typical Permanent Magnet Synchronous Motor (PMSM) servo system model is constructed in the MATLAB environment, and a simulation scenario with desired trajectories of varying frequencies is established. The proposed method is comprehensively compared with the widely used Proportional–Integral (PI) control and the advanced method reported in reference [7]. Simulation results demonstrate the following: 1. Tracking performance: Under various reference trajectories, the proposed controller enables the system to accurately follow the target trajectory with a tracking error substantially smaller than that of the PI controller. Compared with the method in reference [7], it achieves smoother responses and smaller residual errors, effectively eliminating the chattering observed in some operating conditions of the latter. 2 Disturbance rejection and robustness: The adaptive disturbance observer based on the RBFNN rapidly and effectively learns and compensates for the lumped disturbance composed of unknown load variations and frictional nonlinearities. Even in the presence of these disturbances, the proposed controller maintains high-precision trajectory tracking, demonstrating strong disturbance rejection and robustness to system parameter variations. 3. Control input characteristics: Compared with the reference methods, the control signal of the proposed approach quickly stabilizes after the initial transient phase, effectively suppressing chattering caused by high-frequency switching. The amplitude range of the control input remains reasonable, facilitating practical actuator implementation. 4. Comprehensive evaluation: Based on multiple error performance indices, including Integral Squared Error (ISE), Integral Absolute Error (IAE), Time-weighted Integral Absolute Error (ITAE), and Time-weighted Integral Squared Error (ITSE), the proposed controller consistently outperforms both PI control and the method in reference [7]. It demonstrates comprehensive advantages in suppressing transient errors rapidly and reducing overall error accumulation. The method also improves steady-state accuracy and achieves a balanced response speed with effective noise attenuation. 5. Observer performance: The RBFNN weight norm estimation converges rapidly and stabilizes at a low level after initial adaptation, confirming the effectiveness of the proposed adaptive law and the learning efficiency of the observer.  Conclusions  A finite-time sliding mode control strategy with an adaptive disturbance observer is proposed for servo systems used in ultra-fast laser processing. The method models unknown load disturbances and frictional nonlinearities as a lumped disturbance term. An adaptive observer, integrating an RBF neural network with a finite-time mechanism, accurately estimates this disturbance for real-time compensation. Based on the observer, a finite-time SMC law is formulated, and the practical finite-time stability of the closed-loop system is theoretically proven. Simulations conducted on a permanent magnet synchronous motor platform confirm that the proposed approach achieves superior tracking accuracy, robustness, and control smoothness compared with conventional PI and existing advanced methods. This work offers an effective solution for achieving high-precision control in nonlinear systems subject to strong disturbances.
3D Localization Method with Uniform Circular Array Driven by Complex Subspace Neural Network
JIANG Wei, ZHI Boxin, YANG Junjie, WANG hui, DING Pengfei, ZHANG Zheng
Available online  , doi: 10.11999/JEIT250395
Abstract:
  Objective  High-precision indoor localization is increasingly required in intelligent service scenarios, yet existing techniques continue to face difficulties in complex environments where signal frequency offset, multipath propagation, and noise interfere with accuracy. To address these limitations, a 3D localization method using a Uniform Circular Array (UCA) driven by a Complex Subspace Neural Network (CSNN) is proposed to improve accuracy and robustness under challenging conditions.  Methods  The proposed method establishes a complete localization pipeline based on a hierarchical signal processing framework that includes frequency offset compensation, two-dimensional angle estimation, and spatial mapping (Fig. 2). A dual-estimation frequency compensation algorithm is first designed. The frequency offsets during the Channel Time Extension (CTE) reference period and sample period are estimated separately, and the estimate obtained from the reference period is used to resolve ambiguity in the antenna sample period, which enables high-precision frequency compensation. The CSNN is then constructed to estimate the two-dimensional angle (Fig. 3). Within this framework, a Complex-Valued Convolutional Neural Network (CVCNN) (Fig. 4) is introduced to calibrate the covariance matrix of the received signals, which suppresses correlated noise and multipath interference. Based on the theory of mode-space transformation, the calibrated covariance matrix is projected onto a virtual Uniform Linear Array (ULA). The azimuth and elevation angles are jointly estimated by the ESPRIT algorithm. The estimated angles from three Access Points (APs) are subsequently fused to obtain the final position estimate.  Results and Discussions  Experiments are conducted to evaluate the performance of the proposed method. For frequency offset suppression, the dual-estimation frequency compensation algorithm markedly reduces the effect on angle estimation, improving estimation accuracy by 91.7% compared with uncorrected data and showing clear improvement over commonly used approaches (Fig. 6). For angle estimation, the CSNN achieves reductions of more than 40% in azimuth error and 25% in elevation error compared with the MUSIC algorithm under simulation conditions (Fig. 7), and verifies the capability of the CVCNN module to suppress various interferences. In practical experiments, the CSNN achieves an average azimuth error of 1.07° and an average elevation error of 1.28° in the training scenario (Table 1, Fig. 10). Generalization experiments conducted in three indoor environments (warehouse, corridor, and office) show that the average angular errors remain low at 2.78° for azimuth and 3.39° for elevation (Table 2, Fig. 11). The proposed method further maintains average positioning accuracies of 28.9 cm in 2D and 36.5 cm in 3D after cross-scene migration (Table 4, Fig. 13).  Conclusions  The proposed high-precision indoor localization method integrates dual-estimation frequency compensation, the CSNN angle estimation algorithm, and three-AP cooperative localization. It demonstrates strong performance in both simulation and real-environment experiments. The method also maintains stable cross-scene adaptability and accuracy that meet the requirements of high-precision indoor localization.
Privacy-Preserving Federated Weakly-Supervised Learning for Cancer Subtyping on Histopathology Images
WANG Yumeng, LIU Zhenbing, LIU Zaiyi
Available online  , doi: 10.11999/JEIT250842
Abstract:
  Objective  Data-driven deep learning methods have demonstrated superior performance. The development of robust and accurate models often relies on a large amount of training data with fine-grained annotations, which incurs high annotation costs for gigapixel whole slide images (WSI) in histopathology. Typically, healthcare data exists in “data silos”, and the complex data sharing process may raise privacy concerns. Federated Learning (FL) is a promising approach that enables training a global model from data spread across numerous medical centers without exchanging data. However, in traditional FL algorithms, the inherent data heterogeneity across medical centers significantly impacts the performance of the global model.  Methods  In response to these challenges, this work proposes a privacy-preserving FL method for gigapixel WSIs in computational pathology. The method integrates weakly supervised attention-based multiple instance learning (MIL) with differential privacy techniques. In the context of each client, a multi-scale attention-based MIL method is employed for local training on histopathology WSIs, with only slide-level labels available. This effectively mitigates the high costs of pixel-level annotation for histopathology WSIs via a weakly supervised setting. In the federated model update phase, local differential privacy is used to further mitigate the risk of sensitive data leakage. Specifically, random noise that follows a Gaussian or Laplace distribution is added to the model parameters after local training on each client. Furthermore, a novel federated adaptive reweighting strategy is adopted to overcome challenges posed by the heterogeneity of pathological images across clients. This strategy dynamically balances the contribution of the quantity and quality of local data to each client's weight.  Results and Discussions  The proposed FL framework is evaluated on two clinical diagnostic tasks: Non-small Cell Lung Cancer (NSCLC) histologic subtyping and Breast Invasive Carcinoma (BRCA) histologic subtyping. As shown in (Table 1, Table 2, and Fig. 4), the proposed FL method (Ours with DP and Ours w/o DP) exhibits superior accuracy and generalization when compared with both localized models and other FL methods. Notably, even when compared to the centralized model, its classification performance remains competitive (Fig. 3). These results demonstrate that privacy-preserving FL not only serves as a feasible and effective method for multicenter histopathology images, but also may mitigate the performance degradation typically caused by data heterogeneity across centers. By controlling the intensity of added noise within a limited range, the model can also achieve stable classification (Table 3). The two key components (i.e., multi-scale representation attention network and federated adaptive reweighting strategy) are proven valuable for consistent performance improvement (Table 4). In addition, the proposed FL method maintains stable classification performance across different hyperparameter settings (Table 5, Table 6). These results further demonstrate that the proposed FL method is robust.  Conclusions  In conclusion, the proposed FL method tackles two critical issues in multicenter computational pathology: data silos and privacy concerns. Moreover, it can effectively alleviates the performance degradation induced by inter-center data heterogeneity. Given the challenges in balancing model accuracy and privacy protection, future work will explore new methods that preserve privacy while maintaining model performance.
A Survey of Lightweight Techniques for Segment Anything Model
LUO Yichang, QI Xiyu, ZHANG Borui, SHI Hanru, ZHAO Yan, WANG Lei, LIU Shixiong
Available online  , doi: 10.11999/JEIT250894
Abstract:
  Objective  The Segment Anything Model (SAM) demonstrates strong zero-shot generalization in image segmentation and sets a new direction for visual foundation models. The original SAM, especially the ViT-Huge version with about 637 million parameters, requires high computational resources and substantial memory. This restricts deployment in resource-limited settings such as mobile devices, embedded systems, and real-time tasks. Growing demand for efficient and deployable vision models has encouraged research on lightweight variants of SAM. Existing reviews describe applications of SAM, yet a structured summary of lightweight strategies across model compression, architectural redesign, and knowledge distillation is still absent. This review addresses this need by providing a systematic analysis of current SAM lightweight research, classifying major techniques, assessing performance, and identifying challenges and future research directions for efficient visual foundation models.  Methods  This review examines recent studies on SAM lightweight methods published in leading conferences and journals. The techniques are grouped into three categories based on their technical focus. The first category, Model Compression and Acceleration, covers knowledge distillation, network pruning, and quantization. The second category, Efficient Architecture Design, replaces the ViT backbone with lightweight structures or adjusts attention mechanisms. The third category, Efficient Feature Extraction and Fusion, refines the interaction between the image encoder and prompt encoder. A comparative assessment is conducted for representative studies, considering model size, computational cost, inference speed, and segmentation accuracy on standard benchmarks (Table 3).  Results and Discussions  The reviewed models achieve clear gains in inference speed and parameter efficiency. MobileSAM reduces the model to 9.6 M parameters, and Lite-SAM reaches up to 16× acceleration while maintaining suitable segmentation accuracy. Approaches based on knowledge distillation and hybrid design support generalization across domains such as medical imaging, video segmentation, and embedded tasks. Although accuracy and speed still show a degree of tension, the selection of a lightweight strategy depends on the intended application. Challenges remain in prompt design, multi-scale feature fusion, and deployment on low-power hardware platforms.  Conclusions  This review provides an overview of the rapidly developing field of SAM lightweight research. The development of efficient SAM models is a multifaceted challenge that requires a combination of compression, architectural innovation, and optimization strategies. Current studies show that real-time performance on edge devices can be achieved with a small reduction in accuracy. Although progress is evident, challenges remain in handling complex scenarios, reducing the cost of distillation data, and establishing unified evaluation benchmarks. Future research is expected to emphasize more generalizable lightweight architectures, explore data-free or few-shot distillation approaches, and develop standardized evaluation protocols that consider both accuracy and efficiency.
Key Technologies for Low-Altitude Internet Networks: Architecture, Security, and Optimization
WANG Yuntao, SU Zhou, GAO Yuan, BA Jianle
Available online  , doi: 10.11999/JEIT250947
Abstract:
Low-Altitude Intelligent Networks (LAINs) function as a core infrastructure for the emerging low-altitude digital economy by connecting humans, machines, and physical objects through the integration of manned and unmanned aircraft with ground networks and facilities. This paper provides a comprehensive review of recent research on LAINs from four perspectives: network architecture, resource optimization, security threats and protection, and large model-enabled applications. First, existing standards, general architecture, key characteristics, and networking modes of LAINs are investigated. Second, critical issues related to airspace resource management, spectrum allocation, computing resource scheduling, and energy optimization are discussed. Third, existing/emerging security threats across sensing, network, application, and system layers are assessed, and multi-layer defense strategies in LAINs are reviewed. Furthermore, the integration of large model technologies with LAINs is also analyzed, highlighting their potential in task optimization and security enhancement. Future research directions are discussed to provide theoretical foundations and technical guidance for the development of efficient, secure, and intelligent LAINs.  Significance   LAINs support the low-altitude economy by enabling the integration of manned and unmanned aircraft with ground communication, computing, and control networks. By providing real-time connectivity and collaborative intelligence across heterogeneous platforms, LAINs support applications such as precision agriculture, public safety, low-altitude logistics, and emergency response. However, LAINs continue to face challenges created by dynamic airspace conditions, heterogeneous platforms, and strict real-time operational requirements. The development of large models also presents opportunities for intelligent resource coordination, proactive defense, and adaptive network management, which signals a shift in the design and operation of low-altitude networks.  Progress  Recent studies on LAINs have reported progress in network architecture, resource optimization, security protection, and large model integration. Architecturally, hierarchical and modular designs are proposed to integrate sensing, communication, and computing resources across air, ground, and satellite networks, which enables scalable and interoperable operations. In system optimization research, attention is given to airspace resource management, spectrum allocation, computing offloading, and energy-efficient scheduling through distributed optimization and AI-driven orchestration methods. In security research, multi-layer defense frameworks are developed to address sensing-layer spoofing, network-layer intrusions, and application-layer attacks through cross-layer threat intelligence and proactive defense mechanisms. Large Language Models (LLMs), Vision-Language Models (VLMs), and Multimodal LLMs (MLLMs) also support intelligent task planning, anomaly detection, and autonomous decision-making in complex low-altitude environments, which enhances the resilience and operational efficiency of LAINs.  Conclusions  This survey provides a comprehensive review of the architecture, security mechanisms, optimization techniques, and large model applications in LAINs. The challenges in multi-dimensional resource coordination, cross-layer security protection, and real-time system adaptation are identified, and existing or potential approaches to address these challenges are analyzed. By synthesizing recent research on architectural design, system optimization, and security defense, this work offers a unified perspective for researchers and practitioners aiming to build secure, efficient, and scalable LAIN systems. The findings emphasize the need for integrated solutions that combine algorithmic intelligence, system engineering, and architectural innovation to meet future low-altitude network demands.  Prospects  Future research on LAINs is expected to advance the integration of architecture design, intelligent optimization, security defense, and privacy preservation technologies to meet the demands of rapidly evolving low-altitude ecosystems. Key directions include developing knowledge-driven architectures for cross-domain semantic fusion, service-oriented network slicing, and distributed autonomous decision-making. Furthermore, research should also focus on proactive cross-layer security mechanisms supported by large models and intelligent agents, efficient model deployment through AI-hardware co-design and hierarchical computing architectures, and improved multimodal perception and adaptive decision-making to strengthen system resilience and scalability. In addition, establishing standardized benchmarks, open-source frameworks, and realistic testbeds is essential to accelerate innovation and ensure secure, reliable, and intelligent deployment of LAIN systems in real-world environments.
A Learning-Based Security Control Method for Cyber-Physical Systems Based on False Data Detection
MIAO Jinzhao, LIU Jinliang, SUN Le, ZHA Lijuan, TIAN Engang
Available online  , doi: 10.11999/JEIT250537
Abstract:
  Objective  Cyber-Physical Systems (CPS) constitute the backbone of critical infrastructures and industrial applications, but the tight coupling of cyber and physical components renders them highly susceptible to cyberattacks. False data injection attacks are particularly dangerous because they compromise sensor integrity, mislead controllers, and can trigger severe system failures. Existing control strategies often assume reliable sensor data and lack resilience under adversarial conditions. Furthermore, most conventional approaches decouple attack detection from control adaptation, leading to delayed or ineffective responses to dynamic threats. To overcome these limitations, this study develops a unified secure learning control framework that integrates real-time attack detection with adaptive control policy learning. By enabling the dynamic identification and mitigation of false data injection attacks, the proposed method enhances both stability and performance of CPS under uncertain and adversarial environments.  Methods  To address false data injection attacks in CPS, this study proposes an integrated secure control framework that combines attack detection, state estimation, and adaptive control strategy learning. A sensor grouping-based security assessment index is first developed to detect anomalous sensor data in real time without requiring prior knowledge of attacks. Next, a multi-source sensor fusion estimation method is introduced to reconstruct the system’s true state, thereby improving accuracy and robustness under adversarial disturbances. Finally, an adaptive learning control algorithm is designed, in which dynamic weight updating via gradient descent approximates the optimal control policy online. This unified framework enhances both steady-state performance and resilience of CPS against sophisticated attack scenarios. Its effectiveness and security performance are validated through simulation studies under diverse false data injection attack settings.  Results and Discussions  Simulation results confirm the effectiveness of the proposed secure adaptive learning control framework under multiple false data injection attacks in CPS. As shown in Fig. 1, system states rapidly converge to steady values and maintain stability despite sensor attacks. Fig. 2 demonstrates that the fused state estimator tracks the true system state with greater accuracy than individual local estimators. In Fig. 3, the compensated observation outputs align closely with the original, uncorrupted measurements, indicating precise attack estimation. Fig. 4 shows that detection indicators for sensor groups 2–5 increase sharply during attack intervals, while unaffected sensors remain near zero, verifying timely and accurate detection. Fig. 5 further confirms that the estimated attack signals closely match the true injected values. Finally, Fig. 6 compares different control strategies, showing that the proposed method achieves faster stabilization and smaller state deviations. Together, these results demonstrate robust control, accurate state estimation, and real-time detection under unknown attack conditions.  Conclusions  This study addresses secure perception and control in CPS under false data injection attacks by developing an integrated adaptive learning control framework that unifies detection, estimation, and control. A sensor-level anomaly detection mechanism is introduced to identify and localize malicious data, substantially enhancing attack detection capability. The fusion-based state estimation method further improves reconstruction accuracy of true system states, even when observations are compromised. At the control level, an adaptive learning controller with online weight adjustment enables real-time approximation of the optimal control policy without requiring prior knowledge of the attack model. Future research will extend the proposed framework to broader application scenarios and evaluate its resilience under diverse attack environments.
Inverse Design of a Silicon-Based Compact Polarization Splitter-Rotator
HUI Zhanqiang, ZHANG Xinglong, HAN dongdong, LI Tiantian, GONG Jiamin
Available online  , doi: 10.11999/JEIT250858
Abstract:
  Objective  The integrated polarization splitter-rotator (PSR), as one of the key photonic devices for manipulating the polarization state of light waves, has been widely used in various photonic integrated circuits (PICs). For PICs, device size becomes a major bottleneck limiting integration density. Compared to traditional design methods, which suffer from being time-consuming and producing larger device sizes, inverse design optimizes the best structural parameters of integrated photonic devices according to target performance parameters by employing specific optimization algorithms. This approach can significantly reduce device size while ensuring performance and is currently used to design various integrated photonic devices, such as wavelength/mode division multiplexers, all-optical logic gates, power splitters, etc. In this paper, the Momentum Optimization algorithm and the Adjoint Method are combined to inverse design a compact PSR. This can not only significantly improve the integration level of PICs but also offers a design approach for the miniaturization of other photonic devices.  Methods  First, based on a silicon-on-insulator (SOI) wafer with a thickness of 220 nm, the design region was discretized into 25×50 cylindrical elemental structures. Each structure has a radius of 50 nm and a height of 150 nm and is filled with an intermediate material possessing a relative permittivity of 6.55. Next, the adjoint method was employed for simulation to obtain gradient information over the design region. This gradient information was processed using the Momentum Optimization algorithm. Based on the processed gradient, the relative permittivity of each elemental structure was modified. During the optimization process, the momentum factor in the Momentum Optimization algorithm was dynamically adjusted according to the iteration number to accelerate the optimization. Meanwhile, a linear bias was introduced to artificially control the optimization direction of the relative permittivity. This bias gradually steered the permittivity values towards those of silicon and air as the iterations progressed. Upon completion of the optimization, the elemental structures were binarized based on their final relative permittivity values: structures with permittivity less than 6.55 were filled with air, while those greater than 6.55 were filled with silicon. At this stage, the design region consisted of multiple irregularly distributed air holes. To compensate for the performance loss incurred during binarization, the etching depth of air holes (whose pre-binarization permittivity was between 3 and 6.55) was optimized. Furthermore, adjacent air holes are merged to reduce manufacturing errors. This resulted in a final device structure composed of air holes with five distinct radii. Among these, three types of larger-radius air holes were selected. Their etching radii and depths were further optimized to compensate for the remaining performance loss. Finally, the device performance was evaluated through numerical analysis. Key parameters calculated include insertion loss (IL), crosstalk (CT), polarization extinction ratio (PER), and bandwidth. Additionally, tolerance analysis was performed to assess the robustness of the performance.  Results and Discussions   This paper presents the design of a compact PSR based on a 220-nm-thick SOI wafer, with dimensions of 5 µm in length and 2.5 µm in width. During the design process, the momentum factor within the Momentum Optimization algorithm was dynamically adjusted: a large momentum factor was selected in the initial optimization stages to leverage high momentum for accelerating escape from local maxima or plateau regions, while a smaller momentum factor was used in later stages to increase the weight of the current gradient. Compared to other optimization methods, the algorithm employed in this work required only 20%-33% of the iteration counts needed by other algorithms to achieve a Figure of Merit (FOM) value of 1.7, significantly enhancing optimization efficiency. Numerical analysis results demonstrate that this device achieves the following performance across the 1520-1575 nm wavelength band: low IL (TM0<1 dB,TE0<0.68 dB), low CT: (TM0<-23 dB, TE0<-25.2 dB), high PER: (TM0>17 dB, TE0>28.5 dB), process tolerance analysis indicates that the device exhibits robust fabrication tolerance. Within the 1520-1540 nm bandwidth, performance shows no significant degradation under variations of etching depth offset ±9 nm, etching radius offset ±5 nm. This demonstrates its excellent manufacturability robustness.  Conclusions   Through numerical analysis and comparison with devices designed in other literature, this work clearly demonstrates the feasibility of combining the adjoint method with the Momentum Optimization algorithm for designing the integrated PSR. Its design principle involves manipulating light propagation to achieve the polarization splitting and rotation effect by adjusting the relative permittivity to control the positions of the air holes. Compared to traditional design methods, inverse design enables the efficient utilization of the design region, thereby achieving a more compact structure. The PSR proposed in this work is not only significantly smaller in size but also exhibits larger fabrication tolerance. It holds significant potential for application in future large-scale PICs chips, while also offering valuable design insights for the miniaturization of other photonic devices.
A Two-Stage Framework for CAN Bus Attack Detection by Fusing Temporal and Deep Features
TAN Mingming, ZHANG Heng, WANG Xin, LI Ming, ZHANG Jian, YANG Ming
Available online  , doi: 10.11999/JEIT250651
Abstract:
  Objective  The Controller Area Network (CAN), the de facto standard for in-vehicle communication, is inherently vulnerable to cyberattacks. Existing Intrusion Detection Systems (IDSs) face a fundamental trade-off: achieving fine-grained classification of diverse attack types often requires computationally intensive models that exceed the resource limitations of on-board Electronic Control Units (ECUs). To address this problem, this study proposes a two-stage attack detection framework for the CAN bus that fuses temporal and deep features. The framework is designed to achieve both high classification accuracy and computational efficiency, thereby reconciling the tension between detection performance and practical deployability.  Methods  The proposed framework adopts a “detect-then-classify” strategy and incorporates two key innovations. (1) Stage 1: Temporal Feature-Aware Anomaly Detection. Two custom features are designed to quantify anomalies: Payload Data Entropy (PDE), which measures content randomness, and ID Frequency Mean Deviation (IFMD), which captures behavioral deviations. These features are processed by a Bidirectional Long Short-Term Memory (BiLSTM) network that exploits contextual temporal information to achieve high-recall anomaly detection. (2) Stage 2: Deep Feature-Based Fine-Grained Classification. Triggered only for samples flagged as anomalous, this stage employs a lightweight one-dimensional ParC1D-Net. The core ParC1D Block (Fig. 4) integrates depthwise separable one-dimensional convolution, Squeeze-and-Excitation (SE) attention, and a Feed-Forward Network (FFN), enabling efficient feature extraction with minimal parameters. Stage 1 is optimized using BCEWithLogitsLoss, whereas Stage 2 is trained with Cross-Entropy Loss.  Results and Discussions  The efficacy of the proposed framework is evaluated on public datasets. (1) State-of-the-art performance. On the Car-Hacking dataset (Table 5), an accuracy and F1-score of 99.99% are achieved, exceeding advanced baselines. On the more challenging Challenge dataset (Table 6), superior accuracy (99.90%) and a competitive F1-score (99.70% are also obtained. (2) Feature contribution analysis. Ablation studies (Tables 7 and 8) confirm the critical role of the proposed features. Removal of the IFMD feature results in the largest performance reduction, highlighting the importance of behavioral modeling. A synergistic effect is observed when PDE and IFMD are applied together. (3) Spatiotemporal efficiency. The complete model remains lightweight at only 0.39 MB. Latency tests (Table 9) demonstrate real-time capability, with average detection times of 0.62 ms on a GPU and 0.93 ms on a simulated CPU (batch size = 1). A system-level analysis (Section 3.5.4) further shows that the two-stage framework is approximately 1.65 times more efficient than a single-stage model in a realistic sparse-attack scenario.  Conclusions  This study establishes the two-stage framework as an effective and practical solution for CAN bus intrusion detection. By decoupling detection from classification, the framework resolves the trade-off between accuracy and on-board deployability. Its strong performance, combined with a minimal computational footprint, indicates its potential for securing real-world vehicular systems. Future research could extend the framework and explore hardware-specific optimizations.
A one-dimensional 5G millimeter-wave wide-angle Scanning Array Antenna Using AMC Structure
MA Zhangang, ZHANG Qing, FENG Sirun, ZHAO Luyu
Available online  , doi: 10.11999/JEIT250719
Abstract:
  Objective  With the rapid advancement of 5G millimeter-wave technology, antennas are required to achieve high gain, wide beam coverage, and compact size, particularly in environments characterized by strong propagation loss and blockage. Conventional millimeter-wave arrays often face difficulties in reconciling wide-angle scanning with high gain and broadband operation due to element coupling and narrow beamwidths. To overcome these challenges, this study proposes a one-dimensional linear array antenna incorporating an Artificial Magnetic Conductor (AMC) structure. The AMC’s in-phase reflection is exploited to improve bandwidth and gain while enabling wide-angle scanning of ±80° at 26 GHz. By adopting a 0.4-wavelength element spacing and stacked topology, the design provides an effective solution for 5G millimeter-wave terminals where spatial constraints and performance trade-offs are critical. The findings highlight the potential of AMC-based arrays to advance antenna technology for future high-speed, low-latency 5G applications by combining broadband operation, high directivity, and broad coverage within compact form factors.  Methods  This study develops a high-performance single-polarized one-dimensional linear millimeter-wave array antenna through a multi-layered structural design integrated with AMC technology. The design process begins with theoretical analysis of the pattern multiplication principle and array factor characteristics, which identify 0.4-wavelength element spacing as an optimal balance between wide-angle scanning and directivity. A stacked three-layer antenna unit is then constructed, consisting of square patch radiators on the top layer, a cross-shaped coupling feed structure in the middle layer, and an AMC-loaded substrate at the bottom. The AMC provides in-phase reflection in the 21–30 GHz band, enhancing bandwidth and suppressing surface wave coupling. Full-wave simulations (HFSS) are performed to optimize AMC dimensions, feed networks, and array layout, confirming bandwidth of 23.7–28 GHz, peak gain of 13.9 dBi, and scanning capability of ±80°. A prototype is fabricated using printed circuit board technology and evaluated with a vector network analyzer and anechoic chamber measurements. Experimental results agree closely with simulations, demonstrating an operational bandwidth of 23.3–27.7 GHz, isolation better than −15 dB, and scanning coverage up to ±80°. These results indicate that the synergistic interaction between AMC-modulated radiation fields and the array coupling mechanism enables a favorable balance among wide bandwidth, high gain, and wide-angle scanning.  Results and Discussions  The influence of array factor on directional performance is analyzed, and the maximum array factor is observed when the element spacing is between 0.4λ and 0.46λ (Fig. 2). The in-phase reflection of the AMC structure in the 21–30 GHz range significantly enhances antenna characteristics, broadening the bandwidth by 50% compared with designs without AMC and increasing the gain at 26 GHz by 1.5 dBi (Fig. 10, Fig. 13). The operational bandwidth of 23.3–27.7 GHz is confirmed by measurements (Fig. 17a). When the element spacing is optimized to 4.6 mm (0.4λ) and the coupling radiation mechanisms are adjusted, the H-plane half-power beamwidth (HPBW) of the array elements is extended to 180° (Fig. 8, Fig. 9), with a further gain improvement of 0.6 dBi at the scanning edges (Fig. 11b). The three-layer stacked structure—comprising the radiation, isolation, and AMC layers—achieves isolation better than –15 dB (Fig. 17a). Experimental validation demonstrates wide-angle scanning capability up to ±80°, showing close agreement between simulated and measured results (Fig. 11, Fig. 17b). The proposed antenna is therefore established as a compact, high-performance solution for 5G millimeter-wave terminals, offering wide bandwidth, high gain, and broad scanning coverage.  Conclusions  A one-dimensional linear wide-angle scanning array antenna based on an AMC structure is presented for 5G millimeter-wave applications. Through theoretical analysis, simulation optimization, and experimental validation, balanced improvement in broadband operation, high gain, and wide-angle scanning is achieved. Pattern multiplication theory and array factor analysis are applied to determine 0.4-wavelength element spacing as the optimal compromise between scanning angle and directivity. A stacked three-layer configuration is adopted, and the AMC’s in-phase reflection extends the bandwidth to 23.7–28.5 GHz, representing a 50% increase. Simulation and measurement confirm ±80° scanning at 26 GHz with a peak gain of 13.8 dBi, which is 1.3 dBi higher than that of non-AMC designs. The close consistency between experimental and simulated results verifies the feasibility of the design, providing a compact and high-performance solution for millimeter-wave antennas in mobile communication and vehicular systems. Future research is expected to explore dual-polarization integration and adaptation to complex environments.
Integrating Representation Learning and Knowledge Graph Reasoning for Diabetes and Complications Prediction
WANG Yuao, HUANG Yeqi, LI Qingyuan, LIU Yun, JING Shenqi, SHAN Tao, GUO Yongan
Available online  , doi: 10.11999/JEIT250798
Abstract:
  Objective  Diabetes mellitus and its complications are recognized as major global health challenges, causing severe morbidity, high healthcare costs, and reduced quality of life. Accurate joint prediction of these conditions is essential for early intervention but is hindered by data heterogeneity, sparsity, and complex inter-entity relationships. To address these challenges, a Representation Learning Enhanced Knowledge Graph-based Multi-Disease Prediction (REKG-MDP) model is proposed. Electronic Health Records (EHRs) are integrated with supplementary medical knowledge to construct a comprehensive Medical Knowledge Graph (MKG), and higher-order semantic reasoning combined with relation-aware representation learning is applied to capture complex dependencies and improve predictive accuracy across multiple diabetes-related conditions.  Methods  The REKG-MDP framework consists of three modules. First, a MKG is constructed by integrating structured EHR data from the MIMIC-IV dataset with external disease knowledge. Patient-side features include demographics, laboratory indices, and medical history, whereas disease-side attributes cover comorbidities, susceptible populations, etiological factors, and diagnostic criteria. This integration mitigates data sparsity and enriches semantic representation. Second, a relation-aware embedding module captures four relational patterns: symmetric, antisymmetric, inverse, and compositional. These patterns are used to optimize entity and relation embeddings for semantic reasoning. Third, a Hierarchical Attention-based Graph Convolutional Network (HA-GCN) aggregates multi-hop neighborhood information. Dynamic attention weights capture both local and global dependencies, and a bidirectional mechanism enhances the modeling of patient–disease interactions.  Results and Discussions  Experiments demonstrate that REKG-MDP consistently outperforms four baselines: two machine learning models (DCKD-RF and bSES-AC-RUN-FKNN) and two graph-based models (KGRec and PyRec). Compared with the strongest baseline, REKG-MDP achieves average improvements in P, F1, and NDCG of 19.39%, 19.67%, and 19.39% for single-disease prediction (\begin{document}$ n=1 $\end{document}); 16.71%, 21.83%, and 23.53% for \begin{document}$ n=3 $\end{document}; and 22.01%, 20.34%, and 20.88% for \begin{document}$ n=5 $\end{document} (Table 4). Ablation studies confirm the contribution of each module. Removing relation-pattern modeling reduces performance metrics by approximately 12%, removing hierarchical attention decreases them by 5–6%, and excluding disease-side knowledge produces the largest decline of up to 20% (Fig. 5). Sensitivity analysis indicates that increasing the embedding dimension from 32 to 128 enhances performance by more than 11%, whereas excessive dimensionality (256) leads to over-smoothing (Fig. 6). Adjusting the \begin{document}$ \beta $\end{document} parameter strengthens sample discrimination, improving P, F1, and NDCG by 9.28%, 27.9%, and 8.08%, respectively (Fig. 7).  Conclusions  REKG-MDP integrates representation learning with knowledge graph reasoning to enable multi-disease prediction. The main contributions are as follows: (1) integrating heterogeneous EHR data with disease knowledge mitigates data sparsity and enhances semantic representation; (2) modeling diverse relational patterns and applying hierarchical attention improves the capture of higher-order dependencies; and (3) extensive experiments confirm the model’s superiority over state-of-the-art baselines, with ablation and sensitivity analyses validating the contribution of each module. Remaining challenges include managing extremely sparse data and ensuring generalization across broader populations. Future research will extend REKG-MDP to model temporal disease progression and additional chronic conditions.
Wave-MambaCT: Low-dose CT Artifact Suppression Method Based on Wavelet Mamba
CUI Xueying, WANG Yuhang, LIU Bin, SHANGGUAN Hong, ZHANG Xiong
Available online  , doi: 10.11999/JEIT250489
Abstract:
  Objective  Low-Dose Computed Tomography (LDCT) reduces patient radiation exposure but introduces substantial noise and artifacts into reconstructed images. Convolutional Neural Network (CNN)-based denoising approaches are limited by local receptive fields, which restrict their abilities to capture long-range dependencies. Transformer-based methods alleviate this limitation but incur quadratic computational complexity relative to image size. In contrast, State Space Model (SSM)–based Mamba frameworks achieve linear complexity for long-range interactions. However, existing Mamba-based methods often suffer from information loss and insufficient noise suppression. To address these limitations, we propose the Wave-MambaCT model.  Methods  The proposed Wave-MambaCT model adopts a multi-scale framework that integrates Discrete Wavelet Transform (DWT) with a Mamba module based on the SSM. First, DWT performs a two-level decomposition of the LDCT image, decoupling noise from Low-Frequency (LF) content. This design directs denoising primarily toward the High-Frequency (HF) components, facilitating noise suppression while preserving structural information. Second, a residual module combined with a Spatial-Channel Mamba (SCM) module extracts both local and global features from LF and HF bands at different scales. The noise-free LF features are then used to correct and enhance the corresponding HF features through an attention-based Cross-Frequency Mamba (CFM) module. Finally, inverse wavelet transform is applied in stages to progressively reconstruct the image. To further improve denoising performance and network stability, multiple loss functions are employed, including L1 loss, wavelet-domain LF loss, and adversarial loss for HF components.  Results and Discussions  Extensive experiments on the simulated Mayo Clinic datasets, the real Piglet datasets, and the hospital clinical dataset DeepLesion show that Wave-MambaCT provides superior denoising performance and generalization. On the Mayo dataset, a PSNR of 31.6528 is achieved, which is higher than that of the suboptimal method DenoMamba (PSNR 31.4219), while MSE is reduced to 0.00074 and SSIM and VIF are improved to 0.8851 and 0.4629, respectively (Table 1). Visual results (Figs. 46) demonstrate that edges and fine details such as abdominal textures and lesion contours are preserved, with minimal blurring or residual artifacts compared with competing methods. Computational efficiency analysis (Table 2) indicates that Wave-MambaCT maintains low FLOPs (17.2135 G) and parameters (5.3913 M). FLOPs are lower than those of all networks except RED-CNN, and the parameter count is higher only than those of RED-CNN and CTformer. During training, 4.12 minutes per epoch are required, longer only than RED-CNN. During testing, 0.1463 seconds are required per image, which is at a medium level among the compared methods. Generalization tests on the Piglet datasets (Figs. 7, 8, Tables 3, 4) and DeepLesion (Fig. 9) further confirm the robustness and generalization capacity of Wave-MambaCT.In the proposed design, HF sub-bands are grouped, and noise-free LF information is used to correct and guide their recovery. This strategy is based on two considerations. First, it reduces network complexity and parameter count. Second, although the sub-bands correspond to HF information in different orientations, they are correlated and complementary as components of the same image. Joint processing enhances the representation of HF content, whereas processing them separately would require a multi-branch architecture, inevitably increasing complexity and parameters. Future work will explore approaches to reduce complexity and parameters when processing HF sub-bands individually, while strengthening their correlations to improve recovery. For structural simplicity, SCM is applied to both HF and LF feature extraction. However, redundancy exists when extracting LF features, and future studies will explore the use of different Mamba modules for HF and LF features to further optimize computational efficiency.  Conclusions  Wave-MambaCT integrates DWT for multi-scale decomposition, a residual module for local feature extraction, and an SCM module for efficient global dependency modeling to address the denoising challenges of LDCT images. By decoupling noise from LF content through DWT, the model enables targeted noise removal in the HF domain, facilitating effective noise suppression. The designed RSCM, composed of residual blocks and SCM modules, captures fine-grained textures and long-range interactions, enhancing the extraction of both local and global information. In parallel, the Cross-band Enhancement Module (CEM) employs noise-free LF features to refine HF components through attention-based CFM, ensuring structural consistency across scales. Ablation studies (Table 5) confirm the essential contributions of both SCM and CEM modules to maintaining high performance. Importantly, the model’s staged denoising strategy achieves a favorable balance between noise reduction and structural preservation, yielding robustness to varying radiation doses and complex noise distributions.
Performance Analysis of Double RIS-Assisted Multi-Antenna Cooperative NOMA with Short-Packet Communication
SONG Wenbin, CHEN Dechuan, ZHANG Xingang, WANG Zhipeng, SUN Xiaolin, WANG Baoping
Available online  , doi: 10.11999/JEIT250761
Abstract:
  Objective  Numerous existing studies on short-packet communication systems rely on the assumption of ideal transceiver devices. However, this assumption is unrealistic because radio-frequency transceiver hardware inevitably suffers from impairments such as phase noise and amplifier nonlinearity. Such impairments are particularly pronounced in short-packet communication systems, where low-cost hardware components are widely employed. However, the performance of reconfigurable intelligent surface (RIS)-assisted multi-antenna cooperative non-orthogonal multiple access (NOMA) short-packet communication systems with hardware impairments has not been investigated. Furthermore, the impact of the number of base station (BS) antennas and RIS reflecting elements on the reliable performance remains unexplored. Therefore, this paper investigates the reliable performance of double RIS-assisted multi-antenna cooperative NOMA short-packet communication systems, where one RIS facilitates communication between a multi-antenna BS and a near user, and the other RIS enhances communication between the near user and a far user.  Methods  Based on finite blocklength information theory, closed-form expressions for the average block error rate (BLER) of the near and far users are derived under the optimal antenna selection strategy. These expressions provide an efficient and convenient approach for evaluating the reliability of the considered system. Next, the effective throughput is formulated, and the optimal blocklength that maximizes it under reliability and latency constraints is derived.  Results and Discussions  The theoretical average BLER results show excellent agreement with Monte Carlo simulation results, confirming the validity of the derivations. The average BLER for the near and far users decreases as the transmit signal-to-noise ratio (SNR) increases. Moreover, for a given transmit SNR, increasing the blocklength significantly reduces the average BLER for the near and far users (Fig. 2). The reason for this improvement is that longer blocklengths decrease the transmission rate, thereby enhancing system reliability. The average BLER for the near user initially decreases before reaching a minimum and then increases as the power allocation coefficient increases (Fig. 3). This trend is due to the fact that increasing the power allocation coefficient reduces the BLER for decoding the near user's signal but increases the complexity of the successive interference cancellation (SIC) process. In contrast, the average BLER for the far user increases as the power allocation coefficient increases. The double RIS-assisted transmission scheme demonstrates superior performance compared to the single RIS-assisted and non-RIS-assisted transmission schemes (Fig. 4). Specifically, as the number of RIS reflecting elements increases, the performance advantage of the proposed scheme over these benchmark schemes becomes increasingly significant. The average BLER for the far user saturates as the number of BS antennas increases (Fig. 5). This is due to the fact that the relaying link becomes the dominant performance bottleneck when the number of BS antennas exceeds a certain value. As the blocklength increases, the effective throughput first reaches a maximum and then gradually decreases (Fig. 6). This is because when the blocklength is too small, a higher BLER results in poor effective throughput. When the blocklength is too large, a lower transmission rate also leads to poor effective throughput. As the quality of hardware improves, the optimal blocklength decreases. This can be justified by the fact that lower hardware impairments reduce decoding errors, meaning that shorter blocklengths can be used to reduce transmission latency while still satisfying reliability constraints.  Conclusions  This paper investigates the performance of the double RIS-assisted multi-antenna cooperative NOMA short-packet communication system under hardware impairments. Closed-form expressions for the average BLER of the near and far users are derived under the optimal antenna selection strategy. Furthermore, the effective throughput is analyzed, and the optimal blocklength that maximizes the effective throughput under reliability and latency constraints is determined. Simulation results demonstrate that the double RIS-assisted transmission scheme achieves superior performance compared to the single RIS-assisted and non-RIS-assisted transmission schemes. In addition, increasing the number of BS antennas does not always improve the average BLER for the far user due to the relaying link constraint. Power allocation is critical for ensuring user reliability. The near user should carefully balance self-signal demodulation and SIC under a total power constraint. Superior hardware quality enhances short-packet communication efficiency by lowering the optimal blocklength. Future work will focus on developing RIS configuration schemes that simultaneously maximize energy efficiency (EE) and ensure user fairness in NOMA to address the needs of energy-constrained IoT devices.
Data-Driven Secure Control for Cyber-Physical Systems under Denial-of-Service Attacks: An Online Mode-Dependent Switching-Q-Learning Strategy
ZHANG Ruifeng, YANG Rongni
Available online  , doi: 10.11999/JEIT250746
Abstract:
  Objective   The open network architecture of cyber-physical systems (CPSs) enables remarkable flexibility and scalability, but it also renders CPSs highly vulnerable to cyber-attacks. Particularly, denial-of-service (DoS) attacks have emerged as one of the predominant threats, which can cause packet loss and reduce system performance by directly jamming channels. On the other hand, CPSs under dormant and active DoS attacks can be regarded as dual-mode switched systems with stable and unstable subsystems, respectively. Therefore, it is worth exploring how to utilize the switched system theory to design a secure control approach with high degrees of freedom and low conservatism. However, due to the influence of complex environments such as attacks and noises, it is difficult to model practical CPSs exactly. Currently, although a Q-learning-based control method demonstrates potential for handling unknown CPSs, the significant research gap exists in switched systems with unstable modes, particularly for establishing the evaluable stability criterion. Therefore, it remains to be investigated for unknown CPSs under DoS attacks to apply switched system theory to design the learning-based control algorithm and evaluable security criterion.   Methods   An online mode-dependent switching-Q-learning strategy is presented to study the data-driven evaluable criterion and secure control for unknown CPSs under DoS attacks. Initially, the CPSs under dormant and active DoS attacks are transformed into switched systems with stable and unstable subsystems, respectively. Subsequently, the optimal control problem of the value function is addressed for the model-based switched systems by designing a new generalized switching algebraic Riccati equation (GSARE) and obtaining the corresponding mode-dependent optimal security controller. Furthermore, the existence and uniqueness of the GSARE’s solution are proved. In what follows, with the help of model-based results, a data-driven optimal security control law is proposed by developing a novel online mode-dependent switching-Q-learning control algorithm. Finally, through utilizing the learned control gain and parameter matrices from the above algorithm, a data-driven evaluable security criterion with the attack frequency and duration is established based on the switching constraints and subsystem constraints.   Results and Discussions   In order to verify the efficiency and advantage of the proposed methods, comparative experiments of the wheeled robot are displayed in this work. Firstly, compare the model-based result (Theorem 1) and the data-driven result (Algorithm 1) as follows: From the iterative process curves of control gain and parameter matrices (Fig. 2 and Fig. 3), it can be observed that the optimal control gain and parameter matrices under threshold errors can all be successfully obtained from both the model-based GSARE and the data-driven algorithm. Meanwhile, the tracking errors of CPSs can converge to 0 by utilizing the above data-driven controller (Fig. 5), which ensures the exponential stability of CPSs and verifies the efficiency of our proposed switching-Q-learning algorithm. Secondly, it is evident from learning process curves (Fig.4) that although the initial value of the learned control gain is not stabilizable, the optimal control gain can still be successfully learned to stabilize the system from Algorithm 1. This result significantly reduces conservatism compared to existing Q-learning approaches, which take stabilizable initial control gains as the learning premise. Thirdly, compare the data-driven evaluable security criterion in Theorem 2 of this work and existing criteria as follows: While the switching parameters learned from Algorithm 1 do not satisfy the popular switching constraint to obtain the model dwell-time, by utilizing the evaluable security criterion proposed in this paper, the attack frequency and duration are obtained based on the new switching constraints and subsystem constraints. Furthermore, it is seen from the comparison of the evaluable security criteria (Tab.1) that our proposed evaluable security criterion is less conservative than the existing evaluable criteria. Finally, the learned optimal controller and the obtained DoS attack constraints are applied to the tracking control experiment of a wheeled robot under DoS attacks, and the result is compared with existing results via Q-learning controllers. It is evident from the tracking trajectory comparisons of the robot (Fig.6 and Fig.7) that the robot enables significantly faster and more accurate trajectory tracking with the help of our proposed switching-Q-learning controller. Therefore, the efficiency and advantage of the proposed algorithm and criterion in this work are verified.   Conclusions   Based on the learning strategy and the switched system theory, this study presents an online mode-dependent switching-Q-learning control algorithm and the corresponding evaluable security criterion for the unknown CPSs under DoS attacks. The detailed results are provided as follows: (1) By representing the unknown CPSs under dormant and active DoS attacks as unknown switched systems with stable and unstable subsystems, respectively, the security problem of CPSs under DoS attacks is transformed into a stabilization problem of the switched systems, which offers high design freedom and low conservatism. (2) A novel online mode-dependent switching-Q-learning control algorithm is developed for unknown switched systems with unstable modes. Through the comparative experiments, the proposed switching-Q-learning algorithm effectively increases the design freedom of controllers and decreases conservatism over existing Q-learning algorithms. (3) A new data-driven evaluable security criterion with the attack frequency and duration is established based on the switching constraints and subsystem constraints. It is evident from the comparative criteria that the proposed criterion demonstrates significantly reduced conservatism over existing evaluable criteria via single subsystem constraints and traditional model dwell-time constraints.
Stealthy Path Planning Algorithm for UAV Swarm Based on Improved APF-RRT* Under Dynamic Threat
ZHANG Xinrui, SHI Chenguang, WU Zhifeng, WEN Wen, ZHOU Jianjiang
Available online  , doi: 10.11999/JEIT250554
Abstract:
  Objective  The efficient penetration and survivability of unmanned aerial vehicle (UAV) swarms in complex battlefield environments critically depend on robust trajectory planning. With the increasing deployment of advanced air defense systems—featuring radar network, anti-aircraft artillery and dynamic no-fly zones—conventional planning methods struggle to meet simultaneous requirements for stealth, feasibility, and safety. Although prior studies have contributed valuable insights into UAV swarm path planning, they present several limitations: (1) Most research focuses on detection models for single radars and does not sufficiently incorporate the coupling between UAV radar cross section (RCS) and stealth trajectory optimization; (2) UAV kinematic constraints are often treated independently of stealth characteristics; (3) Environmental threats are typically modeled as static and singular, overlooking real-time dynamic threats; (4) Stealth planning is predominantly studied for individual UAVs, with limited attention to swarm-level coordination. This work addresses these gaps by proposing a cooperative stealth trajectory planning framework that integrates real-time threat perception with swarm dynamics optimization, significantly enhancing survivability in contested airspace.  Methods  To overcome the aforementioned challenges, this paper proposes a stealth path planning algorithm for UAV swarm based on improved artificial potential field (APF) and rapidly-exploring random trees star (RRT*) framework under dynamic threat. First, a multi-threat environment model is constructed, incorporating radars, anti-aircraft artillery, and fixed obstacles. A comprehensive stealth cost function is developed by integrating UAV RCS characteristics, accounting for flight distance, radar detection probability, and artillery threat probability. Second, a stealth trajectory optimization model is formulated with the objective of minimizing the overall cost function, subject to strict constraints on UAV kinematics, swarm coordination, and path feasibility. To solve this model efficiently, an enhanced APF-RRT* algorithm is designed. A rolling-window strategy is introduced to facilitate continuous local replanning in response to dynamic threats, enabling real-time trajectory updates and improving responsiveness to sudden changes in the threat landscape. Furthermore, a target-biased sampling technique is applied to reduce sampling redundancy, thereby enhancing algorithmic convergence speed. By combining the global search capability of RRT* with the local adaptability of APF, the proposed approach enables UAV swarms to generate stealth-optimal paths in real time while maintaining high levels of safety and coordination in adversarial environments.  Results and Discussions  Simulation experiments validate the effectiveness of the proposed algorithm. During global path planning, some UAVs enter regions threatened by dynamic no-fly zones, radars, and artillery systems, while others successfully reach their destinations through unobstructed paths. In the local replanning phase, affected UAVs adaptively adjust their trajectories to minimize radar detection probability and overall stealth cost. When encountering mobile threats, UAVs perform lateral evasive maneuvers to avoid collisions and ensure mission completion. In contrast, the detection probabilities of the UAVs requiring replanning all exceed the specified threshold for networked radar detection under the comparison algorithms. This indicates that, in practical scenarios, the comparison algorithms fail to generate UAV swarm trajectories that meet platform safety requirements, rendering them ineffective. Comparative simulations demonstrate that the proposed method significantly outperforms existing approaches by reducing stealth costs and improving trajectory feasibility and swarm coordination. The algorithm achieves optimal swarm-level stealth and ensures safe and efficient penetration in dynamic environments.  Conclusions  This study addresses the problem of stealth trajectory planning for UAV swarms in dynamic threat environments by proposing an improved APF-RRT* algorithm. The following key findings are derived from extensive simulations conducted across different contested scenarios (Section 5): (1) The proposed algorithm reduces the voyage distance by 11.1km in scene 1 and 66.9km in scene 2 compared with the baseline RRT* method (Tab. 3, Tab. 5), primarily due to RCS-minimizing attitude adjustments by heading angle chang (Fig. 3, Fig. 6); (2) The networked radar detection probability remains below the 30% threshold for all UAVs (Fig. 4(a), Fig. 7(a)), whereas comparison algorithm exceed the safety limit of 98% of the group members at most (Fig. 4(b), Fig. 7(b), Fig. 9(a), Fig. 9(b)); (3) The rolling-window replanning mechanism enables real-time avoidance of mobile threats such as dynamic no-fly zones and anti-aircraft artillery (Fig. 5, Fig. 8), while simultaneously reducing the comprehensive trajectory cost by 9.0% in Scene 1 and 15.6% in Scene 2 compared with the baseline RRT method (Tab. 3, Tab. 5). (4) Cooperative constraints embedded in the planning algorithm maintain safe inter-UAV separation and jointly optimize swarm-level stealth performance (Fig. 2, Fig. 5, Fig. 8). These results collectively demonstrate the superiority of the proposed method in balancing stealth optimization, dynamic threat adaptation, and swarm kinematic feasibility. Future research will extend this framework to 3D complex terrain environments and integrate deep reinforcement learning to further enhance predictive threat response and battlefield adaptability.
Low Complexity Sequential Decoding Algorithm of PAC Code for Short Packet Communication
DAI Jingxin, YIN Hang, WANG Yuhuan, LV Yansong, YANG Zhanxin, LV Rui, XIA Zhiping
Available online  , doi: 10.11999/JEIT250533
Abstract:
  Objective  With the rise of the intelligent Internet of Things (IoT), short packet communication among IoT devices must meet stringent requirements for low latency, high reliability, and very short packet length, posing challenges to the design of channel coding schemes. As an advanced variant of polar codes, Polarization-Adjusted Convolutional (PAC) codes enhance the error-correction performance of polar codes at medium and short code lengths, approaching the dispersion bound in some cases. This makes them promising for short packet communication. However, the high decoding complexity required to achieve near-bound error-correction performance limits their practicality. To address this, we propose two low complexity sequential decoding algorithms: Low Complexity Fano Sequential (LC-FS) and Low Complexity Stack (LC-S). Both algorithms effectively reduce decoding complexity with negligible loss in error-correction performance.  Methods  To reduce the decoding complexity of Fano-based sequential decoding algorithms, we propose the LC-FS algorithm. This method exploits special nodes to terminate decoding at intermediate levels of the decoding tree, thereby reducing the complexity of tree traversal. Special nodes are classified into two types according to decoder structure: low-rate nodes (Type-\begin{document}$ \mathrm{T} $\end{document} node) and high-rate nodes [Rate-1 and Single Parity-Check (SPC) nodes]. This classification minimizes unnecessary hardware overhead by avoiding excessive subdivision of special nodes. For each type, a corresponding LC-FS decoder and node-movement strategy are developed. To reduce the complexity of stack-based decoding algorithms, we propose the LC-S algorithm. While preserving the low backtracking feature of stack-based decoding, this method introduces tailored decoding structures and node-movement strategies for low-rate and high-rate special nodes. Therefore, the LC-S algorithm achieves significant complexity reduction without compromising error-correction performance.  Results and Discussions  The performance of the proposed LC-FS and LC-S decoding algorithms is evaluated through extensive simulations in terms of Frame Error Rate (FER), Average Computational Complexity (ACC), Maximum Computational Complexity (MCC), and memory requirements. Traditional Fano sequential, traditional stack, and Fast Fano Sequential (FFS) decoding algorithms are set as benchmarks. The simulation results show that the LC-FS and LC-S algorithms exhibit negligible error-correction performance loss compared with traditional Fano sequential and stack decoders (Fig. 5). Across different PAC codes, both algorithms effectively reduce decoding complexity. Specifically, as increases, the reductions in ACC and MCC become more pronounced. For ACC, LC-FS decoding algorithm (\begin{document}$T = 4$\end{document}) achieves reductions of 13.77% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 11.42% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 25.52% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with FFS (Fig. 6). LC-S decoding algorithm (\begin{document}$T = 4$\end{document}) reduces ACC by 56.48% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 47.63% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 49.61% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with the traditional stack algorithm (Fig. 6). For MCC, LC-FS decoding algorithm (\begin{document}$T = 4$\end{document}) achieves reductions of 29.71% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 21.18% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 23.62% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with FFS (Fig. 7). LC-S decoding algorithm (\begin{document}$T = 4$\end{document}) reduces MCC by 67.17% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 49.33% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 51.84% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with the traditional stack algorithm (Fig. 7). By exploiting low-rate and high-rate special nodes to terminate decoding at intermediate levels of the decoding tree, the LC-FS and LC-S algorithms also reduce memory requirements (Table 2). However, as \begin{document}$T$\end{document} increases, the memory usage of LC-S rises because all extended paths of low-rate special nodes are pushed into the stack. The increase in \begin{document}$T$\end{document} enlarges the number of extended paths, indicating its critical role in balancing decoding complexity and memory occupation (Fig. 8).  Conclusions  To address the high decoding complexity of sequential decoding algorithms for PAC codes, this paper proposes two low complexity approaches: the LC-FS and LC-S algorithms. Both methods classify special nodes into low-rate and high-rate categories and design corresponding decoders and movement strategies. By introducing Type-\begin{document}$ \mathrm{T} $\end{document} nodes, the algorithms further eliminate redundant computations during decoding, thereby reducing complexity. Simulation results demonstrate that the LC-FS and LC-S algorithms substantially decrease decoding complexity while maintaining the error-correction performance of PAC codes at medium and short code lengths.
Multimodal Hypergraph Learning Guidance with Global Noise Enhancement for Sentiment Analysis under Missing Modality Information
HUANG Chen, LIU Huijie, ZHANG Yan, YANG Chao, SONG Jianhua
Available online  , doi: 10.11999/JEIT250649
Abstract:
  Objective  Multimodal Sentiment Analysis (MSA) has shown considerable promise in interdisciplinary domains such as Natural Language Processing (NLP) and Affective Computing, particularly by integrating information from ElectroEncephaloGraphy (EEG) signals, visual images, and text to classify sentiment polarity and provide a comprehensive understanding of human emotional states. However, in complex real-world scenarios, challenges including missing modalities, limited high-level semantic correlation learning across modalities, and the lack of mechanisms to guide cross-modal information transfer substantially restrict the generalization ability and accuracy of sentiment recognition models. To address these limitations, this study proposes a Multimodal Hypergraph Learning Guidance method with Global Noise Enhancement (MHLGNE), designed to improve the robustness and performance of MSA under conditions of missing modality information in complex environments.  Methods  The overall architecture of the MHLGNE model is illustrated in Fig. 2 and consists of the Adaptive Global Noise Sampling Module, the Multimodal Hypergraph Learning Guiding Module, and the Sentiment Prediction Target Module. A pretrained language model is first applied to encode the multimodal input data. To simulate missing modality conditions, the input data are constructed with incomplete modal information, where a modality \begin{document}$ m\in \{e,v,t\} $\end{document} is randomly absent. The adaptive global noise sampling strategy is then employed to supplement missing modalities from a global perspective, thereby improving adaptability and enhancing both robustness and generalization in complex environments. This design allows the model to handle noisy data and missing modalities more effectively. The Multimodal Hypergraph Learning Guiding Module is further applied to capture high-level semantic correlations across different modalities, overcoming the limitations of conventional methods that rely only on feature alignment and fusion. By guiding cross-modal information transfer, this module enables the model to focus on essential inter-modal semantic dependencies, thereby improving sentiment prediction accuracy. Finally, the performance of MHLGNE is compared with that of State-Of-The-Art (SOTA) MSA models under two conditions: complete modality data and randomly missing modality information.  Results and Discussions  Three publicly available MSA datasets (SEED-IV, SEED-V, and DREAMER) are employed, with features extracted from EEG signals, visual images, and text. To ensure robustness, standard cross-validation is applied, and the training process is conducted with iterative adjustments to the noise sampling strategy, modality fusion method, and hypergraph learning structure to optimize sentiment prediction. Under the complete modality condition, MHLGNE is observed to outperform the second-best M2S model across most evaluation metrics, with accuracy improvements of 3.26%, 2.10%, and 0.58% on SEED-IV, SEED-V, and DREAMER, respectively. Additional metrics also indicate advantages over other SOTA methods. Under the random missing modality condition, MHLGNE maintains superiority over existing MSA approaches, with improvements of 1.03% in accuracy, 0.24% in precision, and 0.08 in Kappa score. The adaptive noise sampling module is further shown to effectively compensate for missing modalities. Unlike conventional models that suffer performance degradation under such conditions, MHLGNE maintains robustness by generating complementary information. In addition, the multimodal hypergraph structure enables the capture of high-level semantic dependencies across modalities, thereby strengthening cross-modal information transfer and offering clear advantages when modalities are absent. Ablation experiments confirm the independent contributions of each module. The removal of either the adaptive noise sampling or the multimodal hypergraph learning guiding module results in notable performance declines, particularly under high-noise or severely missing modality conditions. The exclusion of the cross-modal information transfer mechanism causes a substantial decline in accuracy and robustness, highlighting its essential role in MSA.  Conclusions  The MHLGNE model, equipped with the Adaptive Global Noise Sampling Module and the Multimodal Hypergraph Learning Guiding Module, markedly improves the performance of MSA under conditions of missing modalities and in tasks requiring effective cross-modal information transfer. Experiments on SEED-IV, SEED-V, and DREAMER confirm that MHLGNE exceeds SOTA MSA models across multiple evaluation metrics, including accuracy, precision, Kappa score, and F1 score, thereby demonstrating its robustness and effectiveness. Future work may focus on refining noise sampling strategies and developing more sophisticated hypergraph structures to further strengthen performance under extreme modality-missing scenarios. In addition, this framework has the potential to be extended to broader sentiment analysis tasks across diverse application domains.
Entropy Quantum Collaborative Planning Method for Emergency Path of Unmanned Aerial Vehicles Driven by Survival Probability
WANG Enliang, ZHANG Zhen, SUN Zhixin
Available online  , doi: 10.11999/JEIT250694
Abstract:
  Objective  Natural disaster emergency rescue places stringent requirements on the timeliness and safety of Unmanned Aerial Vehicle (UAV) path planning. Conventional optimization objectives, such as minimizing total distance, often fail to reflect the critical time-sensitive priority of maximizing the survival probability of trapped victims. Moreover, existing algorithms struggle with the complex constraints of disaster environments, including no-fly zones, caution zones, and dynamic obstacles. To address these challenges, this paper proposes an Entropy-Enhanced Quantum Ripple Synergy Algorithm (E2QRSA). The primary goals are to establish a survival probability maximization model that incorporates time decay characteristics and to design a robust optimization algorithm capable of efficiently handling complex spatiotemporal constraints in dynamic disaster scenarios.  Methods  E2QRSA enhances the Quantum Ripple Optimization framework through four key innovations: (1) information entropy–based quantum state initialization, which guides population generation toward high-entropy regions; (2) multi-ripple collaborative interference, which promotes beneficial feature propagation through constructive superposition; (3) entropy-driven parameter control, which dynamically adjusts ripple propagation according to search entropy rates; and (4) quantum entanglement, which enables information sharing among elite individuals. The model employs a survival probability objective function that accounts for time-sensitive decay, base conditions, and mission success probability, subject to constraints including no-fly zones, warning zones, and dynamic obstacles.  Results and Discussions  Simulation experiments are conducted in medium- and large-scale typhoon disaster scenarios. The proposed E2QRSA achieves the highest survival probabilities of 0.847 and 0.762, respectively (Table 1), exceeding comparison algorithms such as SEWOA and PSO by 4.2–16.0%. Although the paths generated by E2QRSA are not the shortest, they are the most effective in maximizing survival chances. The ablation study (Table 3) confirms the contribution of each component, with the removal of multi-ripple interference causing the largest performance decrease (9.97%). The dynamic coupling between search entropy and ripple parameters (Fig. 2) is validated, demonstrating the effectiveness of the adaptive control mechanism. The entanglement effect (Fig. 4) is shown to maintain population diversity. In terms of constraint satisfaction, E2QRSA-planned paths consume only 85.2% of the total available energy (Table 5), ensuring a safe return, and all static and dynamic obstacles are successfully avoided, as visually verified in the 3D path plots (Figs. 6 and 7).  Conclusions  E2QRSA effectively addresses the challenge of UAV path planning for disaster relief by integrating adaptive entropy control with quantum-inspired mechanisms. The survival probability objective captures the essential requirements of disaster scenarios more accurately than conventional distance minimization. Experimental validation demonstrates that E2QRSA achieves superior solution quality and faster convergence, providing a robust technical basis for strengthening emergency response capabilities.
A Method for Named Entity Recognition in Military Intelligence Domain Using Large Language Models
LI Yongbin, LIU Lian, ZHENG Jie
Available online  , doi: 10.11999/JEIT250764
Abstract:
  Objective  Named Entity Recognition (NER) is a fundamental task in information extraction within specialized domains, particularly military intelligence. It plays a critical role in situation assessment, threat analysis, and decision support. However, conventional NER models face major challenges. First, the scarcity of high-quality annotated data in the military intelligence domain is a persistent limitation. Due to the sensitivity and confidentiality of military information, acquiring large-scale, accurately labeled datasets is extremely difficult, which severely restricts the training performance and generalization ability of supervised learning–based NER models. Second, military intelligence requires handling complex and diverse information extraction tasks. The entities to be recognized often possess domain-specific meanings, ambiguous boundaries, and complex relationships, making it difficult for traditional models with fixed architectures to adapt flexibly to such complexity or achieve accurate extraction. This study aims to address these limitations by developing a more effective NER method tailored to the military intelligence domain, leveraging Large Language Models (LLMs) to enhance recognition accuracy and efficiency in this specialized field.  Methods  To achieve the above objective, this study focuses on the military intelligence domain and proposes a NER method based on LLMs. The central concept is to harness the strong semantic reasoning capabilities of LLMs, which enable deep contextual understanding of military texts, accurate interpretation of complex domain-specific extraction requirements, and autonomous execution of extraction tasks without heavy reliance on large annotated datasets. To ensure that general-purpose LLMs can rapidly adapt to the specialized needs of military intelligence, two key strategies are employed. First, instruction fine-tuning is applied. Domain-specific instruction datasets are constructed to include diverse entity types, extraction rules, and representative examples relevant to military intelligence. Through fine-tuning with these datasets, the LLMs acquire a more precise understanding of the characteristics and requirements of NER in this field, thereby improving their ability to follow targeted extraction instructions. Second, Retrieval-Augmented Generation (RAG) is incorporated. A domain knowledge base is developed containing expert knowledge such as entity dictionaries, military terminology, and historical extraction cases. During the NER process, the LLM retrieves relevant knowledge from this base in real time to support entity recognition. This strategy compensates for the limited domain-specific knowledge of general LLMs and enhances recognition accuracy, particularly for rare or complex entities.  Results and Discussions  Experimental results indicate that the proposed LLM–based NER method, which integrates instruction fine-tuning and RAG, achieves strong performance in military intelligence NER tasks. Compared with conventional NER models, it demonstrates higher precision, recall, and F1-score, particularly in recognizing complex entities and managing scenarios with limited annotated data. The effectiveness of this method arises from several key factors. The powerful semantic reasoning capability of LLMs enables a deeper understanding of contextual nuances and ambiguous expressions in military texts, thereby reducing missed and false recognitions commonly caused by rigid pattern-matching approaches. Instruction fine-tuning allows the model to better align with domain-specific extraction requirements, ensuring that the recognition results correspond more closely to the practical needs of military intelligence analysis. Furthermore, the incorporation of RAG provides real-time access to domain expert knowledge, markedly enhancing the recognition of entities that are highly specialized or morphologically variable within military contexts. This integration effectively mitigates the limitations of traditional models that lack sufficient domain knowledge.  Conclusions  This study proposes a LLM–based NER method for the military intelligence domain, effectively addressing the challenges of limited annotated data and complex extraction requirements encountered by traditional models. By combining instruction fine-tuning and RAG, general-purpose LLMs can be rapidly adapted to the specialized demands of military intelligence, enabling the construction of an efficient domain-specific expert system at relatively low cost. The proposed method provides an effective and scalable solution for NER tasks in military intelligence scenarios, enhancing both the efficiency and accuracy of information extraction in this field. It offers not only practical value for military intelligence analysis and decision support but also methodological insight for NER research in other specialized domains facing similar data and complexity constraints, such as aerospace and national security. Future research will focus on optimizing instruction fine-tuning strategies, expanding the domain knowledge base, and reducing computational cost to further improve model performance and applicability.
Secrecy Rate Maximization Algorithm for IRS Assisted UAV-RSMA Systems
WANG Zhengqiang, KONG Weidong, WAN Xiaoyu, FAN Zifu, DUO Bin
Available online  , doi: 10.11999/JEIT250452
Abstract:
  Objective  Under the stringent requirements of Sixth-Generation(6G) mobile communication networks for spectral efficiency, energy efficiency, low latency, and wide coverage, Unmanned Aerial Vehicle (UAV) communication has emerged as a key solution for 6G and beyond, leveraging its Line-of-Sight propagation advantages and flexible deployment capabilities. Functioning as aerial base stations, UAVs significantly enhance network performance by improving spectral efficiency and connection reliability, demonstrating irreplaceable value in critical scenarios such as emergency communications, remote area coverage, and maritime operations. However, UAV communication systems face dual challenges in high-mobility environments: severe multi-user interference in dense access scenarios that substantially degrades system performance, alongside critical physical-layer security threats resulting from the broadcast nature and spatial openness of wireless channels that enable malicious interception of transmitted signals. Rate-Splitting Multiple Access (RSMA) mitigates these challenges by decomposing user messages into common and private streams, thereby providing a flexible interference management mechanism that balances decoding complexity with spectral efficiency. This makes RSMA especially suitable for high-density user access scenarios. In parallel, Intelligent Reflecting Surfaces (IRS) have emerged as a promising technology to dynamically reconfigure wireless propagation through programmable electromagnetic unit arrays. IRS improves the quality of legitimate links while reducing the capacity of eavesdropping links, thereby enhancing physical-layer security in UAV communications. It is noteworthy that while existing research has predominantly centered on conventional multiple access schemes, the application potential of RSMA technology in IRS-assisted UAV communication systems remains relatively unexplored. Against this background, this paper investigates secure transmission strategies in IRS-assisted UAV-RSMA systems.  Methods  This paper investigates the effect of eavesdroppers on the security performance of UAV communication systems and proposes an IRS-assisted RSMA-based UAV communication model. The system comprises a multi-antenna UAV base station, an IRS mounted on a building, multiple single-antenna legitimate users, and multiple single-antenna eavesdroppers. The optimization problem is formulated to maximize the system secrecy rate by jointly optimizing precoding vectors, common secrecy rate allocation, IRS phase shifts, and UAV positioning. The problem is highly non-convex due to the strong coupling among these variables, rendering direct solutions intractable. To overcome this challenge, a two-layer optimization framework is developed. In the inner layer, with UAV position fixed, an alternating optimization strategy divides the problem into two subproblems: (1) joint optimization of precoding vectors and common secrecy rate allocation and (2) optimization of IRS phase shifts. Non-convex constraints are transformed into convex forms using techniques such as Successive Convex Approximation (SCA), relaxation variables, first-order Taylor expansion, and Semidefinite Relaxation (SDR). In the outer layer, the Particle Swarm Optimization (PSO) algorithm determines the UAV deployment position based on the optimized inner-layer variables.  Results and Discussions  Simulation results show that the proposed algorithm outperforms RSMA without IRS, NOMA with IRS, and NOMA without IRS in terms of secrecy rate. (Fig. 2) illustrates that the secrecy rate increases with the number of iterations and converges under different UAV maximum transmit power levels and antenna configurations. (Fig. 3) demonstrates that increasing UAV transmit power significantly enhances the secrecy rate for both the proposed and benchmark schemes. This improvement arises because higher transmit power strengthens the signal received by legitimate users, increasing their achievable rates and enhancing system secrecy performance. (Fig. 4) indicates that the secrecy rate grows with the number of UAV antennas. This improvement is due to expanded signal coverage and greater spatial degrees of freedom, which amplify effective signal strength in legitimate user channels. (Fig. 5) shows that both the proposed scheme and NOMA with IRS achieve higher secrecy rate as the number of IRS reflecting elements increases. The additional elements provide greater spatial degrees of freedom, improving channel gains for legitimate users and strengthening resistance to eavesdropping. In contrast, benchmark schemes operating without IRS assistance exhibit no performance improvement and maintain constant secrecy rate. This result highlights the critical role of the IRS in enabling secure communications. Finally, (Fig. 6) demonstrates the optimal UAV position when \begin{document}${P_{\max }} = 30{\text{ dBm}}$\end{document}. Deploying the UAV near the center of legitimate users and adjacent to the IRS minimizes the average distance to users, thereby reducing path loss and fully exploiting IRS passive beamforming. This placement strengthens legitimate signals while suppressing the eavesdropping link, leading to enhanced secrecy performance.  Conclusions  This study addresses secure communication scenarios with multiple eavesdroppers by proposing an IRS-assisted secure resource allocation algorithm for UAV-enabled RSMA systems. An optimization problem is formulated to maximize the system secrecy rate under multiple constraints, including UAV transmit power, by jointly optimizing precoding vectors, common rate allocation, IRS configurations, and UAV positioning. Due to the non-convex nature of the problem, a hierarchical optimization framework is developed to decompose it into two subproblems. These are effectively solved using techniques such as SCA, SDR, Gaussian randomization, and PSO. Simulation results confirm that the proposed algorithm achieves substantial secrecy rate gains over three benchmark schemes, thereby validating its effectiveness.
BIRD1445: Large-scale Multimodal Bird Dataset for Ecological Monitoring
WANG Hongchang, XIAN Fengyu, XIE Zihui, DONG Miaomiao, JIAN Haifang
Available online  , doi: 10.11999/JEIT250647
Abstract:
  Objective  With the rapid advancement of Artificial Intelligence (AI) and growing demands in ecological monitoring, high-quality multimodal datasets have become essential for training and deploying AI models in specialized domains. Existing bird datasets, however, face notable limitations, including challenges in field data acquisition, high costs of expert annotation, limited representation of rare species, and reliance on single-modal data. To overcome these constraints, this study proposes an efficient framework for constructing large-scale multimodal datasets tailored to ecological monitoring. By integrating heterogeneous data sources, employing intelligent semi-automatic annotation pipelines, and adopting multi-model collaborative validation based on heterogeneous attention fusion, the proposed approach markedly reduces the cost of expert annotation while maintaining high data quality and extensive modality coverage. This work offers a scalable and intelligent strategy for dataset development in professional settings and provides a robust data foundation for advancing AI applications in ecological conservation and biodiversity monitoring.  Methods  The proposed multimodal dataset construction framework integrates multi-source heterogeneous data acquisition, intelligent semi-automatic annotation, and multi-model collaborative verification to enable efficient large-scale dataset development. The data acquisition system comprises distributed sensing networks deployed across natural reserves, incorporating high-definition intelligent cameras, custom-built acoustic monitoring devices, and infrared imaging systems, supplemented by standardizedpublic data to enhance species coverage and modality diversity. The intelligent annotation pipeline is built upon four core automated tools: (1) spatial localization annotation leverages object detection algorithms to generate bounding boxes; (2) fine-grained classification employs Vision Transformer models for hierarchical species identification; (3) pixel-level segmentation combines detection outputs with SegGPT models to produce instance-level masks; and (4) multimodal semantic annotation uses Qwen large language models to generate structured textual descriptions. To ensure annotation quality and minimize manual verification costs, a multi-scale attention fusion verification mechanism is introduced. This mechanism integrates seven heterogeneous deep learning models, each with different feature perception capacities across local detail, mid-level semantic, and global contextual scales. A global weighted voting module dynamically assigns fusion weights based on model performance, while a prior knowledge-guided fine-grained decision module applies category-specific accuracy metrics and Top-K model selection to enhance verification precision and computational efficiency.  Results and Discussions  The proposed multi-scale attention fusion verification method dynamically assesses data quality based on heterogeneous model predictions, forming the basis for automated annotation validation. Through optimized weight allocation and category-specific verification strategies, the collaborative verification framework evaluates the effect of different model combinations on annotation accuracy. Experimental results demonstrate that the optimal verification strategy—achieved by integrating seven specialized models—outperforms all baseline configurations across evaluation metrics. Specifically, the method attains a Top-1 accuracy of 95.39% on the CUB-200-2011 dataset, exceeding the best-performing single-model baseline, which achieves 91.79%, thereby yielding a 3.60% improvement in recognition precision. The constructed BIRD1445 dataset, comprising 3.54 million samples spanning 1,445 bird species and four modalities, outperforms existing datasets in terms of coverage, quality, and annotation accuracy. It serves as a robust benchmark for fine-grained classification, density estimation, and multimodal learning tasks in ecological monitoring.  Conclusions  This study addresses the challenge of constructing large-scale multimodal datasets for ecological monitoring by integrating multi-source data acquisition, intelligent semi-automatic annotation, and multi-model collaborative verification. The proposed approach advances beyond traditional manual annotation workflows by incorporating automated labeling pipelines and heterogeneous attention fusion mechanisms as the core quality control strategy. Comprehensive evaluations on benchmark datasets and real-world scenarios demonstrate the effectiveness of the method: (1) the verification strategy improves annotation accuracy by 3.60% compared to single-model baselines on the CUB-200-2011 dataset; (2) optimal trade-offs between precision and computational efficiency are achieved using Top-K = 3 model selection, based on performance–complexity alignment; and (3) in large-scale annotation scenarios, the system ensures high reliability across 1,445 species categories. Despite its effectiveness, the current approach primarily targets species with sufficient data. Future work should address the representation of rare and endangered species by incorporating advanced data augmentation and few-shot learning techniques to mitigate the limitations posed by long-tail distributions.
Optimal Federated Average Fusion of Gaussian Mixture–Probability Hypothesis Density Filters
XUE Yu, XU Lei
Available online  , doi: 10.11999/JEIT250759
Abstract:
  Objective  To realize optimal decentralized fusion tracking of uncertain targets, this study proposes a federated average fusion algorithm for Gaussian Mixture–Probability Hypothesis Density (GM-PHD) filters, designed with a hierarchical structure. Each sensor node operates a local GM-PHD filter to extract multi-target state estimates from sensor measurements. The fusion node performs three key tasks: (1) maintaining a master filter that predicts the fusion result from the previous iteration; (2) associating and merging the GM-PHDs of all filters; and (3) distributing the fused result and several parameters to each filter. The association step decomposes multi-target density fusion into four categories of single-target estimate fusion. We derive the optimal single-target estimate fusion both in the absence and presence of missed detections. Information assignment applies the covariance upper-bounding theory to eliminate correlation among all filters, enabling the proposed algorithm to achieve the accuracy of Bayesian fusion. Simulation results show that the federated fusion algorithm achieves optimal tracking accuracy and consistently outperforms the conventional Arithmetic Average (AA) fusion method. Moreover, the relative reliability of each filter can be flexibly adjusted.  Methods  The multi-sensor multi-target density fusion is decomposed into multiple groups of single-target component merging through the association operation. Federated filtering is employed as the merging strategy, which achieves the Bayesian optimum owing to its inherent decorrelation capability. Section 3 rigorously extends this approach to scenarios with missed detections. To satisfy federated filtering’s requirement for prior estimates, a master filter is designed to compute the predicted multi-target density, thereby establishing a hierarchical architecture for the proposed algorithm. In addition, auxiliary measures are incorporated to compensate for the observed underestimation of cardinality.  Results and Discussions  modified Mahalanobis distance (Fig.3). The precise association and the single-target decorrelation capability together ensure the theoretical optimality of the proposed algorithm, as illustrated in Fig. 2. Compared with conventional density fusion, the Optimal Sub-Pattern Assignment (OSPA) error is reduced by 8.17% (Fig. 4). The advantage of adopting a small average factor for the master filter is demonstrated in Figs. 5 and 6. The effectiveness of the measures for achieving cardinality consensus is also validated (Fig. 7). Another competitive strength of the algorithm lies in the flexibility of adjusting the average factors (Fig. 8). Furthermore, the algorithm consistently outperforms AA fusion across all missed detection probabilities (Fig. 9).  Conclusions  This paper achieves theoretically optimal multi-target density fusion by employing federated filtering as the merging method for single-target components. The proposed algorithm inherits the decorrelation capability and single-target optimality of federated filtering. A hierarchical fusion architecture is designed to satisfy the requirement for prior estimates. Extensive simulations demonstrate that: (1) the algorithm can accurately associate filtered components belonging to the same target, thereby extending single-target optimality to multi-target fusion tracking; (2) the algorithm supports flexible adjustment of average factors, with smaller values for the master filter consistently preferred; and (3) the superiority of the algorithm persists even under sensor malfunctions and high missed detection rates. Nonetheless, this study is limited to GM-PHD filters with overlapping Fields Of View (FOVs). Future work will investigate its applicability to other filter types and spatially non-overlapping FOVs.
Recent Advances of Programmable Schedulers
ZHAO Yazhu, GUO Zehua, DOU Songshi, FU Xiaoyang
Available online  , doi: 10.11999/JEIT250657
Abstract:
  Objective  In recent years, diversified user demands, dynamic application scenarios, and massive data transmissions have imposed increasingly stringent requirements on modern networks. Network schedulers play a critical role in ensuring efficient and reliable data delivery, enhancing overall performance and stability, and directly shaping user-perceived service quality. Traditional scheduling algorithms, however, rely largely on fixed hardware, with scheduling logic hardwired during chip design. These designs are inflexible, provide coarse and static scheduling granularity, and offer limited capability to represent complex policies. Therefore, they hinder rapid deployment, increase upgrade costs, and fail to meet the evolving requirements of heterogeneous and large-scale network environments. Programmable schedulers, in contrast, leverage flexible hardware architectures to support diverse strategies without hardware replacement. Scheduling granularity can be dynamically adjusted at the flow, queue, or packet level to meet varied application requirements with precision. Furthermore, they enable the deployment of customized logic through data plane programming languages, allowing rapid iteration and online updates. These capabilities significantly reduce maintenance costs while improving adaptability. The combination of high flexibility, cost-effectiveness, and engineering practicality positions programmable schedulers as a superior alternative to traditional designs. Therefore, the design and optimization of high-performance programmable schedulers have become a central focus of current research, particularly for data center networks and industrial Internet applications, where efficient, flexible, and controllable traffic scheduling is essential.  Methods  The primary objective of current research is to design universal, high-performance programmable schedulers. Achieving simultaneous improvements across multiple performance metrics, however, remains a major challenge. Hardware-based schedulers deliver high performance and stability but incur substantial costs and typically support only a limited range of scheduling algorithms, restricting their applicability in large-scale and heterogeneous network environments. In contrast, software-based schedulers provide flexibility in expressing diverse algorithms but suffer from inherent performance constraints. To integrate the high performance of hardware with the flexibility of software, recent designs of programmable schedulers commonly adopt First-In First-Out (FIFO) or Push-In First-Out (PIFO) queue architectures. These approaches emphasize two key performance metrics: scheduling accuracy and programmability. Scheduling accuracy is critical, as modern applications such as real-time communications, online gaming, telemedicine, and autonomous driving demand strict guarantees on packet timing and ordering. Even minor errors may result in increased latency, reduced throughput, or connection interruptions, compromising user experience and service reliability. Programmability, by contrast, enables network devices to adapt to diverse scenarios, supporting rapid deployment of new algorithms and flexible responses to application-specific requirements. Improvements in both accuracy and programmability are therefore essential for developing efficient, reliable, and adaptable network systems, forming the basis for future high-performance deployments.  Results and Discussions  The overall packet scheduling process is illustrated in (Fig. 1), where scheduling is composed of scheduling algorithms and schedulers. At the ingress or egress pipelines of end hosts or network devices, scheduling algorithms assign a Rank value to each packet, determining the transmission order based on relative differences in Rank. Upon arrival at the traffic manager, the scheduler sorts and forwards packets according to their Rank values. Through the joint operation of algorithms and schedulers, packet scheduling is executed while meeting quality-of-service requirements. A comparative analysis of the fundamental principles of FIFO and PIFO scheduling mechanisms (Fig. 2) highlights their differences in queue ordering and disorder control. At present, most studies on programmable schedulers build upon these two foundational architectures (Fig. 3), with extensions and optimizations primarily aimed at improving scheduling accuracy and programmability. Specific strategies include admission control, refinement of scheduling algorithms, egress control, and advancements in data structures and queue mechanisms. On this basis, the current research progress on programmable schedulers is reviewed and systematically analyzed. Existing studies are compared along three key dimensions: structural characteristics, expressive capability, and approximation accuracy (Table 1).  Conclusions  Programmable schedulers, as a key technology for next-generation networks, enable flexible traffic management and open new possibilities for efficient packet scheduling. This review has summarized recent progress in the design of programmable schedulers across diverse application scenarios. The background and significance of programmable schedulers within the broader packet scheduling process were first clarified. An analysis of domestic and international literature shows that most current studies focus on FIFO-based and PIFO-based architectures to improve scheduling accuracy and programmability. The design approaches of these two architectures were examined, the main technical methods for enhancing performance were summarized, and their structural characteristics, expressive capabilities, and approximation accuracy were compared, highlighting respective advantages and limitations. Potential improvements in existing research were also identified, and future development directions were discussed. Nevertheless, the design of a universal, high-performance programmable scheduler remains a critical challenge. Achieving optimal performance across multiple metrics while ensuring high-quality network services will require continued joint efforts from both academia and industry.
Research on ECG Pathological Signal Classification Empowered by Diffusion Generative Data
GE Beining, CHEN Nuo, JIN Peng, SU Xin, LU Xiaochun
Available online  , doi: 10.11999/JEIT250404
Abstract:
  Objective  ElectroCardioGram (ECG) signals are key indicators of human health. However, their complex composition and diverse features make visual recognition prone to errors. This study proposes a classification algorithm for ECG pathological signals based on data generation. A Diffusion Generative Network (DGN), also known as a diffusion model, progressively adds noise to real ECG signals until they approach a noise distribution, thereby facilitating model processing. To improve generation speed and reduce memory usage, a Knowledge Distillation-Diffusion Generative Network (KD-DGN) is proposed, which demonstrates superior memory efficiency and generation performance compared with the traditional DGN. This work compares the memory usage, generation efficiency, and classification accuracy of DGN and KD-DGN, and analyzes the characteristics of the generated data after lightweight processing. In addition, the classification effects of the original MIT-BIH dataset and an extended dataset (MIT-BIH-PLUS) are evaluated. Experimental results show that convolutional networks extract richer feature information from the extended dataset generated by DGN, leading to improved recognition performance of ECG pathological signals.  Methods  The generative network-based ECG signal generation algorithm is designed to enhance the performance of convolutional networks in ECG signal classification. The process begins with a Gaussian noise-based image perturbation algorithm, which obscures the original ECG data by introducing controlled randomness. This step simulates real-world variability, enabling the model to learn more robust representations. A diffusion generative algorithm is then applied to reconstruct and reproduce the data, generating synthetic ECG signals that preserve the essential characteristics of the original categories despite the added noise. This reconstruction ensures that the underlying features of ECG signals are retained, allowing the convolutional network to extract more informative features during classification. To improve efficiency, the approach incorporates knowledge distillation. A teacher-student framework is adopted in which a lightweight student model is trained from the original, more complex teacher ECG data generation model. This strategy reduces computational requirements and accelerates the data generation process, improving suitability for practical applications. Finally, two comparative experiments are designed to validate the effectiveness and accuracy of the proposed method. These experiments evaluate classification performance against existing approaches and provide quantitative evidence of its advantages in ECG signal processing.  Results and Discussions  The data generation algorithm yields ECG signals with a Signal-to-Noise Ratio (SNR) comparable to that of the original data, while presenting more discernible signal features. The student model constructed through knowledge distillation produces ECG samples with the same SNR as those generated by the teacher model, but with substantially reduced complexity. Specifically, the student model achieves a 50% reduction in size, 37.5% lower memory usage, and a 57% shorter runtime compared with the teacher model (Fig. 6). When the convolutional network is trained with data generated by the KD-DGN, its classification performance improves across all metrics compared with a convolutional network trained without KD-DGN. Precision reaches 95.7%, and the misidentification rate is reduced to approximately 3% (Fig. 9).  Conclusions  The DGN provides an effective data generation strategy for addressing the scarcity of ECG datasets. By supplying additional synthetic data, it enables convolutional networks to extract more diverse class-specific features, thereby improving recognition performance and reducing misidentification rates. Optimizing DGN with knowledge distillation further enhances efficiency, while maintaining SNR equivalence with the original DGN. This optimization reduces computational cost, conserves machine resources, and supports simultaneous task execution. Moreover, it enables the generation of new data without LOSS, allowing convolutional networks to learn from larger datasets at lower cost. Overall, the proposed approach markedly improves the classification performance of convolutional networks on ECG signals. Future work will focus on further algorithmic optimization for real-world applications.
Cross Modal Hashing of Medical Image Semantic Mining for Large Language Model
LIU Qinghai, WU Qianlin, LUO Jia, TANG Lun, XU Liming
Available online  , doi: 10.11999/JEIT250529
Abstract:
  Objective  A novel cross-modal hashing framework driven by Large Language Models (LLMs) is proposed to address the semantic misalignment between medical images and their corresponding textual reports. The objective is to enhance cross-modal semantic representation and improve retrieval accuracy by effectively mining and matching semantic associations between modalities.  Methods  The generative capacity of LLMs is first leveraged to produce high-quality textual descriptions of medical images. These descriptions are integrated with diagnostic reports and structured clinical data using a dual-stream semantic enhancement module, designed to reinforce inter-modality alignment and improve semantic comprehension. A structural similarity-guided hashing scheme is then developed to encode both visual and textual features into a unified Hamming space, ensuring semantic consistency and enabling efficient retrieval. To further enhance semantic alignment, a prompt-driven attention template is introduced to fuse image and text features through fine-tuned LLMs. Finally, a contrastive loss function with hard negative mining is employed to improve representation discrimination and retrieval accuracy.  Results and Discussions  Experiments are conducted on a multimodal medical dataset to compare the proposed method with existing cross-modal hashing baselines. The results indicate that the proposed method significantly outperforms baseline models in terms of precision and Mean Average Precision (MAP) (Table 3; Table 4). On average, a 7.21% improvement in retrieval accuracy and a 7.72% increase in MAP are achieved across multiple data scales, confirming the effectiveness of the LLM-driven semantic mining and hashing approach.  Conclusions  Experiments are conducted on a multimodal medical dataset to compare the proposed method with existing cross-modal hashing baselines. The results indicate that the proposed method significantly outperforms baseline models in terms of precision and Mean Average Precision (MAP) (Table 3; Table 4). On average, a 7.21% improvement in retrieval accuracy and a 7.72% increase in MAP are achieved across multiple data scales, confirming the effectiveness of the LLM-driven semantic mining and hashing approach.
Depression Screening Method Driven by Global-Local Feature Fusion
ZHANG Siyong, QIU Jiefan, ZHAO Xiangyun, XIAO Kejiang, CHEN Xiaofu, MAO Keji
Available online  , doi: 10.11999/JEIT250035
Abstract:
  Objective  Depression is a globally prevalent mental disorder that poses a serious threat to the physical and mental health of millions of individuals. Early screening and diagnosis are essential to reducing severe consequences such as self-harm and suicide. However, conventional questionnaire-based screening methods are limited by their dependence on the reliability of respondents’ answers, their difficulty in balancing efficiency with accuracy, and the uneven distribution of medical resources. New auxiliary screening approaches are therefore needed. Existing Artificial Intelligence (AI) methods for depression detection based on facial features primarily emphasize global expressions and often overlook subtle local cues such as eye features. Their performance also declines in scenarios where partial facial information is obscured, for instance by masks, and they raise privacy concerns. This study proposes a Global-Local Fusion Axial Network (GLFAN) for depression screening. By jointly extracting global facial and local eye features, this approach enhances screening accuracy and robustness under complex conditions. A corresponding dataset is constructed, and experimental evaluations are conducted to validate the method’s effectiveness. The model is deployed on edge devices to improve privacy protection while maintaining screening efficiency, offering a more objective, accurate, efficient, and secure depression screening solution that contributes to mitigating global mental health challenges.  Methods  To address the challenges of accuracy and efficiency in depression screening, this study proposes GLFAN. For long-duration consultation videos with partial occlusions such as masks, data preprocessing is performed using OpenFace 2.0 and facial keypoint algorithms, combined with peak detection, clustering, and centroid search strategies to segment the videos into short sequences capturing dynamic facial changes, thereby enhancing data validity. At the model level, GLFAN adopts a dual-branch parallel architecture to extract global facial and local eye features simultaneously. The global branch uses MTCNN for facial keypoint detection and enhances feature extraction under occlusion using an inverted bottleneck structure. The local branch detects eye regions via YOLO v7 and extracts eye movement features using a ResNet-18 network integrated with a convolutional attention module. Following dual-branch feature fusion, an integrated convolutional module optimizes the representation, and classification is performed using an axial attention network.  Results and Discussions  The performance of GLFAN is evaluated through comprehensive, multi-dimensional experiments. On the self-constructed depression dataset, high accuracy is achieved in binary classification tasks, and non-depression and severe depression categories are accurately distinguished in four-class classification. Under mask-occluded conditions, a precision of 0.72 and a precision of 0.690 are obtained for depression detection. Although these values are lower than the precision of 0.87 and precision of 0.840 observed under non-occluded conditions, reliable screening performance is maintained. Compared with other advanced methods, GLFAN achieves higher recall and F1 scores. On the public AVEC2013 and AVEC2014 datasets, the model achieves lower Mean Absolute Error (MAE) values and shows advantages in both short- and long-sequence video processing. Heatmap visualizations indicate that GLFAN dynamically adjusts its attention according to the degree of facial occlusion, demonstrating stronger adaptability than ResNet-50. Edge device tests further confirm that the average processing delay remains below 17.56 milliseconds per frame, and stable performance is maintained under low-bandwidth conditionsThe performance of GLFAN is evaluated through comprehensive, multi-dimensional experiments. On the self-constructed depression dataset, high accuracy is achieved in binary classification tasks, and non-depression and severe depression categories are accurately distinguished in four-class classification. Under mask-occluded conditions, a precision of 0.72 and a recall of 0.690 are obtained for depression detection. Although these values are lower than the precision of 0.87 and recall of 0.840 observed under non-occluded conditions, reliable screening performance is maintained. Compared with other advanced methods, GLFAN achieves higher recall and F1 scores. On the public AVEC2013 and AVEC2014 datasets, the model achieves lower Mean Absolute Error (MAE) values and shows advantages in both short- and long-sequence video processing. Heatmap visualizations indicate that GLFAN dynamically adjusts its attention according to the degree of facial occlusion, demonstrating stronger adaptability than ResNet-50. Edge device tests further confirm that the average processing delay remains below 17.56 frame/s, and stable performance is maintained under low-bandwidth conditions.  Conclusions  This study proposes a depression screening approach based on edge vision technology. A lightweight, end-to-end GLFAN is developed to address the limitations of existing screening methods. The model integrates global facial features extracted via MTCNN with local eye-region features captured by YOLO v7, followed by effective feature fusion and classification using an Axial Transformer module. By emphasizing local eye-region information, GLFAN enhances performance in occluded scenarios such as mask-wearing. Experimental validation using both self-constructed and public datasets demonstrates that GLFAN reduces missed detections and improves adaptability to short-duration video inputs compared with existing models. Grad-CAM visualizations further reveal that GLFAN prioritizes eye-region features under occluded conditions and shifts focus to global facial features when full facial information is available, confirming its context-specific adaptability. The model has been successfully deployed on edge devices, offering a lightweight, efficient, and privacy-conscious solution for real-time depression screening.
Precise Hand Joint Motion Analysis Driven by Complex Physiological Information
YAN Jiaqing, LIU Gengchen, ZHOU Qingqi, XUE Weiqi, ZHOU Weiao, TIAN Yunzhi, WANG Jiaju, DONG Zhekang, LI Xiaoli
Available online  , doi: 10.11999/JEIT250033
Abstract:
  Objective  The human hand is a highly dexterous organ essential for performing complex tasks. However, dysfunction due to trauma, congenital anomalies, or disease substantially impairs daily activities. Restoring hand function remains a major challenge in rehabilitation medicine. Virtual Reality (VR) technology presents a promising approach for functional recovery by enabling hand pose reconstruction from surface ElectroMyoGraphy (sEMG) signals, thereby facilitating neural plasticity and motor relearning. Current sEMG-based hand pose estimation methods are limited by low accuracy and coarse joint resolution. This study proposes a new method to estimate the motion of 15 hand joints using eight-channel sEMG signals, offering a potential improvement in rehabilitation outcomes and quality of life for individuals with hand impairment.  Methods  The proposed method, termed All Hand joints Posture Estimation (AHPE), incorporates a continuous denoising network that combines sparse attention and multi-channel attention mechanisms to extract spatiotemporal features from sEMG signals. A dual-decoder architecture estimates both noisy hand poses and the corresponding correction ranges. These outputs are subsequently refined using a Bidirectional Long Short-Term Memory (BiLSTM) network to improve pose accuracy. Model training employs a composite loss function that integrates Mean Squared Error (MSE) and Kullback-Leibler (KL) divergence to enhance joint angle estimation and capture inter-joint dependencies. Performance is evaluated using the NinaproDB8 and NinaproDB5 datasets, which provide sEMG and hand pose data for single-finger and multi-finger movements, respectively.  Results and Discussions  The AHPE model outperforms existing methods—including CNN-Transformer, DKFN, CNN-LSTM, TEMPOnet, and RPC-Net—in estimating hand poses from multi-channel sEMG signals. In within-subject validation (Table 1), AHPE achieves a Root Mean Squared Error (RMSE) of 2.86, a coefficient of determination (R2) of 0.92, and a Mean Absolute Deviation (MAD) of 1.79° for MetaCarPophalangeal (MCP) joint rotation angle estimation. In between-subject validation (Table 2), the model maintains high accuracy with an RMSE of 3.72, an R2 of 0.88, and an MAD of 2.36°, demonstrating strong generalization. The model’s capacity to estimate complex hand gestures is further confirmed using the NinaproDB5 dataset. Estimated hand poses are visualized with the Mano Torch hand model (Fig. 4, Fig. 5). The average R2 values for finger joint extension estimation are 0.72 (thumb), 0.692 (index), 0.696 (middle), 0.689 (ring), and 0.696 (little finger). Corresponding RMSE values are 10.217°, 10.257°, 10.290°, 10.293°, and 10.303°, respectively. A grid error map (Fig. 6) highlights prediction accuracy, with red regions indicating higher errors.  Conclusions  The AHPE model offers an effective approach for estimating hand poses from sEMG signals, addressing key challenges such as signal noise, high dimensionality, and inter-individual variability. By integrating mixed attention mechanisms with a dual-decoder architecture, the model enhances both accuracy and robustness in multi-joint hand pose estimation. Results confirm the model’s capacity to reconstruct detailed hand kinematics, supporting its potential for applications in hand function rehabilitation and human-machine interaction. Future work will aim to improve robustness under real-world conditions, including sensor noise and environmental variation.
Breakthrough in Solving NP-Complete Problems Using Electronic Probe Computers
XU Jin, YU Le, YANG Huihui, JI Siyuan, ZHANG Yu, YANG Anqi, LI Quanyou, LI Haisheng, ZHU Enqiang, SHI Xiaolong, WU Pu, SHAO Zehui, LENG Huang, LIU Xiaoqing
Available online  , doi: 10.11999/JEIT250352
Abstract:
This study presents a breakthrough in addressing NP-complete problems using a newly developed Electronic Probe Computer (EPC60). The system employs a hybrid serial–parallel computational model and performs large-scale parallel operations through seven probe operators. In benchmark tests on 3-coloring problems in graphs with 2,000 vertices, EPC60 achieves 100% accuracy, outperforming the mainstream solver Gurobi, which succeeds in only 6% of cases. Computation time is reduced from 15 days to 54 seconds. The system demonstrates high scalability and offers a general-purpose solution for complex optimization problems in areas such as supply chain management, finance, and telecommunications.  Objective   NP-complete problems pose a fundamental challenge in computer science. As problem size increases, the required computational effort grows exponentially, making it infeasible for traditional electronic computers to provide timely solutions. Alternative computational models have been proposed, with biological approaches—particularly DNA computing—demonstrating notable theoretical advances. However, DNA computing systems continue to face major limitations in practical implementation.  Methods  Computational Model: EPC is based on a non-Turing computational model in which data are multidimensional and processed in parallel. Its database comprises four types of graphs, and the probe library includes seven operators, each designed for specific graph operations. By executing parallel probe operations, EPC efficiently addresses NP-complete problems.Structural Features:EPC consists of four subsystems: a conversion system, input system, computation system, and output system. The conversion system transforms the target problem into a graph coloring problem; the input system allocates tasks to the computation system; the computation system performs parallel operations via probe computation cards; and the output system maps the solution back to the original problem format.EPC60 features a three-tier hierarchical hardware architecture comprising a control layer, optical routing layer, and probe computation layer. The control layer manages data conversion, format transformation, and task scheduling. The optical routing layer supports high-throughput data transmission, while the probe computation layer conducts large-scale parallel operations using probe computation cards.  Results and Discussions  EPC60 successfully solved 100 instances of the 3-coloring problem for graphs with 2,000 vertices, achieving a 100% success rate. In comparison, the mainstream solver Gurobi succeeded in only 6% of cases. Additionally, EPC60 rapidly solved two 3-coloring problems for graphs with 1,500 and 2,000 vertices, which Gurobi failed to resolve after 15 days of continuous computation on a high-performance workstation.Using an open-source dataset, we identified 1,000 3-colorable graphs with 1,000 vertices and 100 3-colorable graphs with 2,000 vertices. These correspond to theoretical complexities of O(1.3289n) for both cases. The test results are summarized in Table 1.Currently, EPC60 can directly solve 3-coloring problems for graphs with up to n vertices, with theoretical complexity of at least O(1.3289n).On April 15, 2023, a scientific and technological achievement appraisal meeting organized by the Chinese Institute of Electronics was held at Beijing Technology and Business University. A panel of ten senior experts conducted a comprehensive technical evaluation and Q&A session. The committee reached the following unanimous conclusions:1. The probe computer represents an original breakthrough in computational models.2. The system architecture design demonstrates significant innovation.3. The technical complexity reaches internationally leading levels.4. It provides a novel approach to solving NP-complete problems.Experts at the appraisal meeting stated, “This is a major breakthrough in computational science achieved by our country, with not only theoretical value but also broad application prospects.” In cybersecurity, EPC60 has also demonstrated remarkable potential. Supported by the National Key R&D Program of China (2019YFA0706400), Professor Xu Jin’s team developed an automated binary vulnerability mining system based on a function call graph model. Evaluation of the system using the Modbus Slave software showed over 95% vulnerability coverage, far exceeding the 75 vulnerabilities detected by conventional depth-first search algorithms. The system also discovered a previously unknown flaw, the “Unauthorized Access Vulnerability in Changyuan Shenrui PRS-7910 Data Gateway” (CNVD-2020-31406), highlighting EPC60’s efficacy in cybersecurity applications.The high efficiency of EPC60 derives from its unique computational model and hardware architecture. Given that all NP-complete problems can be polynomially reduced to one another, EPC60 provides a general-purpose solution framework. It is therefore expected to be applicable in a wide range of domains, including supply chain management, financial services, telecommunications, energy, and manufacturing.  Conclusions   The successful development of EPC offers a novel approach to solving NP-complete problems. As technological capabilities continue to evolve, EPC is expected to demonstrate strong computational performance across a broader range of application domains. Its distinctive computational model and hardware architecture also provide important insights for the design of next-generation computing systems.
Personalized Federated Learning Method Based on Collation Game and Knowledge Distillation
SUN Yanhua, SHI Yahui, LI Meng, YANG Ruizhe, SI Pengbo
Available online  , doi: 10.11999/JEIT221203
Abstract:
To overcome the limitation of the Federated Learning (FL) when the data and model of each client are all heterogenous and improve the accuracy, a personalized Federated learning algorithm with Collation game and Knowledge distillation (pFedCK) is proposed. Firstly, each client uploads its soft-predict on public dataset and download the most correlative of the k soft-predict. Then, this method apply the shapley value from collation game to measure the multi-wise influences among clients and quantify their marginal contribution to others on personalized learning performance. Lastly, each client identify it’s optimal coalition and then distill the knowledge to local model and train on private dataset. The results show that compared with the state-of-the-art algorithm, this approach can achieve superior personalized accuracy and can improve by about 10%.
The Range-angle Estimation of Target Based on Time-invariant and Spot Beam Optimization
Wei CHU, Yunqing LIU, Wenyug LIU, Xiaolong LI
Available online  , doi: 10.11999/JEIT210265
Abstract:
The application of Frequency Diverse Array and Multiple Input Multiple Output (FDA-MIMO) radar to achieve range-angle estimation of target has attracted more and more attention. The FDA can simultaneously obtain the degree of freedom of transmitting beam pattern in angle and range. However, its performance is degraded due to the periodicity and time-varying of the beam pattern. Therefore, an improved Estimating Signal Parameter via Rotational Invariance Techniques (ESPRIT) algorithm to estimate the target’s parameters based on a new waveform synthesis model of the Time Modulation and Range Compensation FDA-MIMO (TMRC-FDA-MIMO) radar is proposed. Finally, the proposed method is compared with identical frequency increment FDA-MIMO radar system, logarithmically increased frequency offset FDA-MIMO radar system and MUltiple SIgnal Classification (MUSIC) algorithm through the Cramer Rao lower bound and root mean square error of range and angle estimation, and the excellent performance of the proposed method is verified.
The 3rd Intelligent Aerospace Forum - Special Topic on Intelligent Processing and Application Technology of Satellite Information
Remote Sensing Data Intelligent Interpretation Task Scheduling Algorithm Based on Heterogeneous Platform
HAO Lijiang, TIAN Luyun, SUN Peng, CHEN Jian, LIU Pengying, HE Guangjun, LOU Shuqin
Available online  , doi: 10.11999/JEIT251072
Abstract:
  Objective   Intelligent remote sensing data intelligent interpretation tasks executed on heterogeneous platforms exhibit diverse task types, heterogeneous resources, sensitivity to real-time environmental disturbances, and inter-task resource contention. These characteristics often lead to load imbalance and reduced resource utilization across platforms. Therefore, adaptive and efficient scheduling of complex multi-task workloads in resource-heterogeneous environments remains a central challenge in heterogeneous platform task scheduling.  Methods   A Heterogeneous Remote Sensing–Intelligent Task Scheduling (HRS-ITS) algorithm is proposed. The CP-SAT optimizer is enhanced by incorporating four score factors—data affinity, load balancing, makespan prediction, and cross-device transmission efficiency—as optimization objectives to generate an initial task–resource mapping. An adaptive resource-scaling–based Dueling Double Deep Q-Network (D3QN) model is then constructed to optimize task execution sequences for makespan reduction. Resource allocation is dynamically adjusted to eliminate idle time during task queuing, enabling dynamic resource perception and configuration optimization.  Results and Discussions   By integrating static optimization with dynamic adaptation, the HRS-ITS algorithm improves scheduling efficiency and resource utilization on heterogeneous platforms, providing an effective solution for complex remote sensing data intelligent interpretation tasks.  Conclusions   The proposed framework combines global optimization with dynamic adaptation to achieve computationally efficient real-time remote sensing processing. It provides a basis for extension to more complex task dependencies and larger-scale clusters.
A One-Shot Object Detection Method Fusing Dual-Branch Optimized SAM and Global-Local Collaborative Matching
FAN Shenghua, YIN Hang, LIU Jian, QU Tao
Available online  , doi: 10.11999/JEIT250982
Abstract:
  Objective  DOS-GLNet targets high-precision object recognition and localization through hierarchical collaboration of model components, using only a single query image with novel category prototypes and a target image. The method follows a two-layer architecture consisting of feature extraction and matching interaction. In the feature extraction layer, the Segment Anything Model (SAM) is adopted as the base extractor and is fine-tuned using a dual-branch strategy. This strategy preserves SAM’s general visual and category-agnostic perception while enhancing local spatial detail representation. A multi-scale module is further incorporated to construct a feature pyramid and address the single-scale limitation of SAM. In the matching interaction layer, a global-local collaborative two-stage matching mechanism is designed. The Global Matching Module (GMM) performs coarse-grained semantic alignment by suppressing background responses and guiding the Region Proposal Network (RPN) to generate high-quality candidate regions. The Bidirectional Local Matching Module (BLMM) then establishes fine-grained spatial correspondence between candidate regions and the query image to capture part-level associations.  Methods  A detection network based on Dual-Branch Optimized SAM and Global-Local Collaborative Matching, termed DOS-GLNet, is proposed. The main contributions are as follows. (1) In the feature matching stage, a two-stage global-local matching mechanism is constructed. A GMM is embedded before the RPN to achieve robust matching of overall target features using a large receptive field. (2) A BLMM is embedded before the detection head to capture pixel-level, bidirectional fine-grained semantic correlation through a four-dimensional correlation tensor. This progressive matching strategy establishes cross-sample correlations in the feature space, optimizes feature representation, and improves object localization accuracy.  Results and Discussions  On the Pascal VOC dataset, the proposed method is compared with SiamRPN, which was originally developed for one-shot tracking and is adapted for detection due to task similarity, as well as OSCD, CoAE, AIT, BHRL, and BSPG. The results show that the proposed method outperforms all baseline approaches and achieves stronger overall one-shot detection performance. On the MS COCO dataset, comparative methods include SiamMask, CoAE, AIT, BHRL, and BSPG. Although base and novel class performance varies across different data splits, consistent trends are observed. DOS-GLNet matches state-of-the-art performance on base classes while maintaining strong accuracy on fully trained categories. It further achieves state-of-the-art results on novel classes, with an average improvement of approximately 2%. These results indicate more effective feature alignment and relationship modeling based on one-shot samples, as well as improved representation of novel class features under limited prior information.  Conclusions  Conclusions To improve feature optimization in one-shot object detection, enhancements are introduced at both the backbone network and the matching mechanism levels. A DOS-GLNet framework based on dual-branch optimized SAM and global-local collaborative matching is proposed. For the backbone, a SAM-based dual-branch fine-tuning feature extraction network is constructed. Lightweight adapters are integrated into the SAM encoder to enable parameter-efficient fine-tuning, preserving generalization capability while improving task adaptability. In parallel, a convolutional local branch is designed to strengthen local feature perception, and cross-layer fusion is applied to enhance local detail representation. A multi-scale module further increases the scale diversity of the feature pyramid. For feature matching, a two-stage global-local collaborative strategy is adopted. Global matching focuses on target-level semantic alignment, whereas local matching refines instance-level detail discrimination. Together, these designs effectively improve one-shot object detection performance.
PATC: Prototype Alignment and Topology-Consistent Pseudo-Supervision for Multimodal Semi-Supervised Semantic Segmentation of Remote Sensing Images
HAN Wenqi, JIANG Wen, GENG Jie, BAO Yanchen
Available online  , doi: 10.11999/JEIT251115
Abstract:
  Objective   The high annotation cost of remote sensing data and the heterogeneity between optical and SAR modalities limit the performance and scalability of semantic segmentation systems. This study examines a practical semi-supervised setting where only a small set of paired optical–SAR samples is labeled, whereas numerous single-modality SAR images remain unlabeled. The objective is to design a semi-supervised multimodal framework capable of learning discriminative and topology-consistent fused representations under sparse labels by aligning cross-modal semantics and preserving structural coherence through pseudo-supervision. The proposed Prototype Alignment and Topology Consistent (PATC) method aims to achieve robust land-cover segmentation on challenging multimodal datasets, improving region-level accuracy and connectivity-aware structure quality.  Methods   PATC adopts a teacher–student framework that exploits limited labeled optical–SAR pairs and abundant unlabeled SAR data. A shared semantic prototype space is first constructed to reduce modality gaps, where class prototypes are updated with a momentum mechanism for stability. A prototype-level contrastive alignment strategy enhances intra-class compactness and inter-class separability, guiding optical and SAR features of the same category to cluster around unified prototypes and improving cross-modal semantic consistency. To preserve structural integrity, a topology-consistent pseudo-supervision mechanism is incorporated. Inspired by persistent homology, a topology-aware loss constrains the teacher-generated pseudo-labels by penalizing errors such as incorrect formation or removal of connected components and holes. This structural constraint complements pixel-wise losses by maintaining boundary continuity and fine structures (e.g., roads and rivers), ensuring that pseudo-supervised learning remains geometrically and topologically coherent.  Results and Discussions   Experiments show that PATC reduces cross-modal semantic misalignment and topology degradation. By regularizing pseudo-labels with a topology-consistent loss derived from persistent homology, the method preserves connectivity and boundary integrity, especially for thin or fragmented structures. Evaluations on the WHU-OPT-SAR and Suzhou datasets demonstrate consistent improvements over state-of-the-art fully supervised and semi-supervised baselines under 1/16, 1/8, and 1/4 label regimes (Fig. 4, Fig. 5, Fig. 6; Table 3, Table 4). Ablation studies confirm the complementary roles of prototype alignment and topology regularization (Table 5). The findings indicate that unlabeled SAR data provides structural priors that, when used through topology-aware consistency and prototype-level alignment, substantially enhance multimodal fusion under sparse annotation.  Conclusions   This study proposes PATC, a multimodal semi-supervised semantic segmentation method that addresses limited annotations, modality misalignment, and weak generalization. PATC constructs multimodal semantic prototypes in a shared feature subspace and applies prototype-level contrastive learning to improve cross-modal consistency and feature discriminability. A topology-consistent loss based on persistent homology further regularizes the student network, improving the connectivity and structural stability of segmentation results. By incorporating structural priors from unlabeled SAR data within a teacher–student framework with EMA updates, PATC achieves robust multimodal feature fusion and accurate segmentation under scarce labels. Future work will expand topology-based pseudo-supervision to broader multimodal configurations and integrate active learning to refine pseudo-label quality.
A Deep Reinforcement Learning Enhanced Adaptive Large Neighborhood Search for Imaging Satellite Scheduling
WEI Puyuan, HE Lei
Available online  , doi: 10.11999/JEIT251009
Abstract:
  Objective  The Satellite Scheduling Problem (SSP) is a typical NP-hard combinatorial optimization problem. The objective is to maximize observation benefits or the number of completed tasks under complex physical and operational constraints. Adaptive Large Neighborhood Search (ALNS) is an effective metaheuristic for this class of problems; however, its performance strongly depends on the selection of destroy and repair operators. Traditional ALNS methods usually employ heuristic scoring mechanisms based on historical performance to adjust operator selection probabilities. These mechanisms are sensitive to parameter settings and cannot adapt dynamically to complex state changes during the search process. This study aims to address this limitation and proposes an improved algorithm to enhance ALNS performance for SSP.  Methods  To achieve this objective, a Deep Reinforcement Learning based Adaptive Large Neighborhood Search algorithm (DR-ALNS) is proposed. The operator selection process is formulated as a Markov Decision Process (MDP). A Deep Reinforcement Learning (DRL) agent is employed to select destroy and repair operators dynamically according to the current solution state at each iteration. Through end-to-end learning, the DRL agent acquires an implicit and efficient operator selection strategy. This strategy guides the search process and improves both global exploration and local exploitation. Experiments are conducted on a standard satellite scheduling test suite, and the results indicate that DR-ALNS outperforms conventional ALNS and other comparison algorithms in solution quality and convergence speed.  Results and Discussions  To verify the effectiveness of DR-ALNS, experiments are conducted on 100 scenarios selected from the Tianzhi-Cup dataset. These scenarios are classified into small, medium, and large categories based on the number of task strips. The experimental results are summarized in Table 4, and detailed comparisons of average scores across scenario types are reported in Table 5. In small scenarios, the average score of DR-ALNS is 0.8% higher than that of the comparison algorithms. In medium scenarios, the average score exceeds that of the second-ranked algorithm by 2.5%. In large scenarios, DR-ALNS outperforms the second-ranked algorithm by 3.6%.  Conclusions  A DR-ALNS model for the SSP is proposed. By integrating DRL, destroy and repair operator selection and destruction coefficient settings in ALNS are dynamically guided through iterative learning of solution states. This strategy accelerates convergence toward high-quality solutions. Experiments on the Tianzhi-Cup dataset confirm the effectiveness of the proposed method, with clear advantages over A-ALNS and GRILS, particularly in large-scale satellite cluster scheduling. Future studies will evaluate the method in ultra-large-scale scenarios to assess stability and will explore adaptation to dynamic constraints to enhance practical applicability.
Special Topic on Converged Cloud and Network Environment
Robust Resource Allocation Algorithm for Active Reconfigurable Intelligent Surface-Assisted Symbiotic Secure Communication Systems
MA Rui, LI Yanan, TIAN Tuanwei, LIU Shuya, DENG Hao, ZHANG Jinlong
Available online  , doi: 10.11999/JEIT250811
Abstract:
  Objective  Research on Reconfigurable Intelligent Surface (RIS)-assisted symbiotic radio systems is mainly centered on passive RIS. In practice, passive RIS suffers from a pronounced double-fading effect, which restricts capacity gains in scenarios dominated by strong direct paths. This work examines the use of active RIS, whose amplification capability increases the signal-to-noise ratio of the secondary signal and strengthens the security of the primary signal. Imperfect Successive Interference Cancellation (SIC) is considered, and a penalized Successive Convex Approximation (SCA) algorithm based on alternating optimization is analyzed to enable robust resource allocation.  Methods  The original optimization problem is difficult to address directly because it contains complex and non-convex constraints. An alternating optimization strategy is therefore adopted to decompose the problem into two subproblems: the design of the transmit beamforming vector at the primary transmitter and the design of the reflection coefficient matrix at the active RIS. Variable substitution, equivalent transformation, and a penalty-based SCA method are then applied in an alternating iterative manner. For the beamforming design, the rank-one constraint is first transformed into an equivalent form. The penalty-based SCA method is used to recover the rank-one optimal solution, after which iterative optimization is carried out to obtain the final result. For the reflection coefficient matrix design, the problem is reformulated and auxiliary variables are introduced to avoid feasibility issues. A penalty-based SCA approach is then used to handle the rank-one constraint, and the solution is obtained using the CVX toolbox. Based on these procedures, a penalty-driven robust resource allocation algorithm is established through alternating optimization.  Results and Discussions  The convergence curves of the proposed algorithm under different numbers of primary transmitter antennas (K) and RIS reflecting elements (N) is shown (Fig.3). The total system power consumption decreases as the number of iterations increases and converges within a finite number of steps. The relationship between total power consumption and the Signal-to-Interference-and-Noise Ratio (SINR) threshold of the secondary signal is illustrated (Fig. 4). As the SINR threshold increases, the system requires more power to maintain the minimum service quality of the secondary signal, which results in higher total power consumption. In addition, as the imperfect interference cancellation factor decreases, the total power consumption is further reduced. To compare performance, three baseline algorithms are examined (Fig. 5): the passive RIS, the active RIS with random phase shift, and the non-robust algorithm. The total system power consumption under the proposed algorithm remains lower than that of the passive RIS and the active RIS with random phase shift. Although the active RIS consumes additional power, the corresponding reduction in transmit power more than compensates for this consumption, thereby improving overall energy efficiency. When random phase shifts are applied, the active beamforming and amplification capabilities of the RIS cannot be fully utilized. This forces the primary transmitter to compensate alone to meet performance constraints, which increases its power consumption. Furthermore, because imperfect SIC is considered in the proposed algorithm, additional transmit power is required to counter residual interference and satisfy the minimum SINR constraint of the secondary system. Therefore, the total power consumption remains higher than that of the non-robust algorithm. The effect of the secrecy rate threshold of the primary signal on the secure energy efficiency of the primary system under different values of N is shown (Fig. 6). The results indicate that an optimal secrecy rate threshold exists that maximizes the secure energy efficiency of the primary system. To investigate the effect of active RIS placement on total system power consumption, the node positions are rearranged (Fig. 7). As the active RIS is positioned closer to the receiver, the fading effect weakens and the total system power consumption decreases.  Conclusions  This paper investigates the total power consumption of an active RIS-assisted symbiotic secure communication system under imperfect SIC. To enhance system energy efficiency, a total power minimization problem is formulated with constraints on the quality of service for both primary and secondary signals and on the power and phase shift of the active RIS. To address the non-convexity introduced by uncertain disturbance parameters, variable substitution, equivalent transformation, and a penalty-based SCA method are applied to convert the original formulation into a convex optimization problem. Simulation results confirm the effectiveness of the proposed algorithm and show that it achieves a notable reduction in total system power consumption compared with benchmark schemes.
Service Migration Algorithm for Satellite-terrestrial Edge Computing Networks
FENG Yifan, WU Weihong, SUN Gang, WANG Ying, LUO Long, YU Hongfang
Available online  , doi: 10.11999/JEIT250835
Abstract:
  Objective   In highly dynamic Satellite-Terrestrial Edge Computing Networks (STECN), achieving coordinated optimization between user service latency and system migration cost is a central challenge in service migration algorithm design. Existing approaches often fail to maintain stable performance in such environments. To address this, a Multi-Agent Service Migration Optimization (MASMO) algorithm based on multi-agent deep reinforcement learning is proposed to provide an intelligent and forward-looking solution for dynamic service management in STECN.  Methods   The service migration optimization problem is formulated as a Multi-Agent Markov Decision Process (MAMDP), which offers a framework for sequential decision-making under uncertainty. The environment represents the spatiotemporal characteristics of a Low Earth Orbit (LEO) satellite network, where satellite movement and satellite-user visibility define time-varying service availability. Service latency is expressed as the sum of transmission delay and computation delay. Migration cost is modeled as a function of migration distance between satellite nodes to discourage frequent or long-range migrations. A Trajectory-Aware State Enhancement (TASE) method is proposed to incorporate predictable orbital information of LEO satellites into the agent state representation, improving proactive and stable migration actions. Optimization is performed using the recurrent Multi-Agent Proximal Policy Optimization (rMAPPO) algorithm, which is suitable for cooperative multi-agent tasks. The reward function balances the objectives by penalizing high migration cost and rewarding low service latency.  Results and Discussions  Simulations are conducted in dynamic STECN scenarios to compare MASMO with MAPPO, MADDPG, Greedy, and Random strategies. The results consistently confirm the effectiveness of MASMO. As the number of users increases, MASMO shows slower performance degradation. With 16 users, it reduces average service latency by 2.90%, 6.78%, 11.01%, and 14.63% compared with MAPPO, MADDPG, Greedy, and Random. It also maintains high cost efficiency, lowering migration cost by up to 14.69% at 12 users (Fig. 3). When satellite resources increase, MASMO consistently leverages the added availability to reduce both latency and migration cost, whereas myopic strategies such as Greedy do not exhibit similar improvements. With 10 satellites, MASMO achieves the lowest service latency and outperforms the next-best method by 7.53% (Fig. 4). These findings show that MASMO achieves an effective balance between transmission latency and migration latency through its forward-looking decision policy.  Conclusions   This study addresses the service migration challenge in STECN through the MASMO algorithm, which integrates the TASE method with rMAPPO. The method improves service latency and reduces migration cost at the same time, demonstrating strong performance advantages. The trajectory-enhanced state representation improves foresight and stability of migration behavior in predictable dynamic environments. This study assumes ideal real-time state perception, and future work should evaluate communication delays and partial observability, as well as investigate scalability in larger satellite constellations with heterogeneous user demands.
Satellite Navigation
Research on GRI Combination Design of eLORAN System
LIU Shiyao, ZHANG Shougang, HUA Yu
Available online  , doi: 10.11999/JEIT201066
Abstract:
To solve the problem of Group Repetition Interval (GRI) selection in the construction of the enhanced LORAN (eLORAN) system supplementary transmission station, a screening algorithm based on cross interference rate is proposed mainly from the mathematical point of view. Firstly, this method considers the requirement of second information, and on this basis, conducts a first screening by comparing the mutual Cross Rate Interference (CRI) with the adjacent Loran-C stations in the neighboring countries. Secondly, a second screening is conducted through permutation and pairwise comparison. Finally, the optimal GRI combination scheme is given by considering the requirements of data rate and system specification. Then, in view of the high-precision timing requirements for the new eLORAN system, an optimized selection is made in multiple optimal combinations. The analysis results show that the average interference rate of the optimal combination scheme obtained by this algorithm is comparable to that between the current navigation chains and can take into account the timing requirements, which can provide referential suggestions and theoretical basis for the construction of high-precision ground-based timing system.