Advanced Search
Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Graph-structured Data-driven Topology Inference for Non-cooperative Clustered Wireless Communication Networks
HOU Changbo, FU Dingyi, SONG Zhen, WANG Bin, ZHOU Zhichao
 doi: 10.11999/JEIT250084
[Abstract](0) [FullText HTML](0) [PDF 7729KB](0)
Abstract:
  Objective  The emergence of clustered target communication networks complicates electromagnetic environment detection in non-cooperative scenarios, creating challenges for electromagnetic situation awareness and electronic countermeasures. Existing research seldom addresses topology prediction under conditions with no prior knowledge, where the absence of explicit structural information and the dynamic nature of the networks hinder accurate inference. This study investigates topology prediction for non-cooperative clustered wireless communication networks using graph-structured data-driven approaches. Specifically, it evaluates the performance of multiple topology inference methods, including the Multivariate Hawkes Process (MHP), Peter–Clark Momentary Conditional Independence (PCMCI), Graph Encoder–Decoder (GED), and Graph Convolutional Networks (GCN). The effects of network properties such as node count and edge probability on inference accuracy are analyzed. Additionally, a hybrid framework that integrates statistical models with graph-based learning is explored to improve inference accuracy and computational efficiency.  Methods  The proposed methodology combines causal inference with Graph Neural Network (GNN)–based learning. Adjacency matrices are first generated through causal discovery, using time-domain matrices derived from simulated wireless communication events. These matrices are constructed by thresholding power spectra to yield binary communication states. The GNN module subsequently refines the causal discovery output by suppressing false positives and optimizing global topology through encoder–decoder operations with multi-head attention mechanisms. To assess robustness, synthetic datasets are generated with NS-3 simulations under varying conditions: edge probabilities (0.15–0.60), node densities (8–13 nodes), sampling durations (0.05–0.30 ms), and node feature completeness (partial, 50%; full, 100%). Connectivity patterns are modeled by incorporating distance-adjusted edge probabilities. Performance evaluation uses F1-score, accuracy, recall, and inference time, with systematic comparison across baseline models (MHP, PCMCI, GCN, GED) and hybrid variants (PCMCI+GED, MHP+GED).  Results and Discussions  The PCMCI+GED hybrid framework consistently achieves superior topology prediction across diverse network configurations. At an edge probability of 0.45, PCMCI+GED with full node features attains an F1-score of 0.808, exceeding the performance of standalone PCMCI and GED by 31.1% and 4.9%, respectively (Fig. 7). This improvement arises from the synergy between causal priors and graph neural networks: PCMCI establishes preliminary causal relationships, while GED refines inference through global attention mechanisms that reduce false positives. Comparative analysis reveals that richer node features enhance topology inference in causal inference methods (Fig. 7). For example, MHP+GED with full features exceeds its 50% feature counterpart by 2.10%, and PCMCI+GED with full features improves by 3.04%. Yet, the most substantial gains come from combining causal inference with GED. Relative to standalone MHP with full features, MHP+GED improves by 30.65% with 50% features and 33.40% with full features. Similarly, PCMCI+GED improves by 34.43% and 38.51% under the same conditions. In contrast, relying solely on GNNs proves insufficient for modeling causal relationships. GED alone performs similarly to GCN, with Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) values of 0.0149 and 0.0206, respectively (Fig. 8). Without causal priors, GED offers no significant advantage over GCN; however, when priors are incorporated, GED outperforms GCN in inference accuracy (Fig. 9). Sampling duration analysis shows that 0.1 ms is optimal, balancing redundancy and information loss (Fig. 10, Table 2). Shorter intervals (0.05 ms) inflate computational costs through high-dimensional operations, whereas longer intervals (0.30 ms) obscure temporal dependencies, reducing the F1-score of PCMCI+GED with full features to 40.57% of its value at 0.1 ms. Efficiency evaluations highlight trade-offs between accuracy and runtime. With 50% node features, PCMCI+GED reduces inference time by 88.63% while retaining 96.96% of its F1-score. Under the same conditions, MHP+GED reduces inference time by 85.48% with only a 2.07% drop in performance (Fig. 11). PCMCI’s exponential complexity makes it computationally prohibitive in high-dimensional settings, whereas MHP’s quadratic scaling with node count and linear scaling with event frequency yield more modest efficiency gains. In low-dimensional settings, however, MHP’s event-driven computation leads to longer runtimes than PCMCI. Heatmap analysis further confirms the precision of the hybrid models. Adjacency matrices generated by PCMCI+GED and MHP+GED with full features closely align with the ground truth, demonstrating high predictive accuracy (Fig. 9). In sparse networks, standalone PCMCI introduces noise by linking non-interacting nodes, while GCN generates fragmented predictions due to the absence of causal priors. The hybrid framework alleviates these limitations by combining PCMCI’s local causal inference with GED’s global optimization. Overall, the hybrid framework addresses key shortcomings of individual methods: the high computational cost of PCMCI and MHP, and the limited interpretability of GNNs. By integrating causal discovery with graph-based deep learning, the model achieves state-of-the-art predictive accuracy while maintaining scalability. Its performance highlights the potential for real-time applications in resource-constrained environments, emphasizing the importance of balancing causal priors and data-driven learning for advancing non-cooperative wireless communication network analysis.  Conclusions  This study demonstrates the effectiveness of combining PCMCI-based causal inference with GED-enhanced GNN for topology prediction in non-cooperative clustered wireless communication networks. The hybrid model achieves state-of-the-art accuracy, particularly in dense networks, while partial node feature inputs substantially reduce computational overhead. Although the exponential complexity of PCMCI constrains scalability in high-dimensional settings, integration with GED alleviates this limitation through feature reduction and global optimization. The findings highlight the need to balance accuracy and efficiency in practical applications, where GCN offers a viable option for real-time inference. Future research will explore attention mechanisms and self-supervised learning to further enhance robustness. These advancements hold promise for improving electromagnetic situation awareness and electronic countermeasure strategies in dynamic adversarial environments.
Pareto Optimization of Sensing and Communication Performance of Near-field Integrated Sensing and Communication System
ZHANG Guangchi, XIE Zhili, CUI Miao, WU Qingqing
 doi: 10.11999/JEIT250231
[Abstract](6) [FullText HTML](3) [PDF 1483KB](0)
Abstract:
  Objective  With the rapid development of Sixth-Generation (6G) communication technology, Integrated Sensing And Communication (ISAC) systems are regarded as key enablers of emerging applications such as the Internet of Things, smart cities, and autonomous driving. High-precision communication and sensing are required under limited spectrum resources. However, most existing studies concentrate on the far-field region, where incomplete derivation of the sensing mutual information metric, neglect of scatterer interference, and insufficient consideration of communication–sensing trade-offs limit the flexibility of beamforming design and reduce practical effectiveness. As application scenarios expand, the demand for efficient integration of communication and sensing becomes more pronounced, particularly in near-field environments where scatterer interference strongly affects system performance In this work, beamforming design for near-field ISAC systems under scatterer interference is investigated. A general expression for sensing mutual information is derived, a multi-objective optimization problem is formulated, and auxiliary variables, the Schur complement, and the Dinkelbach algorithm are employed to obtain Pareto optimal solutions. The proposed method provides a flexible and effective approach for balancing communication and sensing performance, thereby enhancing overall system performance and resource utilization in diverse application scenarios. The findings serve as a valuable reference for the optimal trade-off design of communication and sensing in near-field ISAC systems.  Methods  The proposed beamforming design method first derives a general expression for sensing mutual information in near-field scenarios, explicitly accounting for and quantifying the effect of scatterer interference on sensing targets. A multi-objective optimization problem is then formulated, with the Signal-to-Interference-plus-Noise Ratio (SINR) of communication users and sensing mutual information as objectives. Within this multi-objective framework, communication and sensing performance can be flexibly balanced to satisfy the requirements of different application scenarios. To enable tractable optimization, the sensing mutual information expression is transformed into a Semi-Definite Programming (SDP) problem using auxiliary variables and the Schur complement. Multi-user SINR expressions are reformulated with the Dinkelbach algorithm to convert them into convex functions, facilitating efficient optimization. The multi-objective problem is subsequently reduced to a single-objective one by constructing a system utility function, and the Pareto optimal solution is obtained to achieve the optimal balance between communication and sensing performance. This method provides a flexible and effective design strategy for near-field ISAC systems, substantially enhancing overall system performance and resource utilization.  Results and Discussions  This study presents a beamforming design method that balances communication and sensing performance through innovative optimization strategies. The method derives the general expression of sensing mutual information under scatterer interference, formulates a multi-objective optimization problem with the SINR of communication users and sensing mutual information as objectives, and transforms the problem into a convex form using auxiliary variables, the Schur complement, and the Dinkelbach algorithm. The Pareto optimal solution is then obtained via a system utility function, enabling the optimal balance between communication and sensing performance. Simulation results demonstrate that adjusting the weight parameter ρ flexibly balances user communication and target sensing performance (Fig. 2). As ρ increases from 0 to 1, sensing mutual information rises while user rate decreases, showing that a controllable trade-off can be achieved by tuning weights. In multi-user scenarios, near-field ISAC systems exhibit superior performance compared with far-field systems (Fig. 3). Under near-field conditions, the proposed method achieves more flexible and adjustable trade-offs than the classic Zero-Forcing (ZF) algorithm and single-objective optimization algorithms (Fig. 4, Fig. 5), confirming its effectiveness and superiority in practical applications. Furthermore, the study reveals the interference pattern of scatterers on sensing targets with respect to distance (Fig. 6, Fig. 7). The results indicate that the greater the distance difference between a scatterer and a sensing target, the weaker the interference on the target, with sensing mutual information gradually increasing and eventually converging. This finding provides a valuable reference for the design of near-field ISAC systems.  Conclusions  This paper proposes a beamforming design method for balancing communication and sensing performance by jointly optimizing sensing mutual information and communication rate. The method derives the general form of sensing mutual information, reformulates it as a SDP problem, and applies the Dinkelbach algorithm to process multi-user SINR expressions, thereby establishing a multi-objective optimization framework that can flexibly adapt to diverse application requirements. The results demonstrate three key findings: (1) The method enables flexible adjustment of communication and sensing performance, achieving an optimal trade-off through weight tuning, and allowing dynamic adaptation of system performance to specific application needs. (2) It reveals the interference pattern of scatterers on sensing targets with respect to distance, providing critical insights for near-field ISAC system design and supporting optimized system layout and parameter selection in complex environments. (3) In multi-user scenarios, the proposed approach outperforms traditional single-objective optimization methods in both communication rate and sensing mutual information, highlighting its competitiveness and practical value.
Collaborative Inference for Large Language Models Against Jamming Attacks
LIN Zhiping, XIAO Liang, CHEN Hongyi, XU Xiaoyu, LI Jieling
 doi: 10.11999/JEIT250675
[Abstract](36) [FullText HTML](13) [PDF 4428KB](16)
Abstract:
  Objective  Collaborative inference with Large Language Models (LLMs) is employed to enable mobile devices to offload multi-modal data, including images, text, video, and environmental information such as temperature and humidity, to edge servers. This offloading improves the performance of inference tasks such as human-computer question answering, logical reasoning, and decision support. Jamming attacks, however, increase transmission latency and packet loss, which reduces task completion rates and slows inference. A reinforcement learning-based collaborative inference scheme is proposed to enhance inference speed, accuracy, and task completion under jamming conditions. LLMs with different sparsity levels and quantization precisions are deployed on edge servers to meet heterogeneous inference requirements across tasks.  Methods  A reinforcement learning-based collaborative inference scheme is proposed to enhance inference accuracy, speed, and task completion under jamming attacks. The scheme jointly selects the edge servers, sparsity rates and quantization levels of LLMs, as well as the transmit power and channels for data offloading, based on task type, data volume, channel gains, and received jamming power. A policy risk function is formulated to quantify the probability of inference task failure given offloading latency and packet loss rate, thereby reducing the likelihood of unsafe policy exploration. Each edge server deploys LLMs with varying sparsity rates and quantization precisions, derived from layer-wise unstructured pruning and model parameter quantization, to process token vectors of multi-modal data including images, text, video, and environmental information such as temperature and humidity. This configuration is designed to meet diverse requirements for inference accuracy and speed across different tasks. The LLM inference system is implemented with mobile devices offloading images and text to edge servers for human-computer question answering and driving decision support. The edge servers employ a vision encoder and tokenizer to transform the received sensing data into token vectors, which serve as inputs to the LLMs. Pruning and parameter quantization are applied to the foundation model LLaVA-1.5-7B, generating nine LLM variants with different sparsity rates and quantization precisions to accommodate heterogeneous inference demands.  Results and Discussions  Experiments are conducted with three vehicles offloading images (i.e., captured traffic scenes) and texts (i.e., user prompts) using a maximum transmit power of 100 mW on 5.170~5.330 MHz frequency channels. The system is evaluated against a smart jammer that applies Q-learning to block one of the 20 MHz channels within this band. The results show consistent performance gains over benchmark schemes. Faster responses and more accurate driving advice are achieved, enabled by reduced offloading latency and lower packet loss in image transmission, which allow the construction of more complete traffic scenes. Over 20 repeated runs, inference speed is improved by 20.3%, task completion rate by 14.1%, and accuracy by 12.2%. These improvements are attributed to the safe exploration strategy, which prevents performance degradation and satisfies diverse inference requirements across tasks.  Conclusions  This paper proposed a reinforcement learning-based collaborative inference scheme that jointly selects the edge servers, sparsity rates and quantization levels of LLMs, as well as the transmit power and offloading channels, to counter jamming attacks. The inference system deploys nine LLM variants with different sparsity rates and quantization precisions for human-computer question answering and driving decision support, thereby meeting heterogeneous requirements for accuracy and speed. Experimental results demonstrate that the proposed scheme provides faster responses and more reliable driving advice. Specifically, it improves inference speed by 20.3%, task completion rate by 14.1%, and accuracy by 12.2%, achieved through reduced offloading latency and packet loss compared with benchmark approaches.
Human-Machine Fusion Intelligent Decision-Making: Concepts, Framework, and Applications
LI Zhe, WANG Ke, WANG Biao, ZHAO Ziqi, LI Yafei, GUO Yibo, HU Yazhou, WANG Hua, LV Pei, XU Mingliang
 doi: 10.11999/JEIT250260
[Abstract](316) [FullText HTML](183) [PDF 25670KB](69)
Abstract:
  Significance  The exponential growth of data volume, advances in computational power, and progress in algorithmic theory have accelerated the development of Artificial Intelligence (AI). Although AI offers unprecedented opportunities across industries, it continues to face limitations such as dependence on large datasets, poor interpretability of learning and decision-making mechanisms, limited robustness, and susceptibility to hallucinations. To overcome these challenges, integrating human cognitive decision-making capabilities and human-like cognitive models into AI systems is essential. This integration gives rise to a new form of intelligence—Human-Machine Fusion Intelligence—which combines physiological and physical characteristics. The core concept is to harness the complementary strengths of humans and machines in information processing and decision-making: humans provide intuitive judgment and contextual understanding, whereas machines are capable of high-speed computation and large-scale data analysis. By establishing a synergistic, collaborative “partnership,” Human-Machine Fusion Intelligent Decision-Making seeks to optimize decision quality through coordinated organic and probabilistic integration. This paradigm holds significant potential to improve decision reliability in mission-critical contexts, such as military operations, medical procedures, and autonomous driving, thus offering both theoretical research value and practical application relevance.  Conclusions  This review adopts a systematic research approach to examine Human-Machine Fusion Intelligence in decision-making across four core dimensions. First, it presents a theoretical analysis of the fundamental concepts underpinning Human-Machine Fusion Intelligence and highlights its unique advantages in complex decision-making contexts. Second, it proposes a general framework for Human-Machine Fusion Intelligent Decision-Making systems, emphasizing two key components: situational awareness and collaborative decision-making. Based on this framework, decision-making approaches are categorized into three types according to task characteristics and the nature of human-machine interaction: human-led, machine-led, and human-machine collaborative decision-making. Third, the review synthesizes recent practical advancements in representative application domains. Finally, it examines emerging trends in the development of Human-Machine Fusion Intelligent Decision-Making.  Progress  Unlike prior reviews that focus primarily on specific application domains, this article presents a comprehensive overview of Human-Machine Fusion Intelligence across four key dimensions: conceptual foundations, system framework, practical applications, and current challenges and future prospects. The core contributions of this review are summarized in the following four areas: First, it elucidates the advantages of Human-Machine Fusion Intelligent Decision-Making systems: (1) Improved decision-making accuracy—By combining machines’ strengths in data processing and logical reasoning with human capabilities in handling unstructured problems and ethically complex decisions, the system enables dynamic adjustment through a human-in-the-loop mechanism. (2) Enhanced interpretability of decision outcomes—The decision-making process bridges the cognitive gap between humans and machines, providing a transparent, traceable decision path and clarifying accountability boundaries. (3) Greater system robustness—By integrating machines’ risk monitoring and adaptive capabilities with human experiential judgment in complex or uncertain environments, the system establishes a closed-loop collaboration that balances technological rationality with human cognition. Second, the article highlights that Human-Machine Fusion systems cannot operate independently in safety-critical contexts due to imperfect trust mechanisms and ethical constraints. In response, it proposes a hierarchical architecture comprising two key layers: (1) Situational awareness layer, including three core processes: multimodal data perception, cross-modal information fusion, and situational analysis. (2) Collaborative decision-making layer, which distinguishes three decision-making paradigms based on task characteristics and human-machine interaction mode: (a) Human-led decision-making, suited for tasks with high uncertainty and open-ended conditions, where an enhanced intelligence model with a human-in-the-loop is adopted. (b) Machine-led decision-making, appropriate for tasks with lower uncertainty, emphasizing hybrid intelligence through cognitive model integration in automated workflows. (c) Human-machine collaborative decision-making, applicable when human and machine strengths are complementary, allowing for equal, synergistic cooperation to optimize decision efficiency. Third, the article synthesizes recent technological progress, summarizing representative applications of Human-Machine Fusion Intelligent Decision-Making in mission-critical domains such as the military, healthcare, and autonomous driving. Finally, it identifies six key directions for future development: optimization of multimodal perception, fusion of semantic and feature spaces, construction of deep collaborative feedback loops, dynamic task allocation mechanisms, enhancement of system reliability, and development of ethical guidelines. These directions aim to advance efficient collaboration and sustainable evolution of human-machine intelligence.  Prospects  Human-Machine Fusion Intelligent Decision-Making offers substantial research value and strong application potential for advancing emerging industries and enabling new intelligent paradigms. Although several exploratory efforts have been made, the field remains in its infancy, lacking a unified and mature theoretical or technological foundation. Key scientific and engineering challenges persist, including the optimization of multimodal perception and data fusion, bridging the semantic gap between human cognition and machine-represented feature spaces, and achieving deep integration of human and machine intelligence. Continued interdisciplinary collaboration will be essential to drive theoretical progress and technological innovation, further unlocking the potential of Human-Machine Fusion Intelligent Decision-Making.
Lightweight Incremental Deployment for Computing-Network Converged AI Services
WANG Qinding, TAN bin, HUANG Guangping, DUAN Wei, YANG Dong, ZHANG Hongke
 doi: 10.11999/JEIT250663
[Abstract](48) [FullText HTML](14) [PDF 3112KB](11)
Abstract:
  Objective   The rapid expansion of Artificial Intelligence (AI) computing services has heightened the demand for flexible access and efficient utilization of computing resources. Traditional Domain Name System (DNS) and IP-based scheduling mechanisms are constrained in addressing the stringent requirements of low latency and high concurrency, highlighting the need for integrated computing-network resource management. To address these challenges, this study proposes a lightweight deployment framework that enhances network adaptability and resource scheduling efficiency for AI services.  Methods   The AI-oriented Service IDentifier (AISID) is designed to encode service attributes into four dimensions: Object, Function, Method, and Performance. Service requests are decoupled from physical resource locations, enabling dynamic resource matching. AISID is embedded within IPv6 packets (Fig. 5), consisting of a 64-bit prefix for identification and a 64-bit service-specific suffix (Fig. 4). A lightweight incremental deployment scheme is implemented through hierarchical routing, in which stable wide-area routing is managed by ingress gateways, and fine-grained local scheduling is handled by egress gateways (Fig. 6). Ingress and egress gateways are incrementally deployed under the coordination of an intelligent control system to optimize resource allocation. AISID-based paths are encapsulated at ingress gateways using Segment Routing over IPv6 (SRv6), whereas egress gateways select optimal service nodes according to real-time load data using a weighted least-connections strategy (Fig. 8). AISID lifecycle management includes registration, query, migration, and decommissioning phases (Table 2), with global synchronization maintained by the control system. Resource scheduling is dynamically adjusted according to real-time network topology and node utilization metrics (Fig. 7).  Results and Discussions   Experimental results show marked improvements over traditional DNS/IP architectures. The AISID mechanism reduces service request initiation latency by 61.3% compared to DNS resolution (Fig. 9), as it eliminates the need for round-trip DNS queries. Under 500 concurrent requests, network bandwidth utilization variance decreases by 32.8% (Fig. 10), reflecting the ability of AISID-enabled scheduling to alleviate congestion hotspots. Computing resource variance improves by 12.3% (Fig. 11), demonstrating more balanced workload distribution across service nodes. These improvements arise from AISID’s precise semantic matching in combination with the hierarchical routing strategy, which together enhance resource allocation efficiency while maintaining compatibility with existing IPv6/DNS infrastructure (Fig. 23). The incremental deployment approach further reduces disruption to legacy networks, confirming the framework’s practicality and viability for real-world deployment.  Conclusions   This study establishes a computing-network convergence framework for AI services based on semantic-driven AISID and lightweight deployment. The key innovations include AISID’s semantic encoding, which enables dynamic resource scheduling and decoupled service access, together with incremental gateway deployment that optimizes routing without requiring major modifications to legacy networks. Experimental validation demonstrates significant improvements in latency reduction, bandwidth efficiency, and balanced resource utilization. Future research will explore AISID’s scalability across heterogeneous domains and its robustness under dynamic network conditions.
Multi-Mode Anti-Jamming for UAV Communications: A Cooperative Mode-Based Decision-Making Approach via Two-Dimensional Transfer Reinforcement Learning
WANG Shiyu, WANG Ximing, KE Zhenyi, LIU Dianxiong, LIU Jize, DU Zhiyong
 doi: 10.11999/JEIT250566
[Abstract](52) [FullText HTML](20) [PDF 2487KB](4)
Abstract:
  Objective  With the widespread application of Unmanned Aerial Vehicles (UAVs) in military reconnaissance, logistics, and emergency communications, ensuring the security and reliability of UAV communication systems has become a critical challenge. Wireless channels are highly vulnerable to diverse jamming attacks. Traditional anti-jamming techniques, such as Frequency-Hopping Spread Spectrum (FHSS), are limited in dynamic spectrum environments and may be compromised by advanced machine learning algorithms. Furthermore, UAVs operate under strict constraints on onboard computational power and energy, which hinders the real-time use of complex anti-jamming algorithms. To address these challenges, this study proposes a multi-mode anti-jamming framework that integrates Intelligent Frequency Hopping (IFH), Jamming-based Backscatter Communication (JBC), and Energy Harvesting (EH) to strengthen communication resilience in complex electromagnetic environments. A Multi-mode Transfer Deep Q-Learning (MT-DQN) method is further proposed, enabling two-dimensional transfer to improve learning efficiency and adaptability under resource constraints. By leveraging transfer learning, the framework reduces computational load and accelerates decision-making, thereby allowing UAVs to counter jamming threats effectively even with limited resources.  Methods  The proposed framework adopts a multi-mode anti-jamming architecture that integrates IFH, JBC, and EH to establish a comprehensive defense strategy of “avoiding, utilizing, and converting” interference. The system is formulated as a Markov Decision Process (MDP) to dynamically optimize the selection of anti-jamming modes and communication channels. To address the challenges of high-dimensional state-action spaces and restricted onboard computational resources, a two-dimensional transfer reinforcement learning framework is developed. This framework comprises a cross-mode strategy-sharing network for extracting common features across different anti-jamming modes (Fig. 3) and a parallel network for cross-task transfer learning to adapt to variable task requirements (Fig. 4). The cross-mode strategy-sharing network accelerates convergence by reusing experiences, whereas the cross-task transfer learning network enables knowledge transfer under different task weightings. The reward function is designed to balance communication throughput and energy consumption. It guides the UAV to select the optimal anti-jamming strategy in real time based on spectrum sensing outcomes and task priorities.  Results and Discussions  The simulation results validate the effectiveness of the proposed MT-DQN. The dynamic weight allocation mechanism exhibits strong cross-task transfer capability (Fig. 6), as weight adjustments enable rapid convergence toward the corresponding optimal reward values. Compared with conventional Deep Reinforcement Learning (DRL) algorithms, the proposed method achieves a 64% faster convergence rate while maintaining the probability of communication interruption below 20% in dynamic jamming environments (Fig. 7). The framework shows robust performance in terms of throughput, convergence rate, and adaptability to variations in jamming patterns. In scenarios with comb-shaped and sweep-frequency jamming, the proposed method yields higher normalized throughput and faster convergence, exceeding baseline DQN and other transfer learning-based approaches. The results also indicate that MT-DQN improves stability and accelerates policy optimization during jamming pattern switching (Fig. 7), highlighting its adaptability to abrupt changes in jamming patterns through transfer learning.  Conclusions  This study proposes a multi-modal anti-jamming framework that integrates IFH, JBC, and EH, thereby enhancing the communication capability of UAVs. The proposed solution shifts the paradigm from traditional jamming avoidance toward active jamming exploitation, repurposing jamming signals as covert carriers to overcome the limitations of conventional frequency-hopping systems. Simulation results confirm the advantages of the proposed method in throughput performance, convergence rate, and environmental adaptability, demonstrating stable communication quality even under complex electromagnetic conditions. Although DRL approaches are inherently constrained in handling completely random jamming without intrinsic patterns, this work improves adaptability to dynamic jamming through transfer learning and cross-modal strategy sharing. These findings provide a promising approach for countering complex jamming threats in UAV networks. Future work will focus on validating the proposed algorithm in hardware implementations and enhancing the robustness of DRL methods under highly non-stationary, though not entirely unpredictable, jamming conditions such as pseudo-random or adaptive interference.
Unmanned Aircraft Vehicle-assisted Multi Cluster Concurrent Authentication Scheme for Internet of Things Devices
MA Ruhui, HE Shiyang, CAO Jin, LIU Kui, LI Hui, QIU Yuan
 doi: 10.11999/JEIT250279
[Abstract](36) [FullText HTML](13) [PDF 2123KB](5)
Abstract:
  Objective  With the rapid expansion of Internet of Things (IoT) devices, application scenarios such as smart cities and industrial intelligent manufacturing demand wider coverage and higher connection density from communication systems. Traditional terrestrial base stations have limited capacity to serve IoT devices in remote or complex environments. Unmanned Aircraft Vehicles (UAVs), owing to their flexible deployment and high mobility, can function as aerial base stations that effectively complement terrestrial networks, providing reliable and energy-efficient access for remote IoT terminals. Additionally, the expected 6G connectivity of tens of billions of devices may give rise to signaling conflicts and congestion at key nodes. To address these challenges, multi-cluster access schemes based on cluster division have been proposed. In these schemes, different clusters connect simultaneously to orthogonal subchannels, enabling UAVs to assist multiple IoT device clusters in accessing terrestrial networks concurrently. However, UAV-assisted multi-cluster communication faces pressing security and performance issues, including the susceptibility of air interface channels to attacks, the limited computational and storage capacities of IoT devices, signaling conflicts arising from massive concurrent access, and the requirement for seamless handover mechanisms due to the restricted endurance of UAVs. Therefore, the development of a secure and efficient UAV-assisted multi-cluster concurrent access and handover authentication scheme is essential.  Methods  This study proposes a secure authentication scheme for the UAV-assisted multi-cluster IoT device communication model, comprising four main components. First, UAV access authentication is achieved through a traditional pre-shared key mechanism, enabling mutual authentication and key agreement between the UAV and the ground network. Second, concurrent access authentication for multi-cluster IoT devices is realized using multi-layer aggregated signaling and aggregated message authentication code technologies, which effectively mitigate signaling conflicts and node congestion during massive concurrent access. Meanwhile, a Physically Unclonable Function (PUF) mechanism is incorporated to strengthen device-level security, protecting IoT devices against physical attacks while maintaining low storage and computational requirements. Finally, the UAV-assisted concurrent handover authentication integrates multi-layer aggregated signaling, aggregated message authentication code, and a pre-distribution key mechanism to enable fast and secure handovers between multi-cluster IoT devices and new UAVs, thereby ensuring the continuous security of network services.  Results and Discussions  The security of the proposed scheme is validated through formal analysis with the Tamarin tool, complemented by informal security analysis. The results show that the scheme satisfies mutual authentication and data security, and resists replay and man-in-the-middle attacks. The signaling overhead, as well as the computational and storage requirements of IoT devices during concurrent access and handover in multi-cluster communication, are also evaluated. The findings indicate that the scheme generates minimal signaling overhead (Fig. 3), thereby preventing signaling conflicts and node congestion. Moreover, the computational cost on devices remains low (Fig. 4), and the storage demand is minimal (Fig. 5), demonstrating that the scheme is well suited for resource-constrained IoT devices.  Conclusions  This paper proposes a UAV-assisted authentication scheme for concurrent access and handover of multi-cluster IoT devices. In this scheme, UAVs can securely and efficiently access the ground network, while multi-cluster IoT devices achieve concurrent and secure access through UAVs and perform rapid authentication and key agreement during handover to a new UAV. Security and performance analyses demonstrate that the scheme ensures multiple security properties, including mutual authentication, data security, and resistance to replay, man-in-the-middle, and physical attacks, while maintaining low computational and storage overhead on IoT devices. In addition, the scheme features low signaling overhead, effectively preventing signaling conflicts and key node congestion during large-scale concurrent access. Nevertheless, some limitations remain. Future work will explore more comprehensive and practical authentication mechanisms. Specifically, lightweight dynamic key update mechanisms tailored to UAV communication scenarios will be investigated to enhance security with minimal overhead. To address design complexity and environmental adaptability issues caused by PUF hardware dependence, more robust hardware security mechanisms will be considered to improve system stability in complex environments. Moreover, to mitigate the computational and energy burden on UAVs resulting from aggregation and forwarding tasks, approaches such as edge computing offloading will be examined to enable dynamic task allocation and load balancing, ensuring efficient and sustainable operation. Finally, a prototype system will be developed, and field experiments will be conducted to validate the feasibility and performance of the proposed solution in real-world scenarios.
A Successive Convex Approximation Optimization based Prototype Filter Design Method for Universal Filtered Multi-Carrier Systems
HUA Jingyu, YANG Le, WEN Jiangang, ZOU Yuanping, SHENG Bin
 doi: 10.11999/JEIT250278
[Abstract](48) [FullText HTML](27) [PDF 1492KB](4)
Abstract:
  Objective  In response to the extensive demands of sixth-generation (6G) communications, new waveform designs are expected to play a critical role. Conventional Orthogonal Frequency Division Multiplexing (OFDM) relies on strict orthogonality among subcarriers; however, this orthogonality is highly vulnerable to synchronization errors, which lead to severe Inter-Carrier Interference (ICI). To address this issue, filtered multicarrier modulation techniques apply high-performance filters to each subcarrier, thereby confining spectral leakage and mitigating ICI caused by non-ideal frequency synchronization. Among these techniques, Universal Filtered Multi-Carrier (UFMC) has shown particular promise, offering enhanced spectral flexibility and reduced out-of-band emissions compared with traditional OFDM. Despite these advantages, most existing studies recommend Dolph-Chebyshev (DC) filters as UFMC prototype filters. Nevertheless, DC filters suffer from limited controllability over design parameters and insufficient robustness against interference. Recent research has sought to improve system performance by applying constrained optimization techniques in filter design, typically optimizing metrics such as Signal-to-Interference Ratio (SIR) and Signal-to-Interference-plus-Noise Ratio (SINR). Nevertheless, the Symbol Error Rate (SER) has not achieved an optimal level, indicating room for further improvement. To bridge this gap, this paper proposes a novel prototype filter design method that directly targets the average SER in interference-limited UFMC systems. This approach improves the anti-interference capability of UFMC systems and contributes to the development of robust waveform solutions for 6G communications.  Methods  This study first derives the SINR of the UFMC system under non-zero Carrier Frequency Offset (CFO) and formulates the SER expression under interference-limited conditions. A mathematical model is then established for prototype filter optimization, with SER defined as the objective function. Because the nonlinear coupling between SINR and the filter coefficients introduces strong non-convexity, the Successive Convex Approximation (SCA) framework is employed to locally linearize the non-convex components. Furthermore, a quadratic upper-bound technique is applied to guarantee both convexity and convergence of the approximated problem. Finally, an iterative algorithm is developed to solve the optimization model and determine the optimal prototype filter.  Results and Discussions  The interference suppression capability of the proposed SCA filter is comprehensively evaluated, as shown in Figs. 2 and 3. The simulation results in Fig. 2 reveal several important findings. (1) The deviation between the theoretical SINR and Monte Carlo simulation results is less than 0.1 dB (Fig. 2), confirming the accuracy of the derived closed-form expressions. (2) CFO is shown to have a strong association with system interference. As the residual CFO increases from 0 to 0.05, the SINR with conventional DC filters decreases by 3.6 dB, whereas the SCA filter achieves an SINR gain of approximately 1 dB compared with the DC filter. (3) Under a CFO of 0.025, the UFMC waveform demonstrates clear superiority over the ideal OFDM system. At a Signal-to-Noise Ratio (SNR) of 18 dB, the UFMC system with the SCA filter attains an SINR of 18.4 dB, outperforming OFDM by 0.3 dB. Fig. 3 further highlights the robustness of the SCA filter in dynamic interference environments. Although the SER increases with both larger CFO and higher modulation orders, the SCA filter consistently yields the lowest SER across all interference scenarios. Under severe interference conditions (CFO = 0.05, 16QAM modulation, SNR = 17 dB), the SCA filter achieves an SER of 7.4×10–3, markedly outperforming the DC filter, which exhibits an SER of 2.9×10–2. These results demonstrate that the proposed SCA filter substantially enhances the anti-interference capability of UFMC systems.  Conclusions  This study first derives analytical expressions for the SINR and SER of the UFMC system under CFO. On this basis, an optimization model is established to design the prototype filter with the objective of minimizing the average SER. To address the non-convexity arising from the nonlinear coupling between SINR and filter coefficients, the SCA method is employed to reformulate the problem into a series of convex subproblems. An iterative algorithm is then proposed to obtain the optimal prototype filter. Simulation results demonstrate that, compared with conventional filters, the proposed SCA-based optimization algorithm provides flexible control over key filter parameters, achieving a narrower transition band and higher stopband attenuation under the same filter length. This improvement translates into significantly enhanced anti-interference performance under various system conditions. In summary, the main contributions of this work are: (1) Proposing a novel SCA-based optimization method for UFMC prototype filter design, which overcomes the parameter control limitations of traditional DC filters; (2) Systematically analyzing the performance advantages of the SCA filter under different modulation schemes and CFO conditions, and quantitatively demonstrating its contributions to SINR and SER improvements.
A Ku-band Circularly Polarized Leaky-wave Antenna Loaded with Parasitic Slots
HUANG Zhiyuan, ZHANG Yunhua, ZHAO Xiaowen
 doi: 10.11999/JEIT250347
[Abstract](59) [FullText HTML](32) [PDF 7823KB](4)
Abstract:
This paper proposes a Ku-band circularly polarized Leaky-Wave Antenna (LWA) based on a Substrate Integrated Waveguide (SIW). A parasitic slot, with the same configuration as the main radiation slot but reduced in size, is employed to address the open-stopband problem and enhance impedance matching. The radiation slot excites Circularly Polarized (CP) waves, while the parasitic slot simultaneously broadens the Axial Ratio (AR) bandwidth and suppresses the open-stopband effect. A prototype antenna is designed, fabricated, and measured. The results demonstrate that the antenna achieves a 32% 3-dB AR bandwidth from 12.6 GHz to 17.4 GHz, with CP beam scanning from –49° to +14°. The simulated and measured results are in good agreement. In addition, the realized gain remains stable across the operating band. Compared with existing works, the proposed design achieves the widest scanning range.  Objective  Compared with traditional phased array antennas, frequency-scanning antennas have extensive applications in both military and civilian fields owing to their advantages of low profile, low cost, and lightweight design. CP waves offer superior anti-interference performance compared with linearly polarized waves. As a representative frequency-scanning antenna, the LWA has attracted sustained global research interest. This study focuses on the investigation of a Ku-band Circularly Polarized Leaky-Wave Antenna (CP-LWA), with emphasis on wide-bandwidth and wide-scanning techniques, as well as methods for achieving circular polarization. The aim is to provide potential design concepts for next-generation mobile communication and radar system antennas.  Methods   The fan-shaped slot is modified based on previous work, and an additional size-reduced parasitic slot of the same shape as the main slot is introduced. The parasitic slots cancel the reflected waves generated by the main radiating slot, thereby suppressing the Open-Stop-Band (OSB) effect, and they also enlarge the effective radiating aperture, which improves radiation efficiency and impedance matching. By exploiting the metallic boundary of the conductors, the parasitic slots enhance CP performance and broaden the AR bandwidth. To validate the proposed design, an antenna consisting of 12 main slots and 11 parasitic slots is designed, simulated, and measured.  Results and Discussions  A prototype is designed, fabricated, and measured in a microwave anechoic chamber to validate the proposed antenna. Both simulated and measured S11 values remain below –10 dB across the entire Ku-band. The measured S11 is slightly higher in the low-frequency range (12~13 GHz) and slightly lower in the high-frequency range (16~18 GHz), while maintaining an overall consistent trend with the simulations, except for a frequency shift of approximately 0.2 GHz toward lower frequencies. For the AR bandwidth, the simulated and measured 3-dB AR bandwidths are 32.7% (12.8~17.8 GHz) and 32.0% (12.6~17.4 GHz), respectively. The realized gains are on average 0.6 dB lower than the simulated values across the AR bandwidth, likely due to measurement system errors and fabrication tolerances. The simulated and measured peak gains reach 14.26 dB and 13.65 dB, respectively, with maximum gain variations of 2.91 dB and 2.85 dB. The measured AR and gain results therefore show strong agreement with the simulations. The measured sidelobe level increases on average by approximately 0.65 dB. The simulated CP scanning range extends from –47° to +17°, while the measured range narrows slightly to –49° to +14°. The frequency shift of the LWA is analyzed, and based on the simulated effect of variations in εr on the scanning patterns, the shift toward lower frequencies is attributed to the actual dielectric constant of the substrate being smaller than the nominal value of 2.2 specified by the manufacturer.  Conclusions  This paper proposes a Ku-band CP-LWA based on a SIW. The antenna employs etched slots consisting of fan-shaped radiation slots and size-reduced parasitic slots. The radiation slots excite circular polarization due to their inherent geometric properties, while the parasitic slots suppress the CP effect and broaden the CP bandwidth. Measurements confirm that the proposed LWA achieves a wide 3-dB AR bandwidth of 12.6~17.4 GHz (32%) with a CP beam scanning range from –49° to +14°. Meanwhile, the antenna demonstrates stable gain performance across the entire AR bandwidth.
Parametric Holographic MIMO Channel Modeling and Its Bayesian Estimation
YUAN Zhengdao, GUO Yabo, GAO Dawei, GUO Qinghua, HUANG Chongwen, LIAO Guisheng
 doi: 10.11999/JEIT250436
[Abstract](58) [FullText HTML](33) [PDF 2383KB](6)
Abstract:
  Objective  Holographic Multiple-Input Multiple-Output (HMIMO), based on continuous-aperture antennas and programmable metasurfaces, is regarded as a cornerstone of 6G wireless communication. Its potential to overcome the limitations of conventional massive MIMO is critically dependent on accurate channel modeling and estimation. Three major challenges remain: (1) oversimplified electromagnetic propagation models, such as far-field approximations, cause severe mismatches in near-field scenarios; (2) statistical models fail to characterize the coupling between channel coefficients, user positions, and random orientations; and (3) the high dimensionality of parameter spaces results in prohibitive computational complexity. To address these challenges, a hybrid parametric-Bayesian framework is proposed in which neural networks, factor graphs, and convex optimization are integrated. Precise channel estimation, user position sensing, and angle decoupling in near-field HMIMO systems are thereby achieved. The methodology provides a pathway toward high-capacity 6G applications, including Integrated Sensing And Communication (ISAC).  Methods  A hybrid channel estimation method is proposed to decouple the “channel-coordinate-angle” parameters and to enable joint estimation of channel coefficients, coordinates, and angles under random user orientations. A neural network is first employed to capture the nonlinear relationship between holographic channel characteristics and the relative coordinates of the base station and user. The trained network is then embedded into a factor graph, where global optimization is performed. The neural network is dynamically approximated through Taylor expansion, allowing bidirectional message propagation and iterative refinement of parameter estimates. To address random user orientations, Euler angle rotation theory is introduced. Finally, convex optimization is applied to estimate the rotation mapping matrix, resulting in the decoupling of coordinate and angle parameters and accurate channel estimation.  Results and Discussions  The simulations evaluate the performance of different algorithms under varying key parameters, including Signal-to-Noise Ratio (SNR), pilot length L, and base station antenna number M. Two performance metrics are considered: Normalized Mean Square Error (NMSE) of channel estimation and user positioning accuracy, with the Cramér–Rao Lower Bound (CRLB) serving as the theoretical benchmark. At an SNR of 10 dB, the proposed method achieves a channel NMSE below –40 dB, outperforming Least Squares (LS) estimation and approximate model-based approaches. Under high SNR conditions, the NMSE converges toward the CRLB, confirming near-optimal performance (Fig. 5a). The proposed channel model demonstrates superior performance over “approximate methods” due to its enhanced characterization of real-world channels. Moreover, the positioning error gap between the proposed method and the “parallel bound” narrows to nearly 3 dB at high SNR, confirming the accuracy of angle estimation and the effectiveness of parameter decoupling (Fig. 5b). Moreover, the proposed method maintains performance close to the theoretical bounds when system parameters, such as user antenna number N, base station antenna number M, and pilot length L, are varied, demonstrating strong robustness (Figs. 68). These results also show that the Euler angle rotation-based estimation effectively compensates for coordinate offsets induced by random user orientations.  Conclusions  This study proposes a framework for HMIMO channel estimation by integrating neural networks, factor graphs, and convex optimization. The main contributions are threefold. First, Euler angles and coordinate mapping are incorporated into the parameterized channel model through factorization and factor graphs, enabling channel modeling under arbitrary user antenna orientations. Second, neural networks and convex optimization are embedded as factor nodes in the graph, allowing nonlinear function approximation and global optimization. Third, bidirectional message passing between neural network and convex optimization nodes is realized through Taylor expansion, thereby achieving joint decoupling and estimation of channel parameters, coordinates, and angles. Simulation results confirm that the proposed framework achieves higher accuracy—exceeding benchmarks by more than 3 dB, and demonstrates strong robustness across a range of scenarios. Future work will extend the method to multi-user environments, incorporate polarization diversity, and address hardware impairments such as phase noise, with the aim of supporting practical deployment in 6G systems.
Research on ECG Pathological Signal Classification Empowered by Diffusion Generative Data
GE Beining, CHEN Nuo, JIN Peng, SU Xin, LU Xiaochun
 doi: 10.11999/JEIT250404
[Abstract](265) [FullText HTML](176) [PDF 10154KB](43)
Abstract:
  Objective  ElectroCardioGram (ECG) signals are key indicators of human health. However, their complex composition and diverse features make visual recognition prone to errors. This study proposes a classification algorithm for ECG pathological signals based on data generation. A Diffusion Generative Network (DGN), also known as a diffusion model, progressively adds noise to real ECG signals until they approach a noise distribution, thereby facilitating model processing. To improve generation speed and reduce memory usage, a Knowledge Distillation-Diffusion Generative Network (KD-DGN) is proposed, which demonstrates superior memory efficiency and generation performance compared with the traditional DGN. This work compares the memory usage, generation efficiency, and classification accuracy of DGN and KD-DGN, and analyzes the characteristics of the generated data after lightweight processing. In addition, the classification effects of the original MIT-BIH dataset and an extended dataset (MIT-BIH-PLUS) are evaluated. Experimental results show that convolutional networks extract richer feature information from the extended dataset generated by DGN, leading to improved recognition performance of ECG pathological signals.  Methods  The generative network-based ECG signal generation algorithm is designed to enhance the performance of convolutional networks in ECG signal classification. The process begins with a Gaussian noise-based image perturbation algorithm, which obscures the original ECG data by introducing controlled randomness. This step simulates real-world variability, enabling the model to learn more robust representations. A diffusion generative algorithm is then applied to reconstruct and reproduce the data, generating synthetic ECG signals that preserve the essential characteristics of the original categories despite the added noise. This reconstruction ensures that the underlying features of ECG signals are retained, allowing the convolutional network to extract more informative features during classification. To improve efficiency, the approach incorporates knowledge distillation. A teacher-student framework is adopted in which a lightweight student model is trained from the original, more complex teacher ECG data generation model. This strategy reduces computational requirements and accelerates the data generation process, improving suitability for practical applications. Finally, two comparative experiments are designed to validate the effectiveness and accuracy of the proposed method. These experiments evaluate classification performance against existing approaches and provide quantitative evidence of its advantages in ECG signal processing.  Results and Discussions  The data generation algorithm yields ECG signals with a Signal-to-Noise Ratio (SNR) comparable to that of the original data, while presenting more discernible signal features. The student model constructed through knowledge distillation produces ECG samples with the same SNR as those generated by the teacher model, but with substantially reduced complexity. Specifically, the student model achieves a 50% reduction in size, 37.5% lower memory usage, and a 57% shorter runtime compared with the teacher model (Fig. 6). When the convolutional network is trained with data generated by the KD-DGN, its classification performance improves across all metrics compared with a convolutional network trained without KD-DGN. Precision reaches 95.7%, and the misidentification rate is reduced to approximately 3% (Fig. 9).  Conclusions  The DGN provides an effective data generation strategy for addressing the scarcity of ECG datasets. By supplying additional synthetic data, it enables convolutional networks to extract more diverse class-specific features, thereby improving recognition performance and reducing misidentification rates. Optimizing DGN with knowledge distillation further enhances efficiency, while maintaining SNR equivalence with the original DGN. This optimization reduces computational cost, conserves machine resources, and supports simultaneous task execution. Moreover, it enables the generation of new data without LOSS, allowing convolutional networks to learn from larger datasets at lower cost. Overall, the proposed approach markedly improves the classification performance of convolutional networks on ECG signals. Future work will focus on further algorithmic optimization for real-world applications.
Signal Sorting Method Based on Multi-station Time Difference and Dirichlet Process Mixture Model
CHEN Jinli, WANG Yanjie, FAN Yu, LI Jiaqiang
 doi: 10.11999/JEIT250191
[Abstract](41) [FullText HTML](21) [PDF 1944KB](5)
Abstract:
  Objective  Signal sorting is a crucial technology in electronic reconnaissance that enables the deinterleaving of mixed pulse sequences emitted by multiple radar radiation sources, thereby supporting military decision-making. With the rapid advancement of electronic technology, multi-station cooperative signal sorting has received increasing attention. However, existing multi-station signal sorting methods depend heavily on manually selected parameters, which limits adaptability. Moreover, in complex environments with pulse loss and noise interference, conventional methods struggle to process unpaired pulses effectively, reducing the accuracy and stability of sorting. To address these challenges, this study applies the Dirichlet Process Mixture Model (DPMM) to multi-station cooperative signal sorting. The proposed approach enables adaptive sorting even when the number of radiation sources is unknown or measurement errors exist, thereby improving flexibility and adaptability. Furthermore, it can effectively classify unpaired pulses caused by pulse loss or noise, enhancing the robustness and reliability of sorting. This research provides a novel strategy for signal sorting in complex electromagnetic environments and holds promising application value in radar signal processing.  Methods  In multi-station cooperative signal sorting, the spatial distribution of multiple receiving stations detecting the same radar signal makes efficient and accurate signal pairing and classification a core challenge. To address this issue, a multi-station cooperative signal sorting method based on the DPMM is proposed. The process comprises three stages: pulse pairing, time-difference clustering and sorting, and mismatched pulse classification. In the pulse pairing stage, identical pulses originating from the same radiation source are identified from the sequences intercepted by each receiving station. To ensure accurate pairing, a dual-constraint strategy is adopted, combining a time-difference window with multi-parameter matching. Successfully paired pulses are then constructed into a time-difference vector set, which provides the data foundation for the subsequent clustering and sorting stage. In the time-difference clustering and sorting stage, DPMM is employed to cluster the time-difference vector set. DPMM adaptively determines the number of clusters to model the data structure, enabling the system to infer the optimal cluster count. Gibbs sampling is used to optimize model parameters, further enhancing clustering robustness. Based on the clustering results, radar pulse sets are constructed, achieving signal sorting across multiple radiation sources. In the mismatched pulse classification stage, unpaired pulses caused by noise interference or pulse loss during transmission are further processed. DPMM is applied to fit radar pulse parameter vectors, including pulse width, radio frequency, and bandwidth. The affiliation degree of each mismatched pulse relative to the radar pulse sets is then calculated. Pulses with affiliation degrees exceeding a predefined threshold are merged into the corresponding pulse set, whereas those below the threshold are classified as anomalous pulses, likely due to interference or noise, and are discarded. This method enhances the adaptability and robustness of multi-station cooperative signal sorting and provides an effective solution for complex electromagnetic environments.  Results and Discussions  In the experimental validation, radar pulse data are generated through simulation to evaluate the effectiveness of the proposed method. Compared with traditional multi-station cooperative signal sorting approaches, the method achieves high-precision sorting without requiring prior knowledge of the number of radiation sources or parameter measurement errors, thereby demonstrating strong adaptability and practicality. To comprehensively assess performance in complex environments, simulations are conducted to analyze sorting capability under varying measurement errors, pulse loss rates, and interference rates. The final sorting results are summarized in (Table. 3). The results indicate that even in the presence of noise interference and data loss, most radar pulses are accurately identified, with only a small fraction misclassified as interference signals. The final sorting accuracy reaches 98.8%, confirming the robustness and stability of the method against pulse loss, noise, and other uncertainties. To further validate its superiority, the method is compared with other algorithms under different conditions. Sorting accuracy under different Time of Arrival (TOA) measurement errors (Fig. 6) shows that stable performance is maintained even under severe noise interference, reflecting strong noise resistance. Further analyses of sorting accuracy under different pulse loss rates and interference rates (Figs. 7 and 8) demonstrate that higher efficiency and stability are achieved in handling unpaired pulses, and pulses that fail to be paired are more accurately classified. The sorting accuracy of different algorithms in various scenarios (Fig. 9) further confirms that the method performs more consistently in complex environments, indicating higher adaptability. Overall, the method adapts well to diverse application scenarios and provides efficient, stable, and reliable signal sorting for multi-station cooperative electronic reconnaissance tasks.  Conclusions  This study proposes a multi-station cooperative signal sorting method based on the DPMM to address the limitations of traditional approaches, which rely heavily on prior information and perform poorly in processing unpaired pulses. By applying DPMM for adaptive clustering of time-difference information, the proposed method avoids sorting errors caused by improper manual parameter settings and effectively classifies unpaired pulses based on radar pulse parameter characteristics. Simulation results show that this method not only improves the accuracy and stability of multi-station cooperative signal sorting but also maintains high sorting performance even when the number of radiation sources is unknown or measurement errors are present, highlighting its engineering application value. Future research may extend this approach to dynamic electromagnetic environments and adaptive real-time processing to meet the demands of more complex electronic reconnaissance tasks.
Advancements in Quantum Circuit Design for ARIA: Implementation and Security Evaluation
LI Lingchen, LI Pei, MO Shenyong, WEI Yongzhuang, YE Tao
 doi: 10.11999/JEIT250440
[Abstract](67) [FullText HTML](35) [PDF 3972KB](7)
Abstract:
  Objective  ARIA is established as the Korean national Standard block cipher (KS X 1213) in 2003 to meet the demand for robust cryptographic solutions across government, industrial, and commercial sectors in South Korea. Designed by a consortium of Korean cryptographers, the algorithm adopts a hardware-efficient architecture that supports 128-, 192-, and 256-bit key lengths, providing a balance between computational performance and cryptographic security. This design allows ARIA to serve as a competitive alternative to the Advanced Encryption Standard (AES), with comparable encryption and decryption speeds suitable for deployment in resource-constrained environments, including embedded systems and high-performance applications. The security of ARIA is ensured by its Substitution–Permutation Network (SPN) structure, which incorporates two distinct substitution layers and a diffusion layer to resist classical cryptanalytic methods such as differential, linear, and related-key attacks. This robustness has promoted its adoption in secure communication protocols and financial systems within South Korea and internationally. With the emergence of quantum computing, challenges to classical ciphers arise. Quantum algorithms such as Grover’s algorithm reduce the effective key strength of symmetric ciphers, necessitating reassessment of their post-quantum security. In this study, ARIA’s quantum circuit implementation is optimized through tower-field decomposition and in-place circuit optimization, enabling a comprehensive evaluation of its resilience against quantum adversaries.  Methods  The quantum resistance of ARIA is evaluated by estimating the resources required for exhaustive key search attacks under Grover’s algorithm. Grover’s quantum search algorithm achieves quadratic speedup, effectively reducing the security strength of a 128-bit key to the classical equivalent of 64 bits. To ensure accurate assessment, the quantum circuits for ARIA’s encryption and decryption processes are optimized within Grover’s framework, thereby reducing the required quantum resources. The core technique employed is tower-field decomposition, which transforms high-order finite field operations into equivalent lower-order operations, yielding compact computational representations. Specifically, the S-box and linear layer circuits are optimized using automated search tools to identify efficient combinations of low-order field operations. The resulting quantum circuits are then applied to estimate Grover-attack resource requirements, and the results are compared against the National Institute of Standards and Technology (NIST) post-quantum security standards.  Results and Discussions  Optimized quantum circuits for all four ARIA S-boxes are constructed using tower-field decomposition and automated circuit search tools (Fig. 7, Table 2). By integrating these with the linear layer, a complete quantum encryption circuit is implemented, and Grover-attack resource requirements are re-evaluated (Tables 5 and 6). Detailed implementation data are provided in the Clifford+T gate set. The experimental results show that ARIA-192 does not meet the NIST Level 3 post-quantum security standard, indicating vulnerabilities to quantum adversaries. In contrast, ARIA-128 and ARIA-256 comply with Level 1 and Level 3 requirements, respectively. Further optimization is theoretically feasible through methods such as pseudo-key techniques. Future research may focus on developing automated circuit search tools to extend this framework, enabling systematic post-quantum security evaluations of ARIA and comparable symmetric ciphers (e.g., AES, SM4) within a generalized assessment model.  Conclusions  This study investigates the quantum resistance of classical cryptographic algorithms in the context of quantum computing, with a particular focus on the Korean block cipher ARIA. By leveraging the distinct algebraic structures of ARIA’s four S-boxes, tower-field decomposition is applied to design optimized quantum circuits for all S-boxes. Additionally, the circuit depth of the ARIA linear layer is optimized through an in-place quantum circuit implementation, resulting in a more efficient realization of the ARIA algorithm in the quantum setting. A complete quantum encryption circuit is constructed by integrating these optimization components, and the security of the ARIA family of algorithms is evaluated against quantum adversaries using Grover’s key search attack model. The results demonstrate improved implementation efficiency under the newly designed quantum scheme. However, ARIA-192 exhibits resistance below the NIST Level 3 quantum security threshold, indicating a potential vulnerability to quantum attacks.
Bit-configurable Physical Unclonable Function Circuit Based on Self-detection and Repair Method
XU Mengfan, ZHANG Yuejun, LIU Tianxiang, PAN Yu
 doi: 10.11999/JEIT250359
[Abstract](118) [FullText HTML](67) [PDF 7274KB](14)
Abstract:
  Objective  The proliferation of Internet of Things (IoT) devices has intensified the need for robust, hardware-level security. Among hardware-based security primitives, Physical Unclonable Functions (PUFs) serve a critical role in lightweight authentication and dynamic key generation by leveraging inherent process variations to produce unique, unclonable responses. Achieving reliable PUF performance under environmental fluctuations—such as temperature and supply voltage variation, requires balancing sensitivity to process variations with environmental robustness. Conventional approaches, including circuit-level stabilization and architecture-level error correction, can improve reliability but often increase area, power, and test complexity. To overcome these drawbacks, recent work has explored voltage or bias perturbation for unstable response correction. However, entropy degradation during mode transitions in dual-mode PUFs remains a major concern, compromising both reliability and energy efficiency. This study proposes a bit-configurable bistable electric bridge-divider PUF that addresses these challenges by maintaining entropy independence between operational modes, reducing error correlation, and limiting repair and masking overhead. The proposed solution improves randomness, reliability, and energy efficiency, making it suitable for secure, cost-effective authentication in IoT edge devices operating under dynamic conditions.  Methods  Hardware overhead and testing complexity associated with conventional PUF stabilization techniques are reduced by introducing a bit-configurable bistable electric bridge-divider PUF architecture. Entropy generation is enhanced by amplifying process-induced variations through electric bridge imbalance and the exponential behavior of subthreshold current. A reconfigurable bit-cell is employed to enable seamless switching between electric bridge mode and voltage divider mode without additional layout cost; dual-mode operation is thus supported while preserving area efficiency. A voltage-skew-based self-detection and repair mechanism is integrated to dynamically identify and mitigate unstable responses, thereby improving reliability under varying environmental conditions. The PUF circuit is fully custom-designed and fabricated in the TSMC 28 nm CMOS process. Post-layout simulations confirm the robustness of the architecture, demonstrating effective self-repair capabilities and consistent performance under temperature and voltage fluctuations.  Results and Discussions  The proposed design is fabricated using the TSMC 28 nm CMOS process. The total layout area measures 3 283.3 μm2, and each PUF cell occupies 0.7888 μm2 (Fig. 11). Simulation waveforms of the self-detection, repair, and masking operations are presented in (Fig. 12). Inter-chip Hamming distance histograms and fitted curves for both electric bridge mode and voltage divider mode are shown in (Fig. 13a, Fig. 14a). Autocorrelation results of the 40,960-bit output are illustrated in (Fig. 13b, Fig. 14b). The randomness of the responses is evaluated using the NIST test suite provided by the U.S. National Institute of Standards and Technology, with the results summarized in (Table 1). The native Bit Error Rate (BER), measured before repair or masking, is analyzed under various temperature and supply voltage conditions (Fig. 15). By dynamically adjusting the voltage skew, precise control of the error correction rate is achieved, leading to a substantial reduction in BER across different environments (Fig. 16). A performance comparison with previously reported designs is provided in (Table 2). After applying the entropy source repair and masking mechanism, the BER converges to below 1.62 × 10–9, approaching the ideal “zero” BER.  Conclusions  A bit-configurable PUF architecture is proposed to address environmental variability and hardware constraints in IoT edge devices. A reconfigurable bit-cell is employed to support dynamic switching between electric bridge mode and voltage divider mode without incurring additional layout cost. Process-induced variations are amplified through bridge imbalance and the exponential behavior of subthreshold current, which enhances the randomness and uniqueness of the PUF responses. A voltage-skew-based self-detection and repair mechanism is integrated to identify and correct unstable responses, effectively reducing the BER under varying environmental conditions. The proposed design, fabricated using the TSMC 28 nm CMOS process, demonstrates high entropy, robustness, and low overhead in terms of area and power consumption. These characteristics make it suitable for secure and lightweight authentication and key generation in resource-constrained IoT systems.
A Novel Transient Execution Attack Exploiting Loop Prediction Mechanisms
GUO Jiayi, QIU Pengfei, YUAN Jie, LAN Zeru, WANG Chunlu, ZHANG Jiliang, WANG Dongsheng
 doi: 10.11999/JEIT250361
[Abstract](248) [FullText HTML](101) [PDF 3185KB](40)
Abstract:
  Objective  Modern processors rely heavily on branch prediction to improve pipeline efficiency; however, the transient execution windows created by speculative execution expose critical security vulnerabilities. While prior research has primarily examined conditional branch instructions, this study identifies a previously overlooked attack surface: loop instructions (LOOP, LOOPZ, LOOPNZ) and JRCXZ in x86 architectures, which use the RCX register to determine branch outcomes. These instructions produce significantly longer transient windows than JCC instructions, posing heightened threats to hardware-level isolation. This work demonstrates the exploitability of these instructions, quantifies their transient execution behavior, and validates practical attack scenarios.  Methods  This study employs a systematic methodology to investigate the speculative behavior of loop instructions and assess their exploitability. First, the microarchitectural behavior of LOOP, LOOPZ, LOOPNZ, and JRCXZ instructions is reverse-engineered using Performance Monitoring Counters (PMCs), with a focus on their dependency on RCX register values and interaction with the branch prediction unit. Speculative durations of loop and JCC instructions are compared using cycle-accurate profiling via the RDPMC instruction, which accesses fixed-function PMCs to record clock cycles. Based on these observations, exploit primitives are constructed by manipulating RCX values to induce speculative execution paths. The feasibility of these primitives is evaluated through four real-world attack scenarios on Intel CPUs: (1) Cross-user/kernel data leakage through speculative memory access following mispredicted loop exits. (2) Covert channel creation between Simultaneous MultiThreading (SMT) threads by measuring timing differences between correctly and incorrectly predicted branches during speculative execution. (3) SGX enclave compromise via speculative access to secrets gated by RCX-controlled branching. (4) Kernel Address Space Layout Randomization (KASLR) bypass using page fault timing during transient execution of loop-based probes. Each scenario is tested on real hardware under controlled conditions to assess reliability, reproducibility, and attack robustness.  Results and Discussions  The proposed transient execution attack targeting loop instructions (LOOP, LOOPZ, LOOPNZ) and JRCXZ offers notable advantages over traditional Spectre exploits. These RCX-dependent instructions exhibit transient execution windows that are, on average, 40% longer than those of conventional JCC branches (Table 1). The extended speculative duration significantly improves attack reliability: in cross-user/kernel boundary experiments, the proposed method achieves an average data leakage accuracy of 90%, compared to only 10% for JCC-based techniques under identical conditions. The attack also demonstrates high efficacy in bypassing hardware isolation mechanisms. In Intel SMT environments, a covert channel is established with 97.5% accuracy and a throughput of 256.9 kbit/s (Table 4), exploiting timing discrepancies between correctly and incorrectly predicted branches during speculative execution. In trusted execution environments, the attack achieves 98% accuracy in extracting secret values from Intel SGX enclaves, highlighting the susceptibility of RCX-controlled speculation to enclave compromise. Additionally, KASLR is completely defeated by exploiting speculative page fault timing during loop instruction execution. Kernel base addresses are recovered deterministically in all test cases (Fig. 4), demonstrating the critical security implications of this attack vector.  Conclusions  This study identifies a critical vulnerability in modern speculative execution mechanisms by demonstrating that loop instructions (LOOP, LOOPZ, LOOPNZ) and JRCXZ—which rely on the RCX register for branch decisions, serve as novel vectors for transient execution attacks. The key contributions are threefold: (1) These instructions generate speculative execution windows that are, on average, 40% longer than those of JCC instructions. (2) Practical exploits are demonstrated across key hardware isolation boundaries—including user/kernel space, SMT, and Intel SGX enclaves, with success rates exceeding 90% in targeted scenarios. (3) The findings expose critical limitations in current Spectre defenses, indicating that existing mitigations are insufficient to address RCX-dependent speculative paths, thereby motivating the need for specialized countermeasures.
Cover
Cover
2025, 47(8).  
[Abstract](112) [PDF 7629KB](24)
Abstract:
2025, 47(8): 1-4.  
[Abstract](81) [FullText HTML](31) [PDF 302KB](18)
Abstract:
Excellence Action Plan Leading Column
Optimal and Suboptimal Architectures of Millimeter-wave Large-scale Arrays for 6G
HONG Wei, XU Jun, CHEN Jixin, HAO Zhangcheng, ZHOU Jianyi, YU zhiqiang, YANG Guangqi, JIANG Zhihao, YU Chao, HU Yun, HOU Debin, ZHU Xiaowei, CHEN Zhe, ZHOU Peigen
2025, 47(8): 2405-2415.   doi: 10.11999/JEIT250109
[Abstract](1010) [FullText HTML](436) [PDF 5723KB](317)
Abstract:
  Objective  Beamforming array technology is a key enabler for modern radio systems, evolving across three primary dimensions: advancements in electromagnetic theory, innovations in semiconductor manufacturing, and iterations in system architectures. The development of mobile communication has driven the progress of beamforming array technologies. Hybrid beamforming array technology, in particular, was established as a critical solution for 5G millimeter-wave communications under the 3GPP Release 15 standard. To address the needs of future 6G communication and sensing, millimeter-wave beamforming arrays will evolve towards ultra-large-scale (>1000 elements), intelligent (AI-driven), and heterogeneous (integration of photonics and electronics) architectures, serving as the foundation for ubiquitous intelligent connectivity. This article investigates optimal and suboptimal large-scale beamforming array architectures for 6G millimeter-wave communications.  Methods  Large-scale beamforming arrays can be classified into three primary types: analog-domain beamforming arrays, digital-domain beamforming arrays, and hybrid-domain beamforming arrays. These arrays can be further categorized into single-beam and multi-beam configurations based on the number of supported beams. Each category includes various architectural variants (Figure 1). Analog-domain beamforming arrays (Figure 2) consist of passive and active beamforming arrays. Active beamforming arrays are further divided into Radio Frequency (RF) phase-shifting, Intermediate Frequency (IF) phase-shifting, and Local Oscillator (LO) phase-shifting architectures. Digital-domain implementations (Figure 3) include symmetric and asymmetric full-digital beamforming architectures. Hybrid-domain configurations (Figure 4) offer various combinations, such as architectures integrating RF phase-shifting phased subarrays with digital beamforming, or hybrid multi-beam arrays that combine passive beamforming networks with digital processing. In terms of performance, the symmetric full-digital beamforming architecture (Figure 5) is considered the optimal solution among all beamforming arrays. However, it faces challenges such as high system costs, excessive power consumption, and increased complexity due to the need for numerous high-speed ADCs and real-time processing of large data streams. To address these limitations in symmetric full-digital multi-beam large-scale arrays, an asymmetric large-scale full-digital multi-beam array architecture was proposed (Figure 6). Additionally, a spatial-domain orthogonal hybrid beamforming array (Figure 8) is proposed, which uses differentiated beamforming techniques across spatial dimensions to implement large-scale hybrid beamforming arrays.  Results and Discussions  Current 5G millimeter-wave base stations primarily utilize hybrid beamforming massive MIMO array architectures (Figure 7), which integrate RF phase-shifting phased subarrays with digital-domain beamforming. In this configuration, each two-dimensional RF phase-shifting phased subarray is connected to a dedicated Up-Down Converter (UDC), followed by secondary digital beamforming processing. However, this architecture limits independent beam control flexibility, often leading to the abandonment of digital beamforming in practical implementations. Therefore, each beam achieves only subarray-level gain. For mobile communication base stations, the adoption of asymmetric full-digital phased arrays (Figure 6), which include large-scale transmit arrays and smaller receive arrays, offers an optimal balance between cost, power consumption, complexity, and performance. This configuration meets uplink/downlink traffic demands while enabling wider receive beams (corresponding to compact receive arrays) that support accurate angle-of-arrival estimation. Theoretically, hybrid multi-beam architectures that combine digital beamforming in the horizontal dimension with analog beamforming in the vertical dimension (or vice versa) can reduce system complexity, cost, and power consumption without degrading performance, potentially outperforming current 5G mmWave hybrid beamforming solutions. However, these architectures face limitations due to beam bundling. Building upon the 4-subarray hybrid beamforming architecture (256 elements) used in existing 5G mmWave base stations (Figure 7), a spatial-domain orthogonal hybrid beamforming array is proposed (Figure 8). In this configuration, vertical-dimension elements are grouped into sixteen 1D RF phase-shifting phased subarrays (16 elements each), with each subarray connected to individual UDCs and ADC/DAC chains. The 16 data streams are processed through horizontal-dimension digital beamforming, achieving spatial orthogonality between the analog and digital beamforming domains. This architecture preserves full-aperture gain for each beam, supporting enhanced transmission rates and system capacity. This hybrid multi-beam solution maintains the same beamforming chip count as conventional hybrid architectures (Figure 7) for identical array scales, requiring only an increase in UDC channels from 4 to 16, with minimal cost impact. The proposed solution supports 16 simultaneous beams/data streams, resulting in a significant capacity improvement. For dual-polarization configurations, this extends to 32 beams/data streams, further enhancing system capacity. In horizontal-digital/vertical-analog implementations, all beams align along the horizontal plane, with vertical scanning limited to simultaneous elevation adjustment (Figure 9). Although vertical beam grouping enables independent elevation control, it results in beam gain degradation.  Conclusions  From a performance perspective, the symmetric full-digital beamforming array architecture can be considered the optimal solution. However, it is hindered by high system complexity and cost. The asymmetric full-digital beamforming array architecture significantly reduces system complexity and cost while closely approaching the performance of its symmetric counterpart, making it a suboptimal yet practical choice for large-scale beamforming arrays. Additionally, the spatially orthogonal hybrid beamforming array architecture—such as the design combining vertical analog phased subarrays with horizontal digital beamforming—represents another suboptimal solution. Notably, this hybrid architecture outperforms current 5G mmWave hybrid beamforming systems in terms of performance.
Overviews
Advances and Challenges in Wireless Channel Hardware Twin
FANG Sheng, ZHU Qiuming, XIE Yuetian, JIANG Hao, LI Hui, WU Qihui, MAO Kai, HUA Boyu
2025, 47(8): 2416-2428.   doi: 10.11999/JEIT241093
[Abstract](224) [FullText HTML](151) [PDF 1738KB](48)
Abstract:
  Significance   Wireless channel characteristics are key determinants of communication system performance and reliability. Channel twin technology—defined as the physical or digital reproduction of channel distortion effects—has become essential for system testing and validation. Digital twin approaches are favored for their flexibility and efficiency, and hardware-based platforms (i.e., channel emulators, CEs) are widely used for large-scale performance evaluation. However, as networks advance toward Terahertz (THz) bands, extremely large-scale massive Multiple-Input Multiple-Output (XL-MIMO) systems (e.g., 1024-antenna arrays), and integrated air-space-ground-sea communications, three key limitations remain: (1) Inability to support real-time processing for ultra-wide bandwidths over 10 GHz; (2) Inadequate dynamic emulation accuracy for non-stationary channels under high mobility; and (3) Insufficient hardware resources for simulating over 106 concurrent channels in XL-MIMO architectures. This study reviews state-of-the-art hardware twin technologies, identifies critical technical bottlenecks, and outlines future directions for next-generation channel emulation platforms.  Progress   Existing channel hardware twin technologies can be categorized into three paradigms based on channel data sources, model types, and implementation architectures. First, measured data-driven twin technology uses real-world propagation data to enable high-fidelity emulation. Signal replay reproduces specific electromagnetic environments by replaying recorded signals. Although this approach preserves scene-specific authenticity, it lacks flexibility due to dependence on measured datasets and storage constraints. Channel Impulse Response (CIR) replay extracts propagation characteristics from measurement data, making it applicable to unmodeled environments such as underwater acoustics. However, its accuracy depends on precise channel estimation and is limited by data sampling resolution and storage capacity. Second, deterministic model-driven twin technology generates CIR using Finite Impulse Response (FIR) filters by synthesizing multipath delays and fading coefficients for predefined scenarios. Techniques such as sparse filtering and subspace projection optimize the trade-off between accuracy and hardware efficiency. For example, current large-scale emulators support 256×256 MIMO systems with 512-tap FIR filters, requiring only four active taps. Nonetheless, limited clock resolution introduces phase distortion in the frequency response, reducing fidelity in high-frequency terahertz applications. Third, statistical model-driven twin technology emulates time-varying channel behavior by generating fading and delay profiles based on probability distributions. The sum-of-sinusoids method is widely employed due to its simplicity and low computational demand, while enhanced implementations—such as the coordinate rotating digital computer algorithm—minimize storage requirements for sinusoid generation. This paradigm offers strong scalability but sacrifices scenario-specific fidelity, limiting its ability to reproduce certain channel characteristics accurately. A comparative analysis across fidelity, flexibility, scalability, and implementation complexity shows that measured data-driven methods are best suited for reproducing real-world environments; deterministic models support configurable scenario design for known settings; and statistical models facilitate efficient emulation of large-scale networks. Each approach balances distinct advantages against inherent limitations.  Prospects   Future developments in channel hardware twin technologies are expected to integrate emerging innovations to overcome current limitations: (1) Deep learning techniques—such as generative adversarial networks—can learn from limited measured channel data to synthesize channel characteristics, reducing dependence on extensive datasets in measured data-driven approaches. (2) The environment-aware capabilities of next-generation communication networks enable dynamic reconstruction of propagation environments, addressing the lack of real-time adaptability in deterministic model-driven technologies. (3) Transfer learning enables the migration of knowledge across propagation scenarios, improving the cross-scenario generalization of statistical model-driven emulation without requiring large amounts of measured data. Future applications of channel hardware twin technologies are expected to advance in three primary directions: (1) real-time optimization of communication systems; (2) network planning and design; and (3) testing and evaluation of electromagnetic devices. Through the integration of deep learning and environmental sensing, hardware twin platforms will support intelligent, self-adaptive communication systems capable of meeting the increasing complexity of future network demands.  Conclusions  This review synthesizes recent progress in channel hardware twin technologies and addresses critical challenges posed by future communication scenarios characterized by ultra-wide bandwidths, high channel dynamics, and large-scale networking. Key issues include high-frequency wideband signal processing, emulation of non-stationary dynamic environments, and scalability to large multi-branch network architectures. A classification framework is proposed, categorizing existing hardware twin approaches into three paradigms—measured data-driven, deterministic model-driven, and statistical model-driven—based on data sources, modeling strategies, and implementation architectures. A comparative analysis of these paradigms evaluates their relative strengths and limitations in terms of authenticity, flexibility, scalability, and emulation duration, providing guidance for selecting appropriate emulation strategies in complex environments. Furthermore, this study explores the integration of emerging technologies such as generative networks, environmental sensing, and transfer learning to support data-efficient generation, dynamic scenario adaptation, and cross-scene generalization. These advancements are expected to enhance the efficiency and adaptability of channel hardware twins, enabling them to meet the stringent requirements of future communication systems in performance validation, network design, and device testing. This work offers a foundation for advancing innovation in channel hardware twin technologies and accelerating the development of next-generation wireless networks.
Exploration of Application of Artificial Intelligence Technology in Underwater Acoustic Network Routing Protocols
ZHAO Yihao, CHEN Yougan, LI Jianghui, WAN Lei, TAO Yi, WANG Xuchen, DONG Yanhan, TU Shen’ao, XU Xiaomei
2025, 47(8): 2429-2447.   doi: 10.11999/JEIT250110
[Abstract](404) [FullText HTML](163) [PDF 2698KB](70)
Abstract:
  Significance   In response to the strategic emphasis on maritime power, China has experienced growing demand for ocean resource exploration, ecological monitoring, and defense applications. Underwater acoustic networks provide an effective solution for data acquisition in these domains, with network performance largely dependent on the design and implementation of routing protocols. These protocols determine the transmission path and method, forming a foundation for optimizing underwater communication. Recent advances in Artificial Intelligence (AI) have prompted efforts to apply AI techniques to underwater acoustic network routing. By leveraging AI’s learning capacity, data insight capability, and adaptability, researchers aim to address challenges posed by dynamic underwater environments, energy limitations of nodes, and potential security threats. This paper examines the integration of AI technology into underwater acoustic network routing protocols and provides a critical evaluation of current research progress.   Progress   This paper reviews the application of AI technology in underwater acoustic network routing protocols, classifying existing approaches into flat and hierarchical routing categories. In flat routing, AI methods such as conventional heuristic algorithms, reinforcement learning, and deep learning have been applied to improve routing decisions. For hierarchical routing, AI is utilized not only for routing optimization but also for node clustering and layer structuring. These applications offer potential benefits, including enhanced routing efficiency, reduced energy consumption, improved end-to-end delay, and strengthened network security. Most performance evaluations are based on simulations. However, simulation environments vary considerably across studies, particularly in node quantity and density, ranging from small-scale to very large-scale networks. This variability complicates quantitative comparisons of performance metrics. Additionally, replicating these simulation scenarios in sea trials is limited by the logistical and financial constraints of deploying and recovering large numbers of nodes, thus impeding the validation of protocol performance under real-world conditions. The review further identifies critical challenges in applying AI to underwater acoustic networks. Many AI-based protocols operate under impractical assumptions, such as global knowledge of node positions and energy levels, which is rarely achievable in dynamic underwater settings. Maintaining such information requires substantial communication overhead, thereby increasing energy consumption and delay. Furthermore, the computational complexity of AI algorithms—particularly deep learning models—presents difficulties for implementation on underwater nodes with limited power, processing, and storage capacities. Few studies provide detailed complexity analyses, and hardware-based performance verifications remain scarce. This lack of real-world validation limits the assessment of the practical feasibility and effectiveness of AI-enabled routing protocols.  Conclusions  AI technology offers considerable potential for enhancing underwater acoustic network routing protocols by addressing key challenges such as environmental variability, energy constraints, and security threats. However, current research is constrained by several limitations. Many studies rely on unrealistic assumptions regarding the availability of complete node information, which is impractical in dynamic underwater settings. The acquisition and maintenance of such information entail substantial communication overhead, leading to increased energy consumption and delay. Moreover, the computational demands of AI algorithms—particularly deep learning models—often exceed the capabilities of resource-limited underwater nodes. Performance assessments remain predominantly simulation-based, with limited hardware implementation, thereby restricting the verification of real-world feasibility and effectiveness.  Prospects  Future research should prioritize the development of more accurate and realistic simulation platforms to support the evaluation of AI-based routing protocols. This includes the integration of advanced channel models and real-world observational data to improve simulation fidelity. Establishing standardized simulation conditions will also be essential for enabling consistent performance comparisons across studies. In parallel, greater emphasis should be placed on hardware implementation of AI algorithms, with efforts directed toward reducing algorithmic complexity and storage demands to accommodate the limitations of energy-constrained underwater nodes. Exploring cost-effective validation approaches, such as small-scale sea trials and semi-physical simulation frameworks, will be critical for assessing the practical performance and deployment feasibility of AI-enabled routing protocols.
An Overview of Resource Management Technology of 6G Integrated Communication, Sensing, and Computation Enabled Satellite-Terrestrial Intelligent Network
WU Yanyan, WU Song, DENG Wei
2025, 47(8): 2448-2472.   doi: 10.11999/JEIT250140
[Abstract](305) [FullText HTML](209) [PDF 5140KB](74)
Abstract:
  Significance   As 6G mobile communication systems continue to evolve, Integrated Communication, Sensing, and Computation (ICSC) technology has emerged as a key area of research. ICSC not only improves network performance but also meets increasingly diverse and personalized user requirements. Recent progress in spectrum sharing, high-precision sensing algorithms, dynamic computing resource scheduling, and Artificial Intelligence (AI) has supported the development of 6G networks. However, several challenges remain. These include inefficient spectrum utilization, limited accuracy and real-time performance of sensing algorithms, and insufficient adaptability and intelligence in computing resource scheduling strategies. Moreover, integrating these technologies into the 6G ICSC Enabled Satellite-Terrestrial Intelligent Network (6G-ICSC-STIN) for effective resource management and optimal allocation is an unresolved issue. To address demands for high bandwidth, low latency, and wide coverage in future networks, a distributed intelligent resource management strategy is designed. Based on this approach, a resource management framework combining game theory and multi-agent reinforcement learning is proposed, offering guidance for advancing resource management in 6G-ICSC-STIN systems.  Progress   This paper provides a comprehensive discussion of resource management technologies for 6G ICSC Enabled Satellite–Terrestrial Intelligent Networks (6G-ICSC-STIN). It summarizes key technological advances driving the field and presents recent progress in four core areas: spectrum sharing, high-precision sensing algorithms, dynamic computing resource scheduling, and the application of AI in ICSC systems. Measurement indicators for ICSC performance are also examined. Based on this review, a 6G-ICSC-STIN architecture is proposed (Fig. 2), integrating 6G communication, sensing, computation, and intelligent coordination technologies. This architecture fully leverages the capabilities of satellites, unmanned aerial vehicles, High-Altitude Platforms (HAPs), and ground terminals to enable seamless and full-domain coverage across space, air, ground, and sea. It supports deep integration of communication, sensing, computation, intelligence, and security, resulting in a unified network system characterized by more precise perception and transmission, improved resource coordination, lower system overhead, and enhanced user experience. To address complex resource management challenges, a functional block diagram comprising the application, service, capability, and resource layers is introduced (Fig. 3), aiming to identify new approaches for efficient resource allocation. A distributed intelligent resource management strategy is further proposed for the ICSC central, fog node, edge networks and terminal (Fig. 4). Within the integrated edge network, a novel “Master–Slave two-level edge node” architecture is designed, in which the Master node deploys a resource demand prediction model to estimate regional demand in real time (Fig. 6). Building on this strategy, a resource management framework based on game theory and multi-agent reinforcement learning is proposed (Fig. 5). This framework employs the Nash-Equilibrium Asynchronous Advantage Actor-Critic (Nash-E-A3C) algorithm, adopts a parallelized multi-agent and distributed computing approach, and integrates Nash equilibrium theory (Fig. 7), with the aim of achieving intelligent, collaborative, and efficient network resource management.  Conclusions  The distributed intelligent resource management strategy is essential for achieving efficient resource coordination and optimal utilization in the 6G-ICSC-STIN architecture. By decentralizing computing, storage, and communication resources across network nodes, it enables resource sharing and collaborative operation. The proposed architecture, grounded in game theory and multi-agent reinforcement learning, supports dynamic resource allocation and optimization. Agents are deployed at each node, where they make decisions based on local demands and environmental conditions using game-theoretic reasoning and Reinforcement Learning (RL) algorithms. This approach enables globally efficient resource management across the network.  Prospects   Cross-domain technological integration is fundamental to the realization of 6G-ICSC-STIN. Deep integration of sensing, communication, and computing capabilities can substantially enhance overall network performance and efficiency. However, this integration faces several challenges, including heterogeneous network compatibility, complex resource scheduling, fragmented security mechanisms, and slow progress in standardization. Efficient resource representation is critical for effective resource management and performance optimization. Existing studies show that resources in satellite-terrestrial integrated networks are heterogeneous, multidimensional, and unevenly distributed across large spatiotemporal scales, posing new challenges to resource coordination. This paper outlines future development trends in intelligent resource management for 6G-ICSC-STIN, synthesizing current research progress, key challenges, and future directions in cross-domain technology fusion and resource representation. These emerging technologies together form a foundation for intelligent and efficient resource management in 6G-ICSC-STIN and offer new pathways for the advancement of next-generation wireless communication systems.
A Review on Action Recognition Based on Contrastive Learning
SUN Zhonghua, WU Shuang, JIA Kebin, FENG Jinchao, LIU Pengyu
2025, 47(8): 2473-2485.   doi: 10.11999/JEIT250131
[Abstract](325) [FullText HTML](248) [PDF 1325KB](40)
Abstract:
  Significance   Action recognition is a key topic in computer vision research and has evolved into an interdisciplinary area integrating computer vision, deep learning, and pattern recognition. It seeks to identify human actions by analyzing diverse modalities, including skeleton sequences, RGB images, depth maps, and video frames. Currently, action recognition plays a central role in human-computer interaction, video surveillance, virtual reality, and intelligent security systems. Its broad application potential has led to increasing attention in recent years. However, the task remains challenging due to the large number of action categories and significant intra-class variation. A major barrier to improving recognition accuracy is the reliance on large-scale annotated datasets, which are costly and time-consuming to construct. Contrastive learning offers a promising solution to this problem. Since its initial proposal in 1992, contrastive learning has undergone substantial development, yielding a series of advanced models that have demonstrated strong performance when applied to action recognition.  Progress   Recent developments in contrastive learning-based action recognition methods are comprehensively reviewed. Contrastive learning is categorized into three stages: traditional contrastive learning, clustering-based contrastive learning, and contrastive learning without negative samples. In the traditional contrastive learning stage, mainstream action recognition approaches are examined with reference to the Simple framework for Contrastive Learning of visual Representations (SimCLR) and Momentum Contrast v2 (MoCo-v2). For SimCLR-based methods, the principles are discussed progressively across three dimensions: temporal contrast, spatio-temporal contrast, and the integration of spatio-temporal and global-local contrast. For MoCo-v2, early applications in action recognition are briefly introduced, followed by methods proposed to enrich the positive sample set. Cross-view complementarity is addressed through a summary of methods incorporating knowledge distillation. For different data modalities, approaches that exploit the hierarchical structure of human skeletons are reviewed. In the clustering-based stage, methods are examined under the frameworks of Prototypical Contrastive Learning (PCL) and Swapping Assignments between multiple Views of the same image (SwAV). For contrastive learning without negative samples, representative methods based on Bootstrap Your Own Latent (BYOL) and Simple Siamese networks (SimSiam) are analyzed. Additionally, the roles of data augmentation and encoder design in the integration of contrastive learning with action recognition are discussed in detail. Data augmentation strategies are primarily dependent on input modality and dimensionality, whereas encoder selection is guided by the characteristics of the input and its representation mapping. Various contrastive loss functions are categorized systematically, and their corresponding formulas are provided. Several benchmark datasets used for evaluation are introduced. Performance results of the reviewed methods are presented under three categories: unsupervised single-stream, unsupervised multi-stream, and semi-supervised approaches. Finally, the methods are compared both horizontally (across techniques) and vertically (across stages).  Conclusions  In the data augmentation analysis, two dimensions are considered: modality and transformation type. For RGB images or video frames, which contain rich pixel-level information, augmentations such as spatial cropping, horizontal flipping, color jittering, grayscale conversion, and Gaussian blurring are commonly applied. These operations generate varied views of the same content without altering its semantic meaning. For skeleton sequences, augmentation methods are selected to preserve structural integrity. Common strategies include shearing, rotation, scaling, and the use of view-invariant coordinate systems. Skeleton data can also be segmented by individual joints, multiple joints, all joints, or along spatial and temporal axes separately. Regarding dimensional transformations, spatial augmentations include cropping, flipping, rotation, and axis masking, all of which enhance the salience of key spatial features. Temporal transformations apply time-domain cropping and flipping, or resampling to different frame rates, to leverage temporal continuity and short-term action invariance. Spatio-temporal transformations typically use Gaussian blur and Gaussian noise to simulate real-world perturbations while preserving overall action semantics. For encoder selection, temporal modeling commonly uses Gated Recurrent Units (GRUs), Long Short-Term Memory networks (LSTMs), and Sequence-to-Sequence (S2S) models. LSTM is suitable for long-term temporal dependencies, while bidirectional GRU captures temporal patterns in both forward and backward directions, allowing for richer temporal representations. Spatial encoders are typically based on the ResNet architecture. ResNet18, a shallower model, is preferred for small datasets or low-resource scenarios, whereas ResNet50, a deeper model, is better suited for complex feature extraction on larger datasets. For spatio-temporal encoding, ST-GCN are employed to jointly model spatial configurations and temporal dynamics of skeletal actions. In the experimental evaluation, performance comparisons of the reviewed methods yield several constructive insights and summaries, providing guidance for future research on contrastive learning in action recognition.  Prospects   The limitations and potential developments of action recognition methods based on contrastive learning are discussed from three aspects: runtime efficiency, the quality of negative samples, and the design of contrastive loss functions.
A Review of Electronic Skin and Its Application in Clinical Diagnosis and Treatment of Traditional Chinese Medicine
WANG Zheng, MI Jinpeng, CHEN Guodong
2025, 47(8): 2486-2498.   doi: 10.11999/JEIT250148
[Abstract](263) [FullText HTML](171) [PDF 7032KB](17)
Abstract:
Integrating Electronic Skin (e-skin) into Traditional Chinese Medicine (TCM) diagnostics offers a novel approach to addressing long-standing issues of standardization and objectivity. Core diagnostic practices in TCM-pulse assessment, tongue analysis, and acupuncture, are predominantly based on subjective interpretation, which hinders reproducibility and limits broader clinical acceptance. This review examines recent advances in e-skin technology, including flexible electronics, multimodal sensing, and Artificial Intelligence (AI), and discusses their potential to support quantifiable, data-driven diagnostic frameworks. These developments may provide a technological basis for modernizing TCM while maintaining its holistic orientation. This review systematically examines the convergence of TCM clinical requirements and e-skin technologies through a comprehensive survey of over 60 peer-reviewed studies and patents published between 2015 and 2024. First, the current state of e-skin research is mapped onto the diagnostic needs of TCM, with a focus on material flexibility, multisensory integration, and energy autonomy. Second, key technical challenges are identified through comparative analysis of sensor performance metrics (e.g., sensitivity, durability) and TCM-specific biomarker detection requirements. Third, a framework is proposed for optimizing e-skin architectures in accordance with TCM’s systemic diagnostic logic. The analysis highlights three technical domains: (1) Material innovations: Graphene-polymer composites and liquid metal-hydrogel interfaces that enable conformal adherence to dynamic biological surfaces (Fig. 3). (2) Multimodal sensing: Heterogeneous sensor arrays capable of synchronously capturing pulse waveforms, tongue coatings, and acupoint bioimpedance (Table 1). (3) AI-driven signal interpretation: Deep learning models such as ResNet-1D and transformer networks for classifying TCM pulse patterns and body constitutions. e-skin technologies have advanced significantly in supporting the digital transformation of TCM through innovations in materials, sensing functions, and algorithmic design. In pulse diagnosis, graphene-based sensor arrays achieve 89.3% classification accuracy across 27 pulse categories (Table 2), exceeding manual assessments (Kappa: 0.72 vs. 0.51) by quantifying nuanced differences in pulse types such as “slippery” and “wiry” (Fig. 1). For tongue diagnosis, MXene-enabled multispectral imaging (400~1000 nm) supports automated analysis of coating thickness with an F1-score of 0.91, and reveals thermal-humidity gradients correlated with Yang Deficiency patterns (Fig. 6). Acupuncture standardization has improved through the use of piezoresistive needle arrays, which reduce insertion depth errors to ±0.3 mm. Integration with machine learning further enables classification of nine TCM body constitutions at 86.4% accuracy, supporting personalized therapeutic strategies (Fig. 5). Despite these achievements, key technical limitations remain. Material degradation and signal synchronization latency over 72 ms restrict real-time applications. Variability in sensor specifications (sampling rates from 50 to 2,000 Hz) and the lack of quantifiable biomarkers for TCM concepts such as Qi-Stagnation continue to hinder clinical validation (Table 2). Future research should focus on: (1) Self-healing materials: Bioinspired hydrogels with strain tolerance over 300% and enhanced fatigue resistance. (2) Edge-AI architectures: Lightweight transformer-CNN hybrids optimized for reduced latency (<20 ms). (3) TCM-specific biomarkers: Electrochemical sensors designed to detect molecular correlates of Yin-Yang imbalances. This review outlines a roadmap for modernizing TCM through e-skin integration by aligning technological advances with clinical requirements. Three key insights are emphasized: (1) Material-device co-design: Engineering stretchable electronics to accommodate the dynamic diagnostic contexts of TCM. (2) Multimodal data fusion: Combining pulse, tongue, and meridian signals to support systemic pattern differentiation. (3) Regulatory frameworks: Establishing TCM-oriented standards for sensor validation and clinical reliability. Emerging applications-including Internet of Things (IoT)-connected e-skin patches for continuous Zang-Fu organ monitoring and AI-guided acupuncture robotics-illustrate the field’s transformative potential. By 2030, the interdisciplinary integration of flexible electronics, artificial intelligence, and TCM principles is projected to enable e-skin diagnostic systems to be adopted in 40% of tertiary hospitals, supporting the transition of TCM toward a globally recognized precision medicine paradigm.
DTDS: Dilithium Dataset for Power Analysis
YUAN Qingjun, ZHANG Haojin, FAN Haopeng, GAO Yang, WANG Yongjuan
2025, 47(8): 2499-2508.   doi: 10.11999/JEIT250048
[Abstract](372) [FullText HTML](213) [PDF 3135KB](76)
Abstract:
  Objective  The development of quantum computing threatens the security of traditional cryptosystems and advances the research and standardisation of post-quantum cryptographic algorithms. The Dilithium digital signature algorithm is designed based on the lattice theory and was selected by USA National Institute of Standards and Technology (NIST) as the standard for post-quantum cryptographic algorithms in 2024. Meanwhile, the side channel analysis of Dilithium, especially the power analysis, has become a current research hotspot. However, the existing power analysis datasets are mainly for classical packet cryptography algorithms, such as AES, etc., and the lack of datasets for novel algorithms, such as Dilithium, restricts the research of side-channel security analysis methods.  Results and Discussions  For this reason, this paper collects and discloses the first power analysis dataset for the Dilithium algorithm, aiming to facilitate the research on power analysis of post-quantum cryptographic algorithms. The dataset is based on the open-source reference implementation of Dilithium, running on a Cortex M4 processor and captured by a dedicated device, and contains 60,000 traces captured during the Dilithium signature process, as well as the signature source data and sensitive intermediate values corresponding to each trace.  Conclusions  The constructed DTDS dataset is further visualised and analysed, and the execution process of the random polynomial generation function polyz_unpack and its effect on the traces are investigated in detail. Finally, the dataset is modelled and tested using template analysis and deep learning analytics to verify the validity and usefulness of the dataset. The dataset and code could be found at https://doi.org/10.57760/sciencedb.j00173.00001.
Wireless Communication and Internet of Things
Adaptive Multi-Mode Blind Equalization Scheme for OFDM-NOMA Systems
YANG Long, YU Kaixin, LI Jin, JIA Ziyi
2025, 47(8): 2509-2520.   doi: 10.11999/JEIT250153
[Abstract](171) [FullText HTML](102) [PDF 5593KB](27)
Abstract:
  Objective  Orthogonal Frequency Division Multiplexing (OFDM) combined with Non-Orthogonal Multiple Access (NOMA) is widely applied in next-generation wireless communication systems for its high spectral efficiency and support for concurrent multi-user transmission. However, in downlink transmission, the superposition of signals from multiple users on the same subcarrier yields non-standard Quadrature Amplitude Modulation (QAM) constellations, rendering conventional equalization techniques ineffective. In addition, channel variability and impulsive noise introduce severe distortion, further degrading system performance. To overcome these limitations, this paper proposes an unsupervised adaptive multi-mode blind equalization scheme designed for OFDM-NOMA systems.  Methods  The proposed equalization scheme combines the Multi-Mode Algorithm (MMA) with a Soft-Decision Directed (SDD) strategy to construct an adaptive cost function. This function incorporates the power allocation factors of NOMA users to compensate for amplitude and phase distortions introduced by the wireless channel. To minimize the cost function efficiently, an optimized Newton method is employed, which avoids direct matrix inversion to reduce computational complexity. An iterative update rule is derived to enable fast convergence with low processing overhead. The algorithm is implemented on a real-time Software-Defined Radio (SDR) system using the GNURadio platform for practical validation.  Results and Discussions  Simulation results show that the proposed equalization algorithm substantially outperforms conventional methods in both convergence speed and accuracy. Compared with the traditional Minimum Mean Square Error (MMSE) algorithm, it reduces convergence time by 90% while achieving comparable performance without the use of pilot signals (Fig. 8). Constellation diagrams before and after equalization confirm that the algorithm effectively restores non-standard QAM constellations distorted by NOMA signal superposition (Fig. 9). The method also demonstrates strong robustness to impulsive noise and dynamic channel variations. Complexity analysis indicates that the proposed algorithm incurs lower computational overhead than conventional Newton-based equalization approaches (Table 1). Experimental validation on the GNURadio platform confirms its ability to separate user signals and support accurate decoding in real-world OFDM-NOMA downlink conditions (Fig. 12).  Conclusions  This study presents a blind equalization scheme for OFDM-NOMA systems based on an MMA-SDD adaptive cost function and an optimized Newton method. The proposed algorithm compensates for amplitude and phase distortions, enabling reliable signal recovery without pilot information. Theoretical analysis, simulation results, and experimental validation confirm its fast convergence, robustness to noise, and low computational complexity. These characteristics support its potential for practical deployment in future NOMA-based wireless communication networks.
Anti-interrupted Sampling Repeater Jamming Method Based on down-sampling Processing Blind Source Separation
LIU Yipin, YU Lei, WEI Yinsheng
2025, 47(8): 2521-2534.   doi: 10.11999/JEIT250193
[Abstract](232) [FullText HTML](78) [PDF 6246KB](45)
Abstract:
  Objective  Advancements in radar jamming technology have made coherent jamming generated by Digital Radio Frequency Memory (DRFM) a significant threat to radar detection. This type of jamming exhibits considerable spectral overlap with target echo signals and shares similar time-frequency characteristics. Even after matched filtering is applied to the received signal, the jamming can still achieve high gain. Among various forms, Interrupted Sampling Repeater Jamming (ISRJ) presents both suppression and deception effects, combined with high agility and diversity, posing a considerable challenge to radar detection systems. Existing ISRJ suppression methods face several limitations, including reliance on prior knowledge of jamming parameters, reduced robustness against ISRJ style variations, and the need for advance detection of ISRJ forwarding strategies. Blind Source Separation (BSS) can extract source signals based solely on the received mixture, without requiring prior information about the source or transmission parameters. BSS is widely applied in radar anti-jamming scenarios due to its high robustness. However, as ISRJ is primarily deployed for self-defense jamming, conventional BSS methods lack spatial degrees of freedom and cannot effectively suppress such interference. To address this limitation, this study proposes a down-sampling BSS method for ISRJ suppression. By applying dechirping and down-sampling to the echo signal, varying the down-sampling retention positions produces multiple down-sampled output signals. Theoretical analysis demonstrates that the jamming and target signals in these multi-channel down-sampled outputs satisfy the linear mixing model required for BSS. BSS is subsequently applied to separate the ISRJ and target components. This study introduces BSS into ISRJ suppression, providing a highly robust approach that does not depend on prior knowledge, with theoretical validation supporting the method.  Methods   In self-defense ISRJ scenarios, the jamming and target share the same azimuth angle, resulting in a loss of spatial freedom in the received signal. Therefore conventional BSS methods based on linear instantaneous mixing models are no longer applicable. When all source signals originate from the same azimuth, the rank of the receiving array manifold matrix reduces to one, causing the array receiving model to degenerate into an effective single-channel system. However, BSS requires multiple mixed signals to perform signal separation. To overcome this limitation, this study proposes a down-sampling BSS method for ISRJ suppression. The approach begins by applying oversampling to the received signal, followed by dechirp processing of the single-channel echo signal that contains both jamming and target components. Through conjugate multiplication of the echo signal with a reference signal, both the ISRJ and target echo are converted into sinusoidal signals with fixed frequencies and time-domain windowing characteristics. Subsequently, the signal undergoes down-sampling, during which multiple down-sampled output signals are generated by varying the retention positions of the sampled data. This process effectively restores the degrees of freedom required for separation. Theoretical analysis confirms that the ISRJ and target components in the down-sampled output signals satisfy the linear mixing model necessary for BSS processing. The multi-channel down-sampled signals are then used as input for BSS, enabling the separation of jamming and target components. Pulse compression is performed via Fourier transform to enhance detection resolution. Finally, target detection is conducted on each separated component to isolate the jamming signals and recover the target echoes, thereby achieving anti-jamming performance.  Results and Discussions  The key innovation of the proposed method is the application of BSS to ISRJ suppression, eliminating the requirement for precise estimation of ISRJ parameters and demonstrating high robustness. Furthermore, a single-frequency, single-channel BSS approach based on down-sampling is presented, which has potential application beyond jamming suppression. Simulation results confirm that the proposed method effectively separates ISRJ from the target signal (Fig. 7) and suppresses multiple ISRJ types, including direct forwarding ISRJ (Fig. 5), repeated forwarding ISRJ (Fig. 7), and frequency-shift forwarding ISRJ (Fig. 6). Comparative experiments demonstrate that this method resolves the problem of degraded suppression performance caused by the jamming azimuth in existing BSS approaches. Compared with conventional ISRJ suppression algorithms, the proposed method maintains stable performance regardless of ISRJ slice width or jamming power. Moreover, it achieves superior output Signal-to-Interference-plus-Noise Ratio (SINR), confirming its effectiveness in enhancing anti-jamming capabilities.  Conclusions  To address the threat posed by ISRJ to radar systems, this study proposes an ISRJ suppression method based on down-sampling BSS. By applying down-sampling and dechirp processing to the received signal, multiple signals are generated, and the Joint Approximate Diagonalization of Eigenmatrices (JADE) BSS algorithm is employed to separate the jamming and target components. This method overcomes the dependence of conventional BSS approaches on spatial separability and remains effective in self-defense jamming scenarios where the jamming and target share the same azimuth. The proposed method demonstrates effective suppression of various ISRJ types, including direct forwarding, repeated forwarding, and frequency-shift forwarding. Compared with existing ISRJ suppression techniques, this approach provides improved anti-jamming performance, as it is largely unaffected by ISRJ slice width, does not require prior knowledge of jamming parameters, and exhibits minimal sensitivity to variations in Signal-to-Interference Ratio (SIR).
LLM Channel Prediction Method for TDD OTFS Low-Earth-Orbit Satellite Communication Systems
YOU Yuxin, JIANG Xinglong, LIU Huijie, LIANG Guang
2025, 47(8): 2535-2548.   doi: 10.11999/JEIT250105
[Abstract](373) [FullText HTML](242) [PDF 3234KB](87)
Abstract:
Orthogonal Time Frequency Space (OTFS) modulation shows promise in Low Earth Orbit (LEO) satellite-to-ground communications. However, rapid Doppler shift variation and high latency in LEO systems lead to channel aging. Real-time channel estimation increases the computational complexity of onboard receivers and reduces transmission efficiency due to substantial pilot overhead. This study addresses a Ka-band Multiple-Input Single-Output (MISO) OTFS satellite-to-ground communication system by designing a Downlink (DL) channel prediction scheme based on Uplink (IL) channel estimation. A high-precision channel estimation method is proposed, combining matched filtering with data detection to extract UL Channel State Information (CSI). An Adaptive Sparse Large Language Model (ASLLM)-based channel prediction network is then constructed to predict DL CSI. Compared with existing methods, simulations show that the proposed approach achieves lower Normalized Mean Square Error (NMSE) and Bit Error Rate (BER), with improved generalization across multiple scenarios and within an acceptable computational complexity range.  Objective   LEO satellite communication systems offer advantages over Medium-Earth-Orbit (MEO) and Geostationary-Earth-Orbit (GEO) systems, particularly in terms of reduced transmission latency and lower path loss. Therefore, LEO satellites are considered a key element of the Sixth-Generation (6G) Non-Terrestrial Network (NTN) satellite internet architecture. However, high-mobility channels between LEO satellites and ground stations introduce significant challenges for conventional Orthogonal Frequency Division Multiplexing (OFDM), resulting in marked performance degradation. OTFS modulation, which operates in the Delay-Doppler (DD) domain, has been shown to outperform OFDM in high-mobility scenarios, Multiple-Input Multiple-Output (MIMO) systems, and millimeter-wave frequency bands. This performance advantage is attributed to its robustness to Doppler shifts and inter-symbol interference. In modern Time Division Duplexing (TDD) satellite communication systems, OTFS receivers require high-complexity real-time channel estimation, and transmitters rely on extensive pilot overhead to encode CSI for reliable data recovery. To mitigate these limitations, channel prediction schemes using UL CSI to predict DL CSI have been proposed. However, broadband MISO-OTFS systems with large antenna arrays and high-resolution transmission demand precise and efficient CSI prediction under rapidly varying DD-domain conditions. The dynamic and rapidly aging characteristics of DD domain CSI present significant challenges for accurate prediction in broadband, high-mobility, and large-scale antenna communication systems. To address this, an ASLLM-based channel prediction method is developed. The proposed method enables accurate prediction of DD-domain CSI under these conditions.  Methods  By modeling the input–output relationship of a MISO OTFS satellite-to-ground communication system, this study proposes a data-assisted fractional Doppler matched filtering algorithm for channel estimation. This method leverages the shift property of correlation functions and integrates iterative optimization through Minimum Mean Square Error (MMSE) signal detection to achieve accurate estimation of DD domain CSI. The resulting high-precision CSI serves as a reliable input for the subsequent prediction network. The task of predicting DL slot CSI from UL slot CSI is formulated as a minimization of the NMSE between the network’s predicted CSI and the true DL CSI. The proposed ASLLM prediction network consists of a preprocessing layer, an embedding layer, a Generative Pre-trained Transformer (GPT) layer, and an output layer. The raw DD-domain CSI is first processed through the preprocessing layer to extract convolutional features. In the embedding layer, a value attention module and a position attention module are applied to convert the CSI features into a structured, text-like input suitable for GPT processing. The value attention module adaptively extracts sparse feature values of the CSI, while the position attention module encodes positional characteristics in a non-trainable manner. The core of the prediction network is a pre-trained, open-source GPT-2 backbone, which is used to model and forecast the CSI sequence. The network output is then passed through a linear transformation layer to recover the predicted DD-domain CSI.  Results and Discussions  The satellite-to-ground channel is modeled using the NTN-TDL-D dual mobility channel and simulated with QuaDRiGa. First, the performance of the data-assisted matched filtering channel estimation method is validated (Fig. 7). At a Signal-to-Noise Ratio (SNR) of 20 dB, the BER reaches the order of 0.001 after three iterations. Next, training loss curves for several neural network models are compared (Fig. 8). The ASLLM model exhibits the fastest convergence and highest stability. It also achieves superior NMSE and BER performance in MMSE data detection compared with other approaches (Fig. 9). ASLLM demonstrates strong generalization across different channel models and varying terminal velocities (Fig. 10). However, in cross-frequency generalization scenarios, a small number of additional training samples are still required to maintain accuracy (Fig. 11). Finally, ablation experiments confirm the contribution of each core module within the ASLLM architecture (Table 2). Comparisons of network parameters, training time, and inference time indicate that the computational complexity of ASLLM remains within an acceptable range (Table 3).  Conclusions  This study proposes a channel prediction method for TDD MISO OTFS systems, termed ASLLM, tailored to high-mobility scenarios such as communication between LEO satellites and high-speed trains. The approach leverages high-precision historical UL CSI, obtained through a data-assisted matched filtering algorithm, to predict future DL CSI. By extracting sparse features from DD domain CSI, the method fine-tunes a pre-trained GPT-2 model—originally trained on general knowledge—to improve predictive accuracy. Simulation results show that: (1) considering both computational complexity and estimation accuracy, optimal stopping criteria for the channel estimation algorithm are defined as an iteration number of 3 and a threshold of 0.001; (2) ASLLM outperforms existing prediction methods in terms of convergence speed, NMSE, BER, and generalization capability; and (3) each module of the network contributes effectively to performance, while overall computational complexity remains within a feasible range.
Chinese Semantic Communication System Based on Word-level and Sentence-level Semantics
DENG Jiewen, ZHAO Haitao, WEI Jibo, CAO Kuo, ZHANG Yichi, LUO Peng, ZHANG Yuyuan, LIU Yueling
2025, 47(8): 2549-2562.   doi: 10.11999/JEIT250137
[Abstract](143) [FullText HTML](73) [PDF 3188KB](23)
Abstract:
  Objective  To address the mismatch between limited communication resources and growing service demands, semantic communication—a novel paradigm—has been proposed and is expected to offer an effective solution. Unlike traditional approaches that focus on accurate symbol transmission, semantic communication operates at the semantic level, aiming to convey intended meaning by leveraging shared background knowledge at both the transmitter and receiver. Advances in semantic information theory provide a theoretical basis for this paradigm, while the development of artificial intelligence techniques for semantic extraction and understanding supports practical system implementation. Most existing semantic communication systems for textual data are based on English corpora; however, Chinese text differs markedly in word segmentation, lexical annotation, and syntactic structure. Systems tailored for Chinese corpora remain underexplored. Furthermore, current lexical code-based systems primarily focus on word-level semantics and fail to fully capture sentence-level semantics. This study addresses these limitations by mining and processing lexical and contextual semantics specific to Chinese text. A semantic communication system is proposed that uses Chinese corpora to learn and extract both word-level and sentence-level semantic associations. Lexical coding is performed at the transmitter, and joint context decoding is realized at the receiver, thereby improving the effectiveness and reliability of the communication process.  Methods  A Chinese semantic communication system is designed to capture both word-level and sentence-level semantics, leveraging the unique characteristics of Chinese text to enable efficient and reliable transmission of meaning. At the transmitter, a lexical coding method is proposed that encodes words based on their combined lexical semantic features. At the receiver, a two-stage decoding process is implemented. First, the Continuous Bag-of-Words (CBOW) model is used to learn word-level semantics from shared knowledge, estimating the conditional probability of the next word based on preceding words. Second, the Bidirectional Encoder Representations from Transformers (BERT) model is applied to capture sentence-level semantics, using Chinese characters as the fundamental processing unit to compute the probability distribution of words at each position in the sentence. Upon receiving the bit sequence, Huffman decoding is performed with a candidate code list mechanism to generate a set of candidate words. A recursive memoization algorithm then selects the most probable words based on word-level semantics. Finally, sentence-level semantics are applied to correct potential errors in the sentence, producing the recovered text.  Results and Discussions  The proposed semantic communication system improves effectiveness by encoding combined phrases during lexical coding, thereby reducing the number of coding objects. Reliability is enhanced by leveraging contextual associations during feature learning and joint decoding. For effectiveness, the average code length of the Huffman coding dictionary is 10.61, while the lexical coding dictionary for four categories achieves an average of 8.98. This represents an 18.15% increase in average coding rate. Experiments conducted on 100 randomly selected texts across different corpus sizes yield consistent results (Table 3, Fig. 5), validating the effectiveness of lexical coding. For reliability, system performance is first evaluated under varying parameter settings. The optimal values for context window size, lexical category count, and Hamming distance threshold are identified (Figs. 610). Comparative analysis across different systems is then conducted. Under an AWGN channel, the lexical+word-level+sentence-level semantic system achieves higher BLEU scores than the Huffman-only system when the Signal-to-Noise Ratio (SNR) is ≤6 dB, and matches the performance of DeepSC between –3 dB and 3 dB. At SNR ≥9 dB, its BLEU scores are slightly lower than those of the Huffman-only system but significantly higher than those of DeepSC. Across all SNR ranges, the lexical+word-level+sentence-level system outperforms the lexical+word-level system. The BLEU scores of the Huffman+word-level and Huffman+sentence-level systems are similar and consistently exceed those of the Huffman-only system. Similar trends are observed on Rayleigh and Rician fading channels and with METEOR scores (Figs. 11, 12). These results indicate that combining word-level and sentence-level semantics with a candidate set mechanism for joint context decoding substantially enhances transmission reliability at the receiver.  Conclusions  A Chinese semantic communication system based on word-level and sentence-level semantics is proposed. First, a lexical grouping and coding method based on LAC segmentation is developed by analyzing lexical features in Chinese text, which improves the effectiveness of the communication system. Second, the receiver models context co-occurrence probabilities by extracting word-level and sentence-level semantic features, enabling joint decoding through word selection and sentence-level error correction, thereby enhancing reliability. Simulation results show that the average code length of the Huffman coding dictionary is 10.61, while the lexical coding dictionary for four categories achieves an average of 8.98, resulting in an 18.15% increase in coding rate. On the AWGN channel, the proposed lexical+word-level+sentence-level system outperforms the Huffman-only system at low SNR and the DeepSC system at high SNR. The Huffman+word-level and Huffman+sentence-level systems yield similar reliability scores, both consistently higher than the Huffman-only system. These findings confirm that incorporating both word-level and sentence-level semantics significantly enhances system reliability.
Service Caching and Task Migration Mechanism Based on Internet of Vehicles
ZUO Linli, XIA Shichao, LI Yun, PAN Junnan, CHEN Bingyi
2025, 47(8): 2563-2572.   doi: 10.11999/JEIT241097
[Abstract](140) [FullText HTML](78) [PDF 2726KB](17)
Abstract:
  Objective  In the era of digital transformation and smart mobility, the Internet of Vehicles (IoV) has emerged as a transformative paradigm reshaping transportation systems and vehicle-related services. In recent years, the proliferation of IoV applications has led to the generation and processing of large volumes of real-time data, requiring ultra-low latency and high-efficiency computation to maintain seamless functionality and ensure high-quality user experiences. To meet these demands, Mobile Edge Computing (MEC) has been widely adopted in the IoV domain, effectively reducing the load on backhaul links. However, the dynamic and mobile nature of vehicular networks poses significant challenges to the effective deployment of edge services and the efficient management of task migration. Vehicles continuously move across regions with heterogeneous network conditions, edge node coverage, and service availability. Conventional static or rule-based approaches for service caching and task migration often fail to adapt to these environmental dynamics, leading to degraded performance, frequent service interruptions, and elevated energy consumption. This study proposes a Joint Service Caching and Task Migration Algorithm (SCTMA) tailored to the dynamic characteristics of the IoV environment. By incorporating machine learning, optimization techniques, and context-aware decision-making, SCTMA dynamically adjusts caching and migration strategies to ensure that appropriate services are delivered to suitable edge nodes at the optimal time, thereby minimizing latency and improving resource utilization.  Methods  This study systematically considers multiple constraints within the IoV system, including caching decisions, the number of cached services, cache capacity, CPU resource consumption, and task migration policies at edge nodes. To jointly optimize service caching and task migration under these constraints, a Markov Decision Process (MDP) model is constructed. The MDP framework captures the temporal dynamics of the IoV environment, wherein system states, such as vehicle location, service demand, and cache status evolve over time. The reward function is formulated to balance competing objectives, including minimizing latency, reducing energy consumption, and improving the cache hit ratio. To address inefficient utilization of Base Station (BS) caching resources and mitigate storage waste, the concept of a service hit ratio is introduced. Based on this ratio, BSs proactively cache frequently requested services, thereby reducing latency and energy usage during Vehicle User (VU) service requests and enhancing overall caching efficiency. A task migration algorithm is also developed, incorporating vehicle velocity to estimate the remaining dwell time of a VU within the coverage area of a RoadSide Unit (RSU). This estimation is used to compute the associated service data migration volume and assess migration costs. Building on this framework, a Joint SCTMA is proposed. SCTMA employs the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method to address uncertainties in multi-agent settings. This approach reduces system communication and computation costs, optimizes migration notification strategies, and improves the cache hit ratio.  Results and Discussions  Simulation results indicate that the proposed SCTMA algorithm effectively reduces caching and task migration costs while improving the cache hit ratio. Following training, the system’s long-term average reward under SCTMA markedly exceeds that of baseline algorithms (Fig. 3). Specifically, SCTMA maintains the long-term average reward at approximately –30, whereas the best-performing comparative method stabilizes at around –38, corresponding to an improvement of at least 21.05%. Further analysis of edge device caching performance (Fig. 5(a), Fig. 5(b)) shows that as the maximum cache capacity increases, the system using SCTMA consistently achieves the highest cache hit ratio across all tested scenarios.  Conclusions  In edge computing-enabled IoV ecosystems, where vehicles interact with infrastructure and peer nodes through interconnected edge networks, this study examines decision-making mechanisms for service hit rate optimization and task migration. By formulating the joint optimization of service caching and task migration as A MDP, a Joint SCTMA is proposed. Simulation results show that SCTMA reduces service caching and task migration costs, shortens service request latency for VU, and improves overall system performance. However, the current study assumes an idealized IoV environment. Future research should evaluate the algorithm’s robustness and efficiency under real-world conditions.
Sparse Channel Estimation and Array Blockage Diagnosis for Non-Ideal RIS-Assisted MIMO Systems
LI Shuangzhi, LEI Haojie, GUO Xin
2025, 47(8): 2573-2583.   doi: 10.11999/JEIT241108
[Abstract](219) [FullText HTML](102) [PDF 3935KB](20)
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RISs) offer a promising approach to enhance Millimeter-Wave (mmWave) Multiple-Input Multiple-Output (MIMO) systems by dynamically manipulating wireless propagation. However, practical deployments are challenged by hardware faults and environmental blockages (e.g., dust or rain), which impair Channel State Information (CSI) accuracy and reduce Spectral Efficiency (SE). Most existing studies either overlook the interdependence between the CSI and blockage vector or fail to leverage the dual sparsity of multipath channels and blockage patterns. This study proposes a joint sparse channel estimation and blockage diagnosis scheme to overcome these limitations, thereby enabling reliable beamforming and enhancing system robustness in non-ideal RIS-assisted mmWave MIMO environments.  Methods  A third-order Parallel Factor (PARAFAC) decomposition model is constructed for the received signals using a tensor-based signal representation. The intrinsic relationship between mmWave channel parameters and the blockage vector is exploited to estimate spatial angular frequencies at the User Equipment (UE) and Base Station (BS) using Orthogonal Matching Pursuit (OMP). Based on these frequencies, a coupled observation matrix is formed to jointly capture residual channel parameters and blockage vector information. This matrix is reformulated as a Least Absolute Shrinkage and Selection Operator (LASSO) problem, which is solved using the Alternating Direction Method of Multipliers (ADMM) to estimate the blockage vector. The remaining channel parameters are then recovered using sparse reconstruction techniques by leveraging their inherent sparsity. Iterative refinement updates both the blockage vector and channel parameters, ensuring convergence under limited pilot overhead conditions.  Results and Discussions  For a non-ideal RIS-assisted mmWave MIMO system (Fig. 1), a signal transmission framework is designed (Fig. 2), in which the received signals are represented as a third-order tensor. Leveraging the dual-sparsity of multipath channels and the blockage vector, a joint estimation scheme is developed (Algorithm 1), enabling effective parameter decoupling through tensor-based parallel factor decomposition and iterative optimization. Simulation results show that the proposed scheme achieves superior performance in both channel estimation and blockage diagnosis compared with baseline methods by fully exploiting dual-sparsity characteristics (Fig. 3). SE analysis confirms the detrimental effect of blockages on system throughput and highlights that the proposed scheme improves SE by compensating for blockage-induced impairments (Fig. 4). The method also demonstrates strong estimation accuracy under reduced pilot overhead (Fig. 5) and improved robustness as the number of blocked RIS elements increases (Fig. 6). A decline in spatial angular frequency estimation is observed with fewer UE antennas, which negatively affects overall performance; however, estimation stabilizes as antenna count increases (Fig. 7). Moreover, when Non-Line-of-Sight (NLoS) path contributions decrease, the scheme exhibits enhanced performance due to improved resolution between Line-of-Sight (LoS) and NLoS components (Fig. 8).  Conclusions  This study proposes a joint channel estimation and blockage diagnosis scheme for non-ideal RIS-assisted mmWave MIMO systems, based on the dual sparsity of multipath channels and blockage vectors. Analysis of the tensor-based parallel factor decomposition model reveals that the estimation of spatial angular frequencies at the UE and BS is unaffected by blockage conditions. The proposed scheme accounts for the contributions of NLoS paths, enabling accurate decoupling of residual channel parameters and blockage vector across different propagation paths. Simulation results confirm that incorporating NLoS path information improves both channel estimation accuracy and blockage detection. Compared with existing methods, the proposed approach achieves superior performance in both aspects. In practical scenarios, real-time adaptability may be challenged if blockage states vary more rapidly than channel characteristics. Future work will focus on enhancing the scheme’s responsiveness to dynamic blockage conditions.
An Improved Modulation Recognition Method Based on Hybrid Kolmogorov-Arnold Convolutional Neural Network
ZHENG Qinghe, LIU Fanglin, YU Lisu, JIANG Weiwei, HUANG Chongwen, LI Bin, SHU Feng
2025, 47(8): 2584-2597.   doi: 10.11999/JEIT250161
[Abstract](524) [FullText HTML](278) [PDF 7312KB](82)
Abstract:
  Objective  With the rapid growth of communication devices and increasing complexity of electromagnetic environments, spectrum efficiency has become a critical performance metric for sixth-generation communication systems. Modulation recognition is an essential function of dynamic spectrum access, aiming to automatically identify the modulation scheme of received signals to enhance spectrum utilization. In practice, wireless signals are often affected by multipath propagation, interference, and noise, which pose challenges for accurate recognition. To address these issues, this study proposes a deep learning-based approach using an end-to-end model that eliminates manual feature extraction, mitigates limitations of handcrafted features, and improves recognition accuracy. By transferring general knowledge from signal classification to modulation recognition, a well-generalized method based on a hybrid Kolmogorov-Arnold Convolutional Neural Network (KA-CNN) is developed. This approach supports reliable communication in applications such as intelligent transportation, the Internet of Things (IoT), vehicular ad hoc networks, and satellite communication.  Methods  The proposed modulation recognition method first decomposes the signal into a multi-dimensional wavelet domain using a dual-tree complex wavelet packet transform. Different frequency components are then combined to construct a multi-scale signal representation, enabling the neural network to learn consistent features across frequencies. A deep learning structure, KA-CNN, is designed by integrating spline functions with nonlinear activation functions to enhance nonlinear fitting and continuous learning of periodic features. Spline functions are used to address the curse of dimensionality. To improve adaptability to varying signal parameters and enhance generalization across communication scenarios, multilevel grid training with Lipschitz regularization constraints is applied. In KA-CNN, the hybrid module transfers the characteristics of the spline function into convolution operations, which improves the model’s capacity to capture complex mappings between input signals and modulation schemes while retaining the efficiency of the Kolmogorov-Arnold network. This enhances both the expressive power and adaptability of deep learning models under complex communication conditions.  Results and Discussions  During the experimental phase, modulation recognition performance testing, ablation study, and comparative analysis are conducted on three publicly available datasets (RadioML 2016.10a, RadioML 2018.01a, and CSPB.ML.2023) to evaluate the performance of KA-CNN. Results show that KA-CNN achieves modulation recognition accuracies of 65.14%, 65.56%, and 78.40% on RadioML 2016.10a, RadioML 2018.01a, and CSPB.ML.2023, respectively (Figure 6). The main performance limitation arises in the classification of QPSK versus 8PSK, AM-DSB versus WBFM, and high-order QAM modulation types (Figure 7). Maximum differences in recognition accuracy of KA-CNN driven by different signal representations reach 2.04%, 3.46%, and 4.54% across the three datasets, demonstrating the effect of signal representation (Figure 8). The wavelet packet transform constructs a multi-scale time-frequency representation of signals that is insensitive to the maximum decomposition scale L and supports complementary learning of different modulation features. The hybrid Kolmogorov-Arnold convolutional module and the multi-dimensional perceptual cascade attention mechanism play key roles in enhancing modulation recognition accuracy, particularly under relatively high Signal-To-Noise Ratio (SNR) conditions (Figure 9). Additionally, finer grids and higher decomposition orders improve the model’s ability to extract discriminative signal features, thereby increasing recognition accuracy (Figure 10). Finally, a comparative evaluation against several deep learning models, including GGCNN, Transformer, PR-LSTM, and MobileViT, confirms the superior performance of KA-CNN (Figure 11).  Conclusions  This study proposes a hybrid KA-CNN to address the reduced modulation recognition accuracy caused by noise and parameter variation, as well as the limited generalization across communication scenarios in existing deep learning models. By integrating spline functions with nonlinear activation functions, KA-CNN mitigates the curse of dimensionality and improves its capacity for continuous learning of periodic features. A dual-tree complex wavelet packet transform is used to construct a multi-scale signal representation, enabling the model to extract consistent features across frequencies. The model is trained using multilevel grids with Lipschitz regularization constraints to enhance adaptability to varying signal parameters and improve generalization. Experimental results on three public datasets demonstrate that KA-CNN improves modulation recognition accuracy and exhibits robust generalization, particularly under low SNRs.
Multi-dimensional Performance Adaptive Content Caching in Mobile Networks Based on Meta Reinforcement Learning
LIN Peng, WANG Jun, LIU Yan, ZHANG Zhizhong
2025, 47(8): 2598-2607.   doi: 10.11999/JEIT250100
[Abstract](102) [FullText HTML](71) [PDF 2967KB](19)
Abstract:
  Objective  Content caching enhances the efficiency of video services in mobile networks. However, most existing studies optimize caching strategies for a single performance objective, overlooking their combined effect on key metrics such as content delivery latency, cache hit rate, and redundancy rate. An effective caching strategy must simultaneously satisfy multiple performance requirements and adapt to their dynamic changes over time. This study addresses these limitations by investigating the joint optimization of content delivery latency, cache hit rate, and redundancy rate. To capture the interdependencies and temporal variations among these metrics, a meta-reinforcement learning-based caching decision algorithm is proposed. Built on conventional reinforcement learning frameworks, the proposed method enables adaptive optimization across multiple performance dimensions, supporting a dynamic and balanced content caching strategy.  Methods  To address the multi-dimensional objectives of content caching, namely, content delivery latency, cache hit rate, and redundancy rate, this study proposes a performance-aware adaptive caching strategy. Given the uncertainty and temporal variability of interrelationships among performance metrics in real-world environments, dynamic correlation parameters are introduced to simulate the evolving behavior of these metrics. The caching problem is formulated as a dynamic joint optimization task involving delivery latency efficiency, cache hit rate, and a cache redundancy index. This problem is further modeled as a Markov Decision Process (MDP), where the state comprises the content popularity distribution and the caching state from the previous time slot; the action represents the caching decision at the current time slot. The reward function is defined as a cumulative metric that integrates dynamic correlation parameters across latency, hit rate, and redundancy. To solve the MDP, a Model-Agnostic Meta-Reinforcement Learning Algorithm (MAML-DDPG) is proposed. This algorithm reformulates the joint optimization task as a multi-task reinforcement learning problem, enabling adaptation to dynamically changing optimization targets and improving decision-making efficiency.  Results and Discussions  This study compares the performance of MAML-DDPG with baseline algorithms under a gradually changing Zipf parameter (0.5 to 1.5). Results show that MAML-DDPG maintains more stable system performance throughout the change, indicating superior adaptability. The algorithm’s response to abrupt shifts in optimization objectives is further evaluated by modifying weight parameters during training. Specifically, the experiments include comparisons among DDPG, \begin{document}$ {\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{100} $\end{document}, \begin{document}$ \mathrm{M}\mathrm{A}\mathrm{M}\mathrm{L}{-\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{100} $\end{document}, and \begin{document}$ \mathrm{M}\mathrm{A}\mathrm{M}\mathrm{L}{-\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{150} $\end{document}, where \begin{document}$ {\mathrm{D}\mathrm{D}\mathrm{P}\mathrm{G}|}_{100} $\end{document} denotes a change in weight parameters at the 100th training cycle to simulate task mutation. Results show that the DDPG model exhibits a sharp drop in convergence value following the change and stabilizes at a lower performance level. In contrast, MAML-DDPG, although initially affected by the shift, recovers rapidly due to its meta-learning capability and ultimately converges to a higher-performing caching strategy.  Conclusions  This study addresses the content caching problem in mobile edge networks by formulating it as a joint optimization task involving cache hit rate, cache redundancy index, and delivery latency efficiency. To handle the dynamic uncertainty associated with these performance metrics, a MAML-DDPG is proposed. The algorithm enables rapid adaptation to changing optimization targets, improving decision-making efficiency. Simulation results confirm that MAML-DDPG effectively adapts to dynamic performance objectives and outperforms existing methods across multiple caching metrics. The findings demonstrate the algorithm’s capability to meet evolving performance requirements while maintaining strong overall performance.
RIS-Enhanced Semantic Communication Systems Oriented towards Semantic Importance and Robustness
ZHANG Zufan, YIN Xingran, ZHOU Jianping, LIU Yue
2025, 47(8): 2608-2620.   doi: 10.11999/JEIT250159
[Abstract](171) [FullText HTML](64) [PDF 5084KB](50)
Abstract:
  Objective  The deep integration of Deep Learning (DL) and Semantic Communication (SC) has become a key trend in next-generation communication systems. Current SC systems primarily adopt DL-based Joint Source-Channel Coding (JSCC) with end-to-end training to enable efficient semantic transmission. However, several limitations remain. Existing systems often optimize physical-layer channel characteristics or semantic-layer feature extraction in isolation, without establishing cross-layer mapping mechanisms. In addition, protection strategies for critical semantic features in fading channel environments are insufficient, limiting semantic recovery performance. To address these challenges, this study integrates Reconfigurable Intelligent Surfaces (RIS) into SC systems and proposes an intelligent transmission scheme based on dual-dimensional semantic feature metrics. The proposed approach effectively enhances semantic recovery capability under adverse channel conditions. This work provides a new intelligent solution for protecting semantic features in fading channels and establishes theoretical support for collaborative mechanisms between physical and semantic layers in SC systems.  Methods  This study develops a joint semantic importance-robustness metric model. Semantic importance is quantified using Bidirectional Encoder Representations from Transformers (BERT) combined with cosine similarity, while semantic robustness is assessed by measuring the loss increments of high-dimensional feature vectors during transmission. A dynamically updated background knowledge base is constructed to support a priority evaluation framework for semantic features (Fig. 2). During transmission, the system partitions the original text into high- and low-priority data streams based on feature priority. High-priority streams are transmitted through RIS-assisted channels, whereas low-priority streams are transmitted over conventional fading channels. At the physical layer, an alternating optimization algorithm jointly designs active precoding beamforming vectors and RIS passive phase matrices. At the receiver, semantic reconstruction is performed under the guidance of feature priority index lists (Fig. 1).  Results and Discussions  The proposed SISR-RIS system effectively reduces the distortion effects of channel fading on critical semantic features by establishing cross-layer mapping between semantic features and physical channels. Simulation results show that, in medium-to-low Signal-to-Noise Ratio (SNR) environments, the SISR-RIS system maintains high low-order BLEU scores and approaches the theoretical performance boundary near the 10 dB SNR threshold, achieving approximately 95% recovery accuracy for BLEU-1 and 92% for BLEU-2 (Fig.3(a)). As the n-gram order increases, the system outperforms the baseline Deep-SC system by approximately 10% in BLEU-4, confirming its improved capability for contextual semantic reconstruction (Fig.3(b)). Owing to the dual-dimensional metric mechanism, the system demonstrates stable performance with less than 1% variance in recovery accuracy across short and long sentences (Fig. 4). Case analysis indicates that when the original statements cannot be fully restored, the system maintains semantic equivalence through appropriate synonym substitutions. Additionally, core verbs and nouns are consistently assigned higher feature priority scores, which reduces the effect of channel fading on critical semantic features (Tables 2 and 3; Figs. 5 and 6).  Conclusions  This study proposes a RIS-enhanced SC system designed to account for semantic importance and robustness. By extracting semantic importance and robustness features to prioritise transmission and implementing a joint physical-semantic layer design enabled by RIS, the system provides enhanced protection for high-importance, low-robustness semantic features. Evaluations based on BLEU scores, BERT Semantic Similarity (BERT-SS) metrics, and case analyses demonstrate the following: (1) The proposed system achieves a 15% performance improvement over baseline systems in low SNR environments, with performance approaching theoretical limits near the 10 dB SNR threshold; (2) In high-SNR conditions, the system performs comparably to state-of-the-art methods across both BLEU and BERT-SS metrics; (3) The dual-dimensional semantic feature metric mechanism enhances contextual semantic relevance, reduces the recovery discrepancy between long and short sentences to below 1% in high-SNR scenarios, and demonstrates strong adaptability to varying text lengths.
Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy
CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia
2025, 47(8): 2621-2629.   doi: 10.11999/JEIT250021
[Abstract](282) [FullText HTML](172) [PDF 2557KB](44)
Abstract:
  Objective  To address the future trend of network densification and spatial distribution, this study proposes a multi-base station air–ground integrated ultra-dense network architecture and develops a semi-distributed scheme for resource optimization. The network comprises coexisting macro, micro, and Unmanned Aerial Vehicle (UAV) base stations. A semi-distributed Double Deep Q Network (DDQN)-based power control scheme is designed to reduce computational burden, improve response speed, and overcome the lack of global optimization in conventional fully centralized approaches. The proposed scheme enhances energy efficiency by combining distributed decision-making at the base station level with centralized training via a network trainer, enabling a balance between computational complexity and performance. The DDQN algorithm facilitates local decision-making while centralized coordination ensures overall network optimization.  Methods  This study establishes a complex dense network model for air–ground integration with coexisting macro, micro, and UAV base stations, and proposes a semi-distributed DDQN scheme to improve network energy efficiency. The methods are as follows: (1) Construct an integrated air–ground dense network model in which macro, micro, and UAV base stations share the spectrum through a cooperative mechanism, thereby overcoming the performance bottlenecks of conventional heterogeneous networks. (2) Develop an improved semi-distributed DDQN algorithm that enhances Q-value estimation accuracy, addressing the limitations of traditional centralized and distributed control modes and mitigating Q-value overestimation observed in conventional Deep Q Network (DQN) approaches. (3) Introduce a disturbance factor to increase the probability of exploring random actions, strengthen the algorithm’s ability to escape local optima, and improve estimation accuracy.  Results and Discussions  Simulation results demonstrate that the proposed semi-distributed DDQN scheme effectively adapts to dense and complex network topologies, yielding marked improvements in both energy efficiency and total throughput relative to traditional DQN and Q-learning algorithms. Key results include the following: The total throughput achieved by DDQN exceeds that of the baseline DQN and Q-learning algorithms (Fig. 3). In terms of energy efficiency, DDQN exhibits a clear advantage, converging to 84.60%, which is 15.18% higher than DQN (69.42%) and 17.1% higher than Q-learning (67.50%) (Fig. 4). The loss value of DDQN also decreases more rapidly and stabilizes at a lower level. With increasing iterations, the loss curve becomes smoother and ultimately converges to 100, which is 100 lower than that of DQN (Fig. 5). Moreover, DDQN achieves the highest user access success rate compared with DQN and Q-learning (Fig. 6). When the access success rate reaches 80%, DDQN requires significantly fewer iterations than the other two algorithms. This advantage becomes more pronounced under high user density. For example, when the number of users reaches 800, DDQN requires fewer iterations than both DQN and Q-learning to achieve comparable performance (Fig. 7).  Conclusions  This study proposes a semi-distributed DDQN strategy for intelligent control of base station transmission power in ultra-dense air–ground networks. Unlike traditional methods that target energy efficiency at the individual base station level, the proposed strategy focuses on optimizing the overall energy efficiency of the network system. By dynamically adjusting the transmission power of macro, micro, and airborne base stations through intelligent learning, the scheme achieves system-level coordination and adaptation. Simulation results confirm the superior adaptability and performance of the proposed DDQN scheme under complex and dynamic network conditions. Compared with conventional DQN and Q-learning approaches, DDQN exhibits greater flexibility and effectiveness in resource control, achieving higher energy efficiency and sustained improvements in total throughput. These findings offer a new approach for the design and management of integrated air–ground networks and provide a technical basis for the development of future large-scale dense network architectures.
Joint Resource Optimization Algorithm for Intelligent Reflective Surface Assisted Wireless Soft Video Transmission
WU Junjie, LUO Lei, ZHU Ce, JIANG Pei
2025, 47(8): 2630-2641.   doi: 10.11999/JEIT250019
[Abstract](207) [FullText HTML](109) [PDF 5317KB](34)
Abstract:
  Objective  Intelligent Reflecting Surface (IRS) technology is a key enabler for next-generation mobile communication systems, addressing the growing demands for massive device connectivity and increasing data traffic. Video data accounts for over 80% of global mobile traffic, and this proportion continues to rise. Although video SoftCast offers a simpler structure and more graceful degradation compared to conventional separate source-channel coding schemes, its transmission efficiency is restricted by the limited availability of wireless transmission resources. Moreover, existing SoftCast frameworks are not inherently compatible with IRS-assisted wireless channels. To address these limitations, this paper proposes an IRS-assisted wireless soft video transmission scheme.  Methods  Video soft transmission distortion is jointly determined by three critical wireless resources: transmit power, active beamforming at the primary transmitter, and passive beamforming at the IRS. Minimizing video soft transmission distortion is therefore formulated as a joint optimization problem over these resources. To solve this multivariable problem, an Alternating Optimization (AO) framework is employed to decouple the original problem into single-variable subproblems. For the fractional nonhomogeneous quadratic optimization and unit-modulus constraints arising in this process, the Semi-Definite Relaxation (SDR) method is applied to obtain near-optimal solutions for both active and passive beamforming vectors. Based on the derived beamforming vectors, the optimal power allocation factor for soft transmission is then computed using the Lagrange multiplier method.  Results and Discussions  Simulation results indicate that the proposed method yields an improvement of at least 1.82 dB in Peak Signal-to-Noise Ratio (PSNR) compared to existing video soft transmission approaches (Fig. 3). Besides, evaluation across extensive HEVC test sequences shows that the proposed method achieves an average received quality gain of no less than 1.51 dB (Table 1). Further simulations reveal that when the secondary link channel quality falls below a critical threshold, it no longer contributes to improving the received video quality (Fig. 5). Rapid variations in the secondary signal \begin{document}$c$\end{document} degrade the reception quality of the primary signal, with a reduction of approximately 0.52 dB observed (Fig. 6). Increasing the number of IRS elements significantly enhances both video reception quality and achievable rates for the primary and secondary links (Fig. 7); however, this improvement comes with a power-law scaling increase in computational complexity. Additional simulations confirm that the proposed method maintains per-frame quality fluctuations within an acceptable range across each Group Of Pictures (GOP) (Fig. 8). As GOP size increases, temporal redundancy within the source is more effectively removed, leading to further improvements in received quality, although this is accompanied by higher computational complexity (Fig. 9).  Conclusions  This paper proposes an IRS-assisted soft video transmission scheme that leverages IRS-aided secondary links to improve received video quality. To minimize video signal distortion, a multivariable optimization problem is formulated for joint resource allocation. An AO framework is adopted to decouple the problem into single-variable subproblems, which are solved iteratively. Simulation results show that the proposed method achieves significant improvements in both objective and subjective visual quality compared to existing video transmission algorithms. In addition, the effects of secondary link channel gain, secondary signal characteristics, the number of IRS elements, and GOP parameters on transmission performance are systematically examined. This study demonstrates, for the first time, the performance enhancement of video soft transmission using IRS and provides a technical basis for the development of video soft transmission in IRS-assisted communication environments.
Energy Characteristic Map Based Resource Allocation Algorithm for High-density V2V Communications
QIU Gongan, LIU Yongsheng, ZHANG Guoan, LIU Min
2025, 47(8): 2642-2651.   doi: 10.11999/JEIT250004
[Abstract](293) [FullText HTML](166) [PDF 4463KB](30)
Abstract:
  Objective  In high density scenarios, the random resource selection method has limitations in handling the high access collision probability of traffic safety messages under the limited frequency resource. At the same time, the variable topology accompanied by high mobility increases the failure rate of Vehicle to Vehicle (V2V) links. However, the traffic safety messages with ultra-high reliability and ultra-low latency are very important to ensure traffic safety and road efficiency under the present scenarios. To address these challenges, integrating the energy characteristic parameters in sub-frames and sub-carriers into the resource block map has emerged as a promising approach. By incorporating the distributed V2V links and designing effective reward functions, it is possible to decrease the access collision probability and smooth the dynamics of variable topology while maintaining high resource efficiency, thereby better meeting the needs of dense traffic. This research offers an intelligent solution for resource allocation in Cellular Vehicle to Everything (C-V2X) and provides theoretical support for the coordinated access of limited frequency with diverse link quality.  Methods  Based on the sustainable adjacency among the neighborhood vehicles in high-density V2V communications, Energy Characteristic Map (ECM) based resource allocation algorithm is proposed using Deep Reinforcement Learning algorithm. The guidance logic of the ECM algorithm periodically renews the energy indicators of candidate resources to train the weight coefficient matrix of two-layer Deep Neural Network (DNN) based on the characteristic results within the sensing window. The algorithm is then used as the action space in double Deep Q-learning Network (DQN) agent to maximize the V2V throughput, which has a main DQN and a target DQN. The state space in the DQN model includes the energy indicators of candidate resources such as the Received Signal Strength Indicator (RSSI) in sub-frames and Signal-to-Interference plus Noise Ratio (SINR) in sub-carriers, along with dynamic factors like the relative position and speed of other vehicles. The reward function is crucial for ensuring the resource efficiency and performance of the safety messages during the resource blocks selection. It accounts for factors such as the bandwidth and SINR of V2V links rewards to optimize decision-making. Additionally, the discount factor determines the weight of future rewards, balancing the importance of immediate versus future rewards. A lower discount factor typically emphasizes immediate rewards, leading to frequently resource block reselection, while a higher discount factor enhances the robustness of occupied resource.  Results and Discussions  The ECM algorithm periodically renews the energy indicators of candidate resources based on the characteristic results within the sensing window, which then serves as the action space in the double DQN agent. By defining an appropriate reward function, the main DQN in double DQN agent is established to select the candidate resource with high energy indicators for V2V links. The numerical results (Eq.(11) and Eq.(15)) between the packet received ratio and the energy indicators are analyzed using the discrete-time Markov chains. Simulation results show that the end-to-end disseminating performance of safety messages under variable V2V distances, simulated on WiLabV2Xsim, are represented (Fig.6, Fig.7). The reliability, PRR, is more than 0.95 under less than 160 veh/km (the blue line), while the comparative PRR is more than 0.95 under less than 120 veh/km (the green line) and 90 veh/km (the red line), respectively (Fig.10). At the same time, the latency, TD, is less than 3 ms under less than 180 veh/km (the blue line), while the comparative TD is less than 3 ms under less than 160 veh/km (the green line) and about 80 veh/km (the red line), respectively (Fig.11). The resource utilization, RU, is more than 0.6 under less than 180 veh/km (the blue line), while the comparative RU is more than 0.6 under less than 160 veh/km (the green line) and about 80 veh/km (the red line), respectively (Fig.12), demonstrating a 10~20% improvement in resource efficiency. When the discount factor is set to 0.9 while the learning rate is set to 0.01 (Fig.8, Fig.9), the VUE selects the resource blocks that balances immediate and long-term throughput, effectively improving the robustness of the main DQN, which meets the advanced V2V service requirements such as platooning in C-V2X.  Conclusions  This paper addresses the challenge of resource allocation in high-density V2V communications by integrating the ECM algorithm with double DQN agent. The proposed resource selection scheme enhances the RSS algorithm by establishing distributed V2V links using high quality resource blocks to maximize throughput. The scheme is evaluated through disseminating safety messages simulations under variable density, and the results show that: (1) The proposed scheme has high reliability with more than 0.95 PRR and ultra-low latency with less than 3 ms TD under upper 160 veh/km. (2) The resource efficiency has been improved by 10~20% over the RSS method; (3) Long-term and short-term rewards are considered by selecting the discount factor of 0.9 and the learning rate of 0.01 and enhance the robustness of DQN model. However, this study has not considered different resource characteristics for the heterogeneous messages with diverse Quality of Service (QoS) providing, which should be accounted for in future work.
Federated Deep Reinforcement Learning-based Intelligent Routing Design for LEO Satellite Networks
LI Xuehua, LIAO Hailong, ZHANG Xian, ZHOU Jiaen
2025, 47(8): 2652-2664.   doi: 10.11999/JEIT250072
[Abstract](347) [FullText HTML](160) [PDF 4203KB](53)
Abstract:
  Objective  The topology of Low Earth Orbit (LEO) satellite communication networks is highly dynamic, rendering traditional terrestrial routing methods unsuitable for direct application. Additionally, due to the limited onboard resources of satellites, Artificial Intelligence (AI)-based routing methods often experience low learning efficiency. Collaborative training requires data sharing and transmission, which poses significant challenges and data security risks. To address these issues, this research introduces Federated Deep Reinforcement Learning (FDRL) into LEO satellite communication networks. By leveraging FDRL’s capabilities in distributed perception, decision-making, and training, it facilitates the efficient learning of global routing strategies. Through local model aggregation and global model sharing among satellite nodes, FDRL dynamically adapts to topology changes while ensuring data privacy, thereby generating optimal routing decisions and enhancing the overall routing performance of LEO satellite networks. Furthermore, integrating Federated Learning (FL) into the LEO satellite network enables autonomous constellation training within regions, eliminating the need to transmit raw data to Ground Stations (GS), thus reducing reliance on GS and minimizing communication overhead during collaborative training.  Methods  A novel FDRL-based intelligent routing method for LEO satellite communication networks is proposed. This method develops a routing model that integrates network, communication, and computational energy consumption, with the optimization objective focused on maximizing the energy efficiency of the LEO satellite network. Utilizing a satellite clustering algorithm, the entire LEO satellite network is partitioned into multiple clusters. Within each cluster, the FDRL framework is implemented, where each LEO satellite uses the Advantage Actor-Critic (A2C) algorithm for local reinforcement learning. The policy network generates efficient routing actions, while the value network dynamically evaluates state values to reduce variance in policy updates. After a specified number of training rounds, the Federated Proximal Algorithm (FedProx) is applied at the cluster head satellite to conduct federated aggregation within the cluster. By collaboratively sharing model parameters among satellites, a global model is jointly trained, enhancing the generalization capability to optimize the network's energy efficiency.  Results and Discussions  To validate the effectiveness of the proposed method, the LEO satellite constellation is first clustered using the suggested clustering algorithm. The number of Cluster Member (CM) nodes within each cluster ranges from 6 to 8 (Fig. 5), with the variation in the CM node count not exceeding 5, indicating relatively stable clustering. FDRL training is then conducted within each cluster. Simulation results show that when the aggregation frequency is set to 400 (i.e., aggregation occurs every 400 time slots), training energy consumption is minimized (Fig. 6), and the reward is most stable (Fig. 7) compared to other aggregation frequencies. Next, the performance of the designed FL-A2C algorithm is compared to other baseline algorithms. The results demonstrate that the FL-A2C algorithm exhibits better convergence and higher total reward values than the benchmarks, namely Sarsa, MAD2QN, and REINFORCE (Fig. 8), although its total reward is slightly lower than that of A2C. Compared to Sarsa, REINFORCE, and MAD2QN, the designed method improves average network throughput by 83.7%, 19.8%, and 14.1%, respectively (Fig. 9); reduces average hop count by 25.0%, 18.9%, and 9.1%, respectively (Fig. 10); and enhances energy efficiency by 55.6%, 42.9%, and 45.8%, respectively (Fig. 11).  Conclusions  To address the challenges posed by the highly dynamic network topology of LEO satellite networks and the limitations of traditional terrestrial routing methods, this research presents a multi-agent FDRL routing method combined with satellite clustering. Comprehensive simulations are conducted to evaluate the intelligent routing method, and the results demonstrate that: (1) The designed FL-A2C algorithm achieves better convergence and enhances the energy efficiency of LEO satellite networks; (2) The stability of LEO satellite clustering is ensured by the proposed scheme; (3) The intelligent routing method outperforms benchmark schemes (Sarsa, REINFORCE, MAD2QN) with triple advantages, achieving 83.7%/19.8%/14.1% higher network throughput, 25.0%/18.9%/9.1% lower hop counts, and 55.6%/42.9%/45.8% better energy efficiency, respectively.
Swin Transformer-based Wideband Wireless Image Transmission Semantic Joint Encoding and Decoding Method
SHEN Bin, LI Xuan, LAI Xuebing, YANG Shuhan
2025, 47(8): 2665-2674.   doi: 10.11999/JEIT250039
[Abstract](295) [FullText HTML](179) [PDF 3029KB](54)
Abstract:
  Objective  Conventional studies on image semantic communication primarily address simplified channel models, such as Gaussian and Rayleigh fading channels. However, real-world wireless communication environments are characterized by complex multipath fading, which necessitates advanced signal processing at both the transmitter and receiver. To address this challenge, this paper proposes a Wideband Wireless Image Transmission Semantic Communication (WWIT-SC) system based on the Swin Transformer. The proposed method enhances image transmission performance in multipath fading channels through end-to-end semantic joint encoding and decoding.  Methods  The WWIT-SC system adopts the Swin Transformer as the core architecture for semantic encoding and decoding. This network not only processes semantic image representations but also improves adaptability to complex channel conditions through a joint mechanism based on Channel State Information (CSI) and Coordinate Attention (CA). CSI, a key signal in wireless systems, enables accurate estimation of channel conditions. However, due to temporal variations in wireless channels, CSI is often subject to attenuation and distortion, reducing its effectiveness when used in isolation. To address this limitation, the system incorporates a CSI-guided CA mechanism that enables fine-grained mapping and adjustment of semantic features across subcarriers. This mechanism integrates spatial and channel-domain features to localize critical information adaptively, thereby accommodating the channel’s time-varying behavior. A Channel Estimation Subnetwork (CES) is further implemented at the receiver to correct CSI estimation errors introduced by noise and dynamic channel variations. The CES enhances CSI accuracy during decoding, resulting in improved semantic image reconstruction quality.  Results and Discussions   The WWIT-SC and CA-JSCC models are trained under fixed Signal-to-Noise Ratio (SNR) conditions and evaluated at the same SNR values. Across all SNR levels, the WWIT-SC model consistently outperforms CA-JSCC. Specifically, Peak Signal-to-Noise Ratio (PSNR) improves by 6.4%, 8.5%, and 9.3% at different bandwidth ratios (R=1/12, 1/6, 1/3)(Fig.4). Both models are also trained using SNR values randomly selected from the range [0, 15] dB and tested at various SNR levels. Although random SNR training leads to reduced overall performance compared to fixed SNR training, WWIT-SC maintains superior performance over CA-JSCC across all conditions. Under these settings, PSNR gains of up to 6.8%, 8.3%, and 9.8% are achieved at different bandwidth ratios (R=1/12, 1/6, 1/3)(Fig. 4). Further evaluation is conducted by training both models on randomly cropped ImageNet images and testing them on the Kodak dataset. The WWIT-SC model trained on the larger dataset achieves up to a 4% PSNR improvement over CA-JSCC on Kodak (Fig. 6). A series of ablation experiments are conducted to assess the contributions of each module in WWIT-SC. First, the Swin Transformer is replaced with the Feature Learning (FL) module from CA-JSCC. Across all three bandwidth ratios, PSNR values for WWIT-SC exceed those of the modified WWIT-SC-FL variant at all SNR levels (Fig. 5(a)), confirming the importance of multi-scale feature extraction. Next, the CSI-CA module is replaced with the Channel Learning (CL) module from CA-JSCC. Again, WWIT-SC outperforms the modified WWIT-SC-CL model across all bandwidth ratios and SNR values (Fig. 5(b)), highlighting the role of the long-range dependency mechanism in enhancing feature localization and adaptation. Finally, the CES is removed to assess its contribution. The original WWIT-SC model consistently achieves higher PSNR values than the variant without CES at all bandwidth ratios and SNR levels (Fig. 5(c)), demonstrating that the inclusion of CES substantially improves channel decoding accuracy.  Conclusions  This paper proposes a Swin Transformer-based WWIT-SC system, integrating Orthogonal Frequency Division Multiplexing (OFDM) technology to enhance semantic image transmission under multipath fading channels. The scheme employs the Swin Transformer as the backbone for the semantic encoder-decoder and incorporates a CSI-assisted CA mechanism to accurately map critical semantic features to subcarriers, adapting to time-varying channel conditions. In addition, a CES at the receiver compensates for channel estimation errors, improving CSI accuracy. Experimental results show that, compared to CA-JSCC, the WWIT-SC system achieves up to a 9.8% PSNR improvement. This work presents a novel solution for semantic image transmission in complex broadband wireless communication environments.
Radar, Navigation and Array Signal Processing
Design of a Very Low Frequency Magnetic Induction Communication System Based on a Series-Array Magnetoelectric Antenna
ZHANG Feng, LI Jiaran, TIAN Yuxiao, XU Ziyang, GONG Zhaoqian, ZHUANG Xin
2025, 47(8): 2675-2684.   doi: 10.11999/JEIT250065
[Abstract](243) [FullText HTML](114) [PDF 4195KB](36)
Abstract:
  Objective  MagnetoElectric (ME) antennas, recognized for their high energy conversion efficiency and compact structure, have gained attention in portable cross-medium communication systems. In the Very Low Frequency (VLF) range, conventional antennas are typically large and difficult to deploy, whereas mechanical antennas—though smaller—exhibit limited radiation intensity, constraining communication range. To address these limitations, this study proposes a portable VLF magnetic induction communication system based on a series-array ME antenna. By connecting seven ME antenna units in series, the radiated field strength is substantially increased. Through the combination of strong ME coupling and an optimized system design, this work offers a practical solution for compact low-frequency communication.  Methods  The radiated magnetic flux density of the antenna is evaluated using a small air-core coil (diameter: 50 mm; length: 120 mm) with a gain-500 preamplifier as the receiving antenna. The conversion coefficient Tr of the receiving antenna is calibrated using a standard Helmholtz coil, enabling conversion of the measured voltage to magnetic flux density. The ME antenna is driven by a signal generator and power amplifier, and the magnetic field strength is measured at a distance of 1.2 m under different drive voltages. To balance hardware simplicity and efficient bandwidth usage, Binary Amplitude Shift Keying (BASK) modulation is employed. On the transmitter side, a computer transmits the bitstream to a Field-Programmable Gate Array (FPGA), which generates the baseband signal and multiplies it by a 27.2 kHz carrier to produce the modulated signal. Following power amplification, the signal directly drives the ME antenna. On the receiver side, the air-core coil receives the transmitted signal, which is subsequently amplified by the preamplifier. A National Instruments (NI) data acquisition module digitizes the signal. Demodulation, including filtering, coherent detection, and symbol decision, is performed on a computer. For laboratory safety and signal stability, the Root Mean Square (RMS) drive voltage is set to 14.8 V, and the symbol rate is fixed at 50 bps. Communication experiments are conducted over distances from 1.2 m to 11.4 m.  Results and Discussions  (1) Antenna radiation intensity. When the RMS drive voltage of the series-array ME antenna is 180.5 V (25.8 V per unit), the measured magnetic field strength reaches 93.6 nT at 1.2 m and 165 nT at 1.0 m. These values indicate strong performance among acoustically driven ME antennas. The results demonstrate that the combination of ME materials with a seven-element series configuration substantially enhances both ME coupling and radiated field strength. (2) System communication performance. The BASK system operates at 50 bps, matching the measured 111 Hz bandwidth of the ME antenna. The receiving antenna exhibits a bandwidth of 851 Hz at 27.6 kHz, which fully covers the transmitted signal. Due to laboratory space constraints, 128-bit random data are transmitted over distances ranging from 1.2 m to 11.4 m. Even at 11.4 m—where the received signal amplitude falls below 0.004 V—the proposed demodulation scheme successfully recovers the transmitted data. To verify these results, a theoretical model of magnetic field attenuation with distance is fitted to the experimental data, showing strong agreement except for minor deviations attributed to environmental noise. Noise spectrum analysis within a 100 Hz bandwidth centered at 27.2 kHz indicates a maximum environmental noise level of approximately 4.41 pT, resulting in a Signal-to-Noise Ratio (SNR) of 12.65 dB at 11.4 m. Based on the theoretical relationship between SNR and Bit Error Rate (BER) for coherent ASK, the maximum BER under these conditions is approximately 0.12%, consistent with the measured performance.  Conclusions  This study presents a VLF magnetic induction communication system based on a series-array ME antenna, with the ME antenna serving as the transmitter and an air-core coil as the receiver. A standard Helmholtz coil circuit is used to calibrate the conversion coefficient between received voltage and magnetic flux density. The radiated magnetic field strength is characterized by varying the ME antenna’s drive voltage. Notably, at an RMS drive voltage of 180.5 V, the ME antenna generates a magnetic induction of 165 nT at a distance of 1 m. Laboratory communication experiments confirm that, with a drive voltage of 14.8 V, ASK transmission achieves a range of 11.4 m at a symbol rate of 50 bps. In a high-noise environment with an in-band noise level of 4.41 pT, the system achieves a BER of 0.12%, consistent with theoretical predictions and confirming the reliability of the demodulation process. These results demonstrate the feasibility and efficiency of ME antennas for compact, low-frequency magnetic communication. Further performance improvements may be achieved by (1) operating in low-noise environments and (2) increasing the drive voltage to enhance radiation strength by up to a factor of 6.4.
Time-Series Information-Driven Parallel Interactive Multiple Model Algorithm for Underwater Target Tracking
LAN Chaofeng, ZHANG Tongji, CHEN Huan
2025, 47(8): 2685-2693.   doi: 10.11999/JEIT250044
[Abstract](218) [FullText HTML](107) [PDF 2071KB](40)
Abstract:
  Objective  Accurate underwater target tracking is critical in marine surveillance, military reconnaissance, and resource management. The nonlinear, stochastic, and uncertain motion of underwater targets, exacerbated by complex environmental dynamics, limits the effectiveness of traditional tracking methods. The Interacting Multiple Model (IMM) algorithm addresses this challenge by integrating several motion models and adaptively switching between them based on the target’s dynamic state. Such adaptability enables improved tracking under abrupt motion transitions, such as submarine maneuvers or the irregular paths of unmanned underwater vehicles. However, the classical IMM algorithm relies on a fixed Transition Probability Matrix (TPM), which can delay model switching and reduce tracking accuracy in highly dynamic settings. To overcome these limitations, this paper proposes an adaptive IMM algorithm that incorporates timing information, parallel processing, and information entropy. These enhancements improve model-switching speed and accuracy, increase adaptability to environmental changes, and boost overall tracking performance and stability.  Methods  This study proposes a Temporal Information Parallel Interacting Multiple Model (TIP-IMM) target tracking algorithm that integrates temporal information, information entropy evaluation, and parallel processing to adaptively correct the TPM. At each time step, the algorithm identifies the model with the highest probability and assesses whether this model remains dominant across consecutive time steps. If consistent, it is designated the main model. The algorithm then evaluates changes in the main model’s probability relative to other models and updates the TPM based on defined correction rules. A parallel structure is introduced: Module A performs TPM correction, while Module B executes a standard IMM algorithm. Both modules operate concurrently. Information entropy is employed to quantify the uncertainty of the model probability distribution. When entropy is low in Module A, its corrected results are prioritized; when entropy is high, the system places greater reliance on Module B to ensure stable and robust performance under varying conditions.  Results and Discussions  The proposed TIP-IMM algorithm is evaluated through simulation experiments, demonstrating improved tracking accuracy and stability. True motion and observation trajectories are generated using predefined initial parameters and a motion model. Filtering results from TIP-IMM are compared with those of three benchmark algorithms. The estimated trajectories from TIP-IMM align more closely with the true trajectories, as confirmed by the enlarged views in panels A and B (Fig. 2). Analysis of model probability evolution during filtering indicates that TIP-IMM exhibits smaller fluctuations during model transitions and identifies the dominant model more rapidly than the comparison methods (Fig. 3). To quantify tracking performance, Root Mean Square Error (RMSE) serves as the evaluation metric. TIP-IMM consistently yields lower and smoother RMSE curves across the full trajectory and in both X and Y directions (Figs. 4 and 5). Furthermore, average RMSE (ARMSE) serves as a comprehensive indicator. TIP-IMM achieves lower errors in both position and velocity estimates (Table 2), consistent with the trend observed in RMSE analysis. Although the algorithm incurs a slightly higher runtime than the reference methods, its execution time remains within the millisecond range, meeting the real-time requirements of practical applications (Table 3).  Conclusions  This study proposes a TIP-IMM algorithm to address limitations of the classical IMM algorithm, particularly model-switching delays and over-smoothing during abrupt target motion changes. By incorporating temporal correlation of model probabilities, parallel processing, and information entropy, TIP-IMM improves responsiveness and transition smoothness in dynamic environments. Simulation experiments confirm that TIP-IMM achieves faster and more accurate model switching than existing methods. Compared with the traditional IMM and benchmark algorithms, TIP-IMM improves overall tracking accuracy by 3.52% to 7.87% across multiple scenarios. It also reduces estimation error recovery time while maintaining high accuracy during sudden motion transitions. These results demonstrate the algorithm’s enhanced adaptability, robustness, and stability, making it well suited for underwater target tracking applications.
Trust Adaptive Event-triggered Robust Extended Kalman Fusion Filtering for Target Tracking
ZHU Hongbo, JIN Jiahui
2025, 47(8): 2694-2702.   doi: 10.11999/JEIT250103
[Abstract](132) [FullText HTML](114) [PDF 2409KB](24)
Abstract:
  Objective  Mobile Wireless Sensor Networks (MWSNs) with dynamic topology exhibit considerable application value across various fields, making target tracking a critical area of research. Although conventional filtering algorithms and event-triggered schemes have enabled basic target tracking, they remain limited in addressing motion modeling errors, Received Signal Strength (RSS) quantization inaccuracies, and adaptation to dynamic network conditions. To overcome these limitations, this study proposes a trust-adaptive event-triggered mechanism combined with an improved Extended Kalman Filter (EKF). The mechanism dynamically schedules a suitable number of trust anchor nodes based on network conditions, while the robust EKF estimates the motion state of the mobile target. This approach ensures stable, accurate, and consistent estimation even under time-varying process and measurement noise covariance. The proposed method offers an effective solution for RSS-based tracking in resource-constrained MWSNs by reducing power, computation, and bandwidth consumption, while improving tracking accuracy and maintaining robustness against measurement uncertainty and faulty nodes.  Methods  In the resource-constrained MWSN environment, a robust extended Kalman fusion filtering method with trust-adaptive event triggering is proposed for target tracking. This method incorporates a trust-adaptive, event-driven anchor node scheduling and information exchange mechanism. It dynamically adjusts to the spatial distribution of trusted anchor nodes near the target, schedules a number of anchor nodes close to the optimal value, and streamlines communication between these nodes and the mobile target. This design substantially reduces power, computational, and bandwidth demands while maintaining measurement credibility. To address uncertainties arising from motion modeling and RSS quantization, a robust extended Kalman trust fusion filtering algorithm based on mean drift is developed. The algorithm estimates the actual covariance by randomly sampling uniformly distributed process and measurement noise covariance matrices, thereby compensating for discrepancies between model predictions and observations. Additionally, only measurements from nodes identified as reliable are incorporated via adaptive weighted fusion, which enhances the stability, robustness, and accuracy of target tracking.  Results and Discussions  The proposed trust-adaptive event-triggered robust extended Kalman fusion filtering method substantially improves target tracking performance in resource-constrained MWSNs. By integrating a dynamic anchor node scheduling mechanism with a dual-layer noise compensation strategy, the method adjusts the response radius in real time through trust-adaptive event triggering. Therefore, the average number of trust response anchors remains stable at a preset target—for example, ANoTRA = 5.0583 when \begin{document}$ {{N}}_{\text{t}} $\end{document} = 5—while reducing communication resource consumption by 53.8% compared with the fixed threshold method (Fig. 2; Table 2). Furthermore, the use of uniformly distributed random sampling enables the algorithm to account for system uncertainty when the process noise covariance q is within [0.25, 1.5]. The introduction of a mean-shift algorithm helps to eliminate abnormal measurements, leading to a 42.6% reduction in tracking Root-Mean-Square Error (RMSE) compared with traditional approaches (Fig. 3, Fig. 4, Fig. 5). Under complex environmental conditions, with parameters set as q ∈ [0.25, 1.5], H = 10, L = 6, \begin{document}$ {{N}}_{\text{t}} $\end{document} = 5, and m = 8, the method demonstrates high accuracy and robustness. These results indicate that the proposed approach not only enhances tracking precision but also significantly improves the efficiency of resource utilization.  Conclusions  This study addresses the problem of mobile target tracking in resource-constrained MWSNs by integrating a trust-adaptive event-triggering mechanism with a robust extended Kalman fusion filtering algorithm. The proposed method leverages the advantages of trust-based adaptive triggering and robust filtering to achieve high tracking accuracy while reducing power, computational, and communication overhead. Simulation results demonstrate that (1) the robust EKF reduces the tracking root mean square error by 42.6% compared with the conventional EKF, and (2) the trust-adaptive event-triggering mechanism reduces communication resource consumption by 53.8% relative to static schemes such as non-trust-based adaptive triggering. This work focuses on tracking under low-noise conditions. Future research will extend the method to more complex nonlinear systems and explore the integration of statistical approaches and deep learning techniques for enhanced outlier identification and suppression under high interference.
Radar High-speed Target Tracking via Quick Unscented Kalman Filter
SONG Jiazhen, SHI Zhuoyue, ZHANG Xiaoping, LIU Zhenyu
2025, 47(8): 2703-2713.   doi: 10.11999/JEIT250010
[Abstract](372) [FullText HTML](135) [PDF 3331KB](43)
Abstract:
  Objective  The increasing prevalence of high-speed targets due to advancements in space technology presents new challenges for radar tracking. The pronounced motion of such targets within a single frame induces large variations in range, causing dispersion of echo energy across the range-Doppler plane and invalidating the assumption of concentrated target energy. This results in “range cell migration” and “Doppler cell migration”, both of which degrade tracking accuracy. To address these challenges, this study proposes a Quick Unscented Kalman Filter (Q-UKF) algorithm tailored for high-speed radar target tracking. The Q-UKF performs recursive, pulse-by-pulse state estimation directly from radar echo signals, thereby improving tracking precision and eliminating the need for conventional energy correction and migration compensation. Furthermore, the algorithm employs the Woodbury matrix identity to reduce computational burden while preserving the estimation accuracy of the standard Unscented Kalman Filter (UKF).  Methods  The target state vector at each pulse time is modeled as a three-dimensional random vector representing position, velocity, and acceleration. Target motion is governed by a kinematic model that characterizes its temporal dynamics. A measurement model is formulated based on the radar echo signals received at each pulse, defining a nonlinear relationship between the target state and the observed measurements. This formulation supports recursive state estimation. In the classical UKF, the high dimensionality of radar echo data necessitates frequent inversion of large covariance matrices, imposing a substantial computational burden. To mitigate this issue, the Q-UKF is developed. By incorporating the Woodbury matrix identity, the Q-UKF reduces the computational complexity of matrix inversion without compromising estimation accuracy relative to the classical UKF. Within this framework, Q-UKF performs pulse-by-pulse recursive estimation, integrating all measurements up to the current pulse to improve prediction accuracy. In contrast to conventional radar tracking methods that process complete frame data and apply multiple signal correction steps, Q-UKF operates directly on raw measurements and avoids such corrections, thereby simplifying the processing pipeline. This efficiency makes Q-UKF well suited for real-time tracking of high-speed targets.  Results and Discussions  The performance of the proposed Q-UKF method is assessed using Monte Carlo simulations. Estimation errors of the Q-UKF and Extended Kalman Filter (EKF) are compared over time (Fig. 3). During the effective pulse periods within each frame cycle, both methods yield accurate target state estimates. Estimation errors increase during the delay intervals, but rapidly decrease and stabilize once effective pulse signals resume, forming a periodic error pattern. To evaluate robustness, the Root Mean Square Error (RMSE) of state estimation is examined under varied initial conditions, including different positions, velocities, and accelerations. In all scenarios, both Q-UKF and EKF perform reliably, with Q-UKF consistently demonstrating superior accuracy (Fig. 4). Under Signal-to-Noise Ratios (SNRs) from –15 dB to 0 dB, the RMSEs in both Gaussian and Rayleigh noise environments (Fig. 5a and Fig. 5b) decrease with increasing SNR. Q-UKF maintains high accuracy even under low SNR conditions. In the Gaussian noise setting, Q-UKF improves estimation accuracy by an average of 10.60% relative to EKF; in the Rayleigh environment, the average improvement is 9.55%. In terms of computational efficiency, Q-UKF demonstrates the lowest runtime among the evaluated methods (EKF, UKF, and Particle Filter (PF)). The average computation time per effective pulse is reduced by 8.91% compared to EKF, 72.55% compared to UKF, and over 90% compared to PF (Table 2). This efficiency gain results from applying the Woodbury matrix identity, which alleviates the computational load of matrix inversion in high-dimensional radar echo data processing.  Conclusions  This study presents the Q-UKF method for high-speed target tracking in radar systems. The algorithm performs pulse-by-pulse state estimation directly from radar echo signals, advancing estimation granularity from the frame level to the pulse level. By removing the need for energy accumulation and migration correction, Q-UKF simplifies the conventional signal processing pipeline. The method incorporates the Woodbury matrix identity to efficiently invert covariance matrices, substantially reducing computational load. Simulation results show that Q-UKF consistently outperforms the EKF in estimation accuracy under varied initial target states, achieving an average improvement of approximately 10.60% under Gaussian noise and 9.55% under Rayleigh noise. Additionally, Q-UKF improves computational efficiency by 8.91% compared to EKF. Compared to the classical UKF, Q-UKF delivers equivalent accuracy with significantly reduced runtime. Although the PF may yield slightly better accuracy under certain conditions, its computational demand limits its practicality in real-time applications. Overall, Q-UKF provides a favorable balance between accuracy and efficiency, making it a viable solution for real-time tracking of high-speed targets. Its ability to address high-dimensional, nonlinear measurement problems also highlights its potential for broader application.
Non-orthogonal Prolate Spheroidal Wave Functions Signal Detection Method with Cross-terms
LU Faping, MAO Zhongyang, XU Zhichao, SHU Yihao, KANG Jiafang, WANG Feng, WANG Mengjiao
2025, 47(8): 2714-2723.   doi: 10.11999/JEIT250052
[Abstract](200) [FullText HTML](73) [PDF 2216KB](34)
Abstract:
  Objective  Non-orthogonal Shape Modulation based on Prolate Spheroidal Wave Functions (NSM-PSWFs) utilizes PSWFs with high time-frequency energy concentration as basis waveforms. This structure enables high spectral efficiency and time-frequency energy aggregation, making it a promising candidate for B5G/6G waveform design. However, due to the non-orthogonality of the PSWFs used for information transmission in NSM-PSWFs, mutual interference between non-orthogonal signals and poor bit error performance in coherent detection systems significantly limit their practical deployment. This issue is a common challenge in non-orthogonal modulation and access technologies. To address the problem of low detection performance resulting from mutual interference among non-orthogonal PSWFs, this study incorporates time-frequency domain characteristics into signal detection. A novel detection mechanism for non-orthogonal PSWFs in the time-frequency domain is proposed, with the aim of reducing interference between non-orthogonal PSWFs and enhancing detection performance.  Methods  Given the different time-frequency domain energy distribution characteristics of PSWF signals at various stages, particularly the "local" energy concentration in different regions, this study introduces cross-terms between signals. Based on an analysis of non-orthogonal signal time-frequency characteristics, with a focus on innovating detection mechanisms, a combined approach of theoretical modeling and numerical simulation is employed to explore novel methods for detecting non-orthogonal PSWF signals via cross-terms. Specifically: (1) The impact of interference between PSWF signals and Gaussian white noise on the time-frequency distribution of cross-terms is analyzed, demonstrating the feasibility of detecting PSWF signals in the time-frequency domain. (2) Building on this analysis, time-frequency characteristics are integrated into the detection process. A novel method for detecting non-orthogonal PSWFs based on cross-terms is proposed, accompanied by a strategy for selecting time-frequency feature parameters. The "integral value of cross-terms over symmetric time intervals at the frequency corresponding to the peak energy density of cross-terms" is chosen as the feature parameter. This shifts signal detection from the "one-dimensional energy domain (time or frequency)" to the "two-dimensional time-frequency energy domain," enabling detection by exploiting localized energy regions while simultaneously mitigating interference during statistical acquisition.  Results and Discussions  This study demonstrates the feasibility of detecting signals in the two-dimensional time-frequency domain and analyzes the impact of different PSWFs and AWGN on the distribution characteristics of cross-terms. Notably, AWGN interference can be regarded as a special form of “interference between PSWFs” exhibiting a linear superposition with PSWF-induced interference. The interference from PSWFs with time-domain parity opposite to that of the template signal can be eliminated through “symmetric time-interval integration” (Fig. 1, Table 1, Table 2). This establishes a theoretical foundation for the novel detection mechanism based on cross-terms and provides a reference for incorporating other two-dimensional distribution characteristics into signal detection. Additionally, a novel detection mechanism for non-orthogonal PSWFs based on cross-terms is proposed, utilizing time-frequency distribution characteristics for signal detection. This method effectively reduces interference between non-orthogonal PSWFs, thereby enhancing detection performance. It also offers valuable insights for exploring detection mechanisms based on other two-dimensional distribution characteristics. For example, compared to conventional coherent detection, the proposed method achieves a superior performance with approximately 1 dB improvement in bit error rate at 4 × 10–5 (Fig. 4).  Conclusions  This paper demonstrates the feasibility of detecting PSWFs in the two-dimensional time-frequency domain and proposes a novel detection method for non-orthogonal PSWFs based on cross-terms. The proposed method transforms traditional coherent detection from “global energy in the time/frequency domain” to “local energy in the time-frequency domain” significantly reducing interference between non-orthogonal signals and enhancing detection performance. This approach not only provides a new perspective for developing efficient detection methods for non-orthogonal signals but also serves as a valuable reference for investigating novel signal detection mechanisms in two-dimensional domains.
Multiple Maneuvering Target Poisson Multi-Bernoulli Mixture Filter for Gaussian Process Cognitive Learning
ZHAO Ziwen, CHEN Hui, LIAN Feng, ZHANG Guanghua, ZHANG Wenxu
2025, 47(8): 2724-2735.   doi: 10.11999/JEIT241139
[Abstract](143) [FullText HTML](71) [PDF 2769KB](10)
Abstract:
  Objective  Multiple Maneuvering Target Tracking (MMTT) remains a critical yet challenging problem in radar signal processing and sensor fusion, particularly under complex and uncertain conditions. The primary difficulty arises from the unpredictable or highly dynamic nature of target motion. Conventional model-based methods, especially Multiple Model (MM) approaches, rely on predefined motion models to accommodate varying target behaviors. However, these methods face limitations, including sensitivity to initial parameter settings, high computational cost due to model switching, and degraded performance when actual target behavior deviates from the assumed model set. To address these limitations, this study proposes a data-driven MMTT method that combines Gaussian Process (GP) learning with the Poisson Multi-Bernoulli Mixture (PMBM) filter to improve robustness and tracking accuracy in dynamic environments without requiring extensive model assumptions.  Methods  The proposed method exploits the data-driven modeling capability of GP, a non-parametric Bayesian inference approach that learns high-dimensional, nonlinear function mappings from limited historical data without specifying explicit functional forms. In this study, GP models both the state transition and observation processes of multi-target systems, reducing the dependence on predefined motion models. During the offline phase, historical target trajectories and sensor measurements are collected to build a training dataset. The squared exponential kernel is selected for its smoothness and infinite differentiability, which effectively captures the continuity and dynamic characteristics of target state evolution. GP hyperparameters, including length scale, signal variance, and observation noise variance, are jointly optimized by maximizing the log-marginal likelihood, ensuring generalization and expressiveness in complex environments. In the online filtering phase, the trained GP models are incorporated into the PMBM filter, forming a recursive GP-PMBM filtering structure. Within this framework, the PMBM filter employs a Poisson point process to represent undetected targets and a multi-Bernoulli mixture to characterize the posterior state distribution of detected targets. During the prediction step, the GP-derived nonlinear state transition model is propagated using the Cubature Kalman Filter (CKF). In the update step, the GP-learned observation model refines state estimates, enhancing both tracking accuracy and robustness.  Results and Discussions  Extensive simulation experiments under two different MMTT scenarios validate the effectiveness and performance advantages of the proposed method. In Scenario 1, a moderate 2D surveillance environment with clutter and a varying number of targets is constructed. The GP-PMBM filter significantly outperforms existing methods, including LSTM-PMBM, MM-PMBM, MM-GLMB, and MM-PHD filters, based on the Generalized Optimal Sub-Pattern Assignment (GOSPA) metric (Fig. 3). In addition, the GP-PMBM filter achieves the lowest standard deviation in cardinality estimation, demonstrating high accuracy and stability (Fig. 4). Further experiments under different monitoring conditions confirm the robustness of GP-PMBM. When clutter rates vary, the GP-PMBM filter consistently achieves the lowest average GOSPA error, reflecting strong stability under interference (Fig. 5). As detection probability decreases, most algorithms show significant degradation in accuracy. However, GP-PMBM maintains superior tracking performance, achieving the lowest GOSPA distance across all detection conditions (Fig. 6). In Scenario 2, target motion becomes more complex, with increased maneuverability and higher–frequency birth–death dynamics. Despite these challenges, the GP-PMBM filter maintains superior tracking performance, even under highly maneuverable conditions and frequent target appearance and disappearance (Fig. 9, Fig. 10).  Conclusions  This study proposes a novel GP-PMBM filtering framework for MMTT in complex environments. By integrating the data-driven learning capability of the GP with the PMBM filter, the proposed method addresses the limitations of conventional model-based tracking approaches. The GP-PMBM filter automatically learns unknown motion and observation models from historical data, eliminating the dependence on predefined model sets and significantly improving adaptability. Simulation results confirm that the GP-PMBM filter achieves superior tracking accuracy, improved cardinality estimation, and enhanced robustness under varying clutter levels and detection conditions. These results indicate that the proposed method is well-suited for environments characterized by frequent maneuvering changes and uncertain target behavior. Future work will focus on extending the GP-PMBM framework to multi-maneuvering extended target tracking tasks to address more challenging scenarios.
Research on Unmanned Aircraft Radio Frequency Signal Recognition Algorithm Based on Wavelet Entropy Features
LIU Bing, SHI Mingxin, LIU Jiaqi
2025, 47(8): 2736-2745.   doi: 10.11999/JEIT250051
[Abstract](484) [FullText HTML](192) [PDF 3910KB](65)
Abstract:
  Objective   With the rapid development and broad application of Unmanned Aerial Vehicle (UAV) technology, its use in military reconnaissance, agricultural spraying, logistics, and film production presents growing challenges in signal classification and safety supervision. Accurate classification of UAV Radio Frequency (RF) signals in complex electromagnetic environments is critical for real-time flight monitoring, autonomous obstacle avoidance, and communication reliability in multi-agent coordination. However, conventional recognition methods exhibit limitations in both feature extraction and classification accuracy, particularly under interference or multipath propagation, which severely reduces recognition performance and constrains practical implementation. To address this limitation, this study proposes a recognition algorithm based on wavelet entropy features and an optimized Support Vector Machine (SVM). The method enhances classification accuracy and robustness by extracting wavelet entropy features from UAV RF signals and optimizing SVM parameters using the Great Cane Rat Optimization Algorithm (GCRA). The proposed approach offers a reliable strategy for UAV signal identification under complex electromagnetic conditions. The results contribute to UAV airspace regulation and unauthorized flight detection and establish a foundation for future applications, including autonomous navigation and intelligent route planning. This work holds both theoretical value and practical relevance for supporting the secure and standardized advancement of UAV systems.   Methods   This study adopts a systematic approach to achieve accurate classification and recognition of UAV RF signals, including four key stages: data acquisition, feature extraction, classifier design, and performance verification. First, the publicly available DroneRFa dataset is selected as the experimental dataset. It contains RF signals from 24 mainstream UAV models (e.g., DJI Phantom 3, DJI Inspire 2) across three ISM frequency bands—915 MHz, 2.4 GHz, and 5.8 GHz (Fig. 1). Data collection follows a “pick-store-pick-store” protocol to preserve signal integrity and ensure accurate classification. During preprocessing, 50,000 sampling points are extracted from each channel (RF0_I, RF0_Q, RF1_I, RF1_Q), balancing data continuity and feature representativeness under hardware read/write constraints. Signal magnitudes are normalized to eliminate amplitude-related bias. For feature extraction, a three-level wavelet transform using the Daubechies “db4” wavelet is applied to decompose the signal at multiple scales. A four-dimensional feature matrix is constructed by computing wavelet spectral entropy (Figs. 2 and 3), which captures both time-frequency characteristics and energy distribution. Feature differences among UAV models are confirmed by F-test analysis (Table 1), establishing a robust foundation for classification. In the classifier design stage, the GCRA is applied to optimize the penalty parameter C and Gaussian kernel parameter σ of the SVM. The classification error rate serves as the fitness function during optimization (Fig. 5). Inspired by the foraging behavior of cane rats, GCRA offers improved global search performance. Finally, algorithm performance is evaluated using 10-fold cross-validation and benchmarked against unoptimized SVM, PSO-SVM, GA-SVM, and GWO-SVM (Table 3), demonstrating the robustness and reliability of the proposed method.   Results and Discussions   This study yields several key findings. For wavelet entropy feature extraction, the F-test confirms that features from all four channels are statistically significant (p < 0.05), demonstrating their effectiveness in distinguishing among UAV models (Table 1). In classifier optimization, the GCRA exhibits strong parameter search capability, with fitness convergence achieved within 50 iterations at approximately 0.03 (Fig. 6). The optimized SVM classifier reaches an average recognition accuracy of 98.5%, representing a 6.8 percentage point improvement over the traditional SVM (Table 3). At the individual model level, the highest recognition accuracy is observed for DJI Inspire 2 (99.0%), with all other models exceeding 97% (Table 2). Confusion matrix analysis indicates that all misclassification rates are below 3% (Table2, Fig. 7). Notably, under identical experimental conditions, GCRA-SVM outperforms other optimization algorithms—achieving higher accuracy than PSO-SVM (94.7%) and GA-SVM (94.2%)—with lower variance (±0.00032), indicating greater stability (Table 3). These results validate the discriminative power of wavelet entropy features and highlight the enhanced performance and robustness of GCRA-based SVM optimization.   Conclusions   Through systematic theoretical analysis and experimental validation, this study reaches several key conclusions. The wavelet entropy-based feature extraction method effectively captures the time-frequency characteristics of UAV RF signals. By employing multi-scale decomposition and energy distribution analysis, it accurately identifies the unique signal features of various UAV models. Statistical tests confirm significant differences among the features of different UAV categories, providing a solid foundation for feature selection in UAV identification. The optimization of SVM parameters using the GCRA substantially enhances classification performance, achieving an average accuracy of 98.5% and a peak of 99% on the DroneRFa dataset, with excellent stability. This method addresses the technical challenge of UAV RF signal recognition in complex electromagnetic environments, with performance metrics fully meeting practical application needs. The findings offer a reliable technical solution for UAV flight supervision and lay the groundwork for advanced applications such as autonomous obstacle avoidance. Future research may focus on evaluating the method’s performance in high-noise environments and exploring fusion strategies with other models. Overall, this study provides significant contributions both in terms of theoretical innovation and engineering application.
Real-time Adaptive Suppression of Broadband Noise in General Sensing Signals
WEN Yumei, ZHU Yu
2025, 47(8): 2746-2756.   doi: 10.11999/JEIT250018
[Abstract](187) [FullText HTML](131) [PDF 8308KB](33)
Abstract:
  Objective  Broadband noise is inevitable in sensing outputs due to thermal noise from the sensing system and various uncorrelated environmental disturbances. Adaptive filtering is a common method for removing such noise. At convergence, the adaptive filter output provides the optimal estimate of the sensing signal. However, during actual sensing, changes in the sensing signal lead to alterations in the statistical characteristics of the output. Therefore, the adaptive process must be re-adjusted to converge to a new steady state. The filter output during this adjustment is not the optimal estimate and introduces distortion, thereby adding extra noise. Fast-converging adaptive algorithms are typically employed to improve the filter’s response speed to such changes. Despite the speed of convergence and the methods used to update filter coefficients, the adjustment process remains unavoidable, during which the filter output is distorted, and additional noise is introduced. To ensure the filter remains at steady state without being influenced by changes in the sensing signal, a new adaptive filtering method is proposed. This method ensures that the input to the adaptive filter remains stationary, thereby preventing output distortion and the introduction of extra noise.  Methods  First, a threshold \begin{document}$ R $\end{document} and quantization scale \begin{document}$ Q $\end{document} are defined in terms of the noise standard deviation, \begin{document}$ \sigma $\end{document}, where \begin{document}$ R = 3\sqrt 2 \sigma $\end{document} and \begin{document}$ Q = 3\sigma $\end{document}. A quantization transformation is applied to the sensing output \begin{document}$ x(n) $\end{document} in real time, with the transformation result \begin{document}$ q(n) $\end{document} used as the new sequence to be filtered. When the absolute value of the first-order difference of \begin{document}$ x(n) $\end{document} is no less than \begin{document}$ R $\end{document}, the sensing signal \begin{document}$ s(n) $\end{document} is considered to have changed, and \begin{document}$ p(n) $\end{document} is set as the quantization value of \begin{document}$ x(n) $\end{document} according to \begin{document}$ Q $\end{document}. When the absolute value of the first-order difference of \begin{document}$ x(n) $\end{document} is less than \begin{document}$ R $\end{document}, \begin{document}$ s(n) $\end{document} is considered unchanged, and \begin{document}$ p(n) $\end{document} is equal to the previous value, i.e., \begin{document}$ p(n) = p(n - 1) $\end{document}. Let \begin{document}$ q(n) = x(n) - p(n) $\end{document}, \begin{document}$ q(n) $\end{document} contains both the information of the sensing signal and the noise. Although its variance may change slightly, the mean of \begin{document}$ q(n) $\end{document} remains 0, ensuring that \begin{document}$ q(n) $\end{document} stays relatively stationary. Next, \begin{document}$ q(n - {n_0}) $\end{document} is used as the input to the adaptive filter, with \begin{document}$ q(n) $\end{document} serving as the reference for the adaptive filter. Here, \begin{document}$ q(n - {n_0}) $\end{document} represents the time delay of \begin{document}$ q(n) $\end{document} and \begin{document}$ {n_0} $\end{document} denotes the length of the time delay. This method performs adaptive linear prediction of \begin{document}$ q(n) $\end{document} and filters out broadband noise. Finally, the output of the adaptive filter, \begin{document}$ y(n) $\end{document}, is compensated with \begin{document}$ p(n) $\end{document} to obtain an estimation of the sensing signal \begin{document}$ s(n) $\end{document} by removing noise.  Results and Discussions  The maximum mean square errors produced by the proposed method and conventional adaptive algorithms are compared using computer-simulated noisy band-limited step signals and noisy one-sided sinusoidal signals. Additionally, Signal-to-Noise Ratio (SNR) improvements obtained during filtering are also evaluated concurrently. For the noisy band-limited step signal (Table 1), the maximum mean square error of the proposed method is only 0.18% of that produced by the Recursive Least Squares (RLS) algorithm and 0.15%~0.19% of those generated by the Least Mean Square (LMS) algorithms. Correspondingly, the SNR improvement is 25.88 dB higher than the RLS algorithm and between 28.65 dB and 32.35 dB greater than the LMS algorithms. In processing a noisy one-sided sinusoidal signal (Table 2), the maximum mean square error generated by the proposed method is 0.3% of that generated by the RLS algorithm and 0.06%~0.08% of that generated by the compared LMS algorithms. The SNR improvement is 10.25 dB higher than that of the RLS algorithm and 26.53 dB~29.61 dB higher than that of the compared LMS algorithms. Figures 3 and 5 illustrate the quantization transformation outcomes for both the noisy band-limited step signal and noisy sinusoidal signal, demonstrating stability and consistency with theoretical expectations. Real sensing outputs primarily cover static or quasi-static signals (Figures 7 and 8); step or step-like signals (Figures 9 and 10), and periodic or quasi-periodic signals (Figures 11 and 12). Comparative analysis of the proposed method against common adaptive algorithms on varied real sensing outputs consistently shows superior filtering performance by the proposed method, with minimal distortion and no additional noise introduction, regardless of whether the sensing signals undergo changes.  Conclusions  A new adaptive filtering method is proposed in this paper. The proposed method ensures that the adaptive filter always operates at a steady state, avoiding the introduction of additional noise caused by distortion during the adjustment to the new steady state. The results from computer simulations and actual signal processing demonstrate that the proposed method provides effective filtering for both dynamic and static sensing signals, indicating that it outperforms commonly used adaptive algorithms.
Cryption and Network Information Security
Design of Private Set Intersection Protocol Based on National Cryptographic Algorithms
HUANG Hai, GUAN Zhibo, YU Bin, MA Chao, YANG Jinbo, MA Xiangyu
2025, 47(8): 2757-2767.   doi: 10.11999/JEIT250050
[Abstract](211) [FullText HTML](135) [PDF 1125KB](40)
Abstract:
  Objective  The rapid development of global digital transformation has exposed Private Set Intersection (PSI) as a key bottleneck constraining the digital economy. Although technical innovations and architectural advances in PSI protocols continue to emerge, current protocols face persistent challenges, including algorithmic vulnerabilities in international cryptographic primitives and limited computational efficiency when applied to large-scale datasets. To address these limitations, this study integrates domestic SM2 elliptic curve cryptography and the SM3 cryptographic hash function to enhance PSI protocol performance and protect sensitive data, providing technical support for China’s cyberspace security. A PSI protocol based on national cryptographic standards (SM-PSI) is proposed, with hardware acceleration of core cryptographic operations implemented using domestic security chips. This approach achieves simultaneous improvements in both security and computational efficiency.  Methods  SM-PSI integrates the domestic SM2 and SM3 cryptographic algorithms to reveal only the intersection results without disclosing additional information, while preserving the privacy of each participant’s input set. By combining SM2 elliptic curve public-key encryption with the SM3 hash algorithm, the protocol reconstructs encryption parameter negotiation, data obfuscation, and ciphertext mapping processes, thereby eliminating dependence on international algorithms such as RSA and SHA-256. An SM2-based non-interactive zero-knowledge proof mechanism is designed to verify the validity of public–private key pairs using a single communication round. This reduces communication overhead, mitigates man-in-the-middle attack risks, and prevents private key exposure. The domestic reconfigurable cryptographic chip RSP S20G is integrated to offload core computations, including SM2 modular exponentiation and SM3 hash iteration, to dedicated hardware. This software-hardware co-acceleration approach significantly improves protocol performance.  Results and Discussions  Experimental results on simulated datasets demonstrate that SM-PSI, through hardware-software co-optimization, significantly outperforms existing protocols at comparable security levels. The protocol achieves an average speedup of 4.2 times over the CPU-based SpOT-Light PSI scheme and 6.3 times over DH-IPP (Table 3), primarily due to offloading computationally intensive operations, including SM2 modular exponentiation and SM3 hash iteration, to dedicated hardware. Under the semi-honest model, SM-PSI reduces both the number of dataset encryption operations and communication rounds, thereby lowering data transmission volume and computational overhead. Its computational and communication complexities are substantially lower than those of SpOT-Light, DH-IPP, and FLASH-RSA, making it suitable for large-scale data processing and low-bandwidth environments (Table 1). Simulation experiments further show that the hardware-accelerated framework consistently outperforms CPU-only implementations, achieving a peak speedup of 9.0 times. The speedup ratio exhibits a near-linear relationship with dataset size, indicating stable performance as the ID data volume increases with minimal efficiency loss (Fig. 3). These results demonstrate SM-PSI’s ability to balance security, efficiency, and scalability for practical privacy-preserving data intersection applications.  Conclusions  This study proposes SM-PSI, a PSI protocol that integrates national cryptographic algorithms SM2 and SM3 with hardware-software co-optimization. By leveraging domestic security chip acceleration for core operations, including non-interactive zero-knowledge proofs and cryptographic computations, the protocol addresses security vulnerabilities presented in international algorithms and overcomes computational inefficiencies in large-scale applications. Theoretical analysis confirms its security under the semi-honest adversary model, and experimental results demonstrate substantial performance improvements, with an average speedup of 4.2 times over CPU-based SpOT-Light and 6.3 times over DH-IPP. These results establish SM-PSI as an efficient and autonomous solution for privacy-preserving set intersection, supporting China’s strategic objective of achieving technical independence and high-performance computation in privacy-sensitive environments.  Prospects   Future work will extend this research by exploring more efficient PSI protocols based on national cryptographic standards, aiming to improve chip-algorithm compatibility, reduce power consumption, and enhance large-scale data processing efficiency. Further efforts will target optimizing protocol scalability in multi-party scenarios and developing privacy-preserving set intersection mechanisms suitable for multiple participants to meet complex practical application demands. In addition, this research will promote integration with other privacy-enhancing technologies, such as federated learning and differential privacy, to support the development of a more comprehensive privacy protection framework.
One-sided Personalized Differential Privacy Random Response Algorithm Driven by User Sensitive Weights
LIU Zhenhua, WANG Wenxin, DONG Xinfeng, WANG Baocang
2025, 47(8): 2768-2779.   doi: 10.11999/JEIT250099
[Abstract](176) [FullText HTML](110) [PDF 7027KB](31)
Abstract:
  Objective  One-sided differential privacy has received increasing attention in privacy protection due to its ability to shield sensitive information. This mechanism ensures that adversaries cannot substantially reduce uncertainty regarding record sensitivity, thereby enhancing privacy. However, its use in practical datasets remains constrained. Specifically, the random response algorithm under one-sided differential privacy performs effectively only when the proportion of sensitive records is low, but yields limited results in datasets with high sensitivity ratios. Examples include medical records, financial transactions, and personal data in social networks, where sensitivity levels are inherently high. Existing algorithms often fail to meet privacy protection requirements in such contexts. This study proposes an extension of the one-sided differential privacy random response algorithm by introducing user-sensitive weights. The method enables efficient processing of highly sensitive datasets while substantially improving data utility and maintaining privacy guarantees, supporting secure analysis and application of high-sensitivity data.  Methods  This study proposes a one-sided personalized differential privacy random response algorithm comprising three key stages: sensitivity specification, personalized sampling, and fixed-value noise addition. In the sensitivity specification stage, user data are mapped to sensitivity weight values using a predefined sensitivity function. This function reflects both the relative importance of each record to the user and its quantified sensitivity level. The resulting sensitivity weights are then normalized to compute a comprehensive sensitivity weight for each user. In the personalized sampling stage, the data sampling probability is adjusted dynamically according to the user’s comprehensive sensitivity weight. Unlike uniform-probability sampling employed in conventional methods, this personalized approach reduces sampling bias and improves data representativeness, thereby enhancing utility. In the fixed-value noise addition stage, the noise amount is determined in proportion to the comprehensive sensitivity weight. In high-sensitivity scenarios, a larger noise value is added to reinforce privacy protection; in low-sensitivity scenarios, the noise is reduced to preserve data availability. This adaptive mechanism allows the algorithm to balance privacy protection with utility across different application contexts.  Results and Discussions  The primary innovations of this study are reflected in three areas. First, a one-sided personalized differential privacy random response algorithm is proposed, incorporating a sensitivity specification function to allocate personalized sensitivity weights to user data. This design captures user-specific sensitivity requirements across data attributes and improves system efficiency by minimizing user interaction. Second, a personalized sampling method based on comprehensive sensitivity weights is developed to support fine-grained privacy protection. Compared with conventional approaches, this method dynamically adjusts sampling strategies in response to user-specific privacy preferences, thereby increasing data representativeness while maintaining privacy. Third, the algorithm’s sensitivity shielding property is established through theoretical analysis, and its effectiveness is validated via simulation experiments. The results show that the proposed algorithm outperforms the traditional one-sided differential privacy random response algorithm in both data utility and robustness. In high-sensitivity scenarios, improvements in query accuracy and robustness are particularly evident. When the data follow a Laplace distribution, for the sum function, the Root Mean Square Error (RMSE) produced by the proposed algorithm is approximately 76.67% of that generated by the traditional algorithm, with the threshold upper bound set to 0.6 (Fig. 4(c)). When the data follow a normal distribution, in the coefficient of variation function, the RMSE produced by the proposed algorithm remains below 200 regardless of whether the upper bound of the threshold t is 0.7, 0.8, or 0.9, while the RMSE of the traditional algorithm consistently exceeds 200 (Fig. 5(g,h,i)). On real-world datasets, the proposed algorithm achieves higher data utility across all three evaluated functions compared with the traditional approach (Fig. 6).  Conclusions  The proposed one-sided personalized differential privacy random response algorithm achieves effective performance under an equivalent level of privacy protection. It is applicable not only in datasets with a low proportion of sensitive records but also in those with high sensitivity, such as healthcare and financial transaction data. By integrating sensitivity specification, personalized sampling, and fixed-value noise addition, the algorithm balances privacy protection with data utility in complex scenarios. This approach offers reliable technical support for the secure analysis and application of highly sensitive data. Future work may investigate the extension of this algorithm to scenarios involving correlated data in relational databases.
Locality-optimized Forward-secure Dynamic Searchable Symmetric Encryption
GUO Yutao, LIU Feng, WANG Feng, XUE Kaiping
2025, 47(8): 2780-2790.   doi: 10.11999/JEIT250107
[Abstract](161) [FullText HTML](95) [PDF 1706KB](16)
Abstract:
  Objective  With the rapid development of cloud computing, more individuals and enterprises are outsourcing data storage, raising significant concerns regarding data privacy. Traditional encryption methods preserve confidentiality but render the data unsearchable, severely limiting usability. Searchable Symmetric Encryption (SSE) addresses this limitation by enabling efficient keyword searches over encrypted data, and dynamic SSE further enhances practicality by supporting data updates. However, a critical challenge in dynamic SSE is the trade-off between forward security—ensuring that past queries cannot retrieve results from newly added data—and locality. Locality, defined as the number of non-contiguous storage accesses during a search, is a key metric for I/O efficiency and directly affects search performance. Poor locality causes search latency to increase linearly with keyword frequency, creating a significant performance bottleneck. Existing schemes either constrain the number of updates between searches or degrade update and read efficiency, limiting real-world applicability. This study proposes a novel scheme that transforms existing forward-secure dynamic SSE into locality-optimized variants without compromising key performance metrics such as update and read efficiency.  Methods  The proposed scheme improves locality by reorganizing SSE updates into batched operations. Instead of uploading each update individually, the client temporarily stores updates in a local buffer. Once a predefined threshold is reached, the accumulated updates are uploaded as a single package. Within each package, updates corresponding to the same keyword are stored contiguously to minimize non-contiguous storage accesses, thereby enhancing locality. The transformed scheme retains the use of the underlying forward-secure dynamic SSE to store essential metadata required for extracting the contents of each package during a search, thereby preserving forward security. However, search operations may reveal the storage positions of updates for some keywords within a package, potentially constraining the inferred distribution of updates for other keywords. To address this issue, a secure packaging algorithm is designed to mitigate such leakage and maintain the overall security of the scheme.   Results and Discussion   By implementing client-side buffering and batched updating, the proposed scheme transforms compatible forward-secure dynamic SSE schemes into locality-optimized variants. The integration of a secure packaging algorithm into the batching process ensures that forward security is preserved, as confirmed by a formal security proof, without introducing additional information leakage. A comprehensive evaluation is conducted, comparing a typical forward-secure dynamic SSE scheme (referred to as the original scheme), its transformed variants under various buffer sizes, and an existing locality-optimized forward-secure dynamic SSE scheme. Both theoretical and experimental analyses are performed. Theoretical analysis indicates that although the transformed scheme imposes an upper bound on locality, it maintains similar computational complexity to the original scheme in other critical aspects, such as update and read efficiency. Moreover, its update complexity and read performance outperform those of the existing locality-optimized scheme (Table 1). Experimental results yield three main findings. (1) Although client-side buffering requires additional storage, the overall client storage remains comparable to that of the original scheme (Table 2). (2) Update times are similar to the original scheme and are reduced to between 1% and 10% of those observed in the existing locality-optimized solution (Fig. 4). (3) For low-frequency keywords, search latency moderately increases—by up to 70%—relative to the original scheme. In contrast, for high-frequency keywords, latency is substantially reduced, ranging from 23.1% to 3.5% of that in the original scheme. Overall, the transformed scheme consistently achieves lower search latency than the existing solution (Fig. 5).  Conclusions  This study proposes a novel scheme that transforms forward-secure dynamic SSE into locality-optimized variants through client-side buffering and batched updating, without degrading core performance metrics (e.g., update and read efficiency). A secure packaging algorithm is introduced, and a formal security proof demonstrates that the scheme preserves forward security without incurring additional information leakage. Both theoretical and experimental results show that the scheme significantly improves locality and search efficiency for high-frequency keywords, while maintaining comparable update and read performance for other keywords. A notable limitation is that the scheme requires predefining the total number or an upper bound on different keywords, which restricts flexibility in dynamic environments. Addressing this limitation remains a key direction for future research. Additionally, extending the scheme to operate under malicious server assumptions or to support further security properties, such as backward security, also warrants investigation.
A Chosen-Plaintext Method on SM4: Linear Operation Challenges and the Countermeasures
TANG Xiaolin, FENG Yan, LI Zhiqiang, GUO Ye, GONG Guanfei
2025, 47(8): 2791-2799.   doi: 10.11999/JEIT250014
[Abstract](237) [FullText HTML](135) [PDF 3028KB](21)
Abstract:
  Objective   With increasing concerns over hardware security, techniques for exploiting hardware vulnerabilities have advanced rapidly. Among these, Side-Channel Attacks (SCAs) have received substantial attention for their ability to extract sensitive information via physical leakage. Power analysis, a prominent form of SCA, has been extensively applied to the Advanced Encryption Standard (AES). However, SM4—a block cipher issued by China’s State Cryptography Administration—presents greater challenges due to its unique linear transformation. Existing chosen-plaintext methods for attacking SM4 still encounter key limitations, including difficulty in constructing four-round chosen plaintexts, recovering initial keys from intermediate values, resolving symmetrical attack ambiguities, and filtering highly correlated incorrect guesses. This study systematically analyzes the root causes of these issues and proposes targeted countermeasures, effectively mitigating the constraints imposed by SM4’s linear operations.   Methods   This study systematically investigates the challenges in chosen-plaintext attacks on SM4 and proposes targeted countermeasures. To enable initial key recovery, the inverse transformation is expressed as a system of linear equations, and a new round-key derivation algorithm is developed. To facilitate the construction of four-round chosen plaintexts, additional critical constraints are incorporated into plaintext generation, yielding leak model expressions that are more concise and plaintext-dependent. To resolve symmetrical attack results, the set of 4-byte round-key candidates is reduced, and incorrect candidates are eliminated through analysis in subsequent rounds. To suppress interference from highly correlated false guesses, an average ranking method is applied.   Results and Discussions   The proposed countermeasures collectively resolve key limitations in chosen-plaintext attacks on SM4 and enhance attack efficiency. The key recovery algorithm (Algorithm 1) integrates Gaussian elimination with Boolean operations to extract round keys. For four-round plaintext construction, detailed expressions for the final three rounds are provided for the first time, expanding the number of valid values from 256 to at least 232 (Table 1), thereby enabling the recovery of four round keys (Fig. 4). The number of symmetrical attack results is reduced from 16 to 2 (Fig. 5), and subsequent round verification identifies the correct candidate (Fig. 6). The average ranking method yields clearer attack traces when analyzing 50,000 plaintexts across 10 groups (Fig. 7).  Conclusions   The proposed countermeasures effectively address the challenges introduced by linear operations in chosen-plaintext attacks on SM4. Correlation Power Analysis (CPA)-based experiments demonstrate that: (1) the key recovery algorithm and plaintext generation strategy enable successful extraction of round keys and reconstruction of the initial key; (2) symmetrical attack results can be resolved using only seven attack sets; and (3) the average ranking method reduces interference from secondary correlation peaks. This study focuses on unprotected SM4 implementations; future work will extend the analysis to masked versions.
Asymptotically Good Multi-twisted Codes over Finite Chain Rings
GAO Jian, CUI Qingxiang, ZHENG Yuqi
2025, 47(8): 2800-2807.   doi: 10.11999/JEIT250032
[Abstract](123) [FullText HTML](88) [PDF 787KB](21)
Abstract:
  Objective  This study aims to address the theoretical gap in the asymptotic analysis of multi-twisted codes over finite chain rings and to provide a foundation for their application in high-efficiency communication and secure data transmission. As modern communication systems demand higher data rates, enhanced error resilience, and robust security, the design of error-correcting codes must balance code rate, error correction capability, and implementation complexity. Finite chain rings, as algebraic structures situated between finite fields and general rings, exhibit a hierarchical ideal structure, that enables sophisticated code designs while retaining the algebraic properties of linear codes. Compared to finite fields, codes over finite chain rings achieve flexible error correction and higher information density through homogeneous weights and Gray mapping. However, existing research has focused primarily on multi-twisted codes over finite fields, leaving the asymptotic properties over finite chain rings unexplored. By constructing 1-generator multi-twisted codes, this work is the first to prove their asymptotic goodness over finite chain rings—i.e., the existence of infinite code sequences\begin{document}$ {\mathcal{C}_i} $\end{document}with code rate\begin{document}$ R\left( {{\mathcal{C}_i}} \right) $\end{document}and relative distance\begin{document}$ {{\Delta }}\left( {{\mathcal{C}_i}} \right) $\end{document}bounded below as code lengths approach infinity. This result not only demonstrates the attainability of Shannon’s Second Theorem in finite chain ring coding but also offers novel solutions for practical systems, such as quantum-resistant encrypted communication and reliable transmission in high-noise channels.  Methods  In the basic concepts section, the structure of a finite chain ring is defined, utilizing its ideal chain structure to study code generation and properties. The concepts of homogeneous weight are introduced, and the homogeneous distance \begin{document}${d_{\hom }}$\end{document} is established to quantify error correction capabilities. A Gray map is constructed to transform the distance problems over finite chain rings into Hamming distance problems over finite fields. To study the asymptotic properties of multi-twisted codes, 1-generator multi-twisted codes are defined using the module structure of \begin{document}$ {{R}}\left[ x \right] $\end{document}, and their free condition is discussed, as demonstrated in Theorem 1: Each subcode \begin{document}$ {\mathcal{C}_i} = \left\langle {{a_i}\left( x \right)} \right\rangle $\end{document} must be a free constant cyclic code, and the rank of \begin{document}$ {\mathcal{C}_i} $\end{document} is determined by the degree of the check polynomial\begin{document}$ h(x) $\end{document}. The asymptotic properties of multi-twisted codes with identical block lengths, which are simpler to analyze than those with varying block lengths are considered. The selection of generators \begin{document}$ ({a_1}(x),{a_2}(x), \ldots ,{a_l}(x)) $\end{document} is treated as a random process, defining a probability space. By introducing the \begin{document}${q^s}$\end{document}-ary entropy function\begin{document}$ H(x) = x{\log _{{q^s}}}({q^s} - 1) - x{\log _{{q^s}}}x - (1 - x){\log _{{q^s}}}(1 - x) $\end{document}, the code rate\begin{document}$ R(\mathcal{C}) $\end{document}and the relative distance \begin{document}$ {{\Delta }}(\mathcal{C}) $\end{document} are analyzed. The Chinese Remainder Theorem is applied to decompose the finite chain ring into the direct product of local rings, transforming the global ideal analysis into localized studies to reduce complexity. Finally, it is proven that the relative homogeneous distance and the rate of multi-twisted codes are positively bounded from below. As the code length\begin{document}$ i \to \infty $\end{document}, the relative distance of the code satisfies \begin{document}$ \Pr \left( {{{\Delta }}\left( {{\mathcal{C}^{\prime (i)}}} \right) \ge \delta } \right) = 1 $\end{document}(Theorem 2) and \begin{document}$ \Pr \left( {{\text{rank}}\left( {{\mathcal{C}^{\prime (i)}}} \right) = {m_i} - 1} \right) = 1 $\end{document}(Theorem 3), leading to the conclusion that this class of multi-twisted codes over finite chain rings is asymptotically good.  Results and Discussions  This paper systematically constructs a class of 1-generator multi-twisted codes (Label 1) over finite chain rings and demonstrates that these codes are asymptotically good based on probabilistic methods and the Chinese Remainder Theorem. This constitutes the first analysis of the asymptotic properties of such codes over finite chain rings. Previous studies on the asymptotic properties of codes have primarily focused on codes over finite fields (e.g., cyclic and quasi-cyclic codes). By leveraging the hierarchical ideal structures of rings (e.g., homogeneous weight and the Chinese Remainder Theorem), the analytical complexity inherent to rings is overcome, thereby extending the scope of asymptotically good codes. This work extends classical finite-field random code analysis to finite chain rings, addressing the complexity of distance computation through complexity via homogeneous weights and Gray mappings. Additionally we leverage the bijection between q-cyclotomic cosets modulo \begin{document}${{M}}$\end{document} and irreducible factors of \begin{document}${x^{{M}}} - 1$\end{document}, combined with CRT-based ideal decomposition, significantly simplifies the asymptotic analysis (Lemma 4).  Conclusions  The asymptotic goodness of multi-twisted codes over finite chain rings has been systematically resoloved, addressing a critical theoretical gap. By constructing 1-generator free codes and applying probabilistic methods combined with the Chinese Remainder Theorem, this work provides the first proof of infinite code sequences over finite chain rings that approach Shannon’s theoretical limits in terms of code rate and relative distance. These codes are suitable for high-frequency communications in 5G/6G networks, deep-space links, and other noisy environments, offering enhanced spectral efficiency through high code rates and robust error correction. This result not only extends the algebraic framework of coding theory but also provides a new coding scheme with strong anti-interference capabilities and high security for practical communication systems. Future research may extend these findings to more complex ring structures and practical application scenarios, further advancing the application of coding theory in the information age.
A Black-Box Query Adversarial Attack Method for Signal Detection Networks Based on Sparse Subspace Sampling
LI Dongyang, WANG Linyuan, PENG Jinxian, MA Dekui, YAN Bin
2025, 47(8): 2808-2818.   doi: 10.11999/JEIT241019
[Abstract](155) [FullText HTML](92) [PDF 2275KB](20)
Abstract:
  Objective  The application of deep neural networks to signal detection has raised concerns regarding their vulnerability to adversarial example attacks. In black-box attack scenarios, where internal model information is inaccessible, this paper proposes a black-box query adversarial attack method based on sparse subspace sampling. The method offers an approach to evaluate the robustness of signal detection networks under black-box conditions, providing theoretical support and practical guidance for improving the reliability and robustness of these networks.  Methods  This study combines the characteristics of signal detection networks with the attack objective of reducing the recall rate of signal targets. The disappearance ratio of detected signal targets is used as the constraint for determining attack success, forming an adversarial attack model for signal detection networks. Based on the HopSkipJumpAttack (HSJA) algorithm, a black-box query adversarial attack method for signal detection networks is designed, which generates adversarial examples by approaching the model’s decision boundary. To further improve query efficiency, a sparse subspace query adversarial attack method is proposed. This approach constructs sparse subspace sampling based on the characteristics of signal adversarial perturbations. Specifically, during the generation of adversarial examples, signal components with large amplitudes are selected in proportion, and only these components are perturbed.  Results and Discussions  Experimental results show that under a decision boundary condition with a signal target disappearance ratio of 0.3, the proposed sparse subspace sampling black-box adversarial attack method reduces the mean Average Precision (mAP) by 43.6% and the recall rate by 41.2%. Under the same number of queries, all performance metrics for the sparse subspace sampling method exceed those of the full-space sampling approach, demonstrating improved attack effectiveness, with the success rate increasing by 2.5% (Table 2). In terms of signal perturbation intensity, the proposed method effectively reduces perturbation intensity through iterative optimization under both sampling spaces. At the beginning of the iterations, the perturbation energies for the two spaces are similar. As the number of query rounds increases, the perturbation energy required for sparse subspace sampling becomes slightly lower than that of full-space sampling, and the difference continues to widen. The average adversarial perturbation energy ratio for full-space sampling is 5.18%, whereas sparse subspace sampling achieves 5.00%, reflecting a 3.47% reduction relative to full-space sampling (Fig. 4). For waveform perturbations, both sampling strategies proposed in this study can generate effective adversarial examples while preserving the primary waveform characteristics of the original signal. Specifically, the full-space query method applies perturbations to every sampling point, whereas the sparse subspace query method selectively perturbs only the large-amplitude signal components, leaving other components unchanged (Fig. 5). This selective approach provides the sparse subspace method with a notable l0-norm control property for adversarial perturbations, minimizing the number of perturbed components without compromising attack performance. In contrast, the full-space sampling method focuses on optimizing the l2-norm of perturbations, without achieving this selective control.  Conclusions  This study proposes a black-box query adversarial attack method for signal detection networks based on sparse subspace sampling. The disappearance ratio of detected signal targets is used as the success criterion for attacks, and an adversarial example generation model for signal detection networks is established. Drawing on the HSJA algorithm, a decision-boundary-based black-box query attack method is designed to generate adversarial signal examples. To further enhance query efficiency, a sparse subspace sampling strategy is constructed based on the characteristics of signal adversarial perturbations. Experimental results show that under a decision boundary with a target disappearance ratio of 0.3, the proposed sparse subspace sampling black-box attack method reduces the mAP of the signal detection network by 43.6% and the recall rate by 41.2%. Compared with full-space sampling, sparse subspace sampling increases the attack success rate by 2.5% and reduces the average perturbation energy ratio by 3.47%. The sparse subspace method significantly degrades signal detection network performance while achieving superior attack efficiency and lower perturbation intensity relative to full-space sampling. Furthermore, the full-space query method introduces perturbations at all sampling points, whereas the sparse subspace method selectively perturbs only the high-amplitude signal components, leaving other components unchanged. This approach enforces l0-norm sparsity constraints, minimizing the number of perturbed components without compromising attack effectiveness. The proposed method provides a practical solution for evaluating the robustness of signal detection networks under black-box conditions and offers theoretical support for improving the reliability of these networks against adversarial threats.
Image and Intelligent Information Processing
Multi-objective Remote Sensing Product Production Task Scheduling Algorithm Based on Double Deep Q-Network
ZHOU Liming, YU Xi, FAN Minghu, ZUO Xianyu, QIAO Baojun
2025, 47(8): 2819-2829.   doi: 10.11999/JEIT250089
[Abstract](310) [FullText HTML](104) [PDF 3759KB](52)
Abstract:
  Objective  Remote sensing product generation is a multi-task scheduling problem influenced by dynamic factors, including resource contention and real-time environmental changes. Achieving adaptive, multi-objective, and efficient scheduling remains a central challenge. To address this, a Multi-Objective Remote Sensing scheduling algorithm (MORS) based on a Double Deep Q-Network (DDQN) is proposed. A subset of candidate algorithms is first identified using a value-driven, parallel-executable screening strategy. A deep neural network is then designed to perceive the characteristics of both remote sensing algorithms and computational nodes. A reward function is constructed by integrating algorithm execution time and node resource status. The DDQN is employed to train the model to select optimal execution nodes for each algorithm in the processing subset. This approach reduces production time and enables load balancing across computational nodes.  Methods  The MORS scheduling process comprises two stages: remote sensing product processing and screening, followed by scheduling model training and execution. A time-triggered strategy is adopted, whereby all newly arrived remote sensing products within a predefined time window are collected and placed in a task queue. For efficient scheduling, each product is parsed into a set of executable remote sensing algorithms. Based on the model illustrated in Figure 2, the processing unit extracts all constituent algorithms to form an algorithm set. An optimal subset is then selected using a value-driven parallel-executable screening strategy. The scheduling process is modeled as a Markov decision process, and the DDQN is applied to assign each algorithm in the selected subset to the optimal virtual node.  Results and Discussions  Simulation experiments use varying numbers of tasks and nodes to evaluate the performance of MORS. Comparative analyses are conducted against several baseline scheduling algorithms, including First-Come, First-Served (FCFS), Round Robin (RR), Genetic Algorithm (GA), Deep Q-Network (DQN), and Dueling Deep Q-Network (Dueling DQN). The results demonstrate that MORS outperforms all other algorithms in terms of scheduling efficiency and adaptability in remote sensing task scheduling. The learning rate, a critical hyperparameter in DDQN, influences the step size for parameter updates during training. When the learning rate is set to 0.00001, the model fails to converge even after 5,000 iterations due to extremely slow optimization. A learning rate of 0.0001 achieves a balance between convergence speed and training stability, avoiding oscillations associated with overly large learning rates (Figure 3 and Figure 4). The corresponding DDQN loss values show a steady decline, reflecting effective optimization and gradual convergence. In contrast, the unpruned DDQN initially declines sharply but plateaus prematurely, failing to reach optimal convergence. DDQN without soft updates shows large fluctuations in loss and remains unstable during later training stages, indicating that the absence of soft updates impairs convergence (Figure 5). Regarding decision quality, the reward values of DDQN gradually approach 25 in the later training stages, reflecting stable convergence and strong decision-making performance. Conversely, DDQN models without pruning or soft updates display unstable reward trajectories, particularly the latter, which exhibits pronounced reward fluctuations and slower convergence (Figure 6). A comparison of DQN, Dueling DQN, and DDQN reveals that all three show decreasing training loss, suggesting continuous optimization (Figure 7). However, the reward curve of Dueling DQN shows higher volatility and reduced stability (Figure 8). To further assess scalability, four sets of simulation experiments use 30, 60, 90, and 120 remote sensing tasks, with the number of virtual machine nodes fixed at 15. Each experimental configuration is evaluated using 100 Monte Carlo iterations to ensure statistical robustness. DDQN consistently shows superior performance under high-concurrency conditions, effectively managing increased scheduling pressure (Table 7). In addition, DDQN exhibits lower standard deviations in node load across all task volumes, reflecting more balanced resource allocation and reduced fluctuation in system utilization (Table 8 and Table 9).  Conclusions  The proposed MORS algorithm addresses the variability and complexity inherent in remote sensing task scheduling. Experimental results demonstrate that MORS not only improves scheduling efficiency but also significantly reduces production time and achieves balanced allocation of node resources.
A Multi-Agent Path Finding Strategy Combining Selective Communication and Conflict Resolution
WANG Yu, ZHANG Xuxiu
2025, 47(8): 2830-2840.   doi: 10.11999/JEIT250122
[Abstract](280) [FullText HTML](201) [PDF 2257KB](52)
Abstract:
  Objective  The rapid development of intelligent manufacturing, automated warehousing, and Internet of Things technologies has made Multi-Agent Path Finding (MAPF) a key approach for addressing complex coordination tasks. Traditional centralized methods face limitations in large-scale multi-agent systems due to excessive communication load, high computational complexity, and susceptibility to path conflicts and deadlocks. Existing methods rely on broadcast-based communication, which leads to information overload and poor scalability. Furthermore, current conflict resolution strategies are static and overly simplistic, making them ineffective for dynamically balancing task priorities and environmental congestion. This study proposes an MAPF strategy based on selective communication and hierarchical conflict resolution to optimize communication efficiency, reduce path deviations and deadlocks, and improve path planning performance and task completion rates in complex environments.  Methods  The proposed Decision Causal Communication with Prioritized Resolution (DCCPR) method integrates reinforcement learning and the A* algorithm and introduces selective communication with hierarchical conflict resolution. A dynamic joint masking decision mechanism enables targeted agent selection within the selective communication framework. The model is instantiated and validated using the Dueling Double Deep Q-Network (D3QN) algorithm, which dynamically selects agents for communication, reducing information redundancy, lowering communication overhead, and enhancing computational efficiency. The Q-network reward function incorporates expected paths generated by the A* algorithm, penalizing path deviations and cumulative congestion to guide agents toward low-congestion routes, thereby accelerating task completion. A hierarchical conflict resolution strategy is also proposed, which considers target distance, task Q-values, and task urgency. By combining dynamic re-planning using the A* algorithm with a turn-taking mechanism, this approach effectively resolves conflicts, enables necessary detours to avoid collisions, increases task success rates, and reduces average task completion time.  Results and Discussions  The experimental results show that the DCCPR method outperforms conventional approaches in task success rate, computational efficiency, and path planning performance, particularly in large-scale and complex environments. In terms of task success rate, DCCPR demonstrates superior performance across random maps of different sizes. In the 40 × 40 random map environment (Fig. 6), DCCPR consistently maintains a success rate above 90%, significantly higher than other baseline methods, with no apparent decline as the number of agents increases. In contrast, methods such as DHC and PRIMAL exhibit substantial performance degradation, with success rates dropping below 50% as agent numbers grow. DCCPR reduces communication overhead through its selective communication mechanism, while the hierarchical conflict resolution strategy minimizes path conflicts, maintaining stable performance even in high-density environments. In the 80 × 80 map (Fig. 7), under extreme conditions with 128 agents, DCCPR’s success rate remains above 90%, confirming its applicability to both small-scale and large-scale, complex scenarios. DCCPR also achieves significantly improved computational efficiency. In the 40 × 40 map (Fig. 8), it records the shortest average episode length among all methods, and the increase in episode length with higher agent numbers is substantially lower than that observed in other approaches. In the 80 × 80 environment (Fig. 9), despite the larger map size, DCCPR consistently maintains the shortest episode length. The hierarchical conflict resolution strategy effectively reduces path conflicts and prevents deadlocks. In environments with dense obstacles and high agent numbers, DCCPR dynamically adjusts task priorities and employs a turn-taking mechanism to mitigate delays caused by path competition. Moreover, in structured map environments not encountered during training, DCCPR maintains high success rates and efficiency (Table 2), demonstrating strong scalability. Compared to baseline methods, DCCPR achieves approximately a 79% improvement in task success rate and a 46.4% reduction in average episode length. DCCPR also performs well in warehouse environments with narrow passages, where congestion typically presents challenges. Through turn-taking and dynamic path re-planning, agents are guided toward previously unused suboptimal paths, reducing oscillatory behavior and lowering the risk of task failure. Overall, DCCPR sustains high computational efficiency while maintaining high success rates, effectively addressing the challenges of multi-agent path planning in complex dynamic environments.  Conclusions  The DCCPR method proposed in this study provides an effective solution for multi-agent path planning. Through selective communication and hierarchical conflict resolution, DCCPR significantly improves path planning efficiency and task success rates. Experimental results confirm the strong adaptability and stability of DCCPR across diverse complex environments, particularly in dynamic scenarios, where it effectively reduces conflicts and enhances system performance. Future work will focus on refining the communication strategy by integrating global and local communication benefits to improve performance in large-scale environments. In addition, real-world factors such as dynamic environmental changes and the energy consumption of intelligent agents will be considered to further enhance the system’s effectiveness.
Long-Term Trajectory Prediction Model Based on Points of Interest and Joint Loss Function
ZHOU Chuanxin, JIAN Gang, LI Lingshu, YANG Yi, HU Yu, LIU Zhengming, ZHANG Wei, RAO Zhenzhen, LI Yunxiao, WU Chao
2025, 47(8): 2841-2849.   doi: 10.11999/JEIT250011
[Abstract](180) [FullText HTML](90) [PDF 1488KB](22)
Abstract:
  Objective  With the rapid development of modern maritime and aerospace sectors, trajectory prediction plays an increasingly critical role in applications such as ship scheduling, aviation, and security. Growing demand for higher prediction accuracy exposes limitations in traditional methods, such as Kalman filtering and Markov chains, which struggle with complex, nonlinear trajectory patterns and fail to meet practical needs. In recent years, deep learning techniques, including LSTM, GRU, CNN, and TCN models, have demonstrated notable advantages in trajectory prediction by effectively capturing time series features. However, these models still face challenges in representing the heterogeneity and diversity of trajectory data, with limited capacity to extract features from multidimensional inputs. To address these gaps, this study proposes a long-term trajectory prediction model, PL-Transformer, based on points of interest and a joint loss function.  Methods  Building on the TrAISformer framework, the proposed PL-Transformer incorporates points of interest and a joint loss function to enhance long-term trajectory prediction. The model defines the positions of points of interest within the prediction range using expert knowledge and introduces correlation features between trajectory points and points of interest. These features are integrated into a sparse data representation that improves the model’s ability to capture global trajectory patterns, addressing the limitation of conventional Transformer models, which primarily focus on local feature changes. Additionally, the model employs a joint loss function that links latitude and longitude predictions with feature losses associated with points of interest. This approach leverages inter-feature loss relationships to enhance the model’s capability for accurate long-term trajectory prediction.  Results and Discussions  The convergence performance of the PL-Transformer model is evaluated by analyzing the variation in training and validation losses and comparing them with those of the TrAISformer model. The corresponding loss curves are presented in (Fig. 5). The PL-Transformer model exhibits faster convergence and improved training stability on both datasets. These results indicate that the introduction of the joint loss function enhances convergence efficiency and training stability, yielding performance superior to the TrAISformer model. In terms of short-term prediction accuracy, the results in Table 1 show that the PL-Transformer model achieves comparable overall prediction accuracy to the TrAISformer model. The PL-Transformer model performs better in terms of the Mean Absolute Percentage Error (MAPE) metric, while it shows slightly higher errors than the TrAISformer model for Mean Absolute Error (MAE), median Absolute Error (MdAE), and coefficient of determination (R2). For the widely used Mean Squared Error (MSE) metric, both models perform similarly. These results indicate that after incorporating points of interest and optimizing the loss function, the PL-Transformer model retains competitive performance in relative error control and fitting accuracy, while preserving the stability and robustness of the TrAISformer model in complex trajectory prediction tasks. For long-term prediction visualization, Table 2 presents the loss values for both models across medium to long-term prediction horizons (1 to 3 h). The PL-Transformer model achieves better long-term prediction accuracy than the TrAISformer model. Specifically, the loss for the PL-Transformer model increases from 2.058 (1 h) to 5.561 (3 h), whereas the TrAISformer model’s loss rises from 2.160 to 6.145 over the same period. In terms of time complexity analysis, although the PL-Transformer model incorporates additional feature engineering and joint loss computation steps, these enhancements do not substantially increase the overall time complexity. The total computational complexity of the PL-Transformer model remains consistent with that of the TrAISformer model.  Conclusions  This study proposes the PL-Transformer model, which incorporates points of interest and an optimized loss function to address the challenges posed by complex dynamic features and heterogeneity in trajectory prediction tasks. By introducing distance and bearing angle through feature engineering and designing a joint loss function, the model effectively learns and captures spatial and motion characteristics within trajectory data. Experimental results demonstrate that the PL-Transformer model achieves higher prediction accuracy, faster convergence, and greater robustness than the TrAISformer model and other widely used baseline models, particularly in long-term and complex dynamic trajectory prediction scenarios. Despite the strong performance of the PL-Transformer model in experimental settings, trajectory prediction tasks in real-world applications remain affected by various challenges, including data noise, high-frequency trajectory fluctuations, and the influence of external environmental factors. Future research will focus on improving the model’s adaptability to multimodal trajectory data, integrating multi-source information to enhance generalization capability, and incorporating additional feature engineering and optimization strategies to address more complex prediction tasks. In summary, the proposed PL-Transformer model provides an effective advancement for Transformer-based trajectory prediction frameworks and offers valuable reference for practical applications in trajectory forecasting and related fields.
Detection and Interaction Analysis of Place Cell Firing Information in Dual Brain Regions of Awake Active Rats
LI Ming, XU Wei, XU Zhaojie, MO Fan, YANG Gucheng, LV Shiya, LUO Jinping, JIN Hongyan, LIU Juntao, CAI Xinxia
2025, 47(8): 2850-2858.   doi: 10.11999/JEIT250024
[Abstract](176) [FullText HTML](111) [PDF 4579KB](15)
Abstract:
  Objective  Continuous monitoring of neural activities in free-moving rats is essential for understanding brain function but presents significant challenges regarding the stability and biocompatibility of the detection device. This study aims to provide comprehensive data on brain activity by simultaneously monitoring two brain regions. This approach is crucial for elucidating the neural encoding differences within these regions and the information exchange between them, both of which are integral to spatial memory and navigation processes. Spatial navigation is a fundamental behavior in rats, vital for their survival and interaction with their environment. Central to this behavior are place cells—neurons that selectively respond to an animal’s location, forming the basis of spatial memory and navigation. This study focuses on the hippocampal CA1 region and the Barrel Cortex (BC), both of which are critical for spatial processing. By monitoring these regions simultaneously, the aim is to uncover the neural dynamics underlying spatial memory formation and retrieval. Understanding these dynamics provides insights into the neural mechanisms of spatial cognition and memory, which are fundamental to higher cognitive functions and are often disrupted in neurological disorders such as Alzheimer’s disease and schizophrenia.  Methods  To achieve dual brain region monitoring, a four-electrode MicroElectrode Array (MEA) is designed to conform to the shape of the dual brain regions and is surface-modified with a Polypyrrole/Silver Nanowire (PPy/AgNW) nanocomposite material. Each probe of the MEA consists of eight recording sites with a diameter of 20 μm and one reference site. The MEA is fabricated using Microelectromechanical Systems (MEMS) technology and modified via an electrochemical deposition process. The PPy/AgNW nanocomposite modification is selected for its low impedance and high biocompatibility, which are critical for stable, long-term recordings. The deposition of PPy/AgNW is carried out using cyclic voltammetry. The stability of the modified MEA is assessed by cyclic voltammetry in phosphate-buffered saline to simulate in vivo charge/discharge processes. The MEA is then implanted into the CA1 and BC regions of rats, and neural activities are recorded during a two-week spatial memory task. Spike signals are analyzed to identify place cells and assess their firing patterns, while Local Field Potential (LFP) power is measured to evaluate overall neural activity. Mutual information analysis is performed to quantify the interaction between the two brain regions. The experimental setup includes a behavior arena where rats perform spatial navigation tasks, with continuous neural signal recording using the modified MEA.  Results and Discussions  The PPy/AgNW-modified MEA exhibits low impedance (53.01 ± 2.59 kΩ) at 1 kHz (Fig. 2). This low impedance is critical for high-fidelity signal acquisition, enabling the detection of subtle neural activities. The stability of the MEA is evaluated through 1000 cycles of cyclic voltammetry scanning, demonstrating high capacitance retention (92.51 ± 2.21%) and no significant increase in impedance (Fig. 3). These results suggest that the MEA maintains stable performance over extended periods, which is essential for long-term in vivo monitoring. The modified MEA successfully detects neural activities from the BC and CA1 regions over the two-week period. The average firing rates and LFP power in both regions progressively increase, indicating enhanced neural activity as the rats become more familiar with the spatial memory task (Fig. 4). This increase suggests that the rats’ spatial memory and navigation abilities improve over time, likely due to increased familiarity with the environment and task requirements. Place cells are identified in the recorded neurons, confirming the presence of spatially selective neuronal activity (Fig. 5). The identification of place cells is a key finding, as these neurons are fundamental to spatial memory and navigation. Additionally, the spatial stability of place cells in the CA1 region is higher than in the BC region, indicating functional differences between these areas in spatial memory processing (Fig. 5). This suggests that the CA1 region plays a more critical role in spatial memory consolidation. Mutual information analysis reveals significant information exchange between the dual brain regions during the initial memory phase, suggesting a role in memory storage (Fig. 6). This inter-regional communication is crucial for understanding how spatial information is processed and stored in the brain. The observed increase in mutual information over time indicates that the interaction between the BC and CA1 regions becomes more pronounced as the rats engage in spatial navigation, highlighting the dynamic nature of neural interactions during memory formation and retrieval.  Conclusions  This study successfully demonstrated continuous dual brain region monitoring in freely moving rats using a PPy/AgNW-modified MEA. The findings reveal dynamic interactions between the BC and CA1 regions during spatial memory tasks and highlight the importance of place cells in memory formation. Monitoring neural activities in dual brain regions over extended periods provides new insights into the neural basis of spatial memory and navigation. The results suggest that the CA1 region plays a critical role in spatial memory consolidation, while the BC region also contributes to spatial processing. This distinction highlights the value of studying multiple brain regions simultaneously to gain a comprehensive understanding of neural processes. The PPy/AgNW-modified MEA serves as a powerful tool for investigating the complex neural mechanisms underlying spatial cognition and memory, with potential applications in related neurological disorders.
Personalized Tensor Decomposition Based High-order Complementary Cloud API Recommendation
SUN Mengmeng, LIU Xiaowei, CHEN Wenhui, SHEN Limin, YOU Dianlong, CHEN Zhen
2025, 47(8): 2859-2871.   doi: 10.11999/JEIT250003
[Abstract](265) [FullText HTML](99) [PDF 3150KB](25)
Abstract:
  Objective  With the emergence of the cloud era in the Internet of Things, cloud Application Programming Interfaces (APIs) have become essential for managing data element dynamics, facilitating AI algorithm implementation, and coordinating access to computing resources. Cloud APIs have developed into critical digital infrastructure that supports the digital economy and the operation of service-oriented software. However, the rapid expansion of cloud APIs has impacted users’ decision-making processes and complicated the promotion of cloud APIs. This situation underscores the urgent need for effective cloud API recommendation methods to foster the development of the API economy and encourage the widespread adoption of cloud APIs. While existing research has focused on modeling invocation preferences, search keywords, or a combination of both to recommend suitable cloud APIs for a given Mashup, it does not address the need for personalized high-order complementary cloud APIs in practical software development. Personalized high-order complementary cloud API recommendation aims to provide developers with APIs that align with their personalized invocation preferences and complement the other APIs in their query set, thereby addressing the developers’ joint interests.  Methods  To address this issue, a Personalized Tensor Decomposition-based High-order Complementary cloud API Recommendation (PTDHCR) method is proposed. First, the invocation relationships between Mashups and cloud APIs, as well as the complementary relationships between cloud APIs, are represented as a three-dimensional tensor. RECAL tensor decomposition is applied to jointly learn and uncover personalized asymmetric complementary relationships between cloud APIs. Second, a personalized high-order complementary perception network is designed to account for the varying influence of different complementary relationships on recommendations. This network dynamically calculates the attention of a Mashup to the complementary relationships between different query and candidate cloud APIs using the multi-modal features of the Mashup, query cloud APIs, and candidate cloud APIs. Finally, the personalized complementary relationships are extended to higher orders, yielding a comprehensive personalized complementarity between candidate cloud APIs and the query set.  Results and Discussions  Extensive experiments are conducted on two real cloud API datasets. First, PTDHCR is compared with 11 baseline methods suitable for personalized high-order complementary cloud API recommendation. The experimental results (Tables 2 and 3) show that, on the PWA dataset, PTDHCR outperforms the best baseline by 0.12%, 0.14%, 1.46%, and 2.93% in terms of AUC. HR@10 improves by 0.91%, 1.01%, 3.45%, and 10.84%, while RMSE decreases by 0.33%, 0.7%, 1.36%, and 2.67%. PTDHCR also performs well on the HGA dataset, significantly outperforming the baseline methods in AUC, HR@10, and RMSE metrics. Second, experiments are conducted with varying complementary thresholds to evaluate PTDHCR’s performance at different complementary orders. The experimental results (Figure 4) indicate that PTDHCR’s recommendation performance improves progressively as the complementary order increases. This improvement is attributed to the method’s ability to incorporate more complementary information, thereby enhancing its recommendation capability. Next, a comparison experiment is performed to assess whether the personalized high-order complementary perception network can better capture high-order complementary relationships than the mean-value and semantic similarity-based methods. The experimental results (Figures 5 and 6) demonstrate that the personalized high-order complementary perception network outperforms other methods. This is due to the network’s ability to consider the contribution of different complementary relationships and dynamically compute the Mashup’s attention to each complementary relationship. Finally, an example is provided, evaluating the predicted probability of a Mashup invoking other candidate cloud APIs, given that it has already invoked the “Google Maps API” and the “Google AdSense API.” This example illustrates the personalized nature of the high-order complementary cloud API recommendation achieved by the PTDHCR method.  Conclusions  Existing methods fail to address the actual needs of developers for personalized high-order complementary cloud APIs in the development of service-oriented software. This paper defines the recommendation problem of personalized high-order complementary cloud APIs and proposes a solution. A personalized high-order complementary cloud API recommendation method based on tensor decomposition is introduced. Initially, the invocation relationships between Mashups and cloud APIs, as well as the complementary relationships between cloud APIs, are modeled as a three-dimensional tensor. RECAL tensor decomposition technology is then applied to jointly learn and explore the personalized asymmetric complementary relationships. Additionally, a high-order complementary perception network is constructed to dynamically compute Mashups’ attention towards various complementary relationships, which extends these relationships to higher orders. Experimental results show that PTDHCR outperforms state-of-the-art cloud API recommendation methods on real cloud API datasets. PTDHCR offers an effective approach to address the cloud API selection problem and contributes to the healthy development and popularization of the cloud API economy.
LFTA:Lightweight Feature Extraction and Additive Attention-based Feature Matching Method
GUO Zhiqiang, WANG Zihan, WANG Yongsheng, CHEN Pengyu
2025, 47(8): 2872-2882.   doi: 10.11999/JEIT250124
[Abstract](357) [FullText HTML](266) [PDF 4453KB](62)
Abstract:
  Objective  With the rapid development of deep learning, feature matching has advanced considerably, particularly in computer vision. This progress has led to improved performance in tasks such as 3D reconstruction, motion tracking, and image registration, all of which depend heavily on accurate feature matching. Nevertheless, current techniques often face a trade-off between accuracy and computational efficiency. Some methods achieve high matching accuracy and robustness but suffer from slow processing due to algorithmic complexity. Others offer faster processing but compromise matching accuracy, especially under challenging conditions such as dynamic scenes, low-texture environments, or large view-angle variations. The key challenge is to provide a balanced solution that ensures both accuracy and efficiency. To address this, this paper proposes a Lightweight Feature exTraction and matching Algorithm (LFTA), which integrates an additive attention mechanism within a lightweight architecture. LFTA enhances the robustness and accuracy of feature matching while maintaining the computational efficiency required for real-time applications.  Methods  LFTA utilizes a multi-scale feature extraction network designed to capture information from images at different levels of detail. A triple-exchange fusion attention mechanism merges information across multiple dimensions, including spatial and channel features, allowing the network to learn more robust feature representations. This mechanism improves matching accuracy, particularly in scenarios with sparse textures or large viewpoint variations. LFTA further integrates an adaptive Gaussian kernel to dynamically generate keypoint heatmaps. The kernel adjusts according to local feature strength, enabling accurate keypoint extraction in both high-response and low-response regions. To improve keypoint precision, a dynamic Non-Maximum Suppression (NMS) strategy is applied, which adapts to varying keypoint densities across different image regions. This approach reduces redundancy and improves detection accuracy. In the final stage, LFTA employs a lightweight module with an additive Transformer attention mechanism to refine feature matching. This module strengthens feature fusion while reducing computational complexity through depthwise separable convolutions. These operations substantially lower parameter count and computational cost without affecting performance. Through this combination of techniques, LFTA achieves accurate pixel-level matching with fast inference times, making it suitable for real-time applications.  Results and Discussions  The performance of LFTA is assessed through extensive experiments conducted on two widely used and challenging datasets: MegaDepth and ScanNet. These datasets offer diverse scenarios for evaluating the robustness and efficiency of feature matching methods, including variations in texture, environmental complexity, and viewpoint changes. The results indicate that LFTA achieves higher accuracy and computational efficiency than conventional feature matching approaches. On the MegaDepth dataset, an AUC@20° of 79.77% is attained, which is comparable to or exceeds state-of-the-art methods such as LoFTR. Notably, this level of performance is achieved while reducing inference time by approximately 70%, supporting the suitability of LFTA for practical, time-sensitive applications. When compared with other efficient methods, including Xfeat and Alike, LFTA demonstrates superior matching accuracy with only a marginal increase in inference time, proving its competitive performance in both accuracy and speed. The improvement in accuracy is particularly apparent in scenarios characterized by sparse textures or large viewpoint variations, where traditional methods often fail to maintain robustness. Ablation studies confirm the contribution of each LFTA component. Exclusion of the triple-exchange fusion attention mechanism results in a significant reduction in accuracy, indicating its function in managing complex feature interactions. Similarly, both the adaptive Gaussian kernel and dynamic NMS are found to improve keypoint extraction, emphasizing their roles in enhancing overall matching precision.  Conclusions  The LFTA algorithm addresses the long-standing trade-off between feature extraction accuracy and computational efficiency in feature matching. By integrating the triple-exchange fusion attention mechanism, adaptive Gaussian kernels, and lightweight fine-tuning strategies, LFTA achieves high matching accuracy in dynamic and complex environments while maintaining low computational requirements. Experimental results on the MegaDepth and ScanNet datasets demonstrate that LFTA performs well under typical feature matching conditions and shows clear advantages in more challenging scenarios, including low-texture regions and large viewpoint variations. Given its efficiency and robustness, LFTA is well suited for real-time applications such as Augmented Reality (AR), autonomous driving, and robotic vision, where fast and accurate feature matching is essential. Future work will focus on further optimizing the algorithm for high-resolution images and more complex scenes, with the potential integration of hardware acceleration to reduce computational overhead. The method could also be extended to other computer vision tasks, including image segmentation and object detection, where reliable feature matching is required.
Autonomous Teaming and Task Collaboration for Multi-Agent Systems in Dynamic Environments
WANG Chen, ZHU Cheng, LEI Hongtao
2025, 47(8): 2883-2894.   doi: 10.11999/JEIT250079
[Abstract](169) [FullText HTML](109) [PDF 6518KB](40)
Abstract:
  Objective  In dynamic and volatile battlefield environments, where the command structure of combat units may be disrupted, combat units must autonomously form appropriate tactical groups in edge operational settings, determine group affiliation, and rapidly allocate tasks. This study proposes a combat unit aggregation and planning method based on an adaptive clustering contract network, addressing the real-time limitations of traditional centralized optimization algorithms. The proposed method enables collaborative decision-making for autonomous group formation and supports multi-task optimization and allocation under dynamic battlefield conditions.  Methods  (1) An adaptive combat group division algorithm based on the second-order relative change rate is proposed. The optimal number of groups is determined using the Sum of Squared Errors (SSE) indicator, and spatial clustering of combat units is performed via an improved K-means algorithm. (2) A dual-layer contract network architecture is designed. In the first layer, combat groups participate in bidding by computing the net effectiveness of tasks, incorporating attributes such as attack, defense, and value. In the second layer, individual combat units conduct bidding with a load balancing factor to optimize task selection. (3) Mechanisms for task redistribution and exchange are introduced, improving global utility through a secondary bidding process that reallocates unassigned tasks and replaces those with negative effectiveness.  Results and Discussions  (1) The adaptive combat group division algorithm demonstrates enhanced situational awareness (Algorithm 1). Through dynamic clustering analysis, it accurately captures the spatial aggregation of combat units (Fig. 6 and Fig. 9), showing greater adaptability to environmental variability than conventional fixed-group models. (2) The multi-layer contract network architecture exhibits marked advantages in complex task allocation. The group-level pre-screening mechanism significantly reduces computational overhead, while the unit-level negotiation process improves resource utilization by incorporating load balancing. (3) The dynamic task optimization mechanism enables continuous refinement of the allocation scheme. It resolves unassigned tasks and enhances overall system effectiveness through intelligent task exchanges. Comparative experiments confirm that the proposed framework outperforms traditional approaches in task coverage and resource utilization efficiency (Table 4 and Table 5), supporting its robustness in dynamic battlefield conditions.  Conclusions  This study integrates clustering analysis with contract network protocols to establish an intelligent task allocation framework suited to dynamic battlefield conditions. By implementing dual-layer optimization in combat group division and task assignment, the approach improves combat resource utilization and shortens the kill chain. Future research will focus on validating the framework in multi-domain collaborative combat scenarios, refining bidding strategies informed by combat knowledge, and advancing command and control technologies toward autonomous coordination.
YOMANet-Accel: A Lightweight Algorithm Accelerator for Pedestrians and Vehicles Detection at the Edge
CHEN Ningjiang, LU Yaozong
2025, 47(8): 2895-2908.   doi: 10.11999/JEIT250059
[Abstract](226) [FullText HTML](108) [PDF 6758KB](50)
Abstract:
  Objective  Accurate and real-time detection of pedestrians and vehicles is essential for autonomous driving at the edge. However, deep learning-based object detection algorithms are often challenging to deploy in edge environments due to their high computational demands and complex parameter structures. To address these limitations, this study proposes a soft-hard coordination strategy. A lightweight neural network model, Yolo Model Adaptation Network (YOMANet), is designed, and a corresponding neural network accelerator, YOMANet Accelerator (YOMANet-Accel), is implemented on a heterogeneous Field-Programmable Gate Array (FPGA) platform. This system enables efficient algorithm acceleration for pedestrian and vehicle detection in edge-based autonomous driving scenarios.  Methods  The lightweight backbone of YOMANet adopts MobileNetv2 to reduce the number of network parameters. The neck network incorporates the Spatial Pyramid Pooling (SPP) and Path Aggregation Network (PANet) structures from YOLOv4 to expand the receptive field and accommodate targets of varying sizes. Depthwise separable convolution replaces standard convolution, thereby reducing training complexity and improving convergence speed. To enhance detail extraction, the Normalization-based Attention Module (NAM) is integrated into the head network, allowing suppression of irrelevant feature weights. For deployment on a FPGA platform, parallel computing and data storage schemes are designed. The parallel computing strategy adopts a loop blocking method to reorder inner and outer loops, enabling access to different output array elements through adjacent loop layers and facilitating parallel processing of output feature map pixels. Multiply-add trees are implemented in the Processing Engine (PE) to support efficient task allocation and operation scheduling. A double-buffer mechanism is introduced in the data storage scheme to increase data reuse, minimize transmission latency, and enhance system throughput. In addition, int8 quantization is applied to both weight parameters and activation functions, reducing the overall parameter size and accelerating parallel computation.  Results and Discussions  Experimental results on the training platform indicate that YOMANet achieves the inference speed characteristic of lightweight models while maintaining the detection accuracy of large-scale models, thereby improving overall detection performance (Fig. 12, Table 2). The ablation study demonstrates that the integration of MobileNetv2 and depthwise separable convolution significantly reduces the number of model parameters. Embedding the NAM attention mechanism does not noticeably increase model size but enhances detail extraction and improves detection of small targets (Table 3). Compared with other lightweight algorithms, the enhanced YOMANet shows improved detail extraction and superior detection of small and occluded targets, with substantially lower false and missed detection rates (Fig. 13). Results on the accelerator platform reveal that quantization has minimal effect on accuracy while substantially reducing model size, supporting deployment on resource-constrained edge devices (Table 4). When deployed on the FPGA platform, YOMANet retains detection accuracy comparable to GPU/CPU platforms, while power consumption is reduced by an order of magnitude, meeting the efficiency requirements for edge deployment (Fig. 14). Compared with related accelerator designs, YOMANet-Accel achieves competitive throughput and the highest Digital Signal Processing (DSP) efficiency, demonstrating the effectiveness of the proposed parallel computing and storage schemes in utilizing FPGA resources (Table 5).  Conclusions  Experimental results demonstrate that YOMANet achieves high detection accuracy and fast inference speed on the training platform, with enhanced performance for small and occluded targets, leading to a reduced missed detection rate. When deployed on the FPGA platform, YOMANet-Accel achieves an effective balance between detection performance and resource efficiency, supporting real-time pedestrian and vehicle detection in edge computing scenarios.
FSG: Feature-level Semantic-aware Guidance for multi-modal Image Fusion Algorithm
ZHANG Mei, JIN Ye, ZHU Jinhui, HE Lin
2025, 47(8): 2909-2918.   doi: 10.11999/JEIT250042
[Abstract](295) [FullText HTML](180) [PDF 4340KB](59)
Abstract:
  Objective  Multimodal vision techniques offer greater advantages than unimodal ones in autonomous driving scenarios. Fused images from multiple modalities enhance salient radiation information from targets while preserving background texture and detail. Furthermore, such fused images improve the performance of downstream visual tasks, i.e., semantic segmentation, compared with visible-light images alone, thereby enhancing the decision accuracy of automated driving systems. However, most existing fusion algorithms prioritize visual quality and standard evaluation metrics, often overlooking the requirements of downstream tasks. Although some approaches attempt to integrate task-specific guidance, they are constrained by weak interaction between semantic priors and fusion processes, and fail to address cross-modal feature variability. To address these limitations, this study proposes a multimodal image fusion algorithm, termed Feature-level Semantic-aware Guidance (FSG), which leverages feature-level semantic information from segmentation networks to guide the fusion process. The proposed method aims to enhance the utility of fused images in advanced vision tasks by strengthening the alignment between semantic understanding and feature integration.  Methods  The proposed algorithm adopts a parallel fusion framework integrating a fusion network and a segmentation network. Feature-level semantic prior knowledge from the segmentation network guides the fusion process, aiming to enhance the semantic richness of the fused image and improve performance in downstream visual tasks. The overall architecture comprises a fusion network, a segmentation network, and a feature interaction mechanism connecting the two. Infrared and visible images serve as inputs to the fusion network, whereas only visible images, which are rich in texture and detail, are used as inputs to the segmentation network. The fusion network uses a dual-branch structure for modality-specific feature extraction, with each branch containing two Adaptive Gabor convolution Residual (AGR) modules. A Multimodal Spatial Attention Fusion (MSAF) module is incorporated to effectively integrate features from different modalities. In the reconstruction phase, semantic features from the segmentation network are combined with image features from the fusion network via a Dual Feature Interaction (DFI) module, enhancing semantic representation before generating the final fused image.  Results and Discussions  This study includes fusion experiments and joint segmentation task experiments. For the fusion experiments, the proposed method is compared with seven state-of-the-art algorithms: DenseFuse, DIDFuse, U2Fusion, TarDal, SeAFusion, DIVFusion, and CDDFuse, across three datasets: MFNet, M3FD, and RoadScene. Both subjective and objective evaluations are conducted. For subjective evaluation, the fused images generated by each method are visually compared. For objective evaluation, six metrics are employed: Mutual Information (MI), Visual Information Fidelity (VIF), Average Gradient (AG), Sum of Correlation Differences (SCD), Structural Similarity Index Measure (SSIM), and Gradient-based Similarity Measurement (QAB/F). The results show that the proposed method performs consistently well across all datasets, effectively preserves complementary information from infrared and visible images, and achieves superior scores on all evaluation metrics. In the joint segmentation experiments, comparisons are made on the MFNet dataset. Subjective evaluation is presented through semantic segmentation visualizations, and objective evaluation uses Intersection over Union (IoU) and mean IoU (mIoU) metrics. The segmentation results produced by the proposed method more closely resemble ground truth labels and achieve the highest or second-highest IoU scores across all classes. Overall, the proposed method not only yields improved visual fusion results but also demonstrates clear advantages in downstream segmentation performance.  Conclusions  This study proposes an FSG strategy for multimodal image fusion networks, designed to fully leverage semantic information to improve the utility of fused images in downstream visual tasks. The method accounts for the variability among heterogeneous features and integrates the segmentation and fusion networks into a unified framework. By incorporating feature-level semantic information, the approach enhances the quality of the fused images and strengthens their performance in segmentation tasks. The proposed DFI module serves as a bridge between the segmentation and fusion networks, enabling effective interaction and selection of semantic and image features. This reduces the influence of feature variability and enriches the semantic content of the fusion results. In addition, the proposed MSAF module promotes the complementarity and integration of features from infrared and visible modalities while mitigating the disparity between them. Experimental results demonstrate that the proposed method not only achieves superior visual fusion quality but also outperforms existing methods in joint segmentation performance.
A Fake News Detection Approach Enhanced by Multi-Source Feature Fusion
HU Ze, CHEN Zhinan, YANG Hongyu
2025, 47(8): 2919-2934.   doi: 10.11999/JEIT250041
[Abstract](279) [FullText HTML](173) [PDF 4014KB](40)
Abstract:
  Objective  News exhibits multidimensional complexity, comprising structural, temporal, and content features. Structural features are reflected in the propagation path, depth, and breadth. Fake news often exhibits distinctive structural patterns, such as rapid diffusion through a limited number of “opinion leader” nodes or the formation of densely connected propagation clusters. Temporally, fake news tends to spread quickly within short timeframes, characterized by unusually high dissemination speeds and elevated interaction rates. Content features include the information conveyed in headlines and body text; fake news often contains sensationalized headlines, emotive language, inaccurate data, or fabricated claims. Detection models that rely solely on a single feature type often demonstrate limited discriminative performance. Therefore, capturing the hierarchical and heterogeneous attributes of news is critical to improving detection accuracy and remains a major focus of ongoing research. Current approaches predominantly emphasize content features, with limited incorporation of structural characteristics. Although Graph Neural Networks (GNN) have been employed to model propagation structures, their integration of content and temporal information remains inadequate. To address these limitations, this study proposes a fake news detection approach based on multi-source feature fusion, which enables more comprehensive feature representation and substantially enhances detection performance.  Methods  To enhance fake news detection performance, this study proposes a Multi-Source Feature Fusion Enhancement (MSFFE) approach that extracts features from three sources: structural, temporal, and content, and integrates them using an adaptive fusion mechanism. This mechanism dynamically adjusts the weight of each feature type to generate a unified representation with comprehensive expressiveness. The model comprises three core components: propagation tree encoding, multi-source feature extraction, and news classification (Fig. 2). In the propagation tree encoding component, a GNN is employed to represent the news propagation structure. Specifically, the GraphSAGE (Graph SAmple and aggreGatE) algorithm is used to aggregate node information from the propagation tree to the root node, enabling efficient capture of local structural patterns and temporal dynamics. Compared with conventional GNN methods, GraphSAGE improves scalability for large-scale graphs and reduces computational complexity by avoiding full-graph updates. In the multi-source feature extraction component, the model extracts structural, temporal, and content features. For structural features, the encoded propagation tree nodes are organized into a hypergraph. A hypergraph attention mechanism is then applied: first, hyperedge representations are updated via node-level attention; next, node representations are updated via hyperedge-level attention; and finally, structural-level features are obtained. For temporal features, node activity across multiple time windows is modeled using time-scale expansion and compression. A time decay attention mechanism is introduced to extract multi-scale temporal features, which are then fused into a unified temporal representation. For content features, the root node’s associated text is processed using a multi-head self-attention mechanism to capture semantic information, yielding content-level features. After extracting the three feature types, an adaptive multi-source fusion mechanism integrates them into a final news representation. This representation is passed through a fully connected layer and activation function for classification. The fully connected layer applies a linear transformation using a learnable weight matrix and bias term to produce predicted scores for each news instance. During training, model parameters are optimized to maximize classification accuracy. The final output is mapped to a probability in [0,1] using a Sigmoid activation function, indicating the likelihood of the news being classified as “true” or “fake.” A threshold of 0.5 is used for binary classification: probabilities above 0.5 are labeled “fake,” and those below are labeled “true.”  Results and Discussions  As shown in Table 3, the ablation experiments demonstrate that incorporating features from different sources into the base model significantly improves fake news detection accuracy. This finding confirms the effectiveness of the core components in the proposed approach. The integration of multi-source features enhances the overall detection performance, highlighting the advantage of the fusion mechanism in identifying fake news. Comparative experiments further support these results. As shown in (Table 2), the proposed approach outperforms existing approaches on both the Politifact and Gossipcop datasets. On the Politifact dataset, it improves accuracy by 3.64% and the F1 score by 3.41% compared with the State-Of-The-Art (SOTA) method Robust Trust Evaluation Architecture (RTRUST). On the Gossipcop dataset, the accuracy and F1 score increase by 0.55% and 0.56%, respectively. These improvements are attributed to the approach’s ability to effectively model high-order structural features and integrate temporal and content features, resulting in more comprehensive and discriminative feature representations.  Conclusions  Experimental results demonstrate that the proposed approach effectively extracts and fuses multi-source features, substantially improving the performance of fake news detection. By enhancing the model’s ability to represent structural, temporal, and content characteristics, the approach contributes to more accurate classification. This has the potential to mitigate the societal consequences of fake news, including public misinformation, reputational damage to organizations, and policy misjudgments.
Research on Deep Learning Analysis Model of Sleep ElectroEncephalography Signals for Anxiety Improvement
HUANG Chen, MA Yaolong, ZHANG Yan, WANG Shihui, YANG Chao, SONG Jianhua, CHEN Kansong, YANG Weiping
2025, 47(8): 2935-2944.   doi: 10.11999/JEIT241123
[Abstract](149) [FullText HTML](96) [PDF 2799KB](26)
Abstract:
  Objective  Anxiety is a common emotional disorder, characterized by excessive worry and fear, which negatively affects mental, physical, and social well-being. A bidirectional relationship exists between anxiety and sleep, poor sleep quality worsens anxiety symptoms, and anxiety disrupts normal sleep patterns. ElectroEncephaloGraphy (EEG) signals provide a non-invasive and informative means to investigate brain activity, making them useful for studying the neurophysiological underlying this association. However, conventional EEG analysis methods often fail to capture the complex, multiscale features needed to assess anxiety modulation during sleep. This study proposes an Improved Feature Pyramid Network (IFPN) model to enhance EEG analysis in sleep settings, with the aim of improving the detection and interpretation of anxiety-related brain activity.  Methods  The IFPN model comprises a preprocessing module, feature extraction module, and classification module, each being optimized for analyzing EEG signals related to anxiety during sleep. The preprocessing module applies Z-score normalization to EEG signals from individuals with anxiety to standardize signal amplitude across channels. Noise artifacts are reduced using a denoising process based on a feature pyramid network. Preprocessed signals are then converted into brain entropy topographies using Singular Spectral Entropy (SSE), which quantifies signal complexity. These entropy maps are processed by the IFPN backbone, which incorporates convolutional layers, SSE-guided upsampling, and lateral connections to enable multiscale feature fusion. The resulting features are input to a modified ResNet-50 network for classification, with SSE-based regularization applied to enhance model robustness and accuracy. The model is evaluated using two independent EEG datasets: a sleep deprivation dataset and a cognitive-state EEG dataset, both comprising participants with levels of anxiety.  Results and Discussions  The experimental results demonstrate that the IFPN model improves the detection of anxiety-related features in EEG signals during sleep. Spectral power analysis shows a significant reduction in β-band power after sleep, reflecting decreased hyperarousal commonly associated with anxiety. In Dataset 1, β-band power declines from 16% to 13% (p < 0.01), and in Dataset 2, from 19.5% to 15% (p < 0.05). This is accompanied by an increase in the θ/β power ratio, suggesting a shift toward a more relaxed neural state post-sleep. The IFPN model achieves 85% accuracy in identifying severe anxiety, outperforming baseline methods, which reach 78%. This improvement results from the model’s capacity to integrate multiscale features and selectively emphasize anxiety-related patterns, supporting more accurate classification of elevated anxiety states.  Conclusions  This study proposes an IFPN model for EEG analysis during sleep, with a focus on detecting anxiety-related neural activity. Unlike traditional approaches that rely on shallow architectures or frequency- limited metrics, the IFPN model addresses the multiscale and spatially heterogeneous nature of brain activity associated with anxiety. By incorporating SSE as a nonlinear dynamic feature, the model captures subtle regional and frequency-specific variations in EEG complexity. SSE functions as both a signal complexity metric and a functional biomarker of neural disorganization linked to anxiety. Integrated with the multiscale fusion capability of the feature pyramid network, SSE enhances the model’s ability to extract salient spatiotemporal features relevant to anxiety states. Experimental results show that the IFPN model outperforms existing methods in both accuracy and robustness, particularly in identifying severe anxiety, where conventional models often struggle due to noise and reduced discriminative performance. These findings highlight the model’s potential utility in clinical assessment of anxiety during sleep.
Circuit and System Design
A 2 pJ/bit, 4×112 Gbps PAM4 linear driver for MZM in LPO Application
ZHANG Shu’an, ZHU Wenrui, GU Yuandong, LEI Meng, ZHANG Jianling
2025, 47(8): 2945-2952.   doi: 10.11999/JEIT250176
[Abstract](286) [FullText HTML](200) [PDF 5055KB](40)
Abstract:
  Objective  The rapid increase in data transmission demands, driven by big data, cloud computing, and Artificial Intelligence (AI), requires advanced optical module technologies capable of supporting higher data rates, such as 800 Gbps. Conventional optical modules depend on power-intensive Digital Signal Processors (DSPs) for signal compensation, which increases cost, complexity, and energy consumption. This study addresses these limitations by proposing a Linear Driver Pluggable Optics (LPO) solution that eliminates the DSP while preserving high performance. The primary objective is to design a low-power, high-efficiency Mach–Zehnder Modulator (MZM) driver using 130 nm SiGe BiCMOS technology for 400 Gbps PAM4 applications. The design integrates Continuous-Time Linear Equalization (CTLE) and gain control to support reliable, cost-effective, and energy-efficient data transmission.  Methods  The proposed quad-channel MZM driver adopts a two-stage architecture: a merged Continuous-Time Linear Equalizer (CTLE) and Variable Gain Amplifier (VGA) stage (Stage 1), and an output driver (OUTDRV) stage (Stage 2). By integrating CTLE and VGA functions (Fig. 3), the design removes the pre-driver stage, improves current reuse, and enhances drive capability. Stage 1 employs a Gilbert cell-based core amplifier (Fig. 5a) with programmable peaking via Re and Ce, enabling a transfer function with adjustable gain (\begin{document}$ \eta $\end{document}) and peaking characteristics (Eq. 1). A novel low-frequency gain adjustment branch (Fig. 6) mitigates nonlinearity induced by conductor loss (Fig. 4), resulting in a flattened frequency response (Eq. 2). Stage 2 uses a cascode open-drain output structure to achieve a 3 Vppd swing at 56 Gbaud while reducing power consumption. Simulations and measurements confirm the design’s performance, with key metrics including S-parameters, Total Harmonic Distortion (THD), and Transmitter Dispersion Eye Closure for PAM4 (TDECQ).  Results and Discussions  The driver achieves a maximum gain of 19.49 dB with 9.2 dB peaking and a 12.57 dB gain control range. Measured S-parameters (Fig. 9) confirm the 19.49 dB gain, 47 GHz bandwidth, and a 4.4 dB programmable peaking range. The low-frequency adjustment circuit reduces gain by 1.6 dB below 3 GHz (Fig. 9c), effectively compensating for distortion caused by the skin effect. THD remains below 3.5% across input swings of 300~800 mVppd (Fig. 10). Eye diagrams (Fig. 11) demonstrate 56 Gbaud PAM4 operation, achieving a 3 Vppd output swing with TDECQ below 2.57 dB. The driver achieves a power efficiency of 2 pJ/bit (225.23 mW per channel), outperforming previous designs (Table 1). The use of a single 3.3 V supply eliminates the need for external DC-DC converters, facilitating system integration. Compared with recent drivers [11,14–16], this work demonstrates the highest data rate (112 Gb/s via PAM4) implemented in a mature 130 nm process while maintaining the lowest power consumption per bit.  Conclusions  This study presents a high-performance, energy-efficient MZM driver designed for LPO-based 400 Gbps optical modules. Key contributions include the merged CTLE–VGA architecture for optimized current reuse, a low-frequency gain adjustment technique that mitigates skin effect distortion, and a cascode output stage that achieves high swing and linearity. Measured results are consistent with simulations, confirming 19.49 dB gain, 3 Vppd output swing, and 2 pJ/bit energy efficiency. The elimination of DSPs, compatibility with cost-effective BiCMOS technology, and improved power performance highlight the driver’s potential for deployment in next-generation data centers and high-speed optical interconnects.
Calculation and Testing Method of Gap Impedance for X-Band Klystron Multi-Gap Output Cavity
GUO Xin, ZHANG Zhiqiang, GU Honghong, LIANG Yuan, SHEN Bin
2025, 47(8): 2953-2962.   doi: 10.11999/JEIT250002
[Abstract](106) [FullText HTML](85) [PDF 4001KB](10)
Abstract:
  Objective  From a structural standpoint, the klystron is a narrowband device that realizes beam-wave interaction through a sequence of independent resonant cavities. Advances in simulation techniques and computational methods for klystrons, as well as advancements in electric vacuum materials, manufacturing processes, and related technologies, have continuously enhanced their power and bandwidth performance. For example, high-power wideband klystrons are now widely applied in radar and communication systems. X-band wideband klystrons have achieved megawatt-level output pulse power, offering substantial utility in a range of radar applications. To satisfy the bandwidth demands of microwave electronic systems, research into wideband klystron technologies is increasingly prioritized. Expanding the bandwidth of the klystron output section is therefore a critical technology in the development of broadband klystrons.  Methods  Current approaches to expanding the frequency bandwidth of klystrons primarily rely on techniques such as staggered tuning of resonant cavities, integration of waveguide filters at the output cavity, and utilization of overlapping mode configurations in Multi-Gap Output Cavity (MGOC). The output section of high-frequency structures typically adopts a double-gap coupled cavity, for which gap impedance testing methods are relatively well established. Building on this foundation, a triple-gap coupled output cavity structure is developed, enabling further bandwidth enhancement. The flatness of the gap impedance across the operating band of an MGOC directly determines the gain and bandwidth performance of the klystron. Therefore, accurate calculation and testing of gap impedance are essential. This study proposes a method for calculating MGOC impedance based on cavity equivalent circuit theory. The MGOC is modeled as a resonant circuit comprising capacitive and inductive elements, and the gap impedance matrix is derived using the mesh current method. Based on microwave network theory, a corresponding experimental method for measuring MGOC impedance is also proposed. By analyzing the phase of the reflection coefficient at the output coupling port under various conditions—including all gaps open, single-gap short-circuits, and localized perturbations at individual gaps—the gap impedance within the frequency band of the cold test sample is determined. Using this theoretical framework, an X-band four-gap output cavity structure is designed. The gap impedance of the fabricated sample is measured to verify the validity of the proposed method.  Results and Discussions  The form of the MGOC impedance derived using equivalent circuit theory is presented (Equation 6). The experimental model of the X-band four-gap output cavity is constructed, and the optimized electrical parameters for each cavity are listed (Table 1). The calculated frequency bandwidth over which the internal impedance exceeds 3300 Ω in the X-band reaches 1200 MHz (Fig. 3). This represents a 30% increase compared to the triple-gap cavity and a twofold improvement over the double-gap cavity, meeting the expected design performance. The structural dimensions of the X-band four-gap output cavity are summarized (Table 2). A schematic of the MGOC modeled as an (n+1)-port microwave network is shown (Fig. 5). By solving for the impedance at the output coupling port, the relationship between the output port and other ports is obtained (Equation 13). The impedance for the all-gaps-open condition is given (Equation 14). The impedance for the case where a single gap is short-circuited and all other gaps remain open is derived (Equation 15), and the impedance corresponding to a perturbation in any single gap capacitance, with the remaining gaps open, is expressed (Equation 16). Based on transmission line theory, the impedance at each gap is calculated using these three sets of expressions (Equations 26~29). Using this theoretical framework, the X-band four-gap cavity prototype is fabricated and tested. To support structural optimization, the four fundamental mode field distributions of the four-gap cavity are first analyzed (Figs. 6 and 7). The parameters obtained via the equivalent circuit method are refined and adjusted for the cold test component. The final measured impedance distribution of the X-band four-gap cavity is presented (Fig. 9). The measured bandwidth, with a gap impedance exceeding 3400 Ω, reaches 1185 MHz, which closely agrees with the calculated result based on the equivalent circuit model.  Conclusions  This study proposes a method for calculating the gap impedance of a klystron MGOC based on the mesh current approach within the cavity equivalent circuit framework. A design scheme for an X-band four-gap output cavity is presented, and its impedance bandwidth is compared with those of triple-gap and double-gap cavities. The calculated bandwidth of the four-gap cavity is 33% greater than that of the triple-gap design and twice that of the double-gap counterpart. Building on this, a measurement method for MGOC gap impedance is developed using microwave network theory. Cold test experiments are conducted on an X-band four-gap cavity prototype. The measured results closely match the theoretical predictions, with the impedance exceeding 3400 Ω across nearly 1.2 GHz of bandwidth. Moreover, the proposed cold measurement technique enables the estimation of mutual impedance between cavity gaps by measuring impedance with any two gaps in a short-circuited state. This capability offers important insights into the coupling behavior among cavity modes. These findings provide a robust theoretical and experimental foundation for advancing broadband klystron technologies.
News
more >
Conference
more >
Author Center

Wechat Community