Latest Articles

Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes/issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Electromagnetic Finite-Difference Time-Domain Scattering Analysis of Multilayered/Porous Materials in Specific Geometric Meshing
ZHANG Yuxian, YANG Zijiang, HUANG Zhixiang, FENG Xiaoli, FENG Naixing, YANG Lixia
Available online  , doi: 10.11999/JEIT250348
Abstract:
The Finite-Difference Time-Domain (FDTD) method is a widely used tool for analyzing the electromagnetic properties of dielectric media, but its application is often constrained by model complexity and mesh discretization. To enhance the efficiency of electromagnetic scattering simulations in multilayered/porous materials, we proposes an accelerated FDTD scheme in this paper. Computational geometry algorithms can be employed with the proposed method to rapidly generate Yee’s grids, utilizing a three-dimensional voxel array to define material distributions and field components. By exploiting the voxel characteristics, parallel algorithms are employed to efficiently compute Radar Cross Sections (RCS) for non-analytical geometries. In contrast to conventional volumetric mesh generation, which relies on analytic formulas, this work integrates ray-intersection techniques with Signed Distance Functions (SDFs). Calculations of tangent planes and intersection points minimize invalid traversals and reduce computational complexity, thus expediting grid-based electromagnetic parameter assignment for porous and irregular structures. The approach is applied to the RCS calculations of multilayered/porous models, demonstrating excellent consistency with results from popular commercial solvers (FEKO, CST, HFSS) while offering substantially higher efficiency. Numerical experiments confirm significant reductions in computation time and computer memory without compromising accuracy. Overall, the proposed acceleration scheme enhances the FDTD method’s ability to handle complex dielectric structures, providing an effective balance between computational speed and accuracy, and offering innovative solutions for rapid mesh generation and processing of complex internal geometries.  Objective   The FDTD method, a reliable approach for computing the electromagnetic properties of dielectric media, faces constraints in computational efficiency and accuracy due to model structure and mesh discretization. A major challenge in the field is achieving efficient electromagnetic scattering analysis with minimal computational resources while maintaining sufficient wavelength sampling resolution. To address this difficulty, we propose an FDTD-based electromagnetic analysis acceleration scheme that enhances simulation efficiency by significantly improving mesh generation and optimizing grid partitioning for complex multilayered/porous models.  Methods   In this study, those Yee’s grids for complex materials are efficiently generated using computational geometry algorithms and a 3D voxel array to define material distribution and field components. A parallel algorithm leverages voxel data to accelerate RCS calculations for non-analytical geometries. Unlike conventional volumetric meshing methods that rely on analytic formulas, this approach integrates ray-intersection techniques with SDFs. Calculations of tangent planes and intersection points further reduce invalid traversals and geometric complexity, facilitating faster grid-based assignment of electromagnetic parameters. Numerical experiments validate that the method effectively supports porous and multilayered non-analytical structures, demonstrating both high efficiency and accuracy.  Results and Discussions   The accelerated volumetric meshing algorithm is validated using a Boeing 737 model, showing more than a 67.5% reduction in computation time across different resolutions. Efficiency decreases at very fine meshes because of heavier computational loads and suboptimal valid-grid ratios. The method is further evaluated on three multilayered/porous structures, achieving 85.55% faster computation and 9.8% lower memory usage compared with conventional FDTD. In comparison with commercial solvers (FEKO, CST, HFSS), equivalent accuracy is maintained while runtimes are reduced by 87.58% and memory consumption by 81.6%. In all tested cases, errors remain below 6% relative to high-resolution FDTD, confirming that the proposed acceleration scheme provides both high efficiency and reliable accuracy.  Conclusions   In this study, we optimize volumetric mesh generation in FDTD through computational geometry algorithms. By combining ray-intersection techniques with reliable SDFs, the proposed approach efficiently manages internal cavities, while tangent-plane calculations minimize traversal operations and complexity, thereby accelerating scattering analysis. The scheme extends the applicability of FDTD to a broader range of dielectric structures and materials, delivering substantial savings in computation time and memory without compromising accuracy. Designed to support universal geometric model files, the framework shows strong potential for stealth optimization of multi-material structures and the development of electromagnetic scattering systems. It represents an important step toward integrating computational geometry with computational electromagnetics.
A Survey of Lightweight Techniques for Segment Anything Model
LUO Yichang, QI Xiyu, ZHANG Borui, SHI Hanru, ZHAO Yan, WANG Lei, LIU Shixiong
Available online  , doi: 10.11999/JEIT250894
Abstract:
  Objective  The Segment Anything Model (SAM) demonstrates strong zero-shot generalization in image segmentation and sets a new direction for visual foundation models. The original SAM, especially the ViT-Huge version with about 637 million parameters, requires high computational resources and substantial memory. This restricts deployment in resource-limited settings such as mobile devices, embedded systems, and real-time tasks. Growing demand for efficient and deployable vision models has encouraged research on lightweight variants of SAM. Existing reviews describe applications of SAM, yet a structured summary of lightweight strategies across model compression, architectural redesign, and knowledge distillation is still absent. This review addresses this need by providing a systematic analysis of current SAM lightweight research, classifying major techniques, assessing performance, and identifying challenges and future research directions for efficient visual foundation models.  Methods  This review examines recent studies on SAM lightweight methods published in leading conferences and journals. The techniques are grouped into three categories based on their technical focus. The first category, Model Compression and Acceleration, covers knowledge distillation, network pruning, and quantization. The second category, Efficient Architecture Design, replaces the ViT backbone with lightweight structures or adjusts attention mechanisms. The third category, Efficient Feature Extraction and Fusion, refines the interaction between the image encoder and prompt encoder. A comparative assessment is conducted for representative studies, considering model size, computational cost, inference speed, and segmentation accuracy on standard benchmarks (Table 3).  Results and Discussions  The reviewed models achieve clear gains in inference speed and parameter efficiency. MobileSAM reduces the model to 9.6 M parameters, and Lite-SAM reaches up to 16× acceleration while maintaining suitable segmentation accuracy. Approaches based on knowledge distillation and hybrid design support generalization across domains such as medical imaging, video segmentation, and embedded tasks. Although accuracy and speed still show a degree of tension, the selection of a lightweight strategy depends on the intended application. Challenges remain in prompt design, multi-scale feature fusion, and deployment on low-power hardware platforms.  Conclusions  This review provides an overview of the rapidly developing field of SAM lightweight research. The development of efficient SAM models is a multifaceted challenge that requires a combination of compression, architectural innovation, and optimization strategies. Current studies show that real-time performance on edge devices can be achieved with a small reduction in accuracy. Although progress is evident, challenges remain in handling complex scenarios, reducing the cost of distillation data, and establishing unified evaluation benchmarks. Future research is expected to emphasize more generalizable lightweight architectures, explore data-free or few-shot distillation approaches, and develop standardized evaluation protocols that consider both accuracy and efficiency.
Key Technologies for Low-Altitude Internet Networks: Architecture, Security, and Optimization
WANG Yuntao, SU Zhou, GAO Yuan, BA Jianle
Available online  , doi: 10.11999/JEIT250947
Abstract:
Low-Altitude Intelligent Networks (LAINs) function as a core infrastructure for the emerging low-altitude digital economy by connecting humans, machines, and physical objects through the integration of manned and unmanned aircraft with ground networks and facilities. This paper provides a comprehensive review of recent research on LAINs from four perspectives: network architecture, resource optimization, security threats and protection, and large model-enabled applications. First, existing standards, general architecture, key characteristics, and networking modes of LAINs are investigated. Second, critical issues related to airspace resource management, spectrum allocation, computing resource scheduling, and energy optimization are discussed. Third, existing/emerging security threats across sensing, network, application, and system layers are assessed, and multi-layer defense strategies in LAINs are reviewed. Furthermore, the integration of large model technologies with LAINs is also analyzed, highlighting their potential in task optimization and security enhancement. Future research directions are discussed to provide theoretical foundations and technical guidance for the development of efficient, secure, and intelligent LAINs.  Significance   LAINs support the low-altitude economy by enabling the integration of manned and unmanned aircraft with ground communication, computing, and control networks. By providing real-time connectivity and collaborative intelligence across heterogeneous platforms, LAINs support applications such as precision agriculture, public safety, low-altitude logistics, and emergency response. However, LAINs continue to face challenges created by dynamic airspace conditions, heterogeneous platforms, and strict real-time operational requirements. The development of large models also presents opportunities for intelligent resource coordination, proactive defense, and adaptive network management, which signals a shift in the design and operation of low-altitude networks.  Progress  Recent studies on LAINs have reported progress in network architecture, resource optimization, security protection, and large model integration. Architecturally, hierarchical and modular designs are proposed to integrate sensing, communication, and computing resources across air, ground, and satellite networks, which enables scalable and interoperable operations. In system optimization research, attention is given to airspace resource management, spectrum allocation, computing offloading, and energy-efficient scheduling through distributed optimization and AI-driven orchestration methods. In security research, multi-layer defense frameworks are developed to address sensing-layer spoofing, network-layer intrusions, and application-layer attacks through cross-layer threat intelligence and proactive defense mechanisms. Large Language Models (LLMs), Vision-Language Models (VLMs), and Multimodal LLMs (MLLMs) also support intelligent task planning, anomaly detection, and autonomous decision-making in complex low-altitude environments, which enhances the resilience and operational efficiency of LAINs.  Conclusions  This survey provides a comprehensive review of the architecture, security mechanisms, optimization techniques, and large model applications in LAINs. The challenges in multi-dimensional resource coordination, cross-layer security protection, and real-time system adaptation are identified, and existing or potential approaches to address these challenges are analyzed. By synthesizing recent research on architectural design, system optimization, and security defense, this work offers a unified perspective for researchers and practitioners aiming to build secure, efficient, and scalable LAIN systems. The findings emphasize the need for integrated solutions that combine algorithmic intelligence, system engineering, and architectural innovation to meet future low-altitude network demands.  Prospects  Future research on LAINs is expected to advance the integration of architecture design, intelligent optimization, security defense, and privacy preservation technologies to meet the demands of rapidly evolving low-altitude ecosystems. Key directions include developing knowledge-driven architectures for cross-domain semantic fusion, service-oriented network slicing, and distributed autonomous decision-making. Furthermore, research should also focus on proactive cross-layer security mechanisms supported by large models and intelligent agents, efficient model deployment through AI-hardware co-design and hierarchical computing architectures, and improved multimodal perception and adaptive decision-making to strengthen system resilience and scalability. In addition, establishing standardized benchmarks, open-source frameworks, and realistic testbeds is essential to accelerate innovation and ensure secure, reliable, and intelligent deployment of LAIN systems in real-world environments.
A Learning-Based Security Control Method for Cyber-Physical Systems Based on False Data Detection
MIAO Jinzhao, LIU Jinliang, SUN Le, ZHA Lijuan, TIAN Engang
Available online  , doi: 10.11999/JEIT250537
Abstract:
  Objective  Cyber-Physical Systems (CPS) constitute the backbone of critical infrastructures and industrial applications, but the tight coupling of cyber and physical components renders them highly susceptible to cyberattacks. False data injection attacks are particularly dangerous because they compromise sensor integrity, mislead controllers, and can trigger severe system failures. Existing control strategies often assume reliable sensor data and lack resilience under adversarial conditions. Furthermore, most conventional approaches decouple attack detection from control adaptation, leading to delayed or ineffective responses to dynamic threats. To overcome these limitations, this study develops a unified secure learning control framework that integrates real-time attack detection with adaptive control policy learning. By enabling the dynamic identification and mitigation of false data injection attacks, the proposed method enhances both stability and performance of CPS under uncertain and adversarial environments.  Methods  To address false data injection attacks in CPS, this study proposes an integrated secure control framework that combines attack detection, state estimation, and adaptive control strategy learning. A sensor grouping-based security assessment index is first developed to detect anomalous sensor data in real time without requiring prior knowledge of attacks. Next, a multi-source sensor fusion estimation method is introduced to reconstruct the system’s true state, thereby improving accuracy and robustness under adversarial disturbances. Finally, an adaptive learning control algorithm is designed, in which dynamic weight updating via gradient descent approximates the optimal control policy online. This unified framework enhances both steady-state performance and resilience of CPS against sophisticated attack scenarios. Its effectiveness and security performance are validated through simulation studies under diverse false data injection attack settings.  Results and Discussions  Simulation results confirm the effectiveness of the proposed secure adaptive learning control framework under multiple false data injection attacks in CPS. As shown in Fig. 1, system states rapidly converge to steady values and maintain stability despite sensor attacks. Fig. 2 demonstrates that the fused state estimator tracks the true system state with greater accuracy than individual local estimators. In Fig. 3, the compensated observation outputs align closely with the original, uncorrupted measurements, indicating precise attack estimation. Fig. 4 shows that detection indicators for sensor groups 2–5 increase sharply during attack intervals, while unaffected sensors remain near zero, verifying timely and accurate detection. Fig. 5 further confirms that the estimated attack signals closely match the true injected values. Finally, Fig. 6 compares different control strategies, showing that the proposed method achieves faster stabilization and smaller state deviations. Together, these results demonstrate robust control, accurate state estimation, and real-time detection under unknown attack conditions.  Conclusions  This study addresses secure perception and control in CPS under false data injection attacks by developing an integrated adaptive learning control framework that unifies detection, estimation, and control. A sensor-level anomaly detection mechanism is introduced to identify and localize malicious data, substantially enhancing attack detection capability. The fusion-based state estimation method further improves reconstruction accuracy of true system states, even when observations are compromised. At the control level, an adaptive learning controller with online weight adjustment enables real-time approximation of the optimal control policy without requiring prior knowledge of the attack model. Future research will extend the proposed framework to broader application scenarios and evaluate its resilience under diverse attack environments.
Joint Mask and Multi-Frequency Dual Attention GAN Network for CT-to-DWI Image Synthesis in Acute Ischemic Stroke
ZHANG Zehua, ZHAO Ning, WANG Shuai, WANG Xuan, ZHENG Qiang
Available online  , doi: 10.11999/JEIT250643
Abstract:
  Objective  In the clinical management of Acute Ischemic Stroke (AIS), Computed Tomography (CT) and Diffusion-Weighted Imaging (DWI) serve complementary roles at different stages. CT is widely applied for initial evaluation due to its rapid acquisition and accessibility, but it has limited sensitivity in detecting early ischemic changes, which can result in diagnostic uncertainty. In contrast, DWI demonstrates high sensitivity to early ischemic lesions, enabling visualization of diffusion-restricted regions soon after symptom onset. However, DWI acquisition requires a longer time, is susceptible to motion artifacts, and depends on scanner availability and patient cooperation, thereby reducing its clinical accessibility. The limited availability of multimodal imaging data remains a major challenge for timely and accurate AIS diagnosis. Therefore, developing a method capable of rapidly and accurately generating DWI images from CT scans has important clinical significance for improving diagnostic precision and guiding treatment planning. Existing medical image translation approaches primarily rely on statistical image features and overlook anatomical structures, which leads to blurred lesion regions and reduced structural fidelity.  Methods  This study proposes a Joint Mask and Multi-Frequency Dual Attention Generative Adversarial Network (JMMDA-GAN) for CT-to-DWI image synthesis to assist in the diagnosis and treatment of ischemic stroke. The approach incorporates anatomical priors from brain masks and adaptive multi-frequency feature fusion to improve image translation accuracy. JMMDA-GAN comprises three principal modules: a mask-guided feature fusion module, a multi-frequency attention encoder, and an adaptive fusion weighting module. The mask-guided feature fusion module integrates CT images with anatomical masks through convolution, embedding spatial priors to enhance feature representation and texture detail within brain regions and ischemic lesions. The multi-frequency attention encoder applies Discrete Wavelet Transform (DWT) to decompose images into low-frequency global components and high-frequency edge components. A dual-path attention mechanism facilitates cross-scale feature fusion, reducing high-frequency information loss and improving structural detail reconstruction. The adaptive fusion weighting module combines convolutional neural networks and attention mechanisms to dynamically learn the relative importance of input features. By assigning adaptive weights to multi-scale features, the module selectively enhances informative regions and suppresses redundant or noisy information. This process enables effective integration of low- and high-frequency features, thereby improving both global contextual consistency and local structural precision.  Results and Discussions  Extensive experiments were performed on two independent clinical datasets collected from different hospitals to assess the effectiveness of the proposed method. JMMDA-GAN achieved Mean Squared Error (MSE) values of 0.0097 and 0.0059 on Clinical Dataset 1 and Clinical Dataset 2, respectively, exceeding state-of-the-art models by reducing MSE by 35.8% and 35.2% compared with ARGAN. The proposed network reached peak Signal-to-Noise Ratio (PSNR) values of 26.75 and 28.12, showing improvements of 30.7% and 7.9% over the best existing methods. For Structural Similarity Index (SSIM), JMMDA-GAN achieved 0.753 and 0.844, indicating superior structural preservation and perceptual quality. Visual analysis further demonstrates that JMMDA-GAN restores lesion morphology and fine texture features with higher fidelity, producing sharper lesion boundaries and improved structural consistency compared with other methods. Cross-center generalization and multi-center mixed experiments confirm that the model maintains stable performance across institutions, highlighting its robustness and adaptability in clinical settings. Parameter sensitivity analysis shows that the combination of Haar wavelet and four attention heads achieves an optimal balance between global structural retention and local detail reconstruction. Moreover, superpixel-based gray-level correlation experiments demonstrate that JMMDA-GAN exceeds existing models in both local consistency and global image quality, confirming its capacity to generate realistic and diagnostically reliable DWI images from CT inputs.  Conclusions  This study proposes a novel JMMDA-GAN designed to enhance lesion and texture detail generation by incorporating anatomical structural information. The method achieves this through three principal modules. (1) The mask-guided feature fusion module effectively integrates anatomical structure information, with particular optimization of the lesion region. The mask-guided network focuses on critical lesion features, ensuring accurate restoration of lesion morphology and boundaries. By combining mask and image data, the method preserves the overall anatomical structure while enhancing lesion areas, preventing boundary blurring and texture loss commonly observed in traditional approaches, thereby improving diagnostic reliability. (2) The multi-frequency feature fusion module jointly optimizes low- and high-frequency features to enhance image detail. This integration preserves global structural integrity while refining local features, producing visually realistic and high-fidelity images. (3) The adaptive fusion weighting module dynamically adjusts the learning strategy for frequency-domain features according to image content, enabling the network to manage texture variations and complex anatomical structures effectively, thereby improving overall image quality. Through the coordinated function of these modules, the proposed method enhances image realism and diagnostic precision. Experimental results demonstrate that JMMDA-GAN exceeds existing advanced models across multiple clinical datasets, highlighting its potential to support clinicians in the diagnosis and management of AIS.
Tensor-Train Decomposition for Lightweight Liver Tumor Segmentation
MA Jinlin, YANG Jipeng
Available online  , doi: 10.11999/JEIT250293
Abstract:
  Objective  Convolutional Neural Networks (CNNs) have recently achieved notable progress in medical image segmentation. Their conventional convolution operations, however, remain constrained by locality, which reduces their ability to capture global contextual information. Researchers have pursued two main strategies to address this limitation. Hybrid CNN–Transformer architectures use self-attention to model long-range dependencies, and this markedly improves segmentation accuracy. State-space models such as the Mamba series reduce computational cost and retain global modeling capacity, and they also show favorable scalability. Although CNN–Transformer models remain computationally demanding for real-time use, Mamba-based approaches still face challenges such as boundary blur and parameter redundancy when segmenting small targets and low-contrast regions. Lightweight network design has therefore become a research focus. Existing lightweight methods, however, still show limited segmentation accuracy for liver tumor targets with very small sizes and highly complex boundaries. This paper proposes an efficient lightweight method for liver tumor segmentation that aims to meet the combined requirements of high accuracy and real-time performance for small targets with complex boundaries.  Methods  The proposed method integrates three strategies. A Tensor-Train Multi-Scale Convolutional Attention (TT-MSCA) module is designed to improve segmentation accuracy for small targets and complex boundaries. This module optimizes multi-scale feature fusion through a TT_Layer and employs tensor decomposition to integrate feature information across scales, which supports more accurate identification and segmentation of tumor regions in challenging images. A feature extraction module with a multi-branch residual structure, termed the IncepRes Block, strengthens the model’s capacity to capture global contextual information. Its parallel multi-branch design processes features at several scales and enriches feature representation at a relatively low computational cost. All standard 3×3 convolutions are then decoupled into two consecutive strip convolutions. This reduces the number of parameters and computational cost although the feature extraction capacity is preserved. The combination of these modules allows the method to improve segmentation accuracy and maintain high efficiency, and it demonstrates strong performance for small targets and blurry boundary regions.  Results and Discussions  Experiments on the LiTS2017 and 3Dircadb datasets show that the proposed method reaches Dice coefficients of 98.54% and 97.95% for liver segmentation, and 94.11% and 94.35% for tumor segmentation. Ablation studies show that the TT-MSCA module and the IncepRes Block improve segmentation performance with only a modest computational cost, and the SC Block reduces computational cost while accuracy is preserved (Table 2). When the TT-MSCA module is inserted into the reduced U-Net on the LiTS2017 dataset, the tumor Dice and IoU reach 93.73% and 83.60%. These values are second only to the final model. On the 3Dircadb dataset, adding the SC Block after TT-MSCA produces a slight accuracy decrease but reduces GFLOPs by a factor of 4.15. Compared with the original U-Net, the present method improves liver IoU by 3.35% and tumor IoU by 5.89%. The TT-MSCA module also consistently exceeds the baseline MSCA module. It increases liver and tumor IoU by 2.59% and 1.95% on LiTS2017, and by 2.03% and 3.13% on 3Dircadb (Table 5). These results show that the TT_Layer strengthens global context perception and fine-detail representation through multi-scale feature fusion. The proposed network contains 0.79 M parameters and 1.43 GFLOPs, which represents a 74.9% reduction in parameters compared with CMUNeXt (3.15 M). Real-time performance evaluation records 156.62 FPS, more than three times the 50.23 FPS of the vanilla U-Net (Table 6). Although accuracy decreases slightly in a few isolated metrics, the overall accuracy–compression balance is improved, and the method demonstrates strong practical value for lightweight liver tumor segmentation.  Conclusions  This paper proposes an efficient liver tumor segmentation method that improves segmentation accuracy and meets real-time requirements. The TT-MSCA module enhances recognition of small targets and complex boundaries through the integration of spatial and channel attention. The IncepRes Block strengthens the network’s perception of liver tumors of different sizes. The decoupling of standard 3×3 convolutions into two consecutive strip convolutions reduces the parameter count and computational cost while preserving feature extraction capacity. Experimental evidence shows that the method reduces errors caused by complex boundaries and small tumor sizes and can satisfy real-time deployment needs. It offers a practical technical option for liver tumor segmentation. The method requires many training iterations to reach optimal data fitting, and future work will address improvements in convergence speed.
Energy Consumption Optimization of Cooperative NOMA Secure Offload for Mobile Edge Computing
CHEN Jian, MA Tianrui, YANG Long, LV Lu, XU Yongjun
Available online  , doi: 10.11999/JEIT250606
Abstract:
  Objective  Mobile edge computing (MEC) significantly enhances the computational capabilities and response speed of mobile devices by migrating functions such as computing and caching to the network edge. The application of non-orthogonal multiple access (NOMA) further supports the realization of ultra-high spectral efficiency and massive connectivity. However, considering the broadcast nature of wireless channels, the offloading transmission process in MEC may be vulnerable to eavesdropping attacks. Therefore, the integration of physical layer security techniques into a NOMA-MEC system is proposed to ensure the security of the offloading process. Existing research primarily focuses on optimizing system performance metrics such as energy consumption, latency, and throughput, or enhancing system security through NOMA-based co-channel interference and cooperative interference. However, the joint impact of these two aspects—performance and security—is largely overlooked. To reduce the energy consumption of secure offloading in mobile edge computing, this paper designs a secure offloading scheme based on cooperative NOMA. Compared with existing works, the novelty of this paper lies in the fact that the cooperative nodes provide both forwarding and computational assistance simultaneously. By leveraging joint local computing between users and cooperative nodes, the proposed scheme replaces the security in the offloading process while reducing system energy consumption.  Methods  This paper investigates the joint design of computational and communication resource allocation schemes for nodes, dividing the offloading process into two stages: NOMA offloading and cooperative offloading. It also considers the offloading strategies of different nodes in different stages and proposes an optimization problem to minimize the system-wide weighted total energy consumption under secrecy outage constraints. To address this multi-variable coupled and non-convex optimization problem, the secrecy transmission rate constraints and the secrecy outage probability constraints, which are given in a complex probabilistic form, are first transformed. The original optimization problem is then decomposed into two subproblems: slot and task allocation, and power allocation. For the non-convex power allocation subproblem, its non-convex constraints are replaced with bilinear substitutions and sequentially convex approximations are applied. Ultimately, an alternating iterative resource allocation algorithm is proposed, which can adjust the load, power, and slot allocation between users and cooperative nodes according to the channel state, thereby minimizing energy consumption while satisfying security constraints.  Results and Discussions  Theoretical analysis and simulation results demonstrate that the proposed scheme in this paper converges rapidly and has low complexity. Compared with existing NOMA full-offloading schemes, assisted computing schemes, and NOMA collaborative interference schemes, the proposed offloading scheme can significantly reduce system energy consumption and achieve higher load capacity under the same secrecy constraints. Moreover, the scheme exhibits strong robustness, as it is less affected by poor channel transmission conditions and deteriorating eavesdropping conditions.  Conclusions  This paper reveals the inherent trade-off between system energy consumption and security constraints. During the offloading process in Mobile Edge Computing, communication, computation, and security are not mutually exclusive. Instead, performance and security can be enhanced simultaneously through the rational utilization of cooperative nodes. When cooperative nodes are available, Non-Orthogonal Multiple Access and forwarding cooperation can mitigate the impact of poor channel conditions or high eavesdropping risks on security and performance. Moreover, cooperative nodes can share the local computational burden of users to improve system performance. Joint local computation between users and cooperative nodes can also avoid the eavesdropping risks associated with long-distance wireless transmission. In other words, the secure offloading scenario in MEC is not merely a Physical Layer Security issue in the transmission process. It also encompasses the complex coupling relationship unique to MEC between communication and computation. By leveraging idle resources in the network, the security performance of the system can be enhanced through cooperative communication and computation among idle nodes, while maintaining system performance.
A Survey on Physical Layer Security in Near-Field Communication
XU Yongjun, LI Jing, LUO Dongxin, WANG Ji, LI Xingwang, YANG Long, CHEN Li
Available online  , doi: 10.11999/JEIT250336
Abstract:
  Significance   Traditional wireless communication systems have relied on far-field plane-wave models to support wide-area coverage and long-distance transmission. However, emerging Sixth-Generation (6G) applications—such as extended reality, holographic communication, pervasive intelligence, and smart factories—demand ultra-high bandwidth, ultra-low latency, and sub-centimeter-level localization accuracy. These requirements exceed the spatial multiplexing gains and interference suppression achievable under far-field assumptions. Enabled by extremely large-scale antenna arrays and terahertz technologies, the near-field region has expanded to hundreds of meters, where spherical-wave propagation enables precise beam focusing and flexible spatial resource management. The additional degrees of freedom in the angle and distance domains, however, give rise to new Physical Layer Security (PLS) challenges, including joint angle–distance eavesdropping, beam-split-induced information leakage caused by frequency-dependent focusing, and security–interference conflicts in hybrid near- and far-field environments. This paper provides a comprehensive survey of near-field PLS techniques, advancing theoretical understanding of spherical-wave propagation and associated threat models while offering guidance for designing robust security countermeasures and informing the development of future 6G security standards.  Progress   This paper presents a comprehensive survey of recent advances in PLS for near-field communications in 6G networks, with an in-depth discussion of key enabling technologies and optimization methodologies. Core security techniques, including beam focusing, Artificial Noise (AN), and multi-technology integration, are first examined in terms of their security objectives. Beam focusing exploits ultra-large-scale antenna arrays and the spherical-wave propagation characteristics of near-field communication to achieve precise spatial confinement, thereby reducing information leakage. AN introduces deliberately crafted noise toward undesired directions to hinder eavesdropping. Multi-technology integration combines terahertz communications, Reconfigurable Intelligent Surfaces (RIS), and Integrated Sensing And Communication (ISAC), markedly enhancing overall security performance. Tailored strategies are then analyzed for different transmission environments, including Line-of-Sight (LoS), Non-Line-of-Sight (NLoS), and hybrid near–far-field conditions. In LoS scenarios, beamforming optimization strengthens interference suppression. In NLoS scenarios, RIS reconstructs transmission links, complicating unauthorized reception. For hybrid near–far-field environments, multi-beam symbol-level precoding spatially distinguishes users and optimizes beamforming patterns, ensuring robust security for mixed-distance user groups. Finally, critical challenges are highlighted, including complex channel modeling, tradeoffs between security and performance, and interference management in converged multi-network environments. Promising directions for future research are also identified, such as Artificial Intelligence (AI)-assisted security enhancement, cooperative multi-technology schemes, and energy-efficient secure communications in near-field systems.  Conclusions  This paper provides a comprehensive survey of PLS techniques for near-field communications, with particular emphasis on enabling technologies and diverse transmission scenarios. The fundamentals and system architecture of near-field communications are first reviewed, highlighting their distinctions from far-field systems and their unique channel characteristics. Representative PLS approaches are then examined, including beam focusing, AN injection, and multi-technology integration with RIS and ISAC. Secure transmission strategies are further discussed for LoS, NLoS, and hybrid near–far-field environments. Finally, several open challenges are identified, such as accurate modeling of complex channels, balancing security and performance, and managing interference in multi-network integration. Promising research directions are also outlined, including hybrid near–far-field design and AI-enabled security. These directions are expected to provide theoretical foundations for advancing and standardizing near-field communication security in future 6G networks.  Prospects   Research on PLS for near-field communications remains at an early stage, with no unified or systematic framework established to date. As communication scenarios become increasingly diverse and complex, future studies should prioritize hybrid far-field and near-field environments, where channel coupling and user heterogeneity raise new security challenges. AI-driven PLS techniques show strong potential for adaptive optimization and improved resilience against adversarial threats. In parallel, integrating near-field PLS with advanced technologies such as RIS and ISAC can deliver joint improvements in security, efficiency, and functionality. Moreover, low-power design will be essential to balance security performance with energy efficiency, enabling the development of high-performance, intelligent, and sustainable near-field secure communication systems.
A Cross-Dimensional Collaborative Framework for Header-Metadata-Driven Encrypted Traffic Identification
WANG Menghan, ZHOU Zhengchun, JI Qingbing
Available online  , doi: 10.11999/JEIT250434
Abstract:
  Objective  With the widespread adoption of network communication encryption technologies, encrypted traffic identification has become a critical problem in network security. Traditional identification methods based on payload content face the risk of feature invalidation due to the continuous evolution of encryption algorithms, leading to detection blind spots in dynamic network environments. Meanwhile, the structured information embedded in packet headers, an essential carrier for protocol interaction, remains underutilized. Furthermore, as encryption protocols evolve, existing encrypted traffic identification approaches encounter limitations such as poor feature interpretability and weak model robustness against adversarial attacks. To address these challenges, this paper proposes a cross-dimensional collaborative identification framework for encrypted traffic, driven by header metadata features. The framework systematically reveals and demonstrates the dominant role of header features in encrypted traffic identification, overcoming the constraints of single-perspective analyses and reducing dependence on payload data. It further enables the assessment of deep model performance boundaries and decision credibility. Through effective feature screening and pruning, redundant attributes are eliminated, enhancing the framework’s anti-interference capability in encrypted scenarios. This approach reduces model complexity while improving interpretability and robustness, facilitating the design of lighter and more reliable encrypted traffic identification models.  Methods  This study performs a three-dimensional analysis including (1) network traffic feature selection and identification performance, (2) quantitative evaluation of feature importance in classification, and (3) assessment of model robustness under adversarial perturbations. First, the characteristics, differences, and effects on identification performance are compared among three forms of encrypted traffic packets using a One-Dimensional Convolutional Neural Network (1D-CNN). This comparison verifies the dominant role of header features in encrypted traffic identification. Second, two explainable algorithms, Layer-wise Relevance Propagation (LRP) and Deep Taylor Decomposition (DTD), are employed to further confirm the essential contribution of header features to network traffic classification. The relative importance of header and payload features is quantified from two perspectives: (i) the relevance of backpropagation and (ii) the contribution coefficients derived from Taylor series expansion, thereby enhancing feature interpretability. Finally, adversarial attack experiments are conducted using Projected Gradient Descent (PGD) and random perturbations. By injecting carefully constructed adversarial perturbation data into the initial and terminal parts of the payload, or by adding randomly generated noise to produce adversarial traffic, the study examines the effect of these perturbations on model decision-making. This analysis evaluates the stability and anti-interference capabilities of the encrypted traffic identification model under adversarial conditions.  Results and Discussions  Comparative experiments conducted on the ISCXVPN2016 and ISCXTor2016 datasets yield three key findings. (1) Recognition performance. The model based solely on header features achieves an F1 score up to 6% higher than that of the model using complete traffic, and up to 61% higher than that of the model using only payload features. These results verify that header features possess irreplaceable significance in encrypted traffic identification. The structural information embedded in headers plays a dominant role in enabling the model to accurately classify traffic types. Even without payload data, high identification accuracy can be achieved using header information alone (Figure 2, Table 4). (2) Interpretability evaluation. The LRP and DTD methods are used to quantify the contribution of header features to model classification. The correlation between header features and classification performance is markedly higher than that of payload features, with the average proportion of the correlation score up to 89.8% greater (Figures 3~4, Table 5). This result is highly consistent with the classification behavior of the One-Dimensional Convolutional Neural Network (1D-CNN), further confirming the critical importance and dominant influence of header features in encrypted traffic identification. (3) Anti-interference robustness. The combined Header–Payload model exhibits strong robustness under adversarial attacks. Particularly under low-bandwidth conditions, the model incorporating header features shows a markedly higher maximum performance retention rate under equivalent bandwidth perturbation than the pure payload model, with the maximum difference reaching 98.46%. This finding confirms the essential role of header features in enhancing model robustness (Figures 5~6). Header-based models maintain stable recognition performance, whereas payload information is more susceptible to interference, leading to sharp performance degradation. In addition, the identification performance, contribution quantification, and anti-attack effectiveness of header features are influenced by data type and distribution characteristics. In certain cases, payload features provide auxiliary support, suggesting a complementary relationship between the two feature domains.  Conclusions  This study addresses core challenges in encrypted traffic identification, including feature degradation, limited interpretability, and weak adversarial robustness in traditional payload-dependent methods. A cross-dimensional collaborative identification framework driven by header features is proposed. Through systematic theoretical analysis and experimental validation from three perspectives, the framework demonstrates the irreplaceable value of header features in network traffic identification and overcomes the limitations of conventional single-perspective approaches. It provides a theoretical foundation for improving the efficiency, interpretability, and robustness of encrypted traffic identification models. Future work will focus on enhancing dynamic adaptability, integrating multi-modal features, implementing lightweight architectures, and strengthening adversarial defense mechanisms. These directions are expected to advance encrypted traffic identification technology toward higher intelligence, adaptability, and resilience.
Dual Mode Index Modulation-aided Orthogonal Chirp Division Multiplexing System in High-dynamic Scenes
NING Xiaoyan, TANG Zihan, YIN Qiaoling, WANG Shihan
Available online  , doi: 10.11999/JEIT250475
Abstract:
  Objective  In high-dynamic environments, the Orthogonal Chirp Division Multiplexing (OCDM) system has attracted significant attention due to its inherent advantage of time-frequency two-dimensional expansion gain. The OCDM with Index Modulation (OCDM-IM) system extends the index domain of the traditional OCDM system, selectively activating subcarriers through index modulation. This reduces inter-carrier interference to some extent. However, the OCDM-IM system necessitates that certain subcarriers remain inactive, which, on one hand, diminishes the time-frequency expansion gain of the OCDM system and, on the other hand, leads to more pronounced Doppler interference in high-dynamic environments. Additionally, the inactive subcarriers do not contribute to data transmission, resulting in throughput loss. To overcome these challenges, this study proposes a novel communication system architecture, the Dual Mode Index Modulation-aided OCDM (DM-OCDM-IM). This architecture incorporates a dual-mode index mapping scheme and introduces new modulation dimensions within the OCDM system. The DM-OCDM-IM system preserves the interference immunity associated with the time-frequency two-dimensional expansion of the OCDM system while achieving higher spectral efficiency with low-order constellation modulation, offering enhanced communication performance in high-dynamic scenarios.  Methods  In this study, a DM-OCDM-IM communication system architecture is proposed, consisting of two main components: the dual mode index modulation module and the receiving algorithm. In the dual mode index modulation module, the DM-OCDM-IM system partitions the subcarriers in each subblock into two groups, each transmitting constant-amplitude and mutually distinguishable constellation symbols. This design expands the modulation dimensions and improves spectral efficiency. At the same time, low-order constellation modulation can be applied in a single dimension, thereby strengthening the system’s anti-jamming capability in high-dynamic environments. The constant-amplitude dual mode index mapping scheme also reduces performance fluctuations caused by channel gain variations and offers ease of hardware implementation. For signal reception, the system must contend with substantial Doppler frequency shifts and the computational complexity of demodulation in high-dynamic conditions. To address this, the DM-OCDM-IM employs a receiving algorithm based on feature decomposition of the Discrete Fresnel Transform (DFnT), which reduces complexity. The discrete time-domain transmit signal is reconstructed by applying the Discrete Fourier Transform (DFT) and feature decomposition to the received frequency-domain signal. Finally, the original transmitted bits are recovered through index demodulation and constellation demodulation of the reconstructed time-domain signal using a maximum-likelihood receiver.  Results and Discussions  The performance of the proposed DM-OCDM-IM system is simulated and compared with that of the existed Dual Mode Index Modulation-aided OFDM (DM-OFDM-IM) system and the OCDM-IM system under three channel conditions: AWGN, multipath, and Doppler frequency shift. The results show that, relative to the DM-OFDM-IM system, the proposed DM-OCDM-IM system exploits multipath diversity more effectively and exhibits stronger resistance to fading in all three channels (Fig. 5, Fig. 6). When compared with the OCDM-IM system, the Bit Error Rate (BER) performance of the proposed DM-OCDM-IM system is significantly improved across all three channel conditions, particularly at high spectral efficiency (Fig.7(b), Fig.8(b)). These results confirm that the introduction of the dual mode index modulation technique extends the modulation dimensions within the OCDM framework. Information is transmitted not only through index modulation but also through dual mode modulation, enabling higher spectral efficiency without increasing the modulation order. At the same time, the time-frequency expansion gain characteristic of OCDM is preserved, while receiver complexity is effectively controlled. These combined features make the proposed DM-OCDM-IM system well suited for communication in high-dynamic channel environments.  Conclusions  This paper establishes a novel DM-OCDM-IM system framework. First, by integrating a constant-amplitude dual mode index mapping scheme into the traditional OCDM system, the proposed design expands the modulation dimensions and allows the use of low-order constellation modulation in a single dimension. This improves spectral efficiency while enhancing system reliability in high-dynamic environments. Second, to reduce receiver-side complexity, a receiving algorithm based on feature decomposition of the DFnT is proposed, simplifying the digital signal processing of the DM-OCDM-IM system. Finally, the performance of the system is evaluated under AWGN, multipath, and Doppler frequency shift channels. The results demonstrate that, compared with the existed DM-OFDM-IM system, the proposed DM-OCDM-IM system exhibits stronger resistance to multipath fading and Doppler frequency shifts. In comparison with the OCDM-IM system, the proposed DM-OCDM-IM design preserves the time-frequency expansion gain of OCDM and provides stronger fading resistance at high spectral efficiency. Therefore, the proposed DM-OCDM-IM system offers superior adaptability in high-dynamic scenarios and has the potential to serve as a next-generation physical-layer waveform for mobile communications.
Available online  , doi: 10.11999/JEIT250901
Abstract:
Inverse Design of a Silicon-Based Compact Polarization Splitter-Rotator
HUI Zhanqiang, ZHANG Xinglong, HAN dongdong, LI Tiantian, GONG Jiamin
Available online  , doi: 10.11999/JEIT250858
Abstract:
  Objective  The integrated polarization splitter-rotator (PSR), as one of the key photonic devices for manipulating the polarization state of light waves, has been widely used in various photonic integrated circuits (PICs). For PICs, device size becomes a major bottleneck limiting integration density. Compared to traditional design methods, which suffer from being time-consuming and producing larger device sizes, inverse design optimizes the best structural parameters of integrated photonic devices according to target performance parameters by employing specific optimization algorithms. This approach can significantly reduce device size while ensuring performance and is currently used to design various integrated photonic devices, such as wavelength/mode division multiplexers, all-optical logic gates, power splitters, etc. In this paper, the Momentum Optimization algorithm and the Adjoint Method are combined to inverse design a compact PSR. This can not only significantly improve the integration level of PICs but also offers a design approach for the miniaturization of other photonic devices.  Methods  First, based on a silicon-on-insulator (SOI) wafer with a thickness of 220 nm, the design region was discretized into 25×50 cylindrical elemental structures. Each structure has a radius of 50 nm and a height of 150 nm and is filled with an intermediate material possessing a relative permittivity of 6.55. Next, the adjoint method was employed for simulation to obtain gradient information over the design region. This gradient information was processed using the Momentum Optimization algorithm. Based on the processed gradient, the relative permittivity of each elemental structure was modified. During the optimization process, the momentum factor in the Momentum Optimization algorithm was dynamically adjusted according to the iteration number to accelerate the optimization. Meanwhile, a linear bias was introduced to artificially control the optimization direction of the relative permittivity. This bias gradually steered the permittivity values towards those of silicon and air as the iterations progressed. Upon completion of the optimization, the elemental structures were binarized based on their final relative permittivity values: structures with permittivity less than 6.55 were filled with air, while those greater than 6.55 were filled with silicon. At this stage, the design region consisted of multiple irregularly distributed air holes. To compensate for the performance loss incurred during binarization, the etching depth of air holes (whose pre-binarization permittivity was between 3 and 6.55) was optimized. Furthermore, adjacent air holes are merged to reduce manufacturing errors. This resulted in a final device structure composed of air holes with five distinct radii. Among these, three types of larger-radius air holes were selected. Their etching radii and depths were further optimized to compensate for the remaining performance loss. Finally, the device performance was evaluated through numerical analysis. Key parameters calculated include insertion loss (IL), crosstalk (CT), polarization extinction ratio (PER), and bandwidth. Additionally, tolerance analysis was performed to assess the robustness of the performance.  Results and Discussions   This paper presents the design of a compact PSR based on a 220-nm-thick SOI wafer, with dimensions of 5 µm in length and 2.5 µm in width. During the design process, the momentum factor within the Momentum Optimization algorithm was dynamically adjusted: a large momentum factor was selected in the initial optimization stages to leverage high momentum for accelerating escape from local maxima or plateau regions, while a smaller momentum factor was used in later stages to increase the weight of the current gradient. Compared to other optimization methods, the algorithm employed in this work required only 20%-33% of the iteration counts needed by other algorithms to achieve a Figure of Merit (FOM) value of 1.7, significantly enhancing optimization efficiency. Numerical analysis results demonstrate that this device achieves the following performance across the 1520-1575 nm wavelength band: low IL (TM0<1 dB,TE0<0.68 dB), low CT: (TM0<-23 dB, TE0<-25.2 dB), high PER: (TM0>17 dB, TE0>28.5 dB), process tolerance analysis indicates that the device exhibits robust fabrication tolerance. Within the 1520-1540 nm bandwidth, performance shows no significant degradation under variations of etching depth offset ±9 nm, etching radius offset ±5 nm. This demonstrates its excellent manufacturability robustness.  Conclusions   Through numerical analysis and comparison with devices designed in other literature, this work clearly demonstrates the feasibility of combining the adjoint method with the Momentum Optimization algorithm for designing the integrated PSR. Its design principle involves manipulating light propagation to achieve the polarization splitting and rotation effect by adjusting the relative permittivity to control the positions of the air holes. Compared to traditional design methods, inverse design enables the efficient utilization of the design region, thereby achieving a more compact structure. The PSR proposed in this work is not only significantly smaller in size but also exhibits larger fabrication tolerance. It holds significant potential for application in future large-scale PICs chips, while also offering valuable design insights for the miniaturization of other photonic devices.
An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing
GAO Wenchao, SUO Jianhua, ZHANG Ao
Available online  , doi: 10.11999/JEIT250363
Abstract:
  Objective   Deep learning technology has been widely applied to source code vulnerability detection. The mainstream methods can be categorized into sequence-based and graph-based approaches. Sequence-based models usually convert structured code into a linear sequence, which ignores the syntactic and structural information of the program and often leads to a high false-positive rate. Graph-based models can effectively capture structural features, but they fail to model the execution order of the program. In addition, their prediction granularity is usually coarse and limited to the function level. Both types of methods lack interpretability, which makes it difficult for developers to locate the root causes of vulnerabilities. Although large language models (LLM) have made progress in code understanding, they still suffer from high computational overhead, hallucination problems in the security domain, and insufficient understanding of complex program logic. To address these issues, this paper proposes an interpretable vulnerability detection method based on graphs and code slicing (GSVD). The proposed method integrates structural semantics and sequential features, and provides fine-grained, line-level explanations for model decisions.  Methods   The proposed method consists of four main components: code graph feature extraction, code sequence feature extraction, feature fusion, and an interpreter module (Fig. 1). First, the source code is normalized, and the Joern static analysis tool is used to convert it into multiple code graphs, including the Abstract Syntax Tree (AST), Data Dependency Graph (DDG), and Control Dependency Graph (CDG). These graphs comprehensively represent the syntactic structure, data flow, and control flow of the program. Then, node features are initialized by combining CodeBERT embeddings with one-hot encodings of node types. With the adjacency matrix of each graph, a Gated Graph Convolutional Network (GGCN) equipped with a self-attention pooling layer is applied to extract deep structural semantic features. At the same time, a code slicing algorithm based on taint analysis (Algorithm 1) is designed. In this algorithm, taint sources are identified, and taints are propagated according to data and control dependencies, thereby generating concise code slices that are highly related to potential vulnerabilities. These slices remove irrelevant code noise and are processed by a Bidirectional Long Short-Term Memory (BiLSTM) network to capture long-range sequential dependencies. After obtaining both graph and sequence features, a gating mechanism is introduced for feature fusion. The two feature vectors are fed into a Gated Recurrent Unit (GRU), which automatically learns the dependency relationships between structural and sequential information through its dynamic state updates. Finally, to address vulnerability detection and localization, a VDExplainer is designed, considering the characteristics of the vulnerability detection task. Inspired by the HITS algorithm, it iteratively computes the “authority” and “hub” values of nodes to evaluate their importance under the constraint of an edge mask, thus achieving node-level interpretability for vulnerability explanation.  Results and Discussions   To evaluate the effectiveness of GSVD, a series of comparative experiments(Table 2) are conducted on the Devign (FFmpeg + Qemu) dataset. GSVD is compared with several baseline models. The experimental results show that GSVD achieves the highest accuracy and F1-score of 64.57% and 61.89%, respectively. The recall rate also increases to 62.63%, indicating that the proposed method effectively performs the vulnerability detection task and reduces the number of missed vulnerability reports. To verify the effectiveness of the GRU-based fusion mechanism, three feature fusion strategies—feature concatenation, weighted sum, and attention mechanism—are compared (Table 3). GSVD achieves the best overall performance, with accuracy, recall, and F1-score reaching 64.57%, 62.63%, and 61.89%, respectively. Its precision reaches 61.17%, which is slightly lower than the 63.33% obtained by the weighted sum method. Ablation experiments (Tables 4-5) further confirm the importance of the proposed slicing algorithm. The taint propagation-based slicing method reduces the average number of code lines from 51.98 to 17.30 (a 66.72% reduction) and lowers the data redundancy rate to 6.42%, compared with 19.58% for VulDeePecker and 22.10% for SySeVR. This noise suppression effect leads to a 1.53% improvement in the F1-score, demonstrating its ability to focus on key code segments. Finally, interpretability experiments (Table 6) on the Big-Vul dataset further validate the effectiveness of the VDExplainer. The proposed method outperforms the standard GNNExplainer at all evaluation thresholds. When 50% of the nodes are selected, the localization accuracy improves by 7.65%, showing its advantage in node-level vulnerability localization. In summary, GSVD not only achieves superior detection performance but also significantly improves the interpretability of model decisions, providing practical support for vulnerability localization and remediation.  Conclusions   The GSVD model effectively addresses the limitations of single-modal approaches by deeply integrating graph structures with taint analysis-based code slices. It achieves notable improvements in vulnerability detection accuracy and interpretability. In addition, the VDExplainer provides node-level and line-level vulnerability localization, enhancing the practical value of the model. Experimental results confirm the superiority of the proposed method in both detection performance and interpretability.
An Implicit Certificate-Based Lightweight Authentication Scheme for Power Industrial Internet of Things
WANG Sheng, ZHANG Linghao, TENG Yufei, LIU Hongli, HAO Junyang, WU Wenjuan
Available online  , doi: 10.11999/JEIT250457
Abstract:
  Objective  With the rapid advancement of technologies such as the Internet of Things, cloud computing, and edge computing, the Power Industrial Internet of Things (PIIoT) is evolving into a key infrastructure for smart electricity systems. In this architecture, terminal devices continuously collect operational data and transmit it to edge gateways for initial processing before forwarding it to cloud platforms for further intelligent analysis and control. Such integration significantly enhances operational efficiency, reliability, and security in power systems. However, the close coupling between traditional industrial systems and open network environments introduces new cybersecurity threats. Resource-constrained devices in PIIoT are particularly vulnerable to attacks, leading to data leakage, privacy breaches, and even the disruption of power services. Existing identity authentication mechanisms either incur high computational and communication overheads or fail to provide adequate security guarantees, such as forward secrecy or resistance to replay and man-in-the-middle attacks. Therefore, this study aims to design a secure and efficient identity authentication scheme tailored to the PIIoT environment. The proposed work addresses the urgent need for a solution that balances lightweight performance with strong security, especially for power terminals with limited processing capabilities.  Methods  To address this challenge, a secure and lightweight identity authentication scheme is proposed. Specifically, the scheme introduces implicit certificate technology during the device identity registration phase. This technique embeds public key authentication information into the signature, eliminating the need to transmit the full certificate explicitly during communication. Compared to traditional explicit certificates, implicit certificates feature shorter lengths and more efficient verification, thereby reducing overhead in both transmission and validation processes. Building upon this, a lightweight authentication protocol is constructed, relying only on hash functions, XOR operations, and elliptic curve point multiplications. This enables secure mutual authentication and session key agreement between devices while maintaining suitability for resource-constrained power terminal devices. Furthermore, a formal analysis is conducted to evaluate the security of the proposed scheme. The results demonstrate that it achieves secure mutual authentication, ensures the confidentiality and forward secrecy of session keys, and provides strong resistance against various attacks, including replay and man-in-the-middle attacks. Finally, comprehensive experiments are conducted to compare the proposed scheme with existing advanced authentication protocols. The results confirm that the proposed solution achieves significantly lower computational and communication overhead, making it a practical choice for real-world deployment.  Results and Discussions  The proposed scheme was evaluated through both simulation and numerical comparisons with existing methods. The implementation was conducted on a virtual machine configured with 8 GB RAM, an Intel i7-12700H processor, and Ubuntu 22.04, using the Miracl-Python cryptographic library. The security level was set to 128 bits, employing the ed25519 elliptic curve, SHA-256 as the hash function, and AES-128 for symmetric encryption. Table 1 presents the performance of the underlying cryptographic primitives. As shown in Table 2, the proposed scheme achieves the lowest computational cost, requiring only three elliptic curve point multiplications on the device side and five on the gateway side. This is substantially lower than traditional certificate-based schemes, which demand up to 14 and 12 such operations, respectively. Compared to other representative schemes, our method further reduces the device-side burden, improving its applicability in resource-constrained environments. Table 3 illustrates that the scheme also minimizes communication overhead, achieving the smallest message size (3456 bits) and requiring only three message exchange rounds, attributed to the use of implicit certificates. As depicted in Fig.6, the authentication phase exhibits the shortest runtime among all evaluated schemes—47.72 ms for devices and 82.88 ms for gateways—demonstrating the scheme’s lightweight nature and practical deployability in real-world Industrial Internet of Things scenarios.  Conclusions  This paper presents a lightweight and secure identity authentication scheme based on implicit certificates, specifically designed for resource-constrained terminal devices in the Power Industrial Internet of Things. By integrating a low-overhead authentication protocol with efficient certificate handling, the scheme achieves a balanced trade-off between security and performance. The protocol ensures secure mutual authentication, protects the confidentiality of session keys, and satisfies forward secrecy, all while maintaining minimal computational and communication overhead. Security proofs and experimental evaluations verify that the proposed solution outperforms existing methods in both security robustness and resource efficiency. It offers a practical and scalable approach to enhancing the security infrastructure of modern power systems.
Geospatial Identifier Network Modal Design and Scenario Applications for Vehicle-Infrastructure Cooperative Networks
PAN Zhongxia, SHEN Congqi, LUO Hanguang, ZHU Jun, ZOU Tao, LONG Keping
Available online  , doi: 10.11999/JEIT250807
Abstract:
  Objective  Vehicle-infrastructure cooperative networks (V2X) are characterized by environmental openness, large node counts, high node mobility, frequent changes in network topology, unstable wireless channels, and diverse service requirements. These technical features present significant challenges to the efficient transmission of data. Thus, rapid network reconfiguration based on different service requirements becomes crucial, and constructing a flexible real-time network is essential for the application of V2X technologies in intelligent transportation systems (ITS). With the rise of programmable network technologies, programmable data plane techniques are driving a shift from “rigid architectures” to “flexible, adaptive” systems, enhancing the flexibility of intelligent transportation networks. In this context, a network protocol standard based on geospatial location information is proposed. Combining this standard with a polymorphic network architecture, a geospatial identifier network modal is designed. In this modal, the traditional three-layer protocol is replaced by packets containing geographic location identifiers, enabling packet forwarding directly based on geographic location information. Addressing and routing based on geographic location are more efficient and convenient than traditional IP-based addressing and routing. Furthermore, a geospatial identifier-based vehicle-infrastructure cooperative traffic system is designed for intelligent transportation scenarios. This system supports direct forwarding of packets based on geographic location information, offering flexibility in supporting the dissemination of road safety and traffic information within the V2X system, ensuring vehicle safety and improving route planning efficiency.  Methods  Based on the network protocol standard for geospatial location information and the flexible and scalable architecture of polymorphic networks, a geospatial identifier network modal is proposed. This modal replaces IP with a geospatial identifier network protocol at the three-layer network layer and implements addressing and routing based on geospatial information on programmable polymorphic network elements. To achieve end-to-end transmission, a geospatial identifier network modal protocol stack is designed, effectively supporting the unified transmission of various network modals. Additionally, considering the service demands and transmission characteristics of the GEO network modal, we develop a dynamic geographic routing mechanism. This mechanism operates within a multimodal network controller and leverages the relatively stable coverage areas of roadside base stations to establish a two-level mapping: "geographic region - base station/geographic coordinates - terminal". This enables precise end-to-end path matching for GEO network modal packets, achieving flexible and centrally controlled geographic forwarding. To validate the usability of the geospatial identifier network modal, a vehicle-infrastructure cooperative intelligent transportation system that supports the geospatial identifier addressing mechanism is developed, effectively facilitating the dissemination of road safety and traffic information. A detailed analysis of the business functional requirements of the intelligent transportation system is conducted, followed by the design of the business processing flow and the overall system. Additionally, key hardware and software modules, including geospatial representation data plane code, traffic control center services, road test base stations, and vehicle terminals, are designed and their implementation logic is provided.  Results and Discussions  System evaluation includes four main aspects: system evaluation environment, system operational effectiveness, theoretical analysis and performance evaluation. As shown in Figure 7 and 8, a prototype intelligent transportation system is deployed. The system is tested and validated to ensure it can transmit messages according to the geospatial identifier modal. Taking a typical V2V communication scenario as an example (e.g., onboard terminal T3 sending a road condition alert M to T2), we use sequence analysis to compare the forwarding efficiency of the GEO network modal against traditional IP protocols. Theoretical analysis demonstrates that the GEO network modal offers significant technical advantages in forwarding efficiency, as illustrated in Figure 9. Further tests are conducted by varying conditions such as the number of terminals (Figure 10), background traffic (Figure 11), and transmission bandwidth (Figure 12) to assess the changes in transmission performance of geospatially represented modal packets. The network modal transmission performance of the intelligent transportation system is analyzed. System performance evaluation experiments demonstrate that the system exhibits good stability and high efficiency, meeting the demands of typical V2X traffic scenarios, such as massive connectivity and elastic traffic flows.  Conclusions  Combining the flexible and scalable architecture of polymorphic networks with the network protocol standard for geospatial location information, the geospatial identifier network modal is proposed and successfully implemented, enabling direct packet forwarding based on geospatial location. Additionally, for intelligent transportation scenarios, a prototype vehicle-infrastructure cooperative intelligent transportation system based on geospatial identifier addressing is designed. This system supports a variety of applications within the V2X context, such as road safety alerts and traffic information broadcasting. The intelligent transportation system ensures vehicle safety and enhances route planning efficiency. Experimental results show that the system provides excellent stability and efficiency, effectively supporting typical traffic scenarios involving massive connectivity, network background traffic fluctuations, and elastic service traffic. As vehicular network technologies continue to evolve, this system is expected to play a significant role in broader intelligent transportation fields, providing strong support for the development of safer and more efficient smart transportation systems.
Available online  , doi: 10.11999/JEIT250849
Abstract:
Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation
WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun
Available online  , doi: 10.11999/JEIT250934
Abstract:
  Objective  Hydro-turbine generator units (HTGUs) require reliable early-stage fault detection to ensure operational safety and reduce maintenance costs. Acoustic signals provide a non-intrusive and sensitive monitoring modality, yet their use is hindered by complex structural acoustics, strong background noise, and the scarcity of abnormal data. This work presents an unsupervised acoustic anomaly detection framework that integrates a large-scale pretrained audio model with density-based k-nearest neighbors estimation, enabling accurate anomaly detection using only normal data while maintaining robustness and strong generalization across diverse HTGU conditions.  Methods  The proposed framework performs unsupervised acoustic anomaly detection for HTGUs using only normal data. Time-domain signals are preprocessed with Z-score normalization and Fbank features, followed by random masking to enhance robustness and generalization. A large-scale pretrained BEATs model serves as the feature encoder, and an Attentive Statistical Pooling module is applied to aggregate frame-level representations into discriminative segment-level embeddings by emphasizing informative frames. To improve class separability, an ArcFace loss replaces the conventional classification layer during training. A warm-up learning rate strategy is adopted to ensure stable convergence. During inference, density-based k-nearest neighbors estimation is performed on the learned embeddings to detect acoustic anomalies.  Results and Discussions  This study verifies the effectiveness of the proposed unsupervised acoustic anomaly detection framework for HTGUs using data collected from eight real-world machines. As shown in Fig. 7 and Table 2, large-scale pretrained audio representations significantly outperform traditional features in distinguishing abnormal sounds. With the FED-KE algorithm, the method achieves high accuracy across six metrics, with Hmean reaching 98.7% in the wind tunnel and over 99.9% in the slip-ring environment, demonstrating strong robustness under complex industrial conditions. As shown in Table 4, ablation studies confirm the complementary contributions of feature enhancement, ASP-based representation refinement, and density-based k-NN inference. The framework requires only normal data for training, reducing dependence on scarce fault labels and improving practical applicability. Remaining challenges include computational cost due to the pretrained model and the lack of multimodal fusion, which will be investigated in future work.  Conclusions  This study proposes an unsupervised acoustic anomaly detection framework for HTGUs, addressing the scarcity of fault samples and the complexity of industrial acoustic environments. A pretrained large-scale audio foundation model is adopted and further fine-tuned with turbine-specific strategies to enhance the modeling of normal operational acoustics. During inference, a density-estimation-based k-NN mechanism is employed to detect abnormal patterns using only normal data. Experiments on real-world hydropower station recordings demonstrate high detection accuracy and strong generalization across diverse operating conditions, outperforming conventional supervised approaches. The framework introduces foundation-model-based audio representation learning into the hydro-turbine domain, establishes an efficient adaptation strategy tailored to turbine acoustics, and integrates a robust density-based anomaly scoring mechanism. These components jointly reduce reliance on labeled anomalies and enable practical deployment for intelligent condition monitoring. Future work will investigate model compression, such as knowledge distillation, to support on-device deployment, and explore semi-/self-supervised learning and multimodal fusion to further enhance robustness, scalability, and cross-station adaptability.
Adaptive Detection and Statistical Performance Analysis in Nonzero Mean Clutter
LIU Weijian, XU Zhenyu, ZHANG Jing, QI Chongying, GE Jianjun, CHEN Hui
Available online  , doi: 10.11999/JEIT250935
Abstract:
  Objective  Target detection in nonzero-mean clutter is a critical challenge in radar and hyperspectral imaging systems. Traditional detectors assuming zero-mean clutter often suffer performance degradation in practical scenarios where clutter exhibits nonzero-mean characteristics due to environmental factors or interference. This work aims to design adaptive detectors robust to nonzero-mean clutter and analyze their statistical performance under signal mismatch conditions.  Methods  Three adaptive detectors are derived based on the generalized likelihood ratio test (GLRT), Rao and Wald tests. The detectors are designed to account for unknown clutter mean and covariance matrix, using training samples for estimation. A generalized signal mismatch scenario is considered, where the actual signal steering vector may deviate from the nominal one. Analytical expressions for probability of detection (PD) and false alarm (PFA) are derived for each detector to evaluate performance.  Results and Discussions  Analytic expressions for the PDs and PFAs of the three detectors are confirmed with Monte Carlo simulation. All the detectors possess the constant false alarm rate (CFAR) property. The amplitude characteristic of nonzero mean does not directly affect the detection performance. Instead, it influences through the loss factor of the output signal-to-clutter ratio (SCR) and the degrees of freedom (DOFs) of the detectors’ statistical distribution. Numerical results based on simulated and real data show that the proposed detectors outperform conventional ones.  Conclusions  The proposed three CFAR adaptive detectors based on GLRT, Rao and Wald tests are effective for target detection in nonzero-mean clutter. The nonzero mean of clutter affects the detection performance in two aspects: reducing the optimal output SCR of the detectors and decreasing the DOFs of the detector’s statistical distribution. Based on simulated data, when there is no signal mismatch, the GLRT-NMC detector has the highest PD. When using measured data and there is no signal mismatch, either the Rao-NMC or Wald-NMC will provide a higher PD than the GLRT-NMC. When there is signal mismatch, whether with measured data or simulated data, the Rao-NMC has the best mismatch sensitivity, while the Wald-NMC has the best robustness.
The Storage and Calculation of Biological-like Neural Networks for Locally Active Memristor Circuits
LI Fupeng, WANG Guangyi, LIU Jingbiao, YING Jiajie
Available online  , doi: 10.11999/JEIT250631
Abstract:
  Objective  At present, binary computing systems have encountered bottlenecks in terms of power consumption, operation speed and storage capacity. In contrast, the biological nervous system seems to have unlimited capacity. The biological nervous system has significant advantages in low-power computing and dynamic storage capability, which is closely related to the working mechanism of neurons transmitting neural signals through directional secretion of neurotransmitters. After analyzing the Hodgkin-Huxley model of squid giant axon, Professor Leon Chua proposed that synapses could be composed of locally passive memristors, and neurons could be made up of locally active memristors. The two types of memristors share similar electrical characteristics with nerve fibers. Since the memristors was claimed to be found, locally active memristive devices have been identified in the research of devices with layered structures. The circuits constructed from those devices exhibit different types of neuromorphic_dynamics under different excitations, However, a single two-terminal device capable of achieving multi-state storage has not yet been reported. Locally active memristors have advantages in generating biologically-inspired neural signals. Various forms of locally active memristor models can produce neural morphological signals based on spike pulses. The generation of neural signals involves the amplification and computation of stimulus signals, and its working mechanism can be realized using capacitance-controlled memristor oscillators. When a memristor operates in the locally active domian, the output voltage of its third-order circuit undergoes a period-doubling bifurcation as the capacitance in the circuit changes regularly, forming a multi-state mapping between capacitance values and oscillating voltages. In this paper, the local active memristor-based third-order circuitis used as a unit to generate neuromorphic signals, thereby forming a biologically-inspired neural operation unit, and an operation network can be formed based on the operation unit.  Methods  The mathematical model of the Chua Corsage Memristor proposed by Leon Chua was selected for analysis. The characteristics of the partial local active domain were examined, and an appropriate operating point and external components were chosen to establish a third-order memristor chaotic circuit. Circuit simulation and analysis were then conducted on this circuit. When the memristor operates in the locally active domain, the oscillator formed by its third-order circuit can simultaneously perform the functions of signal amplification, computation, and storage. In this way, the third-order circuit can be perform as the nerve cell and the variable capacitors as cynapses. Enables the electrical signal and the dielectric capacitor to work in succession, allowing the third-order oscillation circuit of the memristor to function like a neuron, with alternating electrical fields and neurotransmitters forming a brain-like computing and storage system. The secretion of biological neurotransmitters has a threshold characteristic, and the membrane threshold voltage controls the secretion of neurotransmitters to the postsynaptic membrane, thereby forming the transmission of neural signals. The step peak value of the oscillation circuit can serve as the trigger voltage for the transfer of the capacity electrolyte.  Results and Discussions  This study utilizes the third-order circuit of a local active memristor to generate stable period-doubling bifurcation voltage signal oscillations as the external capacitance changes. The variation of capacitance in the circuit causes different forms of electrical signals to be serially output at the terminals of the memristor, and the voltage amplitude of these signals changes stably in a periodic manner. This results in a stable multi-state mapping relationship between the changed capacitance and the output voltage signal, thereby forming a storage and computing unit, and subsequently a storage and computing network. Currently, a structure that enables the dielectric to transfer and change the capacitance value to the next stage under the control of the modulated voltage threshold needs to be realized. It is similar to the function of neurotransmitter secretion. The feasibility of using the third-order oscillation circuit of the memristor as a storage and computing unit is expounded, and a storage and computing structure based on the change of capacitance value is obtained.  Conclusions  When the Chua Corsage Memristor operates in its locally active domain, its third order circuit–powered solely by a voltage-stabilized source generates stable period-doubling bifurcation oscillations as external capacitance changes. The serially output oscillating signals exhibit stable voltage amplitudes/periods and has threshold characteristics. The change of the capacitance in the circuit causes different forms of electrical signals to be serially output at the terminals of the memristor, and the voltage amplitude of these signals changes stably in a periodic manner. This results in a stable multi-state mapping relationship between the changed capacitance and the output voltage signal, thereby forming a storage and computing unit, and subsequently a storage and computing network. Currently, a structure is need to realize the transfer of the dielectric to the subordinate under the control of the modulated voltage threshold, similar to the function of neurotransmitter secretion. The feasibility of using the third-order oscillation circuit of the memristor as a storage and computing unit is obtained, and a storage and computing structure based on the variation of capacitance value is described.
ISAR Sequence Motion Modeling and Fuzzy Attitude Classification Method for Small Sample Space Target
YE Juhang, DUAN Jia, ZHANG Lei
Available online  , doi: 10.11999/JEIT250689
Abstract:
  Objective  With the intensification of space activities, Space Situational Awareness (SSA) is required to ensure national security and collision avoidance. A key task is the classification of space target attitudes to interpret states and predict behavior. Current approaches mainly rely on Ground-Based Inverse Synthetic Aperture Radar (GBISAR), which exhibit certain limitations. Model-driven methods rely on accurate prior models and involve high computational costs, while data-driven methods such as deep learning depend on large annotated datasets, which are difficult to obtain for space targets, and thus perform poorly in small-sample scenarios. To address this, a fuzzy attitude classification (FAC) method is proposed, which integrates temporal motion modeling with fuzzy set theory. The method is designed as a training-free and real-time classifier for rapid deployment under data-constrained conditions.  Methods  The method establishes a mapping between three-dimensional (3D) attitude dynamics and two-dimensional (2D) ISAR features through a framework combining the Horizon Coordinate System (HCS), the UNW orbital system, and the Body-Fixed Reference Frame (BFRF). Attitude changes are modeled as Euler rotations of BFRF relative to UNW. The periodic 3D rotation is projected onto the 2D Range-Doppler plane as circular keypoint trajectories. Fourier series analysis is then used to decompose the motion into one-dimensional (1D) cosine features, where phase encodes angular velocity and amplitude indicates motion magnitude. A 10-point annotation model is employed to represent targets, and dimensionless roll, pitch, and yaw feature vectors are derived. For classification, magnitude- and angle-based criteria are defined and processed by a softmax membership function, which incorporates variance across the sequence to compute fuzzy membership degrees. The algorithm operates directly on keypoint sequences, avoids training, and maintains linear computational complexity O(n), enabling real-time application.  Results and Discussions  The FAC method is evaluated on a Ku-band GBISAR simulated dataset of a spinning target. The dataset consists of 36 sequences, each with 36 frames of 512×512 images, devided as reference set as well as testing set. While raw keypoint tracks appear disordered (Fig. 4(a)), the engineered features form clustered patterns (Fig. 4(b)). The variance of the criteria effectively represents motion significance (Fig. 4(c)). Robustness is demonstrated: across nine imaging angles, classification consistency remains 100% within a 0.04 tolerance (Fig. 5(a)). Under noise, consistency is preserved from 10 dB to 1 dB SNR (Fig. 5(b)). With frame loss, 90% consistency is sustained at a 0.1 threshold, with six frames being the minimum for effective classification (Fig. 5(c)). Benchmark comparisons show that FAC outperforms HMM and CNN, maintaining accuracy under noise (Fig. 6(a)), stability under frame loss where HMM degrades to random behavior (Fig. 6(b)), and achieving much lower processing time than both HMM and CNN (Fig. 6(c)).  Conclusions  A fuzzy attitude classification method combining motion modeling and fuzzy reasoning is presented for small-sample space target classification. By mapping multi-coordinate kinematics into interpretable cosine features, the method reduces dependence on prior models and large datasets, while achieving training-free, linear-time operation. Simulations verify robustness across observation angles, SNR levels, and frame availability. Benchmark results confirm superior accuracy, stability, and efficiency compared with HMM and CNN. The FAC method therefore provides a practical solution for real-time, small-sample attitude classification. Future work will extend the framework to multi-axis tumbling and validation using measured data, with potential integration of multi-modal observations to further enhance adaptability.
Detection of underwater acoustic transient signals under Alpha stable distribution noise
CHEN Wen, ZOU Nan, ZHANG Guangpu, LI Yanhe
Available online  , doi: 10.11999/JEIT250500
Abstract:
  Objective  Transient signals are generated by state changes of underwater acoustic targets and are difficult to suppress or eliminate, thus becoming an important means for covert underwater target detection. However, practical marine environmental noise exhibits non-Gaussian characteristics, such as impulsive spikes.That severely degrade or even invalidate traditional Gaussian-based detection methods, particularly energy detection widely used in engineering applications. While existing studies employ nonlinear transformations or fractional lower-order statistics to address non-Gaussian noise, there are limitations such as signal distortion and high computational complexity. To overcome these challenges, the Alpha-stable distribution is adopted to replace traditional Gaussian modeling, and a Data Preprocessing denoising- Short-Time Cross-Correntropy Detection (DP-STCCD) method is proposed to achieve passive detection and Time of Arrival (ToA) estimation for unknown deterministic transient signals in non-Gaussian noise environments.  Methods  The proposed method comprises two stages: data preprocessing denoising and short-time cross-correntropy detection. In the data preprocessing stage, an outlier detection technique based on the Interquartile Range (IQR) is applied. Upper and lower thresholds are calculated to effectively remove impulsive spikes while preserving local signal features. Then multi-stage filtering is employed to further suppress noise: median filtering reconstructs the signal with minimal detail loss, modified mean filtering eliminates residual spikes by discarding extreme values in local windows. In the detection stage, the denoised signal is segmented into frames. Short-time cross-correntropy based on a Gaussian kernel is computed between adjacent frames to construct detection statistics. A first-order recursive filter estimates background noise to set detection thresholds. Joint amplitude-width decision logic generates detection results. ToA estimation is achieved by identifying peaks in the short-time cross-correntropy. This method eliminates dependence on prior noise knowledge and enhances robustness in non-Gaussian environments through data cleaning and information-theoretic feature extraction.  Results and Discussions  Simulations under standard symmetric Alpha-stable distributed noise validate the algorithm’s performance. The data preprocessing denoising algorithm effectively eliminates impulsive spikes while retaining critical time-domain signal characteristics (Fig.3). After denoising, the detection performance of the energy detector is partially restored. And the peak-to-average ratio of short-time cross-correntropy features improves by 10 dB (Fig. 4, Fig. 5). Experimental results demonstrate that DP-STCCD significantly outperforms Data Preprocessing denoising-Energy Detection(DP-ED) in both detection probability and ToA estimation accuracy under identical conditions. At the characteristic index \begin{document}$ \mathrm{\alpha } $\end{document}=1.5 and Generalized SNR (GSNR) of -11 dB, DP-STCCD achieves a 30.2% higher detection probability and an 18.4% improvement in ToA estimation precision compared to DP-ED (Fig. 6, Fig. 9(a)). These results validate the effectiveness and robustness of the proposed method in complex noise environments.  Conclusions  A joint detection method DP-STCCD integrating data preprocessing denoising and short-time cross-correntropy features is proposed to address underwater transient signal detection under Alpha-stable distributed noise. Preprocessing techniques, including IQR-based outlier detection and multi-stage filtering, effectively suppress impulsive interference while preserving key signal characteristics. The short-time cross-correntropy feature enhances detection sensitivity and ToA estimation accuracy. Results indicate that the proposed method outperforms traditional energy detectors under low GSNR conditions and maintains superior stability across varying characteristic indices. This study provides a novel approach for covert underwater target detection in non-Gaussian noise environments. Future work will focus on optimizing the algorithm for practical marine noise interference to enhance its engineering applicability.
Two-channel joint coding detection for cyber-physical systems against integrity attacks
MO Xiaolei, ZENG Weixin, FU Jiawei, DOU Keqin, WANG Yanwei, SUN Ximing, LIN Sida, SUI Tianju
Available online  , doi: 10.11999/JEIT250729
Abstract:
  Objective  With the rapid development of computing, control, and sensing technologies, cyber-physical systems (CPS), which deeply integrate information and physical processes, have been widely used in various industries, such as infrastructure sector, aviation, energy, healthcare, manufacturing, and transportation, etc. However, due to the real-time and heterogeneous nature of information and physical processes, CPS are more vulnerable to attacks and damages when communicating and interacting with each other. CPS attacks can be categorized into three major types, namely, availability attacks, integrity attacks, and reliability attacks, according to the three elements of information security. Integrity attacks are launched against the data flow of CPS to destroy the consistency of input and output data, which is more difficult to be detected and protected than other CPS attacks due to the changeable and covert nature of the attacks. Currently, the mainstream detection methods include actively changing control signals, sensing signals, or system models, which can only address the detection of a single class of attacks and suffer from degraded control performance, complexity of the model and high latency due to the active change approach.  Methods  In this paper, a joint data additive-multiplicative coding detection scheme for control-output two-channel is proposed and applied on three typical integrity attacks so as to verify them. Three typical integrity attacks, namely, control channel bias attack, output channel replay attack, and two-channel covert attack, are selected. The attack achieves “stealthy” to the CPS system by obtaining and controlling the system information partially or comprehensively, so that the detection value of the residual-based \begin{document}${\chi ^2}$\end{document} detector is less than the threshold value. In this paper, we innovatively arrange additive positive/negative watermarking pairs and multiplicative coding/decoding matrix pairs on both sides of the channel. Due to the introduction of unknown signals and components, which brings information uncertainty to the attacker, the statistical characteristics of the residuals deviate from the well-designed values, which are generated by the attacker using the known information to construct the integrity attack. In addition, decoupling between watermarking pairs and matrix pairs is achieved due to the different introduction mechanisms, which are in positive-negative or mutual inverse form so that the control performance of the normal system is not affected in the absence of attacks, and in a time-varying form that prevents the attacker from reconfiguring the detection components.  Results and Discussions  Simulation experiments on the flight trajectory of the aerial vehicle are designed to verify the effect of integrity attack on the flight trajectory and the effectiveness of the proposed scheme. Based on Newton's equations of motion, the trajectory model of the aerial vehicle mass is established, and its attitude dynamics and rotational motion are ignored to focus on trajectory analysis. The detection effects with and without applying the detection scheme are compared and demonstrated for three different attacks (Fig.2, Fig.3, Fig.4), which proves the effectiveness and advancement of the scheme designed in this paper.  Conclusions  This paper investigates the detection of integrity attacks in CPS systems. It models three typical attack types—bias, replay, and covert attacks—and identifies the necessary conditions for successfully executing each. Building upon this foundation, it innovatively proposes a detection scheme combining additive watermarks with multiplicative encoding matrices, achieving successful detection of all three attack types. The proposed solution employs additive positive-negative watermark pairs and multiplicative encoding/decoding matrix pairs to achieve successful detection without compromising normal system control performance. It employs a time-varying approach to prevent attackers from reconstructing the watermark and matrix pairs. Finally, using aerial vehicle flight trajectories simulation as an example, the effectiveness and advanced nature of this detection solution is demonstrated.
A Fake Attention Map-Driven Multi-Task Deepfake Video Detection Model
LIU Pengyu, ZHENG Tianyang, DONG Min Liu
Available online  , doi: 10.11999/JEIT250926
Abstract:
  Objective  With the rapid advancement of synthetic media generation, deepfake detection has become a critical challenge in multimedia forensics and information security. Most high-quality detection methods rely on supervised binary classification models with implicit attention mechanisms. Although such methods can automatically learn discriminative features and identify manipulation traces, their performance degrades significantly when facing unseen forgery techniques. The lack of explicit guidance in feature fusion leads to limited sensitivity to subtle artifacts and poor cross-domain generalization. To address these limitations, a novel detection framework named F-BiFPN-MTLNet is proposed. The framework aims to achieve high detection accuracy and strong generalization by introducing an explicit forgery-attention-guided multi-scale feature fusion mechanism and a multi-task learning strategy. This research is of great significance for improving the interpretability and robustness of deepfake detection models, especially in real-world scenarios where forgeries are diverse and evolving.  Methods  The proposed F-BiFPN-MTLNet consists of two main components: a Forgery-attention-guided Bidirectional Feature Pyramid Network (F-BiFPN) and a Multi-Task Learning Network (MTLNet). The F-BiFPN (Fig.1) is designed to explicitly guide the fusion of multi-scale feature representations from different backbone layers. Instead of performing simple top-down and bottom-up fusion, a forgery-attention map is introduced to supervise the fusion process. The map highlights potential manipulation regions and applies adaptive weighting to each feature level, ensuring that both semantic and spatial details are preserved while redundant information is suppressed. This attention-guided fusion enhances the sensitivity of the network to fine-grained forged traces and improves representation quality.  Results and Discussions  Experiments are conducted on multiple benchmark datasets, including FaceForensics++, DFDC, and Celeb-DF (Table 1). The proposed F-BiFPN-MTLNet achieves consistent improvements over state-of-the-art approaches in both Area Under the Curve (AUC) and Average Precision (AP) metrics (Table 2). The results indicate that the introduction of attention-guided fusion significantly enhances the detection of subtle manipulations, while the multi-task learning structure improves model stability across different forgery types. Ablation analyses (Table 3) confirm the complementary contributions of the two modules. Removing F-BiFPN reduces sensitivity to local artifacts, whereas omitting the self-consistency branch weakens robustness under cross-dataset evaluation. Visualization results (Fig.3) further demonstrate that F-BiFPN-MTLNet effectively focuses on forged regions and produces interpretable attention maps aligned with actual manipulation areas. The framework thus achieves an improved balance between accuracy, generalization, and transparency, while maintaining computational efficiency suitable for practical forensic applications.  Conclusions  In this study, a forgery-attention-guided weighted bidirectional feature pyramid network combined with a multi-task learning framework is proposed for robust and interpretable deepfake detection. The F-BiFPN explicitly supervises multi-scale feature fusion through forgery-attention maps, reducing redundancy and emphasizing informative regions. The MTLNet introduces a learnable mask branch and a self-consistency branch, jointly enhancing localization accuracy and cross-domain robustness. Experimental results confirm that the proposed model surpasses existing baselines in AUC and AP metrics while maintaining strong interpretability through visualized attention maps. Overall, F-BiFPN-MTLNet effectively balances fine-grained localization, detection reliability, and generalization ability. Its explicit attention and multi-task strategies provide a new perspective for designing interpretable and resilient deepfake detection systems. Future work will focus on extending the framework to weakly supervised and unsupervised scenarios, reducing dependency on pixel-level annotations, and exploring adversarial training techniques to further improve adaptability against evolving forgery methods.
For Electric Power Disaster Early Warning Scenarios: A Large Model and Lightweight Models Joint Deployment Scheme Based on Limited Spectrum Resources
CHEN Lei, HUANG Zaichao, LIU Chuan, ZHANG Weiwei
Available online  , doi: 10.11999/JEIT250321
Abstract:
  Objective  Traditional approaches to electric power disaster early warning rely on dedicated, scenario-specific systems, leading to redundant data collection and high development costs. To enhance accuracy and reduce costs, comprehensive early warning frameworks based on Artificial Intelligence (AI) large models have become an important research direction. However, large models are typically deployed in the cloud, and limited wireless spectrum resources constrain the uploading of complete data streams. Deploying lightweight models at terminal devices through substantial model compression can alleviate spectrum limitations but inevitably compromises model performance.  Methods  To address these limitations, this study proposes a cloud–terminal collaborative joint deployment scheme integrating large and lightweight models. In this framework, a high-precision large model is deployed in the cloud to process complex tasks, whereas lightweight models are deployed at terminal devices to handle simple tasks. Task offloading decisions are governed by a confidence threshold that dynamically determines whether computation occurs locally or in the cloud. A power-domain Non-Orthogonal Multiple Access (NOMA) technique is incorporated to allow multiple terminals to share identical time–frequency resources, thereby improving system detection accuracy by increasing the proportion of tasks processed in the cloud. Additionally, for scenarios considering (i) only uplink shared-channel bandwidth constraints and (ii) both terminal access collision constraints and shared-channel bandwidth constraints, corresponding algorithms are developed to determine the maximum number of terminals supported under a given bandwidth and to identify the optimal confidence threshold that maximizes detection accuracy.  Results and Discussions  (1) As shown in Figures 3(a) and 3(b), when the uplink shared-channel bandwidth \begin{document}$ W $\end{document} increases, the number of supported terminals rises for both the proposed scheme and the orthogonal multiple access (OMA)-based scheme. This occurs because a larger \begin{document}$ W $\end{document} enables more terminals with low-confidence detection results to upload raw data to the cloud for further processing, thereby enhancing detection accuracy and reducing the missed detection rate. (2) In contrast, the number of supported terminals \begin{document}$ M $\end{document} in the pure on-device processing scheme remains constant with varying \begin{document}$ W $\end{document}, as this scheme relies entirely on the lightweight model deployed at the terminal and is therefore independent of bandwidth. (3) Compared with the OMA-based and pure on-device schemes, the proposed approach markedly increases the number of supported terminals, confirming that non-orthogonal reuse of time–frequency resources and cloud–terminal collaborative deployment of large and lightweight models are key to improving system performance. (4) As shown in Table 3, an increase in the number of preambles reduces the probability of terminal access collisions, allowing more terminals to successfully transmit raw data to the cloud for detection. Therefore, the missed detection rate decreases, and overall detection accuracy improves.  Conclusions  For electric power disaster early warning scenarios, this study integrates power-domain NOMA and proposes a cloud–terminal collaborative deployment scheme combining a large model with lightweight models. By dynamically determining whether tasks are processed locally by a lightweight model or in the cloud by a large model, the system achieves optimized detection accuracy and a reduced missed detection rate. Numerical results indicate that, under given uplink shared-channel bandwidth, minimum detection accuracy, and maximum missed detection rate, the introduction of power-domain NOMA effectively increases the number of supported terminals. Furthermore, when both terminal access collision constraints and shared-channel bandwidth constraints are considered, optimizing the confidence threshold to regulate the number of terminals transmitting data to the cloud further enhances detection accuracy and reduces the missed detection rate.
Design and Optimization for Orbital Angular Momentum–based wireless-powered Noma Communication System
CHEN Ruirui, CHEN Yu, RAN Jiale, SUN Yanjing, LI Song
Available online  , doi: 10.11999/JEIT250634
Abstract:
  Objective  The Internet of Things (IoT) requires not only interconnection among devices but also seamless connectivity among users, information, and things. Ensuring stable operation and extending the lifespan of IoT Devices (IDs) through continuous power supply have become urgent challenges in IoT-driven Sixth-Generation (6G) communications. Radio Frequency (RF) signals can simultaneously transmit information and energy, forming the basis for Simultaneous Wireless Information and Power Transfer (SWIPT). Non-Orthogonal Multiple Access (NOMA), a key technology in Fifth-Generation (5G) communications, enables multiple users to share the same time and frequency resources. Efficient wireless-powered NOMA communication requires a Line-of-Sight (LoS) channel. However, the strong correlation in LoS channels severely limits the degree of freedom, making it difficult for conventional spatial multiplexing to achieve capacity gains. To address this limitation, this study designs an Orbital Angular Momentum (OAM)-based wireless-powered NOMA communication system. By exploiting OAM mode multiplexing, multiple data streams can be transmitted independently through orthogonal OAM modes, thereby significantly enhancing communication capacity in LoS channels.  Methods  The OAM-based wireless-powered NOMA communication system is designed to enable simultaneous energy transfer and multi-channel information transmission for IDs under LoS conditions. Under the constraints of the communication capacity threshold and the harvested energy threshold, this study formulates a sum-capacity maximization problem by converting harvested energy into the achievable uplink information capacity. The optimization problem is decomposed into two subproblems. A closed-form expression for the optimal Power-Splitting (PS) factor is derived, and the optimal power allocation is obtained using the subgradient method. The transmitting Uniform Circular Array (UCA) employs the Movable Antenna (MA) technique to adjust both position and array angle. To maintain system performance under typical parallel misalignment conditions, a beam-steering method is investigated.  Results and Discussions  Simulation results demonstrate that the proposed OAM-based wireless-powered NOMA communication system effectively enhances capacity performance compared with conventional wireless communication systems. As the OAM mode increases, the sum capacity of the ID decreases. This occurs because higher OAM modes exhibit stronger hollow divergence characteristics, resulting in greater energy attenuation of the received OAM signals (Fig. 3). The sum capacity of the ID increases with the PS factor (Fig. 4). However, as the harvested energy threshold increases, the system’s sum capacity decreases (Fig. 5). When the communication capacity threshold increases, the sum capacity first rises and then gradually declines (Fig. 6). In power allocation optimization, allocating more power to the ID with the best channel condition further improves the total system capacity.  Conclusions  To enhance communication capacity under LoS conditions, this study designs an OAM-based wireless-powered NOMA communication system that employs mode multiplexing to achieve independent multi-channel information transmission. On this basis, a sum-capacity maximization problem is formulated under communication capacity and harvested energy threshold constraints by transforming harvested energy into achievable uplink information capacity. The optimization problem is decomposed into two subproblems. A closed-form expression for the optimal PS factor is derived, and the optimal power allocation is obtained using the subgradient method. In future work, the MA technique will be integrated into the proposed OAM-based wireless-powered NOMA system to further optimize sum-capacity performance based on the three-dimensional spatial configuration and adjustable array angle.
Performance Analysis of Spatial-Reference-Signal-Based Digital Interference Cancellation Systems
XIN Yedi, HE Fangmin, GE Songhu, XING Jinling, GUO Yu, CUI Zhongpu
Available online  , doi: 10.11999/JEIT250679
Abstract:
  Objective  With the rapid development of wireless communications, an increasing number of transceivers are deployed on platforms with limited spatial and spectral resources. Restrictions in frequency and spatial isolation cause high-power local transmitters to couple signals into nearby high-sensitivity receivers, causing co-site interference. Interference cancellation serves as an effective mitigation technique, whose performance depends on precise acquisition of a reference signal representing the interference waveform. Compared with digital sampling, Radio Frequency (RF) sampling enables simpler implementation. However, existing RF-based approaches are generally restricted to low-power communication systems. In high-power RF systems, RF sampling faces critical challenges, including excessive sampling power loss and high integration complexity. Therefore, developing new sampling methods and cancellation architectures suitable for high-power RF systems is of substantial theoretical and practical value.  Methods  To overcome the limitations of conventional high-power RF interference sampling methods based on couplers, a spatial-reference-based digital cancellation architecture is proposed. A directional sampling antenna and its associated link are positioned near the transmitter to acquire the reference signal. This configuration, however, introduces spatial noise, link noise, and possible multipath effects, which can degrade cancellation performance. A system model is developed, and closed-form expressions for the cancellation ratio under multipath conditions are derived. The validity of these expressions is verified through Monte Carlo simulations using three representative modulated signals. Furthermore, a systematic analysis is conducted to evaluate the effects of key system parameters on cancellation performance.  Results and Discussions  Based on the proposed spatial-reference-based digital cancellation architecture, analytical expressions for the cancellation ratio are derived and validated through extensive simulations. These expressions enable systematic evaluation of the key performance factors. For three representative modulation schemes, the cancellation ratio shows excellent consistency between theoretical predictions and simulation results under various conditions, including receiver and sampling channel Interference-to-Noise Ratios (INRs), time-delay mismatch errors, and filter tap numbers (Figs. 2–4). The established theoretical framework is further applied to analyze the effects of system parameters. Simulations quantitatively assess (1) the influence of filter tap number, multipath delay spread, and the number of multipaths on cancellation performance in multipath environments (Figs. 5–7), and (2) the upper performance bounds and contour characteristics under different INR combinations in the receiver and sampling channels (Figs. 8–9).  Conclusion  To reduce the high deployment complexity and substantial insertion loss associated with coupler-based RF interference sampling in high-power systems, a digital interference cancellation architecture based on spatial reference signals is proposed. Closed-form expressions and performance bounds for the cancellation ratio of rectangular band-limited interference under multipath conditions are derived. Simulation results demonstrate that the proposed expressions provide high accuracy in representative scenarios. Based on the analytical findings, the effects of key parameters are examined, including INRs in receiver and sampling channels, filter tap length, multipath delay spread, number of paths, and time-delay mismatch. The results provide practical insights that support the design and optimization of spatial reference–based digital interference cancellation systems.
A Two-Stage Framework for CAN Bus Attack Detection by Fusing Temporal and Deep Features
TAN Mingming, ZHANG Heng, WANG Xin, LI Ming, ZHANG Jian, YANG Ming
Available online  , doi: 10.11999/JEIT250651
Abstract:
  Objective  The Controller Area Network (CAN), the de facto standard for in-vehicle communication, is inherently vulnerable to cyberattacks. Existing Intrusion Detection Systems (IDSs) face a fundamental trade-off: achieving fine-grained classification of diverse attack types often requires computationally intensive models that exceed the resource limitations of on-board Electronic Control Units (ECUs). To address this problem, this study proposes a two-stage attack detection framework for the CAN bus that fuses temporal and deep features. The framework is designed to achieve both high classification accuracy and computational efficiency, thereby reconciling the tension between detection performance and practical deployability.  Methods  The proposed framework adopts a “detect-then-classify” strategy and incorporates two key innovations. (1) Stage 1: Temporal Feature-Aware Anomaly Detection. Two custom features are designed to quantify anomalies: Payload Data Entropy (PDE), which measures content randomness, and ID Frequency Mean Deviation (IFMD), which captures behavioral deviations. These features are processed by a Bidirectional Long Short-Term Memory (BiLSTM) network that exploits contextual temporal information to achieve high-recall anomaly detection. (2) Stage 2: Deep Feature-Based Fine-Grained Classification. Triggered only for samples flagged as anomalous, this stage employs a lightweight one-dimensional ParC1D-Net. The core ParC1D Block (Fig. 4) integrates depthwise separable one-dimensional convolution, Squeeze-and-Excitation (SE) attention, and a Feed-Forward Network (FFN), enabling efficient feature extraction with minimal parameters. Stage 1 is optimized using BCEWithLogitsLoss, whereas Stage 2 is trained with Cross-Entropy Loss.  Results and Discussions  The efficacy of the proposed framework is evaluated on public datasets. (1) State-of-the-art performance. On the Car-Hacking dataset (Table 5), an accuracy and F1-score of 99.99% are achieved, exceeding advanced baselines. On the more challenging Challenge dataset (Table 6), superior accuracy (99.90%) and a competitive F1-score (99.70% are also obtained. (2) Feature contribution analysis. Ablation studies (Tables 7 and 8) confirm the critical role of the proposed features. Removal of the IFMD feature results in the largest performance reduction, highlighting the importance of behavioral modeling. A synergistic effect is observed when PDE and IFMD are applied together. (3) Spatiotemporal efficiency. The complete model remains lightweight at only 0.39 MB. Latency tests (Table 9) demonstrate real-time capability, with average detection times of 0.62 ms on a GPU and 0.93 ms on a simulated CPU (batch size = 1). A system-level analysis (Section 3.5.4) further shows that the two-stage framework is approximately 1.65 times more efficient than a single-stage model in a realistic sparse-attack scenario.  Conclusions  This study establishes the two-stage framework as an effective and practical solution for CAN bus intrusion detection. By decoupling detection from classification, the framework resolves the trade-off between accuracy and on-board deployability. Its strong performance, combined with a minimal computational footprint, indicates its potential for securing real-world vehicular systems. Future research could extend the framework and explore hardware-specific optimizations.
A one-dimensional 5G millimeter-wave wide-angle Scanning Array Antenna Using AMC Structure
MA Zhangang, ZHANG Qing, FENG Sirun, ZHAO Luyu
Available online  , doi: 10.11999/JEIT250719
Abstract:
  Objective  With the rapid advancement of 5G millimeter-wave technology, antennas are required to achieve high gain, wide beam coverage, and compact size, particularly in environments characterized by strong propagation loss and blockage. Conventional millimeter-wave arrays often face difficulties in reconciling wide-angle scanning with high gain and broadband operation due to element coupling and narrow beamwidths. To overcome these challenges, this study proposes a one-dimensional linear array antenna incorporating an Artificial Magnetic Conductor (AMC) structure. The AMC’s in-phase reflection is exploited to improve bandwidth and gain while enabling wide-angle scanning of ±80° at 26 GHz. By adopting a 0.4-wavelength element spacing and stacked topology, the design provides an effective solution for 5G millimeter-wave terminals where spatial constraints and performance trade-offs are critical. The findings highlight the potential of AMC-based arrays to advance antenna technology for future high-speed, low-latency 5G applications by combining broadband operation, high directivity, and broad coverage within compact form factors.  Methods  This study develops a high-performance single-polarized one-dimensional linear millimeter-wave array antenna through a multi-layered structural design integrated with AMC technology. The design process begins with theoretical analysis of the pattern multiplication principle and array factor characteristics, which identify 0.4-wavelength element spacing as an optimal balance between wide-angle scanning and directivity. A stacked three-layer antenna unit is then constructed, consisting of square patch radiators on the top layer, a cross-shaped coupling feed structure in the middle layer, and an AMC-loaded substrate at the bottom. The AMC provides in-phase reflection in the 21–30 GHz band, enhancing bandwidth and suppressing surface wave coupling. Full-wave simulations (HFSS) are performed to optimize AMC dimensions, feed networks, and array layout, confirming bandwidth of 23.7–28 GHz, peak gain of 13.9 dBi, and scanning capability of ±80°. A prototype is fabricated using printed circuit board technology and evaluated with a vector network analyzer and anechoic chamber measurements. Experimental results agree closely with simulations, demonstrating an operational bandwidth of 23.3–27.7 GHz, isolation better than −15 dB, and scanning coverage up to ±80°. These results indicate that the synergistic interaction between AMC-modulated radiation fields and the array coupling mechanism enables a favorable balance among wide bandwidth, high gain, and wide-angle scanning.  Results and Discussions  The influence of array factor on directional performance is analyzed, and the maximum array factor is observed when the element spacing is between 0.4λ and 0.46λ (Fig. 2). The in-phase reflection of the AMC structure in the 21–30 GHz range significantly enhances antenna characteristics, broadening the bandwidth by 50% compared with designs without AMC and increasing the gain at 26 GHz by 1.5 dBi (Fig. 10, Fig. 13). The operational bandwidth of 23.3–27.7 GHz is confirmed by measurements (Fig. 17a). When the element spacing is optimized to 4.6 mm (0.4λ) and the coupling radiation mechanisms are adjusted, the H-plane half-power beamwidth (HPBW) of the array elements is extended to 180° (Fig. 8, Fig. 9), with a further gain improvement of 0.6 dBi at the scanning edges (Fig. 11b). The three-layer stacked structure—comprising the radiation, isolation, and AMC layers—achieves isolation better than –15 dB (Fig. 17a). Experimental validation demonstrates wide-angle scanning capability up to ±80°, showing close agreement between simulated and measured results (Fig. 11, Fig. 17b). The proposed antenna is therefore established as a compact, high-performance solution for 5G millimeter-wave terminals, offering wide bandwidth, high gain, and broad scanning coverage.  Conclusions  A one-dimensional linear wide-angle scanning array antenna based on an AMC structure is presented for 5G millimeter-wave applications. Through theoretical analysis, simulation optimization, and experimental validation, balanced improvement in broadband operation, high gain, and wide-angle scanning is achieved. Pattern multiplication theory and array factor analysis are applied to determine 0.4-wavelength element spacing as the optimal compromise between scanning angle and directivity. A stacked three-layer configuration is adopted, and the AMC’s in-phase reflection extends the bandwidth to 23.7–28.5 GHz, representing a 50% increase. Simulation and measurement confirm ±80° scanning at 26 GHz with a peak gain of 13.8 dBi, which is 1.3 dBi higher than that of non-AMC designs. The close consistency between experimental and simulated results verifies the feasibility of the design, providing a compact and high-performance solution for millimeter-wave antennas in mobile communication and vehicular systems. Future research is expected to explore dual-polarization integration and adaptation to complex environments.
The Research on Interference Suppression Algorithms for Millimeter-Wave Radar in Multi-Interference Environments
TAN Haonan, DONG Mei, CHEN Boxiao
Available online  , doi: 10.11999/JEIT250617
Abstract:
  Objective  With the widespread application of millimeter-wave radar in intelligent driving, mutual interference among radars has become increasingly prominent. Interference signals appear as sharp pulses in the time domain and elevated background noise in the frequency domain, severely degrading target information acquisition and threatening road traffic safety. To address this challenge, this paper proposes a joint envelope recovery–based signal reconstruction algorithm that exploits the time-domain characteristics of signals to enhance target detection performance in multi-interference environments.  Methods  The proposed algorithm consists of two core steps. Step 1: Interference region detection. A dual-criterion mechanism, combining interference envelope detection with transition point detection within the envelope, is employed. This approach substantially improves the accuracy of detecting both interference regions and useful signal segments in multi-interference environments. Step 2: Signal reconstruction. The detected useful signal segments and interference-free portions are used to reconstruct the interference regions. To ensure continuity and improve reconstruction accuracy, the Hilbert transform is applied to perform normalized envelope amplitude coordination on the reconstructed signal.  Results and Discussions  The algorithm first detects interference regions and useful signal segments with high precision through the dual-criterion mechanism, and then reconstructs the interference regions using the detected segments. Simulation results show that the algorithm achieves an interference detection accuracy of 93.7% and a useful signal segment detection accuracy of 97.2%, exceeding comparative algorithms (Table 3). The reconstructed signal effectively eliminates sharp interference pulses in the time domain, smooths the signal amplitude, and markedly improves the Signal-to-Interference-plus-Noise Ratio (SINR) in the frequency domain (Fig. 11). Compared with other interference suppression algorithms, the proposed method exhibits superior suppression performance (Fig. 12), achieving an SINR improvement of more than 3 dB in the frequency domain and maintaining better suppression effects across different SINR conditions (Fig. 13). In real-road tests, the algorithm successfully detects multiple interference regions and useful signal segments (Fig. 14) and significantly enhances the SINR after reconstruction (Fig. 15).  Conclusions  This paper proposes a joint envelope recovery–based signal reconstruction algorithm to address inaccurate target detection in multi-interference environments for millimeter-wave radar. The algorithm employs a dual-criterion mechanism to accurately detect interference regions and valid signal segments, and reconstructs the interference regions using the detected useful segments. The Hilbert transform is further applied to achieve collaborative normalization of the signal envelope. Experimental results demonstrate that the algorithm effectively identifies interference signals and reconstructs interference regions in multi-interference scenarios, significantly improving the signal-to-noise ratio, suppressing interference, and enabling accurate target information acquisition. These findings provide an effective anti-jamming solution for intelligent driving systems operating in multi-interference environments.
Adaptive Cache Deployment Based on Congestion Awareness and Content Value in LEO Satellite Networks
LIU Zhongyu, XIE Yaqin, ZHANG Yu, ZHU Jianyue
Available online  , doi: 10.11999/JEIT250670
Abstract:
  Objective  Low Earth Orbit (LEO) satellite networks are central to future space–air–ground integrated systems, offering global coverage and low-latency communication. However, their high-speed mobility leads to rapidly changing topologies, and strict onboard cache constraints hinder efficient content delivery. Existing caching strategies often overlook real-time network congestion and content attributes (e.g., freshness), which leads to inefficient resource use and degraded Quality of Service (QoS). To address these limitations, we propose an adaptive cache placement strategy based on congestion awareness. The strategy dynamically couples real-time network conditions, including link congestion and latency, with a content value assessment model that incorporates both popularity and freshness.This integrated approach enhances cache hit rates, reduces backhaul load, and improves user QoS in highly dynamic LEO satellite environments, enabling efficient content delivery even under fluctuating traffic demands and resource constraints.  Methods  The proposed strategy combines a dual-threshold congestion detection mechanism with a multi-dimensional content valuation model. It proceeds in three steps. First, satellite nodes monitor link congestion in real time using dual latency thresholds and relay congestion status to downstream nodes through data packets. Second, a two-dimensional content value model is constructed that integrates popularity and freshness. Popularity is updated dynamically using an Exponential Weighted Moving Average (EWMA), which balances historical and recent request patterns to capture temporal variations in demand. Freshness is evaluated according to the remaining data lifetime, ensuring that expired or near-expired content is deprioritized to maintain cache efficiency and relevance. Third, caching thresholds are adaptively adjusted according to congestion level, and a hop count control factor is introduced to guide caching decisions. This coordinated mechanism enables the system to prioritize high-value content while mitigating congestion, thereby improving overall responsiveness and user QoS.  Results and Discussions  Simulations conducted on ndnSIM demonstrate the superiority of the proposed strategy over PaCC (Popularity-Aware Closeness-based Caching), LCE (Leave Copy Everywhere), LCD (Leave Copy Down), and Prob (probability-based caching with probability = 0.5). The key findings are as follows. (1) Cache hit rate. The proposed strategy consistently outperforms conventional methods. As shown in Fig. 8, the cache hit rate rises markedly with increasing cache capacity and Zipf parameter, exceeding those of LCE, LCD, and Prob. Specifically, the proposed strategy achieves improvements of 43.7% over LCE, 25.3% over LCD, 17.6% over Prob, and 9.5% over PaCC. Under high content concentration (i.e., larger Zipf parameters), the improvement reaches 29.1% compared with LCE, highlighting the strong capability of the strategy in promoting high-value content distribution. (2) Average routing hop ratio. The proposed strategy also reduces routing hops compared with the baselines. As shown in Fig. 9, the average hop ratio decreases as cache capacity and Zipf parameter increase. Relative to PaCC, the proposed strategy lowers the average hop ratio by 2.24%, indicating that content is cached closer to users, thereby shortening request paths and improving routing efficiency. (3) Average request latency. The proposed strategy achieves consistently lower latency than all baseline methods. As summarized in Table 2 and Fig. 10, the reduction is more pronounced under larger cache capacities and higher Zipf parameters. For instance, with a cache capacity of 100 MB, latency decreases by approximately 2.9%, 5.8%, 9.0%, and 10.3% compared with PaCC, Prob, LCD, and LCE, respectively. When the Zipf parameter is 1.0, latency reductions reach 2.7%, 5.7%, 7.2%, and 8.8% relative to PaCC, Prob, LCD, and LCE, respectively. Concretely, under a cache capacity of 100 MB and Zipf parameter of 1.0, the average request latency of the proposed strategy is 212.37 ms, compared with 236.67 ms (LCE), 233.45 ms (LCD), 225.42 ms (Prob), and 218.62 ms (PaCC).  Conclusions  This paper presents a congestion-aware adaptive caching placement strategy for LEO satellite networks. By combining real-time congestion monitoring with multi-dimensional content valuation that considers both dynamic popularity and freshness, the strategy achieves balanced improvements in caching efficiency and network stability. Simulation results show that the proposed method markedly enhances cache hit rates, reduces average routing hops, and lowers request latency compared with existing schemes such as PaCC, Prob, LCD, and LCE. These benefits hold across different cache sizes and request distributions, particularly under resource-constrained or highly dynamic conditions, confirming the strategy’s adaptability to LEO environments. The main innovations include a closed-loop feedback mechanism for congestion status, dynamic adjustment of caching thresholds, and hop-aware content placement, which together improve resource utilization and user QoS. This work provides a lightweight and robust foundation for high-performance content delivery in satellite–terrestrial integrated networks. Future extensions will incorporate service-type differentiation (e.g., delay-sensitive vs. bandwidth-intensive services), and orbital prediction to proactively optimize cache migration and updates, further enhancing efficiency and adaptability in 6G-enabled LEO networks.
A Space–Time Joint Waveform for Frequency Diverse Array Radar with Spatial Linear Frequency Modulation Weighting
LAN Yu, ZHOU Jianxiong
Available online  , doi: 10.11999/JEIT250561
Abstract:
  Objective  Frequency Diverse Array (FDA) radar exhibits a fast time-varying beampattern and a space-time coupled steering vector, offering potential advantages for multi-target tracking, wide-area surveillance, and mainlobe interference suppression. However, the beampattern of conventional coherent FDA radar is narrow, resulting in a shorter beam dwell time than that of phased arrays. This limitation prevents the ambiguity function of conventional coherent FDA from achieving both high range resolution and low sidelobe level simultaneously. When the baseband signal is modulated with a Linear Frequency Modulation (LFM) waveform, the ambiguity function presents low range resolution and low sidelobe level. Conversely, when the baseband signal is modulated with a phase-coded waveform, it achieves high range resolution but exhibits high sidelobe levels with strip-like high-gain sidelobes. The degradation in range resolution or sidelobe performance significantly constrains detection capability. To address this problem, this study proposes a novel space-time joint FDA waveform with spatial LFM weighting, which simultaneously achieves high range resolution, low sidelobe level, and reduced Doppler sensitivity.  Methods  The spatial-domain modulation scheme and the time-domain baseband waveform are two interdependent factors that determine the ambiguity function performance of FDA radar. Selecting a time-domain baseband waveform with a thumbtack-shaped ambiguity function enables the range resolution to remain independent of space-time coupling. By modulating the spatial weighting phase, the beampattern shape of the FDA can be adjusted to extend beam dwell time, suppress strip-like high-gain sidelobes, and smooth sidelobe energy distribution. The proposed space-time joint waveform thus achieves both high range resolution and low sidelobe level. Doppler tolerance is another key metric for evaluating ambiguity function performance. A space-time joint waveform with spatial phase-coded weighting exhibits high Doppler sensitivity, leading to significantly elevated sidelobe levels and sharp reductions in transmit beamforming gain. In contrast, the spatial LFM weighting method proposed in this study enhances Doppler tolerance while maintaining desirable range and sidelobe characteristics.  Results and Discussions  By combining the spatial LFM weighting method with a time-domain baseband waveform exhibiting a thumbtack-shaped ambiguity function (e.g., a phase-coded waveform), this study addresses the limitation of conventional coherent FDA waveforms, which cannot simultaneously achieve high range resolution and low sidelobe level. The proposed waveform demonstrates robust pulse compression performance, even under target motion. Simulation experiments were conducted to analyze the ambiguity functions under both stationary and motion conditions, and the results are summarized as follows: (1) The average sidelobe levels near the target peak for the space-time joint FDA waveform with spatial LFM weighting and spatial phase-coded weighting are both approximately –30 dB (Fig.3(a)(b)). In comparison, the average sidelobe level near the target peak for the spatial phase-coded weighting FDA using a time-domain LFM baseband waveform is about –20 dB (Fig.3(c)), while that of the coherent FDA with a time-domain phase-coded waveform is about –12 dB (Fig.3(d)). Thus, the two space-time joint FDA waveforms achieve the lowest average sidelobe levels. (2) The imaging results of both space-time joint FDA waveforms show no strip-like high-gain sidelobes (Fig.4(a)(b)). By contrast, the spatial phase-coded weighting FDA and the coherent FDA with a time-domain phase-coded waveform both display prominent high-gain sidelobes (Fig.4(c)(d)). These sidelobes from high Signal-to-Noise Ratio (SNR) targets can obscure nearby low-SNR targets. (3) All four FDA waveforms achieve a range resolution of 0.75 m (Fig. 5), corresponding to a bandwidth of 200 MHz. (4) Under motion conditions, the space-time joint FDA waveform with spatial phase-coded weighting exhibits a notable increase in peak sidelobe level compared with stationary conditions (Fig. 6(a)). In contrast, the space-time joint FDA waveform with spatial LFM weighting maintains the lowest peak sidelobe level among all four FDA configurations (Fig. 6(b)).  Conclusions  This study proposes a space-time joint FDA waveform with spatial LFM weighting. The proposed waveform effectively resolves the issue of degraded range resolution in conventional coherent FDA systems, ensuring that range resolution depends solely on bandwidth. It also eliminates the strip-like high-gain sidelobes commonly observed in conventional FDA waveforms. Under simulation conditions, the average sidelobe level near the target peak is reduced by approximately 10 dB and 18 dB compared with those of the spatial phase-coded weighting FDA and the coherent FDA with a time-domain phase-coded waveform, respectively. This reduction substantially mitigates the masking of low-SNR targets by sidelobes from high-SNR targets and demonstrates strong Doppler tolerance. However, under relative motion conditions, the proposed waveform exhibits Doppler-angle coupling, which will be addressed in future research through the development of coupling mitigation strategies.
Weakly Supervised Recognition of Aerial Adversarial Maneuvers via Contrastive Learning
ZHU Longjun, YUAN Weiwei, MEN Xuefeng, TONG Wei, WU Qi
Available online  , doi: 10.11999/JEIT250495
Abstract:
  Objective  Accurate recognition of aerial adversarial maneuvers is essential for situational awareness and tactical decision-making in modern air warfare. Conventional supervised approaches face major challenges: obtaining labeled flight data is costly due to the intensive human effort required for collection and annotation, and these methods are limited in capturing temporal dependencies inherent in sequential flight parameters. Temporal dynamics are crucial for describing the evolution of maneuvers, yet existing models fail to fully exploit this information. To address these challenges, this study proposes a weakly supervised maneuver recognition framework based on contrastive learning. The method leverages a small proportion of labeled data to learn discriminative representations, thereby reducing reliance on extensive manual annotations. The proposed framework enhances recognition accuracy in data-scarce scenarios and provides a robust solution for maneuver analysis in dynamic adversarial aerial environments.  Methods  The proposed framework extends the Simple framework for Contrastive Learning of visual Representations (SimCLR) into the time-series domain by incorporating five temporal-specific data augmentation strategies: time compression, masking, permutation, scaling, and flipping. These augmentations generate multi-view samples that form positive pairs for contrastive learning, thereby ensuring temporal invariance in the feature space. A customized ResNet-18 encoder is employed to extract hierarchical features from the augmented time-series data, and a Multi-Layer Perceptron (MLP) projection head maps these features into a contrastive space. The Normalized Temperature-scaled cross-entropy (NT-Xent) loss is adopted to maximize similarity between positive pairs and minimize it between negative pairs, which effectively mitigates pseudo-label noise. To further improve recognition performance, a fine-tuning strategy is introduced in which pre-trained features are combined with a task-specific classification head using a limited amount of labeled data to adapt to downstream recognition tasks. This contrastive learning framework enables efficient analysis of time-series flight data, achieves accurate recognition of fighter aircraft maneuvers, and reduces dependence on large-scale labeled datasets.  Results and Discussions  Experiments are conducted on flight simulation data obtained from DCS World. To address the class imbalance issue, hybrid datasets (Table 1) are constructed, and training data ratios ranging from 2% to 30% are employed to evaluate the effectiveness of the weakly supervised framework. The results demonstrate that contrastive learning effectively captures the temporal patterns within flight data. For example, on the D1 dataset, accuracy with the base method increases from 35.83% with 2% labeled data to 89.62% when the fine-tuning ratio reaches 30% (Tables 36, Fig. 2(a)2(c)). To improve recognition of long maneuver sequences, a linear classifier and a voting strategy are introduced. The voting strategy markedly enhances few-shot learning performance. On the D1 dataset, accuracy reaches 54.5% with 2% labeled data and rises to 97.9% at a 30% fine-tuning ratio, representing a substantial improvement over the base method. On the D6 dataset, which simulates multi-source data fusion scenarios in air combat, the accuracy of the voting method increases from 0.476 with 2% labeled data to 0.928 with 30% labeled data (Fig. 2(d)2(f)), with a growth rate in the low-data phase 53% higher than that of the base method. Additionally, on the comprehensive D7 dataset, the accuracy standard deviation of the voting method is only 0.011 (Fig. 2(g), Fig. 3), significantly lower than the 0.015 observed for the base method. The superiority of the proposed framework can be attributed to two factors: the suppression of noise through integration of multiple prediction results using the voting strategy and the extraction of robust features from unlabeled data via contrastive learning pre-training. Together, these techniques enhance generalization and stability in complex scenarios, confirming the effectiveness of the method in leveraging unlabeled data and managing multi-source information.  Conclusions  This study applies the SimCLR framework to maneuver recognition and proposes a weakly supervised approach based on contrastive learning. By incorporating targeted data augmentation strategies and combining self-supervised learning with fine-tuning, the method exploits the latent information in time-series data, yielding substantial improvements in recognition performance under limited labeled data conditions. Experiments on simulated air combat datasets demonstrate that the framework achieves stable recognition across different data categories, offering practical insights for feature learning and model optimization in time-series classification tasks. Future research will focus on three directions: first, integrating real flight data to evaluate the model’s generalization capability in practical scenarios; second, developing dynamically adaptive data augmentation strategies to enhance performance in complex environments; and third, combining reinforcement learning and related techniques to improve autonomous decision-making in dynamic aerial missions, thereby expanding opportunities for intelligent flight operations.
HRIS-Aided Layered Sparse Reconstruction Hybrid Near- and Far-Field Source Localization Algorithm
YANG Qingqing, PU Xuelai, PENG Yi, LI Hui, YANG Qiuping
Available online  , doi: 10.11999/JEIT250429
Abstract:
  Objective  Advances in Reconfigurable Intelligent Surface (RIS) technology have enabled larger arrays and higher frequencies, which expand the near-field region and improve positioning accuracy. The fundamental differences between near- and far-field propagation necessitate hybrid localization algorithms capable of seamlessly integrating both regimes.  Methods  A localization framework for mixed near- and far-field sources is proposed by integrating Fourth-Order Cumulant (FOC) matrices with hierarchical sparse reconstruction. A hybrid RIS architecture incorporating active elements is employed to directly receive pilot signals, thereby reducing parameter-coupling errors that commonly occur in passive RIS over multi-hop channels and enhancing reliability in Non-Line-of-Sight (NLOS) scenarios. Symmetrically placed active elements are employed to construct three FOC matrices for three-dimensional position estimation. The two-dimensional angle search is decomposed into two sequential one-dimensional searches, where elevation and azimuth are estimated separately to reduce computational complexity. The first FOC matrix (C1), formed from vertically symmetric elements, captures elevation characteristics. The second matrix (C2), constructed from centrally symmetric elements, suppresses nonlinear terms related to distance. The third matrix (C3) applies the previously estimated angles to select active elements, incorporates near-field effects, and enables accurate distance estimation as well as discrimination between near-field and far-field signals. To further improve the efficiency and accuracy of spectral searches, a hierarchical multi-resolution strategy based on sparse reconstruction is introduced. This method partitions the continuous parameter space into discrete intervals, incrementally generates a multi-resolution dictionary, and applies a progressive search procedure for precise position parameter estimation. During the search process, a tuning factor constrains the maximum reconstruction error between the sparse matrix and the projection of the original signal subspace. In addition, the algorithm exploits the orthogonality between the signal and noise subspaces to design a weight matrix, which reduces the effects of noise and position errors on the sparse solution. This hierarchical search enables rapid, coarse-to-fine parameter estimation and substantially improves localization accuracy.  Results and Discussions  The performance of the proposed algorithm is evaluated against Two-Stage Multiple Signal Classification (TSMUSIC), hybrid Orthogonal Matching Pursuit (OMP), and Holographic Multiple-Input Multiple-Output (HMIMO)-based methods with respect to noise resistance, convergence speed, and computational efficiency. Under varying SNR conditions (Fig. 5), traditional subspace methods exhibit degraded performance at low SNR because of reliance on signal–noise subspace orthogonality. In contrast, the proposed algorithm employs the FOC matrix to achieve accurate elevation and azimuth estimation while suppressing Gaussian noise. The hierarchical sparse reconstruction strategy further enhances estimation accuracy, resulting in superior far-field localization performance. Unlike the HMIMO-based algorithm, which depends on dynamic codebook switching, the proposed method retains nonlinear distance-dependent phase terms and constructs the distance codebook from initial angle estimates, thereby improving near-field localization accuracy. In Experiment 2, the effect of varying snapshot numbers on parameter estimation is examined. Owing to the angle-decoupling capability of the FOC matrix, the algorithm achieves rapid reduction in Root Mean Square Error (RMSE) even with a small number of snapshots. As the number of snapshots increases, estimation accuracy improves steadily and approaches convergence, indicating robustness against noise and fast convergence under low-snapshot conditions. Conventional methods typically require predefined near-field and far-field grids. By contrast, the nonlinear phase retention mechanism enables automatic discrimination between near-field and far-field sources without a predetermined distance threshold. While the nonlinear phase term introduces slightly slower convergence during distance decoupling, the proposed method still outperforms TSMUSIC and hybrid OMP. However, angle estimation errors during the decoupling process provide the HMIMO-based approach with a slight advantage in distance estimation accuracy (Fig. 6). Computational complexity is also compared between the hierarchical multi-resolution framework and traditional global search strategies (Fig. 7). Standard hybrid-field localization algorithms, such as TSMUSIC and hybrid OMP, require simultaneous optimization of angle and distance parameters, leading to exponential growth of computational cost. In contrast, the hierarchical strategy applies a phased search in which elevation and azimuth are estimated sequentially, reducing the two-dimensional angle spectrum search to two one-dimensional searches. The combination of progressive grid contraction, layer-by-layer tuning factors, and step-size decay narrows the search range efficiently, enabling rapid convergence through a three-layer dynamic grid structure. The distance dictionary constructed from angle estimates further removes redundant grids, thereby reducing complexity compared with global search methods.  Conclusions  This study presents a 3D localization framework for mixed near- and far-field sources in RIS-assisted systems by combining FOC decoupling with hierarchical sparse reconstruction. The method decouples angle and range estimation and uses a multi-resolution search strategy, achieving reliable performance and rapid convergence even under low SNR conditions and with limited snapshots. Simulation results demonstrate that the proposed approach consistently outperforms TSMUSIC, hybrid OMP, and HMIMO-based techniques, confirming its efficiency and robustness in mixed-field environments.
Visible Figure Part of Multi-source Maritime Ship Dataset
CUI Yaqi, ZHOU Tian, XIONG Wei, XU Saifei, LIN Chuanqi, XIA Mutao, WANG Ziling, GU Xiangqi, SUN Weiwei, LI Haoran, KONG Zhan, TANG Hao, XU Pingliang, ZHANG Jie, DAN Bo, GUO Hengguang, DONG Kai, YU Hongbo, LU Yuan, CHEN Wei, HE Shaowei
Available online  , doi: 10.11999/JEIT250138
Abstract:
  Objective  The increasing intensity of marine resource development and maritime operations has heightened the need for accurate vessel detection under complex marine conditions, which is essential for protecting maritime rights and interests. In recent years, object detection algorithms based on deep learning—such as YOLO and Faster R-CNN—have emerged as key methods for maritime target perception due to their strong feature extraction capabilities. However, their performance relies heavily on large-scale, high-quality training data. Existing general-purpose datasets, such as COCO and PASCAL VOC, offer limited vessel classes and predominantly feature static, urban, or terrestrial scenes, making them unsuitable for marine environments. Similarly, specialized datasets like SeaShips and the Singapore Marine Dataset (SMD) suffer from constraints such as limited data sources, simple scenes, small sample sizes, and incomplete coverage of marine target categories. These limitations significantly hinder further performance improvement of detection algorithms. Therefore, the development of large-scale, multimodal, and comprehensive marine-specific datasets represents a critical step toward resolving current application challenges. This effort is urgently needed to strengthen marine monitoring capabilities and ensure operational safety at sea.  Methods  To overcome the aforementioned challenges, a multi-sensor marine target acquisition system integrating radar, visible-light, infrared, laser, Automatic Identification System (AIS), and Global Positioning System (GPS) technologies is developed. A two-month shipborne observation campaign is conducted, yielding 200 hours of maritime monitoring and over 90 TB of multimodal raw data. To efficiently process this large volume of low-value-density data, a rapid annotation pipeline is designed, combining automated labeling with manual verification. Iterative training of intelligent annotation models, supplemented by extensive manual correction, enables the construction of the Visible Figure Part of the Multi-Source Maritime Ship Dataset (MSMS-VF). This dataset comprises 265,233 visible-light images with 1,097,268 bounding boxes across nine target categories: passenger ship, cargo vessel, speedboat, sailboat, fishing boat, buoy, floater, offshore platform, and others. Notably, 55.84% of targets are small, with pixel areas below 1,024. The dataset incorporates diverse environmental conditions including backlighting, haze, rain, and occlusion, and spans representative maritime settings such as harbor basins, open seas, and navigation channels. MSMS-VF offers a comprehensive data foundation for advancing maritime target detection, recognition, and tracking research.  Results and Discussions  The MSMS-VF dataset exhibits substantially greater diversity than existing datasets (Table 1, Table 2). Small targets, including buoys and floaters, occur frequently (Table 5), posing significant challenges for detection. Five object detection models—YOLO series, Real-Time Detection Transformer (RT-DETR), Faster R-CNN, Single Shot MultiBox Detector (SSD), and RetinaNet—are assessed, together with five multi-object tracking algorithms: Simple Online and Realtime Tracking (SORT), Optimal Compute for SORT (OC-SORT), DeepSORT, ByteTrack, and MotionTrack. YOLO models exhibit the most favorable trade-off between speed and accuracy. YOLOv11 achieves a mAP50 of 0.838 on the test set and a processing speed of 34.43 FPS (Table 6). However, substantial performance gaps remain for small targets; for instance, YOLOv11 yields a mAP50 of 0.549 for speedboats, markedly lower than the 0.946 obtained for large targets such as cargo vessels (Table 7). RT-DETR shows moderate performance on small objects, achieving a mAP50 of 0.532 for floaters, whereas conventional models like Faster R-CNN perform poorly, with mAP50 values below 0.1. For tracking, MotionTrack performs best under low-frame-rate conditions, achieving a MOTA of 0.606, IDF1 of 0.750, and S of 0.681 using a Gaussian distance cascade-matching strategy (Table 8, Fig. 14).  Conclusions  This study presents the MSMS-VF dataset, which offers essential data support for maritime perception research through its integration of multi-source inputs, diverse environmental scenarios, and a high proportion of small targets. Experimental validation confirms the dataset’s utility in training and evaluating state-of-the-art algorithms, while also revealing persistent challenges in detecting and tracking small objects under dynamic maritime conditions. Nevertheless, the dataset has limitations. The current data are predominantly sourced from waters near Yantai, leading to imbalanced ship-type representation and the absence of certain vessel categories. Future efforts will focus on expanding data acquisition to additional maritime regions, broadening the scope of multi-source data collection, and incrementally releasing extended components of the dataset to support ongoing research.
Integrating Representation Learning and Knowledge Graph Reasoning for Diabetes and Complications Prediction
WANG Yuao, HUANG Yeqi, LI Qingyuan, LIU Yun, JING Shenqi, SHAN Tao, GUO Yongan
Available online  , doi: 10.11999/JEIT250798
Abstract:
  Objective  Diabetes mellitus and its complications are recognized as major global health challenges, causing severe morbidity, high healthcare costs, and reduced quality of life. Accurate joint prediction of these conditions is essential for early intervention but is hindered by data heterogeneity, sparsity, and complex inter-entity relationships. To address these challenges, a Representation Learning Enhanced Knowledge Graph-based Multi-Disease Prediction (REKG-MDP) model is proposed. Electronic Health Records (EHRs) are integrated with supplementary medical knowledge to construct a comprehensive Medical Knowledge Graph (MKG), and higher-order semantic reasoning combined with relation-aware representation learning is applied to capture complex dependencies and improve predictive accuracy across multiple diabetes-related conditions.  Methods  The REKG-MDP framework consists of three modules. First, a MKG is constructed by integrating structured EHR data from the MIMIC-IV dataset with external disease knowledge. Patient-side features include demographics, laboratory indices, and medical history, whereas disease-side attributes cover comorbidities, susceptible populations, etiological factors, and diagnostic criteria. This integration mitigates data sparsity and enriches semantic representation. Second, a relation-aware embedding module captures four relational patterns: symmetric, antisymmetric, inverse, and compositional. These patterns are used to optimize entity and relation embeddings for semantic reasoning. Third, a Hierarchical Attention-based Graph Convolutional Network (HA-GCN) aggregates multi-hop neighborhood information. Dynamic attention weights capture both local and global dependencies, and a bidirectional mechanism enhances the modeling of patient–disease interactions.  Results and Discussions  Experiments demonstrate that REKG-MDP consistently outperforms four baselines: two machine learning models (DCKD-RF and bSES-AC-RUN-FKNN) and two graph-based models (KGRec and PyRec). Compared with the strongest baseline, REKG-MDP achieves average improvements in P, F1, and NDCG of 19.39%, 19.67%, and 19.39% for single-disease prediction (\begin{document}$ n=1 $\end{document}); 16.71%, 21.83%, and 23.53% for \begin{document}$ n=3 $\end{document}; and 22.01%, 20.34%, and 20.88% for \begin{document}$ n=5 $\end{document} (Table 4). Ablation studies confirm the contribution of each module. Removing relation-pattern modeling reduces performance metrics by approximately 12%, removing hierarchical attention decreases them by 5–6%, and excluding disease-side knowledge produces the largest decline of up to 20% (Fig. 5). Sensitivity analysis indicates that increasing the embedding dimension from 32 to 128 enhances performance by more than 11%, whereas excessive dimensionality (256) leads to over-smoothing (Fig. 6). Adjusting the \begin{document}$ \beta $\end{document} parameter strengthens sample discrimination, improving P, F1, and NDCG by 9.28%, 27.9%, and 8.08%, respectively (Fig. 7).  Conclusions  REKG-MDP integrates representation learning with knowledge graph reasoning to enable multi-disease prediction. The main contributions are as follows: (1) integrating heterogeneous EHR data with disease knowledge mitigates data sparsity and enhances semantic representation; (2) modeling diverse relational patterns and applying hierarchical attention improves the capture of higher-order dependencies; and (3) extensive experiments confirm the model’s superiority over state-of-the-art baselines, with ablation and sensitivity analyses validating the contribution of each module. Remaining challenges include managing extremely sparse data and ensuring generalization across broader populations. Future research will extend REKG-MDP to model temporal disease progression and additional chronic conditions.
Wave-MambaCT: Low-dose CT Artifact Suppression Method Based on Wavelet Mamba
CUI Xueying, WANG Yuhang, LIU Bin, SHANGGUAN Hong, ZHANG Xiong
Available online  , doi: 10.11999/JEIT250489
Abstract:
  Objective  Low-Dose Computed Tomography (LDCT) reduces patient radiation exposure but introduces substantial noise and artifacts into reconstructed images. Convolutional Neural Network (CNN)-based denoising approaches are limited by local receptive fields, which restrict their abilities to capture long-range dependencies. Transformer-based methods alleviate this limitation but incur quadratic computational complexity relative to image size. In contrast, State Space Model (SSM)–based Mamba frameworks achieve linear complexity for long-range interactions. However, existing Mamba-based methods often suffer from information loss and insufficient noise suppression. To address these limitations, we propose the Wave-MambaCT model.  Methods  The proposed Wave-MambaCT model adopts a multi-scale framework that integrates Discrete Wavelet Transform (DWT) with a Mamba module based on the SSM. First, DWT performs a two-level decomposition of the LDCT image, decoupling noise from Low-Frequency (LF) content. This design directs denoising primarily toward the High-Frequency (HF) components, facilitating noise suppression while preserving structural information. Second, a residual module combined with a Spatial-Channel Mamba (SCM) module extracts both local and global features from LF and HF bands at different scales. The noise-free LF features are then used to correct and enhance the corresponding HF features through an attention-based Cross-Frequency Mamba (CFM) module. Finally, inverse wavelet transform is applied in stages to progressively reconstruct the image. To further improve denoising performance and network stability, multiple loss functions are employed, including L1 loss, wavelet-domain LF loss, and adversarial loss for HF components.  Results and Discussions  Extensive experiments on the simulated Mayo Clinic datasets, the real Piglet datasets, and the hospital clinical dataset DeepLesion show that Wave-MambaCT provides superior denoising performance and generalization. On the Mayo dataset, a PSNR of 31.6528 is achieved, which is higher than that of the suboptimal method DenoMamba (PSNR 31.4219), while MSE is reduced to 0.00074 and SSIM and VIF are improved to 0.8851 and 0.4629, respectively (Table 1). Visual results (Figs. 46) demonstrate that edges and fine details such as abdominal textures and lesion contours are preserved, with minimal blurring or residual artifacts compared with competing methods. Computational efficiency analysis (Table 2) indicates that Wave-MambaCT maintains low FLOPs (17.2135 G) and parameters (5.3913 M). FLOPs are lower than those of all networks except RED-CNN, and the parameter count is higher only than those of RED-CNN and CTformer. During training, 4.12 minutes per epoch are required, longer only than RED-CNN. During testing, 0.1463 seconds are required per image, which is at a medium level among the compared methods. Generalization tests on the Piglet datasets (Figs. 7, 8, Tables 3, 4) and DeepLesion (Fig. 9) further confirm the robustness and generalization capacity of Wave-MambaCT.In the proposed design, HF sub-bands are grouped, and noise-free LF information is used to correct and guide their recovery. This strategy is based on two considerations. First, it reduces network complexity and parameter count. Second, although the sub-bands correspond to HF information in different orientations, they are correlated and complementary as components of the same image. Joint processing enhances the representation of HF content, whereas processing them separately would require a multi-branch architecture, inevitably increasing complexity and parameters. Future work will explore approaches to reduce complexity and parameters when processing HF sub-bands individually, while strengthening their correlations to improve recovery. For structural simplicity, SCM is applied to both HF and LF feature extraction. However, redundancy exists when extracting LF features, and future studies will explore the use of different Mamba modules for HF and LF features to further optimize computational efficiency.  Conclusions  Wave-MambaCT integrates DWT for multi-scale decomposition, a residual module for local feature extraction, and an SCM module for efficient global dependency modeling to address the denoising challenges of LDCT images. By decoupling noise from LF content through DWT, the model enables targeted noise removal in the HF domain, facilitating effective noise suppression. The designed RSCM, composed of residual blocks and SCM modules, captures fine-grained textures and long-range interactions, enhancing the extraction of both local and global information. In parallel, the Cross-band Enhancement Module (CEM) employs noise-free LF features to refine HF components through attention-based CFM, ensuring structural consistency across scales. Ablation studies (Table 5) confirm the essential contributions of both SCM and CEM modules to maintaining high performance. Importantly, the model’s staged denoising strategy achieves a favorable balance between noise reduction and structural preservation, yielding robustness to varying radiation doses and complex noise distributions.
Source Code Vulnerability Detection Method Integrating Code Sequences and Property Graphs
YANG Hongyu, LUO Jingchuan, CHENG Xiang, HU Juncheng
Available online  , doi: 10.11999/JEIT250470
Abstract:
  Objective  Code vulnerabilities create opportunities for hacker intrusions, and if they are not promptly identified and remedied, they pose serious threats to cybersecurity. Deep learning–based vulnerability detection methods leverage large collections of source code to learn secure programming patterns and vulnerability characteristics, enabling the automated identification of potential security risks and enhancing code security. However, most existing deep learning approaches rely on a single network architecture, extracting features from only one perspective, which constrains their ability to comprehensively capture multi-dimensional code characteristics. Some studies have attempted to address this by extracting features from multiple dimensions, yet the adopted feature fusion strategies are relatively simplistic, typically limited to feature concatenation or weighted combination. Such strategies fail to capture interdependencies among feature dimensions, thereby reducing the effectiveness of feature fusion. To address these challenges, this study proposes a source code vulnerability detection method integrating code sequences and property graphs. By optimizing both feature fusion and vulnerability detection processes, the proposed method effectively enhances the accuracy and robustness of vulnerability detection.  Methods  The proposed method consists of four components: feature representation, feature extraction, feature fusion, and vulnerability detection (Fig. 1). First, vector representations of the code sequence and the Code Property Graph (CPG) are obtained. Using word embedding and node embedding techniques, the code sequence and graph nodes are mapped into fixed-dimensional vectors, which serve as inputs for subsequent feature extraction. Next, the pre-trained UniXcoder model is employed to capture contextual information and extract semantic features from the code. In parallel, a Residual Gated Graph Convolution Network (RGGCN) is applied to the CPG to capture complex structural information, thereby extracting graph structural features. To integrate these complementary representations, a Multimodal Attention Fusion Network (MAFN) is designed to model the interactions between semantic and structural features. This network generates informative fused features for the vulnerability detection task. Finally, a MultiLayer Perceptron (MLP) performs classification on the semantic features, structural features, and fused features. An interpolated prediction classifier is then applied to optimize the detection process by balancing multiple prediction outcomes. By adaptively adjusting the model’s focus according to the characteristics of different code samples, the classifier enables the detection model to concentrate on the most critical features, thereby improving overall detection accuracy.  Results and Discussions  To validate the effectiveness of the proposed method, comparative experiments were conducted against baseline approaches on the Devign, Reveal, and SVulD datasets. The experimental results are summarized in (Tables 13). On the Devign dataset, the proposed method achieved an accuracy improvement of 1.38% over SCALE and a precision improvement of 5.19% over CodeBERT. On the Reveal dataset, it improved accuracy by 0.08% compared to SCALE, with precision being closest to that of SCALE. On the SVulD dataset, the method achieved an accuracy improvement of 0.13% over SCALE and a precision gain of 8.15% over Vul-LMGNNs. Collectively, these results demonstrate that the proposed method consistently yields higher accuracy and precision. This improvement can be attributed to its effective integration of semantic information extracted by UniXcoder and structural information captured by RGGCN. By contrast, CodeBERT and LineVul effectively learn code semantics but exhibit insufficient understanding of complex structural patterns, resulting in weaker detection performance. Devign and Reveal employ gated graph neural networks to capture structural information from code graphs but lack the ability to model semantic information contained in code sequences, which constrains their performance. Vul-LMGNNs attempt to improve detection performance by jointly learning semantic and structural features; however, their feature fusion strategy relies on simple concatenation. This approach fails to account for correlations between features, severely limiting the expressive power of the fused representation and reducing detection performance. In contrast, the proposed method fully leverages and integrates semantic and structural features through multimodal attention fusion. By modeling feature interactions rather than treating them independently, it achieves superior accuracy and precision, enabling more effective vulnerability detection.  Conclusions  Fully integrating code features across multiple dimensions can significantly enhance vulnerability detection performance. Compared with baseline methods, the proposed approach enables deeper modeling of interactions among code features, allowing the detection model to develop a more comprehensive understanding of code characteristics and thereby achieve superior detection accuracy and precision.
Research on Directional Modulation Multi-carrier Waveform Design for Integrated Sensing and Communication
HUANG Gaojian, ZHANG Shengzhuang, DING Yuan, LIAO Kefei, JIN Shuanggen, LI Xingwang, OUYANG Shan
Available online  , doi: 10.11999/JEIT250680
Abstract:
  Objective  With the concurrent evolution of wireless communication and radar technologies, spectrum congestion has become increasingly severe. Integrated Sensing and Communication (ISAC) has emerged as an effective approach that unifies sensing and communication functionalities to achieve efficient spectrum and hardware sharing. Orthogonal Frequency Division Multiplexing (OFDM) signals are regarded as a key candidate waveform due to their high flexibility. However, estimating target azimuth angles and suppressing interference from non-target directions remain computationally demanding, and confidential information transmitted in these directions is vulnerable to eavesdropping. To address these challenges, the combination of Directional Modulation (DM) and OFDM, termed OFDM-DM, provides a promising solution. This approach enables secure communication toward the desired direction, suppresses interference in other directions, and reduces radar signal processing complexity. The potential of OFDM-DM for interference suppression and secure waveform design is investigated in this study.  Methods  As a physical-layer security technique, DM is used to preserve signal integrity in the intended direction while deliberately distorting signals in other directions. Based on this principle, an OFDM-DM ISAC waveform is developed to enable secure communication toward the target direction while simultaneously estimating distance, velocity, and azimuth angle. The proposed waveform has two main advantages: the Bit Error Rate (BER) at the radar receiver is employed for simple and adjustable azimuth estimation, and interference from non-target directions is suppressed without additional computational cost. The waveform maintains the OFDM constellation in the target direction while distorting constellation points elsewhere, which reduces correlation with the original signal and enhances target detection through time-domain correlation. Moreover, because element-wise complex division in the Two-Dimensional Fast Fourier Transform (2-D FFT) depends on signal integrity, phase distortion in signals from non-target directions disrupts phase relationships and further diminishes the positional information of interference sources.  Results and Discussions  In the OFDM-DM ISAC system, the transmitted signal retains its communication structure within the target beam, whereas constellation distortion occurs in other directions. Therefore, the BER at the radar receiver exhibits a pronounced main lobe in the target direction, enabling accurate azimuth estimation (Fig. 5). In the time-domain correlation algorithm, the target distance is precisely determined, while correlation in non-target directions deteriorates markedly due to DM, thereby achieving effective interference suppression (Fig. 6). Additionally, during 2-D FFT processing, signal distortion disrupts the linear phase relationship among modulation symbols in non-target directions, causing conventional two-dimensional spectral estimation to fail and further suppressing positional information of interference sources (Fig. 7). Additional simulations yield one-dimensional range and velocity profiles (Fig. 8). The results demonstrate that the OFDM-DM ISAC waveform provides structural flexibility, physical-layer security, and low computational complexity, making it particularly suitable for environments requiring high security or operating under strong interference conditions.  Conclusions  This study proposes an OFDM-DM ISAC waveform and systematically analyzes its advantages in both sensing and communication. The proposed waveform inherently suppresses interference from non-target directions, eliminating target ambiguity commonly encountered in traditional ISAC systems and thereby enhancing sensing accuracy. Owing to the spatial selectivity of DM, only legitimate directions can correctly demodulate information, whereas unintended directions fail to recover valid data, achieving intrinsic physical-layer security. Compared with existing methods, the proposed waveform simultaneously attains secure communication and interference suppression without additional computational burden, offering a lightweight and high-performance solution suitable for resource-constrained platforms. Therefore, the OFDM-DM ISAC waveform enables high-precision sensing while maintaining communication security and hardware feasibility, providing new insights for multi-carrier ISAC waveform design.
Considering Workload Uncertainty in Strategy Gradient-Based Hyper-Heuristic Scheduling for Software Projects
SHEN Xiaoning, SHI Jiangyi, MA Yanzhao, CHEN Wenyan, SHE Juan
Available online  , doi: 10.11999/JEIT250769
Abstract:
  Objective  The Software Project Scheduling Problem (SPSP) is critical for optimizing resource allocation and task sequencing in software development, directly impacting economic efficiency and market competitiveness. However, traditional SPSP models assume deterministic task attributes, ignoring pervasive uncertainties such as fluctuating efforts due to demand changes or estimation inaccuracies. These limitations often lead to infeasible or suboptimal scheduling solutions in real-world dynamic environments. To address this gap, this study establishes a novel multi-objective software project scheduling model that explicitly incorporates task effort uncertainty. The model employs asymmetric triangular interval type-2 fuzzy numbers to robustly characterize effort variability, ensuring a realistic representation of complex software development environments. The primary objective is to enhance decision-making quality under uncertainty by developing an efficient optimization algorithm that minimizes project duration while maximizing employee satisfaction, thereby improving scheduling robustness and adaptability in dynamic software projects.  Methods  A Policy Gradient-based Hyper-Heuristic Algorithm (PGHHA) is developed to address the formulated model. The algorithm framework consists of a High Level Strategy (HLS) and a set of Low Level Heuristics (LLHs). HLS employs an Actor-Critic reinforcement learning architecture, where the Actor network selects appropriate LLHs based on real-time evolutionary states characterized by population convergence and diversity, while the Critic network evaluates the quality of the selected actions. Eight LLHs are designed by combining two global search operators (the matrix crossover operator and the Jaya operator with random Jitter) with two local mining strategies (duration-based local search and satisfaction-based local search). Each LLH is further configured with two levels of neighborhood search depth (V1=5, V2=20), whose values are determined through Taguchi orthogonal experiments. Each individual is encoded as a real-valued task-employee effort matrix, and constraints such as skill coverage, the maximum dedication and the maximum participant limit are enforced during optimization. To accelerate convergence, a prioritized experience replay mechanism is integrated to sample and learn from historical interaction trajectories, thereby efficiently updating network parameters.  Results and Discussions  Experimental evaluations were conducted on 12 synthetic instances and three real-world software projects. The proposed strategies were validated and our algorithm PGHHA was compared with six state-of-the-art ones. Performance was measured using Hypervolume Ratio (HVR) and Inverted Generational Distance (IGD), with statistical significance assessed via Wilcoxon rank-sum tests at a significance level of 0.05. Results demonstrate that PGHHA significantly outperforms all comparison algorithms in both convergence and diversity across most test instances (Table 5, Table 6). Visual comparisons of Pareto fronts (Fig. 4, Fig. 5) further confirm that solutions obtained by PGHHA are located below those of other algorithms, reflecting enhanced convergence precision, while also exhibiting better spread and uniformity. Although PGHHA requires longer computational time due to the neural network training and experience replay mechanism (Fig. 6), the significant improvement in solution quality is considered acceptable given the typically longer cycles of software development projects. The incorporation of asymmetric triangular interval type-2 fuzzy numbers effectively handles task effort uncertainty, and the dynamic selection of LLH via the Actor-Critic framework, combined with prioritized experience replay, contributes to the algorithm’s robust performance in uncertain environments. These results validate that PGHHA provides a more effective scheduling support tool, balancing multiple objectives under uncertainty without compromising solution diversity.  Conclusions  This paper establishes a multi-objective software project scheduling model that incorporates task effort uncertainty using asymmetric triangular interval type-2 fuzzy numbers. To solve this model, a policy gradient-based hyper-heuristic algorithm is proposed, which employs an Actor-Critic reinforcement learning framework as the high-level strategy to dynamically select low-level heuristics according to the evolutionary state of the population. A prioritized experience replay mechanism is integrated to improve learning efficiency and convergence speed. Experimental results on synthetic and real-world instances demonstrate that: (1) The proposed algorithm achieves significantly better convergence and diversity in uncertain environments compared to six state-of-the-art algorithms; (2) The combination of global search operators and local mining strategies effectively balances exploration and exploitation during evolution; (3) The use of type-2 fuzzy numbers provides a more robust representation of effort uncertainty than type-1 fuzzy numbers. However, the current model is limited to single-project scenarios. In future research, the model will be extended to multi-project scheduling environments with shared resources and cross-project dependencies. Furthermore, the integration of more adaptive reward mechanisms and lightweight neural architectures will be explored to reduce computational cost while maintaining solution quality.
Unsupervised 3D Medical Image Segmentation With Sparse Radiation Measurement
XIAOFAN Yu¹, LANLAN Zou², WENQI Gu², JUN Cai, BIN Kang², KANG Ding
Available online  , doi: 10.11999/JEIT250841
Abstract:
  Objective  Three-dimensional medical image segmentation is recognized as a pivotal task in modern medical image analysis. Compared with two-dimensional imaging, it captures the spatial morphology of organs and lesions more comprehensively and provides clinicians with detailed structural information, thereby facilitating early disease screening, personalized surgical planning, and treatment evaluation. With rapid advances in artificial intelligence, three-dimensional segmentation is increasingly regarded as a key technology for diagnostic support, precision therapy, and intraoperative navigation. However, existing methods such as SwinUNETR-v2 and UNETR++ rely heavily on large-scale voxel-level annotations, which incur high annotation costs and limit clinical applicability. Moreover, high-quality segmentation frequently depends on multi-view projections to obtain complete volumetric data, resulting in increased radiation exposure and physiological burden for patients. Consequently, segmentation under sparse radiation measurements is posed as an important challenge. Neural Attenuation Fields (NAF) have recently been proposed as a promising approach for low-dose reconstruction by recovering linear attenuation coefficient fields from sparse views. Nevertheless, their potential for three-dimensional segmentation remains largely unexplored. To address this gap, a unified framework named NA-SAM3D is proposed, which integrates NAF-based reconstruction with interactive segmentation to achieve unsupervised 3D segmentation under sparse-view conditions, thereby reducing annotation dependence and improving boundary perception.  Methods  The proposed framework is designed in two stages. In the first stage, sparse-view reconstruction is performed using NAF to generate a continuous three-dimensional attenuation coefficient tensor from sparse X-ray projections. Ray sampling and positional encoding are applied to arbitrary 3D points, and the encoded features are passed into a multi-layer perceptron (MLP) to predict linear attenuation coefficients that serve as input for subsequent segmentation. In the second stage, interactive segmentation is conducted. A three-dimensional image encoder is used to extract high-dimensional features from the attenuation coefficient tensor, while clinician-provided point prompts indicate regions of interest. These prompts are embedded into semantic features by an interactive user module and fused with image features to guide the mask decoder in producing preliminary masks. Because point prompts provide only local positional cues, boundary ambiguity and mask over-expansion may occur. To mitigate these issues, a Density-Guided Module (DGM) is introduced at the decoder output stage: NAF-derived attenuation coefficients are converted into a density-aware attention map, which is fused with preliminary mask predictions to strengthen tissue boundary perception and improve segmentation accuracy in complex anatomical regions.  Results and Discussions  NA-SAM3D is validated on a self-constructed colorectal cancer dataset comprising 299 patient cases (in collaboration with Nanjing Hospital of Traditional Chinese Medicine) and on two public benchmarks: the Lung CT Segmentation Challenge (LCTSC) and the Liver Tumor Segmentation Challenge (LiTS). Experimental results show that NA-SAM3D achieves overall better performance than the mainstream unsupervised 3D segmentation methods based on full radiation observation (SAM-MED series) and reaches accuracy comparable to or even higher than the fully supervised model SwinUNETR-v2. Compared with SAM-MED3D, NA-SAM3D improves the Dice on the LCTSC dataset by more than 3%, while HD95 and ASD decrease by 5.29 mm and 1.32 mm, respectively, demonstrating better boundary localization and surface consistency. Compared with the sparse-field-based segmentation method SA3D, NA-SAM3D achieves higher Dice scores on all three datasets (Table 1). Compared with the fully supervised model SwinUNETR-v2, NA-SAM3D reduces HD95 by 1.28 mm, and the average Dice is only 0.3% lower. Compared with SA3D, NA-SAM3D increases the average Dice by about 6.6% and reduces HD95 by about 11 mm, further verifying its ability to recover structural details and boundary information under sparse-view conditions (Table 2). Although the overall performance of NA-SAM3D is slightly lower than that of the fully supervised UNETR++ model, it still demonstrates strong competitiveness and good generalization under label-free inference. Qualitative results show that in complex pelvic and intestinal regions, NA-SAM3D produces clearer boundaries and higher contour consistency (Fig. 3). On public datasets, segmentation of the lung and liver also shows superior boundary localization and contour integrity (Fig. 4). Three-dimensional visualization further verifies that in colorectal, lung, and liver regions, NA-SAM3D achieves better structural continuity and boundary preservation than SAM-MED2D and SAM-MED3D (Fig. 5). The Density-Guided Module further improves boundary sensitivity, increasing Dice and mIoU by 1.20% and 3.31% on the self-constructed dataset, and by 4.49 and 2.39 percentage points on the LiTS dataset (Fig. 6).  Conclusions  An unsupervised 3D medical image segmentation framework, NA-SAM3D, is proposed, which integrates NAF reconstruction with interactive 3D segmentation to achieve high-precision segmentation under sparse radiation measurements. The Density-Guided Module effectively leverages attenuation coefficient priors to enhance recognition of complex lesion boundaries. Experimental results demonstrate that the method approaches the performance of fully supervised approaches under unsupervised inference, with an average 2.0% Dice improvement, indicating substantial practical value and clinical potential for low-dose imaging and complex anatomical segmentation. Future work will focus on optimizing the model for additional anatomical regions and evaluating its practical application in preoperative planning.
Optimization of Short Packet Communication Resources for UAV Assisted Power Inspection
CHU Hang, DONG Zhihao, CAO Jie, SHI Huaifeng, ZENG Haiyong, ZHU Xu
Available online  , doi: 10.11999/JEIT250852
Abstract:
  Objective  In Unmanned Aerial Vehicles(UAV)-assisted power grid inspection, real-time collection and transmission of multi-modal data (key parameters, images, and videos) are critical for secure grid operation. These tasks present heterogeneous communication demands, including ultra-reliable low-latency and real-time high-bandwidth. However, the scarcity of wireless communication resources and UAV energy constraints make these demands difficult to meet, which in turn compromises data timeliness and overall task effectiveness. To address these challenges, this article aims to develop a collaborative optimization framework for data transmission scheduling and communication resource allocation, thereby minimizing system overhead while strictly satisfying task performance and reliability requirements.  Methods  To address the challenges mentioned above, this article constructs a collaborative optimization framework for data transmission scheduling and communication resource allocation. In terms of data transmission scheduling, it is modeled as a Markov Decision Process (MDP), incorporating communication consumption into the decision cost. At the resource allocation level, Non-Orthogonal Multiple Access (NOMA) technology is introduced to improve spectral efficiency. This method can significantly reduce communication costs while ensuring transmission reliability, providing effective support for heterogeneous data transmission in UAV-assisted power inspection scenarios.  Results and Discussions  To verify the effectiveness of the proposed framework, comprehensive simulations were conducted. A scenario was established where the task of the drone is to collect data from multiple distributed power towers within a designated area. There is a trade-off between reliability and speed (Fig. 3). At the same transmission rate, the bit error rate can be reduced by about an order of magnitude. When the minimum long-packet signal-to-noise ratio threshold of 7 dB is adopted in the simulation, the optimized transmission system can reduce the bit error rate from the 10–3 level to the 10–5 level while sacrificing only about 0.4 Mbps of transmission rate. After algorithm optimization, a lower effective signal-to-noise ratio is required at the same bit error rate; under the same signal-to-noise ratio, the short-packet error rate is better, which means that the system performance is more stable and the transmission efficiency is higher (Fig. 4).  Conclusions  This paper proposes a novel collaborative optimization framework that effectively addresses the challenges of limited resources and heterogeneous demands in UAV power inspection. By establishing a coordinated framework that deeply integrates MDP-based adaptive scheduling with NOMA-based joint resource allocation, it successfully balances the trade-off between communication performance and system overhead. This work provides a valuable theoretical and practical foundation for achieving efficient, low-cost, and reliable data transmission in future intelligent autonomous aerial systems..
High Area-efficiency Radix-4 NTT Hardware Architecture with Conflict-free Memory Access Optimization for Lattice-based Cryptography
ZHENG Jiwen, ZHAO Shilei, ZHANG Ziyue, LIU Zhiwei, YU Bin, HUANG Hai
Available online  , doi: 10.11999/JEIT250687
Abstract:
  Objective  With the advancement of Post-Quantum Cryptography (PQC) standardization, the efficient implementation of Number Theoretic Transform (NTT) modules has become crucial for enhancing the performance of PQC algorithms. While existing research on high-radix NTT primarily focuses on optimizing in-place computation structures and achieving configurability, it often overlooks performance bottlenecks caused by complex memory access patterns and lacks dedicated optimizations for the specific parameters of PQC algorithms. To address these challenges, this paper proposes a high area-efficiency radix-4 NTT design with a Constant-Geometry (CG) structure. The solution implements targeted optimizations for the modular multiplication unit by analyzing the common characteristics of moduli and integrating multi-level operations, while optimizing memory allocation and address generation strategies to effectively reduce memory capacity and improve data access efficiency. This paper provides an efficient solution for implementing radix-4 CG NTT in out-of-place storage, achieving conflict-free memory access.  Methods  At the algorithmic level, the proposed radix-4 CG NTT/INTT employs a low-complexity design and eliminates the bit-reversal operation to reduce multiplication count and computation cycles, with a correspondingly redesigned twiddle factor access mechanism. Regarding the most time-consuming modular multiplication in the radix-4 butterfly unit, the critical path is effectively shortened by integrating the multiplication with the first-stage \begin{document}${\text{K}} - {\text{RED}}$\end{document} reduction and simplifying the correction logic. Furthermore, to support three parameter configurations, a scalable modular multiplication scheme is proposed by analyzing common characteristics of different moduli. At the architectural level, a strategy is employed where two coefficients are concatenated and stored at the same memory address. By designing a data decomposition and reorganization scheme, the interaction between the memory and the dual-butterfly units is efficiently managed. To efficiently achieve conflict-free memory access, a cyclic memory space reuse strategy is employed, and read and write address generation schemes based on sequential and stepped access patterns are designed, significantly reducing the required memory capacity and the complexity of control logic.  Results and Discussions  Experimental results on Field Programmable Gate Arrays (FPGAs) demonstrate that the proposed NTT architecture achieves high operating frequencies and low resource consumption under three parameter configurations, while also significantly improving the Area-Time Product (ATP) compared to existing designs (Table 5). For the configuration with 256 terms and a modulus of 7681, it consumes 2397 slices, 4 BRAMs, and 16 DSPs, achieving an operating frequency of 363 MHz, resulting in at least a 56.4% improvement in ATP. For the configuration with 256 terms and a modulus of 8380417, it consumes 3760 slices, 6 BRAMs, and 16 DSPs, achieving an operating frequency of 338 MHz, resulting in at least a 69.8% improvement in ATP. For the configuration with 1024 terms and a modulus of 12289, it consumes 2379 slices, 4 BRAMs, and 16 DSPs, achieving an operating frequency of 357 MHz, resulting in at least a 50.3% improvement in ATP.  Conclusions  This paper proposes a high area-efficiency radix-4 NTT hardware architecture for lattice-based PQC algorithms. By employing a low-complexity radix-4 CG NTT/INTT and eliminating the bit-reversal operation, the latency is reduced. By analyzing the common characteristics of three specific moduli and merging partial computations, a scalable modular multiplication architecture based on \begin{document}${\text{K}}{}^2 - {\text{RED}}$\end{document} reduction is designed. This paper tackles the challenges of storage space doubling and address generation complexity by efficiently reusing memory and designing address generation schemes based on sequential and stepped access patterns. Experimental results demonstrate that the proposed scheme significantly improves operating frequency and reduces resource consumption, resulting in lower ATP under three parameter configurations. However, this study only considers a dual-butterfly unit architecture. Future research should further explore architectural designs with higher degrees of parallelism to meet performance requirements for various application scenarios.
Full Field-of-View Optical Calibration with Microradian-Level Accuracy for Space Laser Communication Terminals on LEO Constellation Applications
XIE Qingkun, XU Changzhi, BIAN Jingying, ZHENG Xiaosong, ZHANG Bo
Available online  , doi: 10.11999/JEIT250734
Abstract:
  Objective  The coarse pointing assembly (CPA) serves as a core component in laser communication systems, enabling wide-field scanning, active orbit-attitude compensation, and dynamic disturbance isolation. To counteract multi-source disturbances such as orbital perturbations and attitude maneuvers, it is essential to develop a high-precision, high-bandwidth, and fast-response pointing, acquisition, and tracking (PAT) algorithm. Establishing a full field-of-view (FOV) optical calibration model between the CPA and the detector is critical to suppress image degradation caused by spatial pointing deviations. Conventional calibration methods typically employ ray tracing to simulate beam offsets and infer calibration relationships. However, they exhibit several inherent limitations, including: high modeling complexity arising from non-coaxial paths, multi-reflective surfaces, and freeform optics; susceptibility to systematic errors due to assembly tolerances, detector non-uniformity, and thermal drift; limited applicability across the full FOV resulting from spatial anisotropy. To overcome these technical barriers and ensure stable and reliable laser communication links, a high-precision calibration method applicable over the entire FOV is urgently needed.  Methods  To achieve precise CPA-detector calibration and overcome the shortcomings of traditional methods, this paper proposes a full field-of-view optical calibration method with microradian-level accuracy. Based on the optical design features of periscope-type laser terminals, an equivalent optical transmission model of the CPA is established and the mechanism of image rotation is analyzed. Leveraging the structural rigidity of the optical transceiver channel, the optical transmission matrix is simplified to a constant matrix, yielding a full-space calibration model that directly relates CPA micro-perturbations to the spot displacements. By correlating the CPA rotation angles between the calibration target points and the actual operating positions, the calibration task is further reduced to estimating the calibration matrix at the target points. Random micro-perturbations are applied to the CPA, inducing corresponding micro-displacements of the detector spot. A calibration equation based on the CPA motion and spot displacement is formulated, and the calibration matrix is accurately estimated via least-squares regression. Finally, the full-space calibration relationship between the CPA and detector is derived through matrix operations.  Results and Discussions  Based on the proposed calibration method, an experimental platform (Fig. 4) is built to conduct calibration and verification using a periscope laser terminal. Accurate measurements of the conjugate motion relationship between the CPA and the CCD detector spot are obtained (Tab.1). To comprehensively assess the calibration accuracy and full-space applicability, systematic verification is executed, including single-step static pointing and continuous dynamic tracking. In the static pointing verification, the mechanical rotary table is moved to three extreme diagonal positions, and the CPA performs open-loop pointing based on the established CPA-detector calibration relationship. Experimental results confirm that the spot accurately reaches the intended target position (Fig. 5), with a pointing accuracy of less than 12μrad (RMS). In the dynamic tracking experiment, the system control parameters are optimized to ensure stable tracking of the platform beam. During the low-angular-velocity motion of the rotary table, the laser terminal maintains stable tracking (Fig. 6). The CPA trajectory exhibits a clear conjugate relationship with the rotary table motion (Fig. 6(a), Fig. 6(b)), and the tracking accuracy in both orthogonal directions is less than 4 μrad (Fig. 6(c), Fig. 6(d)). Furthermore, the independence of the optical transmission matrix from calibration target point selection is discussed. By improving the spatial accessibility of calibration points, this method reduces operational complexity without compromising calibration precision. Strategic optimization of the spatial distribution of calibration points further enhances calibration efficiency and accuracy.  Conclusions  This paper proposes a full field-of-view optical calibration method with microradian-level accuracy, based on single-target micro-perturbation measurement. To meet the engineering requirements of rapid linking and stable tracking, a full-space optical matrix model for the CPA-detector calibration is built via matrix optics. Random micro-perturbations applied to the CPA at a single target point yield a generalized transfer equation, from which the calibration matrix is determined by the least-squares estimation. Experimental results demonstrate that this model effectively mitigates issues such as image rotation, mirroring, and tracking anomalies, suppressing calibration residuals below 12 μrad across the entire FOV and limiting the dynamic tracking error to within 5 μrad per axis. The method eliminates the need for additional hardware and complex alignment procedures, providing a high-precision, low-complexity solution that supports rapid deployment in the mass production of Low-Earth-orbit (LEO) laser terminals.
3D Localization Method with Uniform Circular Array Driven by Complex Subspace Neural Network
JIANG Wei, ZHI Boxin, YANG Junjie, WAN hui, DING Pengfei, ZHANG Zheng
Available online  , doi: 10.11999/JEIT250395
Abstract:
  Objective  With the growing demand for high-precision indoor localization in intelligent service scenarios, existing positioning technologies still face significant challenges in complex environments, where factors such as signal frequency offset, multipath propagation, and noise interference severely degrade localization accuracy. To address these challenges, this paper proposes a 3D localization method with a uniform circular array (UCA) driven by Complex Subspace Neural Network (CSNN), aiming to enhance accuracy and robustness in complex environments.  Methods  The proposed method establishes a complete localization pipeline, based on the hierarchical signal processing framework, encompassing frequency offset compensation, two-dimensional angle estimation, and spatial mapping (Fig. 2). Firstly, a dual-estimation frequency compensation algorithm is proposed. By separately estimating the frequency offsets during the CTE reference period and sample period, the frequency estimate obtained from the reference period of the CTE signal is used to disambiguate the frequency estimation in the antenna sample period, enabling high-precision frequency compensation. Subsequently, the CSNN algorithm is constructed to estimate two-dimensional angle (Fig. 3) in which Complex-Valued Convolutional Neural Network (CVCNN) (Fig. 4) is introduced to calibrate the covariance matrix of received signals, effectively suppressing correlated noise and multipath interference. Furthermore, based on the theory of mode space transformation, the calibrated covariance matrix is projected onto a virtual uniform linear array. Then the azimuth and elevation angles are jointly estimated by the ESPRIT algorithm. Finally, the estimated angles from three access points (AP) are fused to achieve position estimation.  Results and Discussions  The experiments are conducted to evaluate the performance of the proposed method. For frequency offset suppression, the proposed dual-estimation frequency compensation algorithm significantly reduces the adverse impact on angle estimation, improving estimation accuracy by 91.7% compared to uncorrected data and showing clear improvements over commonly used methods (Fig. 6). Regarding angle estimation, the CSNN algorithm achieves over 40% and 25% error reduction in azimuth and elevation, respectively, compared to the MUSIC algorithm under simulation conditions (Fig. 7), and also verifies the CVCNN module’s capability to suppress various interferences. In practical experiments, the CSNN algorithm achieves an average azimuth error of 1.07° and an elevation error of 1.28° in the training scenario (Table 1, Fig. 10). Moreover, generalization experiments in three distinct indoor environments—warehouse, corridor, and office—show that the average angular errors remain low at 2.78° for azimuth and 3.39° for elevation (Table 2, Fig. 11). Finally, the proposed method maintains an average positioning accuracy of 28.9 cm in 2D and 36.5 cm in 3D after cross-scene migration (Table 4, Fig. 13).  Conclusions  The proposed high-precision indoor localization method integrates dual-estimation frequency compensation, CSNN angle estimation algorithm and three-AP cooperative localization. It has excellent performance in both simulation and real-environment experiments. The method exhibits strong cross-scene adaptability accuracy which meets the requirements of high-precision indoor localization.
Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation
ZHAO Chuanbin, XU Weihua, LIN bo, ZHANG Tengyu, FENG Yuan, GAO Feifei
Available online  , doi: 10.11999/JEIT250685
Abstract:
  Objective  Integrated Sensing and Communications (ISAC) has become a key enabling technology for the sixth generation mobile communications (6G) network, which can sense and monitor various information in the physical world while communicating with the users, thereby empowering emerging application scenarios such as low altitude economy, digital twin, and vehicle networking. Currently, existing ISAC research mainly focuses on wireless devices that include base stations and terminals. Meanwhile, visual sensing has been a hot research topic in the field of computer science for a long time. Visual sensing has many advantages such as strong visibility and rich details. This paper proposes to combine visual sensing with wireless device sensing to construct a new multimodal ISAC system. Among them, vision can sense the environment and then assist wireless communications, and wireless signals can also help break through the limitations of visual sensing.  Methods  This paper first explores the inherent correlation mechanism between environmental vision and wireless communications. Then, we discuss many key algorithms for visual sensing assisted wireless communication, including beam prediction, occlusion prediction, and resource scheduling and allocation methods for multiple base stations and users. These schemes confirm that visual sensing as prior information can enhance the communications performance of multimodal ISAC system. Next, we discuss the new sensing gains brought by wireless devices combined with visual sensor. Specifically, we propose the static environment reconstruction and dynamic target sensing schemes based on wireless signal and visual fusion, aiming to obtain global information of the physical world. Besides, we construct a "vision communication" simulation and measurement dataset, forming a complete theoretical and technical method for multimodal ISAC.  Results and Discussions  In terms of visual sensing assisted wireless communications, the hardware prototype system constructed in this paper is shown in Figure 6 and Figure 7, and the hardware test results are shown in Table 1. It can be seen that visual sensing can help millimeter wave communications better complete tasks such as beam alignment and beam prediction, thereby enhancing system communications performance. In terms of wireless communication assisted sensing, the hardware prototype system constructed in this article is shown in Figure 8, and the experimental results are presented in Figure 10 and Table 2. It can be seen that the static environment reconstruction effect combining wireless signals and visual sensors is more robust and has higher accuracy. The depth estimation of visual and communications fusion has strong robustness in rainy and snowy weather, and the RMSE error is reduced by about 50% compared to pure visual algorithms. These experimental results indicate that visual enabled multimodal ISAC systems have great potential for application.  Conclusions  This paper proposes to combine visual sensing with wireless device sensing to construct a new multimodal ISAC system. Among them, vision can sense the environment and then assist wireless communications, and wireless signals can also help break through the limitations of visual sensing. We discuss many key algorithms for visual sensing assisted wireless communication, including beam prediction, occlusion prediction, and resource scheduling and allocation methods for multiple base stations and users. We discuss the new sensing gains brought by wireless devices combined with visual sensor. Specifically, we propose the static environment reconstruction and dynamic target sensing schemes based on wireless signal and visual fusion, aiming to obtain global information of the physical world. Besides, we construct a "vision communication" simulation and measurement dataset, forming a complete theoretical and technical method for multimodal ISAC. The experimental results indicate that visual enabled multimodal ISAC systems have great potential for application in 6G networks.
Defeating Voice Conversion Forgery by Active Defense with Diffusion Reconstruction
TIAN Haoyuan, CHEN Yuxuan, CHEN Beijing, FU Zhangjie
Available online  , doi: 10.11999/JEIT250709
Abstract:
  Objective  Voice deep generation technology has been able to generate realistic speech. While enriching people’s entertainment and daily lives, it is also easily abused by malicious actors for voice forgery, thereby posing significant risks to personal privacy and social security. As one of the mainstream defense technologies against voice forgery, the existing active defense techniques have achieved certain achievements, but their performance remains average in balancing defense ability with the imperceptibility of defensive examples, as well as in robustness.  Methods  This paper proposes an active defense method against voice conversion forgery by diffusion reconstruction. The proposed method utilizes the diffusion vocoder PriorGrad as a generator, which guiding the gradual denoising process based on the diffusion prior of the speech to be protected, and reconstructs the speech to be protected, directly obtaining defensive speech examples. Moreover, the proposed method introduces a multi-scale auditory perceptual loss, suppressing the perturbation amplitude of frequency bands sensitive to the human auditory system, thereby enhancing the imperceptibility of defensive examples.  Results and Discussions  The defense experiments on four leading voice conversion models show that, while maintaining the imperceptibility of defensive speech examples and using speaker verification accuracy as the objective metric, compared with the second-best method, the proposed method improves defense ability on average by about 32% in white-box scenarios and about 16% in black-box scenarios, and achieves a better balance between defense ability and imperceptibility (Table 2). In the robustness experiment, compared with the second-best method, the proposed method achieves an average improvement of about 29% in white-box scenarios and about 18% in black-box scenarios under three types of compression attacks (Table 3), as well as an average improvement of about 35% in the white-box scenario and about 17% in the black-box scenario under Gaussian filtering attack (Table 4); In the ablation experiments, the proposed method using the multi-scale auditory perceptual loss achieves a 5% to 10% improvement in defense ability compared with the method using a single-scale auditory perceptual loss (Table 5).  Conclusions  An active defense method against voice conversion forgery by diffusion reconstruction is proposed in this paper. The method directly reconstructs defensive speech examples that better approximate the distribution of the original target speech data through the diffusion vocoder, and combines a multi-scale auditory perceptual loss to further enhance the imperceptibility of the defensive speech. Experimental results show that, compared with existing methods, the proposed method not only achieves superior defense performance in both white-box and black-box scenarios, but also exhibits robustness against compression coding and smoothing filtering. Although the proposed method attains significant results in defense performance and robustness, its computational efficiency still needs to be further improved. Therefore, future work will focus on exploring diffusion generators with a single time step or fewer time steps in order to improve computational efficiency while maintaining defense performance as much as possible.
Performance Analysis of Double RIS-Assisted Multi-Antenna Cooperative NOMA with Short-Packet Communication
SONG Wenbin, CHEN Dechuan, ZHANG Xingang, WANG Zhipeng, SUN Xiaolin, WANG Baoping
Available online  , doi: 10.11999/JEIT250761
Abstract:
  Objective  Numerous existing studies on short-packet communication systems rely on the assumption of ideal transceiver devices. However, this assumption is unrealistic because radio-frequency transceiver hardware inevitably suffers from impairments such as phase noise and amplifier nonlinearity. Such impairments are particularly pronounced in short-packet communication systems, where low-cost hardware components are widely employed. However, the performance of reconfigurable intelligent surface (RIS)-assisted multi-antenna cooperative non-orthogonal multiple access (NOMA) short-packet communication systems with hardware impairments has not been investigated. Furthermore, the impact of the number of base station (BS) antennas and RIS reflecting elements on the reliable performance remains unexplored. Therefore, this paper investigates the reliable performance of double RIS-assisted multi-antenna cooperative NOMA short-packet communication systems, where one RIS facilitates communication between a multi-antenna BS and a near user, and the other RIS enhances communication between the near user and a far user.  Methods  Based on finite blocklength information theory, closed-form expressions for the average block error rate (BLER) of the near and far users are derived under the optimal antenna selection strategy. These expressions provide an efficient and convenient approach for evaluating the reliability of the considered system. Next, the effective throughput is formulated, and the optimal blocklength that maximizes it under reliability and latency constraints is derived.  Results and Discussions  The theoretical average BLER results show excellent agreement with Monte Carlo simulation results, confirming the validity of the derivations. The average BLER for the near and far users decreases as the transmit signal-to-noise ratio (SNR) increases. Moreover, for a given transmit SNR, increasing the blocklength significantly reduces the average BLER for the near and far users (Fig. 2). The reason for this improvement is that longer blocklengths decrease the transmission rate, thereby enhancing system reliability. The average BLER for the near user initially decreases before reaching a minimum and then increases as the power allocation coefficient increases (Fig. 3). This trend is due to the fact that increasing the power allocation coefficient reduces the BLER for decoding the near user's signal but increases the complexity of the successive interference cancellation (SIC) process. In contrast, the average BLER for the far user increases as the power allocation coefficient increases. The double RIS-assisted transmission scheme demonstrates superior performance compared to the single RIS-assisted and non-RIS-assisted transmission schemes (Fig. 4). Specifically, as the number of RIS reflecting elements increases, the performance advantage of the proposed scheme over these benchmark schemes becomes increasingly significant. The average BLER for the far user saturates as the number of BS antennas increases (Fig. 5). This is due to the fact that the relaying link becomes the dominant performance bottleneck when the number of BS antennas exceeds a certain value. As the blocklength increases, the effective throughput first reaches a maximum and then gradually decreases (Fig. 6). This is because when the blocklength is too small, a higher BLER results in poor effective throughput. When the blocklength is too large, a lower transmission rate also leads to poor effective throughput. As the quality of hardware improves, the optimal blocklength decreases. This can be justified by the fact that lower hardware impairments reduce decoding errors, meaning that shorter blocklengths can be used to reduce transmission latency while still satisfying reliability constraints.  Conclusions  This paper investigates the performance of the double RIS-assisted multi-antenna cooperative NOMA short-packet communication system under hardware impairments. Closed-form expressions for the average BLER of the near and far users are derived under the optimal antenna selection strategy. Furthermore, the effective throughput is analyzed, and the optimal blocklength that maximizes the effective throughput under reliability and latency constraints is determined. Simulation results demonstrate that the double RIS-assisted transmission scheme achieves superior performance compared to the single RIS-assisted and non-RIS-assisted transmission schemes. In addition, increasing the number of BS antennas does not always improve the average BLER for the far user due to the relaying link constraint. Power allocation is critical for ensuring user reliability. The near user should carefully balance self-signal demodulation and SIC under a total power constraint. Superior hardware quality enhances short-packet communication efficiency by lowering the optimal blocklength. Future work will focus on developing RIS configuration schemes that simultaneously maximize energy efficiency (EE) and ensure user fairness in NOMA to address the needs of energy-constrained IoT devices.
A Review on Phase Rotation and Beamforming Scheme for Intelligent Reflecting Surface Assisted Wireless Communication Systems
XING Zhitong, LI Yun, WU Guangfu, XIA Shichao
Available online  , doi: 10.11999/JEIT250790
Abstract:
  Objective  With the large-scale commercial deployment of 5G networks since 2020 and the ongoing research into 6G technology, modern communication systems face the challenge of adapting to increasingly complex channel environments. These include ultra-high-density urban areas, remote oceanic regions, deserts, forests, and other locations with demanding propagation conditions. To address these challenges, there is a pressing need for low-energy solutions that can dynamically adjust and reconfigure wireless communication channels. Such advancements would not only enhance transmission performance—reducing latency, increasing data rates, and improving signal reception—but also facilitate more efficient deployment in challenging environments. Intelligent Reflecting Surface (IRS) has emerged as a promising technology for reshaping channel conditions. Unlike traditional active relays, IRS operates passively, introducing minimal additional energy consumption. When integrated with existing communication architectures such as Single Input Single Output (SISO), Multiple Input Single Output (MISO), and Multiple Input Multiple Output (MIMO), IRS can significantly improve transmission efficiency, reduce power consumption, and enhance adaptability in complex scenarios. This paper aims to provide a comprehensive review of IRS-assisted communication systems, focusing on signal transmission models, beamforming techniques, and phase-shift optimization strategies.  Methods  This review presents a systematic analysis of Intelligent Reflecting Surface (IRS) technology in modern communication systems through comprehensive examination of signal transmission models across three fundamental configurations. Beginning with IRS-assisted SISO (Single Input Single Output) systems, we investigate how IRS revolutionizes single-antenna communications by intelligently manipulating incident signals through sophisticated reflection and phase-shifting techniques, thereby overcoming traditional limitations in signal propagation. The analysis then progresses to more complex MISO (Multiple Input Single Output) and MIMO (Multiple Input Multiple Output) architectures, where we explore the critical interplay between IRS phase shifts and advanced MIMO precoding strategies to achieve maximal spectral efficiency. Building upon these transmission models, our study provides an in-depth review of joint optimization and precoding schemes specifically designed for IRS-enhanced MIMO systems. These optimization algorithms are systematically classified into four distinct yet complementary objectives that address diverse operational requirements: The first category focuses on power consumption minimization, developing strategies to reduce total energy expenditure while maintaining satisfactory communication quality - particularly valuable for energy-sensitive applications like IoT networks and sustainable green communications. The second category pursues energy efficiency maximization, optimizing the crucial ratio of achievable data rate to power consumption rather than simply reducing energy use, thus ensuring superior performance per energy unit. The third category targets sum-rate maximization, concentrating on boosting aggregate data throughput across all users to enhance overall system capacity - an essential consideration for high-density urban 5G/6G deployments. The fourth category emphasizes fairness-aware rate maximization, implementing sophisticated resource allocation mechanisms to guarantee equitable bandwidth distribution among users while maintaining high Quality of Service (QoS) standards in multi-user environments. Together, these optimization frameworks establish a comprehensive methodology for advancing IRS-assisted MIMO systems, enabling engineers and researchers to precisely balance performance metrics, energy efficiency, and user fairness according to specific application demands and operational scenarios, thereby unlocking the full potential of IRS technology in next-generation wireless networks.  Results and Discussions  This comprehensive review demonstrates that Intelligent Reflecting Surface (IRS)-assisted communication systems offer transformative capabilities for next-generation wireless networks through four key advantages. Firstly, IRS provides substantial performance enhancement by intelligently reconfiguring propagation environments, particularly improving signal strength and coverage in challenging non-line-of-sight scenarios such as urban canyons, indoor spaces, and remote areas, while also maintaining connectivity in high-mobility applications like vehicular communications. Secondly, the technology achieves remarkable energy efficiency through its passive operation, introducing minimal power overhead while significantly boosting spectral efficiency - a crucial feature for sustainable massive IoT deployments and green 6G networks that may even incorporate energy-harvesting capabilities. Thirdly, IRS exhibits exceptional adaptability through seamless integration with various communication architectures, including SISO systems for basic signal enhancement, MISO for optimized beamforming, and MIMO for spatial multiplexing gains, making it versatile for diverse environments from ultra-dense urban networks to remote and aerial communications. Finally, advanced beamforming and phase-shift optimization techniques enable maximized signal-to-noise ratio through coherent signal combining, effective interference suppression in multi-user scenarios, low-latency performance for critical applications, and increasingly sophisticated real-time optimization through machine learning approaches like deep reinforcement learning. These combined capabilities position IRS as a cornerstone technology for future 6G networks, promising to enable smart radio environments and ubiquitous connectivity, though further research is needed to address practical deployment challenges including channel estimation, scalability, and standardization efforts.  Conclusions  This review underscores the transformative potential of IRS in next-generation wireless communication systems. By enabling dynamic channel reconfiguration with minimal energy overhead, IRS can enhance the performance of SISO, MISO, and MIMO systems, making them more robust in complex environments. The reviewed signal transmission models and optimization techniques provide a foundation for further advancements in IRS-assisted communications. As the industry progresses toward 6G, IRS is expected to play a pivotal role in achieving ultra-reliable, low-latency, and energy-efficient global connectivity. Future work should focus on practical deployment challenges, including hardware design, real-time signal processing, and standardization efforts.
Robust Resource Allocation Algorithm for Active Reconfigurable Intelligent Surface-Assisted Symbiotic Secure Communication Systems
MA Rui, LI Yanan, TIAN Tuanwei, LIU Shuya, DENG Hao, ZHANG Jinlong
Available online  , doi: 10.11999/JEIT250811
Abstract:
  Objective  The existing research on Reconfigurable Intelligent Surface (RIS)-assisted symbiotic radio systems has primarily focused on passive RIS. However, due to the severe double-fading effect, it is difficult to achieve significant capacity gains using passive RIS in communication scenarios with strong direct paths. The assistance of the active RIS can effectively solve this problem. Moreover, the signal amplification capability of active RIS enhances the signal-to-noise ratio of the secondary signal and improves the security of the primary signal. Additionally, by considering imperfect Successive Interference Cancellation (SIC), a penalized-based Successive Convex Approximation (SCA) algorithm utilizing alternating optimization is investigated.  Methods  The original optimization problem is challenging to solve directly due to its complex and non-convex constraints. Thus, the alternating optimization method is adopted to decouple the original optimization problem into two subproblems. These subproblems pertain to designing the transmit beamforming vector at the primary transmitter and the reflection coefficient matrix at the active RIS. Then, the variable substitution, the equivalent transformation, and the penalty-based SCA methods are utilized for alternating iterative solutions. Specifically, for the beamforming design, the rank-one constraint is first equivalently transformed. The penalty-based SCA method is then applied to recover the rank-one optimal solution, and iterative optimization is finally employed to obtain the result. For the reflection coefficient matrix design, the problem is first reformulated. Auxiliary variables are then introduced to avoid feasibility check issues, after which a penalty-based SCA approach is used to handle the rank-one constraint. The solution is ultimately obtained using the CVX toolbox. Based on the above procedures, a robust resource allocation algorithm based on penalty is proposed using alternating optimization.  Results and Discussions  The convergence curves of the proposed algorithm under different numbers of primary transmitter antennas (K) and RIS reflecting elements (N) is shown (Fig.3). The results indicate that the total power consumption of the system gradually decreases with the increase of iterations and converges within a finite number of steps. The relationship between the total power consumption of the system and the Signal-to-Interference-and-Noise Ratio (SINR) threshold of the secondary signal is depicted (Fig.4). As the SINR threshold increases, the system requires more power to maintain the lowest service quality of the secondary signal, leading to a rise in the total power consumption. Besides, with the imperfect interference cancellation factor decreases, the total power consumption of the system diminishes. To compare performance, three baseline algorithms are introduced (Fig.5), namely: the passive RIS, the active RIS with random phase shift, and the non-robust algorithm. The total system power consumption under the proposed algorithm is consistently lower than that of the passive RIS and active RIS with random phase shift. Although additional power is consumed by the active RIS itself, the savings in transmit power outweigh this consumption, resulting in higher overall energy efficiency. When random phase shifts are applied, the active beamforming and amplification capabilities of the RIS are underutilized. This forces the primary transmitter to solely compensate for meeting the performance constraints, thereby increasing its power consumption. Besides, due to the consideration of imperfect SIC in the proposed algorithm, a higher transmit power is required to compensate for residual interference to satisfy the secondary system’s minimum SINR constraint. As a result, the total power consumption remains higher than that of the non-robust algorithm. The influence of the primary signal’s secrecy rate threshold on the secure energy efficiency of the primary system under different N has been revealed (Fig.6). The results indicate that there exists an optimal secrecy rate threshold, which maximizes the secure energy efficiency of the main system. To investigate the impact of the active RIS deployment on the total power consumption of the system, the positions of each node are rearranged (Fig.7). The fading effect experienced is weakened as the active RIS is placed closer to the receiver, thus the total system power consumption is reduced.  Conclusions  This paper investigates the total power consumption of an active RIS-assisted symbiotic secure communication system under imperfect SIC. To improve the energy efficiency of the system, a system-wide total power minimization problem is formulated, subject to multiple constraints including the quality of service for both primary and secondary signals, as well as the power and phase shift constraints of the active RIS. To address this non-convex problem with uncertain disturbance parameters, techniques such as the variable substitution, the equivalent transformation and the penalty-based SCA method are employed to convert the original problem into a convex optimization form. Simulation results validate the effectiveness of the proposed algorithm, demonstrating a significant reduction in the total system power consumption compared to benchmark schemes.
Research on Collaborative Reasoning Framework and Algorithms of Cloud-Edge Large Models for Intelligent Auxiliary Diagnosis Systems
HE Qian, ZHU Lei, LI Gong, YOU Zhengpeng, YUAN Lei, JIA Fei
Available online  , doi: 10.11999/JEIT250828
Abstract:
  Objective  The deployment of large language models (LLMs) in intelligent auxiliary diagnosis is constrained by two critical challenges: insufficient computing power for localized deployment in hospitals and significant privacy risks associated with medical data transmission and storage in cloud environments. Low-parameter local LLMs suffer from 20%-30% lower accuracy in medical knowledge Q&A and 15%-25% reduced medical knowledge coverage compared to full-parameter cloud LLMs, while cloud-based solutions face inherent data security and privacy protection issues. To address these dilemmas, this study aims to propose a cloud-edge LLM collaborative reasoning framework and corresponding algorithms for intelligent auxiliary diagnosis systems. The core objective is to develop a cloud-edge collaborative reasoning agent integrated with intelligent routing and dynamic semantic desensitization capabilities, enabling dynamic task allocation between edge (hospital-end) and cloud (regional cloud) sides. This framework seeks to balance diagnostic accuracy, data privacy security, and resource utilization efficiency, providing a viable technical paradigm for the advancement of medical artificial intelligence systems.  Methods  The proposed framework adopts a layered architectural design, consisting of a four-tier progressive architecture on the edge side and a four-tier service-oriented architecture on the cloud side (Fig. 1). The edge side encompasses resource, data, model, and application layers, with the model layer hosting lightweight medical LLMs and the cloud-edge collaborative agent. The cloud side includes AI IaaS, AI PaaS, AI MaaS, and AI SaaS layers, serving as a convergence center for computing power and advanced models. The collaborative reasoning process follows a structured business workflow (Fig. 2), starting with user input parsed by the agent to extract clinical key features, followed by reasoning node decision-making. Two core technologies underpin the agent: 1) Intelligent routing: This mechanism prioritizes edge-side processing by default and dynamically selects optimal reasoning paths (edge or cloud) through a dual-driven weight update strategy. It integrates semantic feature similarity (calculated via Chinese word segmentation and pre-trained medical language models) and historical decision data, with exponential moving average used to update feature libraries for adaptive optimization. 2) Dynamic semantic desensitization: Employing a three-stage architecture (sensitive entity recognition, semantic correlation analysis, and hierarchical desensitization decision-making), this technology identifies sensitive entities via a domain-enhanced named entity recognition (NER) model, calculates entity sensitivity and desensitization priority, and enforces a semantic similarity constraint to avoid excessive desensitization. Three desensitization strategies (complete deletion, general replacement, partial masking) are applied based on entity sensitivity. Experimental validation was conducted using two open-source Chinese medical knowledge graphs (CMeKG and CPubMedKG) covering over 2.7 million medical entities. The experimental environment (Fig. 3) deployed a qwen3:1.7b model on the edge and the Jiutian LLM on the cloud, with a 5,000-sample evaluation dataset divided into entity-level, relation-level, and subgraph-level questions. Performance was assessed using three core metrics: answer accuracy, average token consumption, and average response time.  Results and Discussions  Experimental results demonstrate that the proposed framework achieves remarkable performance across key evaluation dimensions. In terms of answer accuracy, the intelligent routing mechanism yields overall accuracy of 72.44% (CMeKG)(Fig. 4) and 66.20% (CPubMedKG) (Fig. 5), which are significantly higher than those of the edge-side LLM alone (60.73% and 54.18%) and nearly comparable to the cloud LLM (72.68% and 66.49%). This confirms that the framework maintains diagnostic consistency with cloud-based solutions while leveraging edge-side capabilities. Regarding resource efficiency, the intelligent routing model reduces average token consumption to 61.27, accounting for only 45.63% of the cloud LLM’s token usage (131.68) (Fig. 6), resulting in substantial cost savings. In terms of response time, the edge-side LLM exhibits a latency exceeding 6s due to computing power limitations, while the cloud LLM achieves 0.44s latency via dedicated line access (8% of the 5.46s latency with internet access). The intelligent routing model’s average latency falls between the edge and cloud LLMs under both access modes (Fig. 7), aligning with expected performance trade-offs. The framework demonstrates strong applicability across typical medical scenarios (Table 1), including outpatient triage, chronic disease management, medical image analysis, intensive care, and health consultation, by combining local real-time processing advantages with cloud-based deep reasoning capabilities. However, limitations exist in emergency rescue scenarios with poor network conditions (due to latency constraints) and rare disease diagnosis (due to insufficient edge-side training samples and potential loss of individual features during desensitization). These results collectively validate that the cloud-edge collaborative reasoning mechanism effectively optimizes computing resource overhead while ensuring diagnostic result consistency.  Conclusions  This study successfully constructs a cloud-edge LLM collaborative reasoning framework for intelligent auxiliary diagnosis systems, addressing the key challenges of limited local computing power and cloud data privacy risks. By integrating intelligent routing, prompt engineering adaptation, and dynamic semantic desensitization technologies, the framework achieves a balanced optimization of diagnostic accuracy, data security, and resource economy. The experimental validation confirms that the framework’s performance is comparable to that of cloud-only LLMs in terms of accuracy while significantly reducing resource consumption, providing a new technical path for medical intelligence upgrading. Future research will focus on three directions: first, intelligent on-demand scheduling of computing and network resources to address latency issues caused by edge-side computing bottlenecks; second, collaborative deployment of localized LLMs with Retrieval-Augmented Generation (RAG) to enhance edge-side standalone accuracy to over 90%; and third, expansion of medical diagnostic evaluation indicators to establish a three-dimensional "scenario-node-indicator" system, incorporating sensitivity, specificity, and AUC for clinical-oriented validation.
IRS Deployment for Highly Time Sensitive Short Packet Communications: Distributed or Centralized Deployment?
ZHANG Yangyi, GUAN Xinrong, YANG Weiwei, CAO Kuo, WANG Meng, CAI Yueming
Available online  , doi: 10.11999/JEIT250720
Abstract:
  Objective  With the rapid advancement of the Industrial Internet of Things (IIoT), latency-sensitive applications—such as environmental monitoring and precision control—which primarily rely on short-packet communications, are placing increasingly stringent demands on the timeliness of information delivery. The Intelligent Reflecting Surface (IRS) has emerged as a promising technology to enhance both the reliability and timeliness of short-packet communications by dynamically adjusting reflection coefficients. However, existing research has predominantly focused on optimizing the phase shifts of IRS elements, overlooking the potential performance gains achievable through flexible deployment strategies. Indeed, optimizing the physical deployment of IRS can introduce new degrees of freedom for improving timeliness performance. Two typical deployment strategies are commonly considered: distributed IRS and centralized IRS, each creating distinct effective channel characteristics and resulting in different capacity behaviors. This paper systematically investigates and compares both deployment schemes in IRS-assisted short-packet communication systems. By evaluating their Age of Information (AoI) performance under practical channel estimation overheads, we provide insights into optimal IRS deployment strategies for achieving superior timeliness across diverse system conditions.  Methods  The paper investigates an IRS-assisted short-packet communication system in which multiple terminal devices transmit short packets to an access point (AP) via IRS reflection. Two typical IRS deployment schemes are considered: distributed and centralized IRS. In the distributed scheme, each device is assisted by a dedicated IRS with M reflecting elements deployed in its vicinity. In contrast, the centralized scheme collocates all IRS elements near the AP. To theoretically evaluate and compare the timeliness performance of both deployment strategies, the average AoI is adopted as the key performance metric. However, the complex distribution of the composite channel gain poses a challenge for deriving a closed-form average AoI expression. To overcome this, the moment matching (MM) approximation method is employed to approximate the distribution of the composite channel gain. Furthermore, by incorporating pilot overhead into the analysis, closed-form average AoI expressions are derived for both deployment schemes, thereby enabling a comprehensive performance comparison.  Results and Discussions  Through the simulation results, it can be found that the AoI performance of distributed IRS and centralized IRS varies differently under different system conditions. Specifically, the distributed IRS deployment can achieve superior AoI performance when the IRS equipped with a large number of reflecting elements (Fig. 4). Under high transmission power conditions, the centralized IRS configuration exhibits better AoI performance (Fig. 5). For scenarios with large AP-device distances, the distributed IRS scheme provides more favorable AoI outcomes (Fig. 6). Notably, as the system bandwidth increases, the centralized IRS architecture shows rapid AoI reduction, eventually outperforming its distributed counterpart (Fig. 7).  Conclusions  This paper presents a comparative investigation of the timeliness performance in IRS-assisted short-packet communication systems under two deployment strategies: distributed and centralized IRS. First, the MM method is employed to approximate the composite channel gain as a gamma distribution, enabling the derivation of an approximate expression for the average packet error rate. Subsequently, a closed-form expression for the average AoI is established, incorporating the impact of channel estimation overhead. Simulation-based comparisons between the two deployment schemes reveal distinct AoI performance advantages under different operational conditions. Specifically, the distributed IRS configuration achieves superior AoI performance when a large number of reflecting elements is deployed or when the AP-device distance is considerable. In contrast, the centralized IRS scheme yields better AoI performance under high transmission power or ample system bandwidth.
Data-Driven Secure Control for Cyber-Physical Systems under Denial-of-Service Attacks: An Online Mode-Dependent Switching-Q-Learning Strategy
ZHANG Ruifeng, YANG Rongni
Available online  , doi: 10.11999/JEIT250746
Abstract:
  Objective   The open network architecture of cyber-physical systems (CPSs) enables remarkable flexibility and scalability, but it also renders CPSs highly vulnerable to cyber-attacks. Particularly, denial-of-service (DoS) attacks have emerged as one of the predominant threats, which can cause packet loss and reduce system performance by directly jamming channels. On the other hand, CPSs under dormant and active DoS attacks can be regarded as dual-mode switched systems with stable and unstable subsystems, respectively. Therefore, it is worth exploring how to utilize the switched system theory to design a secure control approach with high degrees of freedom and low conservatism. However, due to the influence of complex environments such as attacks and noises, it is difficult to model practical CPSs exactly. Currently, although a Q-learning-based control method demonstrates potential for handling unknown CPSs, the significant research gap exists in switched systems with unstable modes, particularly for establishing the evaluable stability criterion. Therefore, it remains to be investigated for unknown CPSs under DoS attacks to apply switched system theory to design the learning-based control algorithm and evaluable security criterion.   Methods   An online mode-dependent switching-Q-learning strategy is presented to study the data-driven evaluable criterion and secure control for unknown CPSs under DoS attacks. Initially, the CPSs under dormant and active DoS attacks are transformed into switched systems with stable and unstable subsystems, respectively. Subsequently, the optimal control problem of the value function is addressed for the model-based switched systems by designing a new generalized switching algebraic Riccati equation (GSARE) and obtaining the corresponding mode-dependent optimal security controller. Furthermore, the existence and uniqueness of the GSARE’s solution are proved. In what follows, with the help of model-based results, a data-driven optimal security control law is proposed by developing a novel online mode-dependent switching-Q-learning control algorithm. Finally, through utilizing the learned control gain and parameter matrices from the above algorithm, a data-driven evaluable security criterion with the attack frequency and duration is established based on the switching constraints and subsystem constraints.   Results and Discussions   In order to verify the efficiency and advantage of the proposed methods, comparative experiments of the wheeled robot are displayed in this work. Firstly, compare the model-based result (Theorem 1) and the data-driven result (Algorithm 1) as follows: From the iterative process curves of control gain and parameter matrices (Fig. 2 and Fig. 3), it can be observed that the optimal control gain and parameter matrices under threshold errors can all be successfully obtained from both the model-based GSARE and the data-driven algorithm. Meanwhile, the tracking errors of CPSs can converge to 0 by utilizing the above data-driven controller (Fig. 5), which ensures the exponential stability of CPSs and verifies the efficiency of our proposed switching-Q-learning algorithm. Secondly, it is evident from learning process curves (Fig.4) that although the initial value of the learned control gain is not stabilizable, the optimal control gain can still be successfully learned to stabilize the system from Algorithm 1. This result significantly reduces conservatism compared to existing Q-learning approaches, which take stabilizable initial control gains as the learning premise. Thirdly, compare the data-driven evaluable security criterion in Theorem 2 of this work and existing criteria as follows: While the switching parameters learned from Algorithm 1 do not satisfy the popular switching constraint to obtain the model dwell-time, by utilizing the evaluable security criterion proposed in this paper, the attack frequency and duration are obtained based on the new switching constraints and subsystem constraints. Furthermore, it is seen from the comparison of the evaluable security criteria (Tab.1) that our proposed evaluable security criterion is less conservative than the existing evaluable criteria. Finally, the learned optimal controller and the obtained DoS attack constraints are applied to the tracking control experiment of a wheeled robot under DoS attacks, and the result is compared with existing results via Q-learning controllers. It is evident from the tracking trajectory comparisons of the robot (Fig.6 and Fig.7) that the robot enables significantly faster and more accurate trajectory tracking with the help of our proposed switching-Q-learning controller. Therefore, the efficiency and advantage of the proposed algorithm and criterion in this work are verified.   Conclusions   Based on the learning strategy and the switched system theory, this study presents an online mode-dependent switching-Q-learning control algorithm and the corresponding evaluable security criterion for the unknown CPSs under DoS attacks. The detailed results are provided as follows: (1) By representing the unknown CPSs under dormant and active DoS attacks as unknown switched systems with stable and unstable subsystems, respectively, the security problem of CPSs under DoS attacks is transformed into a stabilization problem of the switched systems, which offers high design freedom and low conservatism. (2) A novel online mode-dependent switching-Q-learning control algorithm is developed for unknown switched systems with unstable modes. Through the comparative experiments, the proposed switching-Q-learning algorithm effectively increases the design freedom of controllers and decreases conservatism over existing Q-learning algorithms. (3) A new data-driven evaluable security criterion with the attack frequency and duration is established based on the switching constraints and subsystem constraints. It is evident from the comparative criteria that the proposed criterion demonstrates significantly reduced conservatism over existing evaluable criteria via single subsystem constraints and traditional model dwell-time constraints.
Knowledge-Guided Few-Shot Earth Surface Anomalies Detection
JI Hong, GAO Zhi, CHEN Boan, AO Wei, CAO Min, WANG Qiao
Available online  , doi: 10.11999/JEIT251000
Abstract:
  Objective   Earth Surface Anomalies (ESAs), referring to sudden natural or human-induced disasters on the Earth’s surface, pose severe risks and widespread impacts. Timely and accurate earth surface anomalies detection is therefore crucial for social security and sustainable development. Remote sensing provides an effective means for earth surface anomalies detection. However, the performance of existing deep learning models remains constrained due to the scarcity of labeled data, the complexity of anomaly backgrounds, and the distribution shift across multi-source remote sensing imagery. To address these challenges, this paper proposes a knowledge-guided few-shot learning method. The method leverages large language models to generate abstract textual descriptions of normal and anomalous geospatial features, which are encoded and fused with visual prototypes to form a cross-modal joint representation. This integration improves prototype discriminability in few-shot settings and demonstrates the necessity of incorporating linguistic knowledge into earth surface anomalies detection, offering a promising direction for reliable disaster monitoring when annotated data are scarce.  Methods   The knowledge-guided few-shot learning method is built on a metric-based paradigm, where each episode consists of support and query sets and classification is achieved by comparing query features with class prototypes using distance-based similarity and cross-entropy optimization (Figure 1). To supplement limited visual prototypes, class-level textual descriptions are generated with ChatGPT through carefully designed prompts, producing semantic sentences that characterize the appearance, attributes, and contextual relations of both normal and anomalous categories (Figures 23). These descriptions encode domain-specific properties such as anomaly extent, morphology, and environmental impact, which are otherwise difficult to capture with scarce visual samples. The sentences are encoded with a CLIP (Contrastive Language–Image Pre-training) text encoder, and task-adaptive soft prompts are introduced by generating tokens from support features and concatenating them with static embeddings, yielding adaptive word embeddings. Encoded sentence vectors are then processed by a lightweight self-attention module to model dependencies across multiple descriptions, resulting in a coherent paragraph-level semantic representation (Figure 4). The obtained semantic prototypes are fused with the visual prototypes through weighted addition, producing cross-modal prototypes that combine visual grounding and linguistic abstraction. During training, query samples are compared with cross-modal prototypes, and optimization is guided by two objectives: a classification loss that enforces accurate query–prototype alignment, and a prototype regularization loss that ensures semantic prototypes are discriminative and well separated. The entire process is implemented in an episodic training framework (Algorithm 1).  Results and Discussions   The proposed method is evaluated under both cross-domain and in-domain few-shot settings. In the cross-domain case, models are trained on NWPU45 or AID and tested on ESAD to assess earth surface anomalies recognition. As shown in the comparisons (Table 2), traditional meta-learning methods such as MAML and Meta-SGD achieve accuracies below 50%, while metric-based baselines like ProtoNet and RelationNet are more stable but still limited. The proposed method reach 61.99% on NWPU45→ESAD and 59.79% on AID→ESAD settings, outperforming ProtoNet by 4.72% and 2.67% respectively. In the in-domain setting, training and testing on the same dataset, the method achieve 76.94% on NWPU45 and 72.98% on AID, consistently surpassing state-of-the-art baselines such as S2M2, and IDLN (Table 3). Ablation experiments further validate the contribution of each component. Using only visual prototypes produce accuracies of 57.74% and 72.16%, while progressively incorporating simple class names, task-oriented templates, and ChatGPT-generated descriptions improved results. The best performance is obtained by combining ChatGPT descriptions, learnable tokens, and attention-based mechanism, reaching 61.99% and 76.94% (Table 4). Parameter sensitivity analysis confirms that an appropriate weight for language features (\begin{document}$\alpha $\end{document}=0.2) and two learnable tokens yield optimal performance (Figure 5).  Conclusions   This paper addresses the task of earth surface anomalies detection in remote sensing imagery by introducing a knowledge-guided few-shot learning method. The method exploits large language models to automatically generate abstract textual descriptions for both anomaly categories and conventional remote sensing scenes, thereby constructing multimodal training and testing resources. These descriptions are encoded into semantic feature vectors through a pretrained text encoder. To extract task-specific knowledge, a dynamic token learning strategy is designed, in which a small number of learnable parameters are guided by visual samples within few-shot tasks to generate adaptive semantic vectors. An attention-based semantic knowledge module then models dependencies among language features, producing cross-modal semantic vectors for each class. By fusing these vectors with visual prototypes, the method forms joint multimodal representations that are used for query–prototype matching and network optimization. Experimental evaluations demonstrate that the proposed method effectively leverages prior knowledge from pretrained models, compensates for the limitations of scarce visual data, and enhances feature discriminability for anomalies recognition. Both cross-domain and in-domain results confirm that the method achieves consistent improvements over competitive baselines, highlighting its potential for reliable application in real-world remote sensing anomalies detection scenarios.
A deception Jamming Discrimination Algorithm Based on Phase Fluctuation for aIrborne Distributed Radar System
LV Zhuoyu, YANG Chao, SUO Chengyu, WEN Cai
Available online  
Abstract:
Deception jamming not only prevents radars from distinguishing real targets from false ones but also significantly degrades the parameter estimation accuracy and tracking performance of real targets. To address the deception jamming identification issue in airborne distributed radar systems, this paper proposes a discrimination method based on phase fluctuation. The method first corrects synchronization errors affecting the echo signal phase in airborne distributed radar systems. Subsequently, refined processing is applied to the received multi-station echoes to extract multi-station target scattering phase vectors. Finally, leveraging differences in scattering characteristics between real and false targets, the fluctuation variance of phase vectors is utilized to discriminate between them. The proposed method enhances the anti-jamming performance of airborne distributed radar systems in complex electromagnetic environments, and simulation results validate its effectiveness.
A Sparse-Reconstruction-Based Fast Localization Algorithm for Mixed Far-Field and Near-Field Sources
FU Shijian, QIU Longhao, LIANG Guolong
Available online  , doi: 10.11999/JEIT250165
Abstract:
  Objective  Source localization is a critical research area in array signal processing, with applications in radar, sonar, and wireless communications. Traditional localization methods, which are based on either far-field or near-field models individually, face significant challenges in effectively separating and localizing mixed far-field and near-field sources. Existing algorithms, such as subspace-based methods, suffer from high computational complexity, limited localization accuracy, and degraded performance in low Signal-to-Noise Ratio (SNR) scenarios. In addition, these methods assume that near-field sources are located within the Fresnel Region, leading to localization errors and a reduction in effective array aperture. Improved algorithms, such as Multiple Sparse Bayesian Learning for Far and Near-field sources (FN-MSBL), successfully overcame these limitations and achieved higher localization accuracy. However, the high computational cost of matrix inversion during each iteration restricts its real-time applicability. Therefore, this paper aims to address these limitations and issues by proposing a novel algorithm that not only develops a sparse representation model for mixed near-field and far-field sources based on the covariance domain but also integrates sparse reconstruction with the Generalized Approximate Message Passing (GAMP) and Variational Bayesian Inference (VBI) frameworks. The primary goal is to achieve high-precision localization of mixed sources while significantly reducing computational costs, thereby enabling real-time applicability.  Methods  The proposed algorithms, termed Covariance-based VBI for Far and Near-field sources (FN-CVBI) and Covariance-based GAMP-VBI for Far and Near-field sources (FN-GAMP-CVBI), are developed through several key methods. First, a unified sparse representation model for mixed far-field and near-field sources is constructed based on the covariance vector. This model leverages the improved SNR of the covariance vector compared to the original array output, enabling more accurate far-field Direction Of Arrival (DOA) estimation. Second, to mitigate the estimation errors in the sample covariance matrix, a pre-whitening operation is applied to the covariance vector. This step effectively minimizes the correlation between the elements of the covariance vector, thereby enhancing the robustness of the sparse reconstruction algorithm. Third, a hierarchical Bayesian model is established to enforce sparsity, and VBI is employed to estimate the parameters. The VBI framework iteratively updates the posterior distributions of the hidden variables, ensuring convergence to a near-optimal solution. Fourth, to address the significant computational complexity associated with traditional VBI methods, the GAMP algorithm is embedded into the VBI framework. GAMP replaces the computationally expensive matrix inversion operations in VBI, significantly reducing the computational burden. The detailed implementation steps of GAMP are provided in Table 1. In conclusion, by combining the advantages of sparse reconstruction, VBI, and GAMP, the proposed algorithm not only improves localization accuracy but also significantly reduces computational complexity, making it suitable for real-time applications.  Results and Discussions  The proposed algorithm FN-GAMP-CVBI demonstrates significant improvements in both localization accuracy and computational efficiency. Computational complexity analysis demonstrates that the algorithm significantly reduces computational costs (Table 2). In terms of localization accuracy, the proposed algorithms, FN-CVBI and FN-GAMP-CVBI, both outperform compared methods such as LOFNS and FN-MSBL (Fig.3, Fig.4), particularly in low SNR and sufficient snapshots scenarios (Fig.5, Fig.6), and demonstrate superior performance in resolving closely spaced far-field sources (Fig.7). Experimental validation using lake trial data further confirms the effectiveness of the proposed algorithms, as evidenced by sharper spectral peaks and minimal false peaks in the background noise of Bearing Time Recording (BTR) (Fig.9). In addition, FN-CVBI achieves the highest accuracy in far-field DOA estimation and near-field localization. The computational time of FN-GAMP-CVBI is reduced by up to 95% compared to FN-MSBL, making the algorithm highly efficient for real-time applications (Table 4). Overall, the proposed FN-GAMP-CVBI algorithm strikes an effective balance between localization accuracy and computational efficiency.  Conclusions  This paper presents a novel approach to mixed far-field and near-field source localization by integrating sparse reconstruction with the GAMP-VBI framework. The proposed FN-GAMP-CVBI algorithm addresses the limitations of traditional methods, offering a balanced trade-off between computational efficiency and localization accuracy. The simulation results demonstrate superior performance, particularly in scenarios with sufficient snapshots and low SNR. Experimental validation further confirms the effectiveness and efficiency of the proposed algorithms. Overall, the proposed FN-GAMP-CVBI algorithm strikes an effective balance between localization accuracy and computational efficiency. Its ability to simultaneously handle both far-field and near-field sources, combined with its low computational complexity, positions it as a promising solution for real-time mixed source localization in complex environments.
Application of WAM Data Set and Classification Method of Electromagnetic Wave Absorbing Materials
YUAN Yuyang, ZHANG Junhan, LI Dandan, SHA Jian jun
Available online  , doi: 10.11999/JEIT250166
Abstract:
The performance of electromagnetic radiation shielding and absorbing materials is mainly determined by thickness, maximum reflection loss, and effective absorption bandwidth. Research focuses on metal-organic frameworks, carbon-based, and ceramic absorbing materials, and weak artificial intelligence is used to analyze the WAM (Wave Absorption Materials) dataset. After dividing the dataset into training and testing sets, data augmentation and correlation and principal component analysis are conducted. The decision tree algorithm is used to establish classification indicators, and it is found that the reflection loss of MOFs (Metal Organic Frameworks) materials is better than that of carbon-based materials, and MOFs materials are more likely to meet the maximum reflection loss value of less than –45 dB. The generalization performance of the random forest algorithm is better than that of the decision tree algorithm, and the ROC-AUC value is higher. The neural network is used for classification research, and the results show that the self-organizing mapping neural network performs better in classification, while the probabilistic neural network has a poor effect. After extending the binary classification problem to a three-class classification problem, nonlinear classification, clustering, and Boosting algorithms are used, and it is found that the maximum reflection loss is a key indicator. Further analysis shows that the WAM dataset is nonlinearly separable, and the fuzzy clustering effect is better.Artificial intelligence helps to reveal the relationship between material properties and absorbing performance, accelerate the development of new materials, and support the construction of the knowledge graph and knowledge base of absorbing materials.  Objective   Computational materials science, high-throughput experimentation, and the Materials Genome Initiative (MGI) have become prominent research frontiers in materials science. The Materials Genome Initiative serves as a strategic framework and developmental roadmap aimed at advancing materials research through artificial intelligence. Similar to gene sequencing in bioinformatics, its primary goal is to facilitate the discovery of novel material compositions and structures. Extracting valuable insights from large-scale datasets contributes significantly to cost reduction, efficiency improvement, interdisciplinary integration, and leapfrog advancements in materials development. Big data analytics, high-performance computing, and advanced algorithms constitute the foundational pillars of this initiative, providing critical support for the research and development of new materials. However, a prerequisite for discovering new material compositions and structures lies in the effective screening of candidate materials to identify those with outstanding properties that satisfy engineering application requirements. This necessitates the construction of comprehensive datasets, the development of robust classification algorithms, further enhancement of model generalization capabilities, and the advancement of associated application software.  Methods   This study was conducted using pattern recognition methods. First, a self-developed Wave-Absorbing Materials (WAM) dataset was constructed, comprising a test set and a validation set. Data preprocessing was carried out initially, which included data augmentation, data merging, and principal component analysis. Decision trees and random forests were employed to establish classification indicators and define the basis for classification. Self-Organizing Maps (SOM) and Probabilistic Neural Networks (PNN) were utilized for the classification task. Finally, the accuracy rates of different clustering algorithms were compared, revealing that the fuzzy clustering algorithm demonstrated relatively superior performance and was capable of achieving satisfactory results.  Results and Discussions   It was found that the reflection loss of MOFs (Metal Organic Frameworks) materials is superior to that of carbon-based materials. Semantic segmentation algorithms are not applicable to the classification of the WAM dataset. The classification accuracy of SOM is better than that of PNN. The WAM dataset is not linearly separable, and the classification results depend on the data distribution characteristics of the dataset itself. The maximum reflection loss is the key indicator for classification.  Conclusions   For the construction of the dataset of absorbing materials, a self-created WAM dataset was first built, which solved the problem that there was no dataset for the study of absorbing materials using pattern recognition reported in the known literature. The performance of various algorithms was compared and studied, and the optimal algorithm was determined based on the characteristics of the dataset. The traditional binary classification problem was extended to three classifications, preparing for the next step of multi-classification problem research. The use of artificial intelligence algorithms is conducive to improving the credibility and reliability of the research, and is beneficial to saving time costs and human resources. This method can explore the relationship between material properties and absorbing performance. It is conducive to shortening the research and development cycle, providing assistance for the screening of new materials, and providing support for the construction of the knowledge base of absorbing materials. The useful knowledge extracted from WAM is troubled by the problem of data sparsity, so there are certain limitations to artificial intelligence.
Research on Non-cooperative Interference Suppression Technology for Dual Antennas without Channel Prior Information
YAN Cheng, LI Tong, PAN Wensheng, DUAN Baiyu, SHAO Shihai
Available online  , doi: 10.11999/JEIT250378
Abstract:
  Objective  In electronic countermeasures, friendly communication links are highly susceptible to interference from adversaries. To suppress non-cooperative interference signals, the auxiliary antenna scheme is commonly employed to extract reference signals for interference cancellation, thereby enhancing communication quality. However, the auxiliary antenna typically receives both interference and communication signals simultaneously, which can degrade the interference suppression capability. Typical methods for non-cooperative interference suppression include interference rejection combining and spatial domain adaptive filtering. These techniques leverage the uncorrelated nature of the interference and desired signals to achieve non-cooperative interference suppression. They also require channel information and interference noise information to support the suppression process, which can limit their applicability in certain scenarios.  Methods  This paper proposes the Fast ICA-based Simulated Annealing Algorithm for SINR Maximization (FSA) to address non-cooperative interference suppression in communication systems. Designed for scenarios where prior channel information is unavailable, FSA employs a weighted reconstruction cancellation technique implemented through a Finite Impulse Response (FIR) filter structure. The method operates in a dual-antenna system where one antenna handles communication while the other serves as an auxiliary antenna for interference reference. The core innovation lies in optimizing the weighted reconstruction coefficients using the Simulated Annealing algorithm while employing Fast Independent Component Analysis (Fast ICA) for SINR estimation. The FIR filter reconstructs interference from the auxiliary antenna signal using these optimized coefficients, then subtracts the reconstructed interference from the main received signal to enhance communication quality. Accurate SINR estimation in non-cooperative environments presents significant challenges due to mixed signal components. FSA addresses this through blind source separation principles inspired by Fast ICA, extracting sample signals of both communication and interference components. The SINR is calculated based on cross-correlation results between these separated signals and the signals after interference suppression. The Simulated Annealing algorithm serves as a probabilistic optimization technique that iteratively adjusts reconstruction coefficients to maximize output SINR. Starting with initial coefficients, the algorithm perturbs them while evaluating resulting SINR improvements. Using the Monte Carlo criterion, it accepts or rejects perturbations, enabling escape from local optima and convergence toward global optimum solutions. This continuous optimization cycle identifies optimal filter coefficients within the search range to maximize SINR performance. The integrated approach of FSA enables effective interference suppression without requiring prior channel knowledge. By combining Fast ICA's blind estimation capabilities with Simulated Annealing's robust optimization, the method achieves reliable performance in dynamic interference environments. The FIR-based implementation provides a practical framework for real-time interference cancellation, making FSA particularly suitable for electronic countermeasure applications where channel conditions are unknown and rapidly changing. This methodology represents a significant advancement over conventional techniques that depend on channel state information, offering improved adaptability in non-cooperative scenarios while maintaining computational efficiency through the synergistic combination of blind source separation and intelligent optimization algorithms.  Results and Discussions  The performance of the proposed Fast ICA-based Simulated Annealing Algorithm for SINR Maximization (FSA) was evaluated through simulations and experiments. Results show that FSA significantly improves the output SINR under various conditions. In simulations, the method achieved up to 27.2 dB SINR improvement when the communication and auxiliary antennas had a large SINR difference and were placed farther apart (Fig. 5). However, performance degraded with increased channel correlation between the antennas. Experiments validated these findings, with an SINR improvement of 19.6 dB observed at a 2 m antenna separation (Fig. 7). The study concludes that FSA is highly effective for non-cooperative interference suppression without prior channel information, but its performance is sensitive to antenna configuration and channel correlation.  Conclusions  The proposed Fast ICA-based Simulated Annealing Algorithm for SINR Maximization (FSA) method provides an effective solution for non-cooperative interference suppression in communication systems. The method leverages weighted reconstruction cancellation, optimized by the Simulated Annealing algorithm, and Fast ICA-based SINR estimation to achieve significant improvements in communication quality without requiring prior channel information. The results from both simulations and experiments demonstrate the method's effectiveness across a range of conditions, highlighting its potential for practical applications in electronic warfare environments. The study concludes that the performance of the FSA method is highly dependent on the SINR difference and channel correlation between the communication and auxiliary antennas. Future research will focus on optimizing the algorithm for more complex scenarios and exploring the impact of various system parameters on its performance. The findings of this research contribute to the development of robust communication systems capable of operating effectively in challenging interference environments.
A Reliable Service Chain Option for Global Migration of Intelligent Twins in Vehicular Metaverses
QIU Xianyi, WEN Jinbo, KANG Jiawen, ZHANG Tao, CAI Chengjun, LIU Jiqiang, XIAO Ming
Available online  , doi: 10.11999/JEIT250612
Abstract:
  Objective  As an emerging paradigm for integrating and evolving metaverses and intelligent transportation systems, vehicle metaverses are gradually becoming a key driving force for transforming the automotive industry. In this context, intelligent twins serve as digital copies that cover the entire lifecycle of vehicles and manage vehicular applications, providing users with immersive vehicular services. However, due to cybersecurity threats, particularly Distributed Denial of Service (DDoS) attacks, the seamless migration of intelligent twins across different RoadSide Units (RSUs) leads to challenges such as excessive data transmission delays and data leakage. This paper proposes a globally optimized scheme for secure dynamic intelligent twin migration based on RSU chains, aimed at addressing data transmission latency and network security issues in vehicular metaverses, ensuring that intelligent twins can be reliably and securely migrated through RSU chains even under various types of DDoS attacks.  Methods  Firstly, a set of reliable RSU chains is constructed through an RSU communication interruption-free mechanism, enabling the rational deployment of intelligent twins for seamless RSU connectivity. This mechanism ensures continuous communication by dynamically adjusting RSU chain configurations based on real-time network conditions and vehicle movements. Then, the secure migration problem of intelligent twins along these RSU chains is modeled as a Partially Observable Markov Decision Process (POMDP). The POMDP framework captures dynamic network state variables such as RSU loads, available bandwidth, computational capacity, and attack types. These variables are continuously monitored to inform decision-making processes. The migration efficiency and security evaluation of RSU chains are based on the total migration delay and the number of DDoS attacks encountered, which are then used as reward functions to optimize decisions. Over time, the DRL agents learn from interactions with the environment, optimizing the selections of RSU chains for secure and efficient intelligent twin migration. Through this algorithm, the proposed scheme effectively addresses the issue of excessive data transmission delays in vehicular metaverses caused by network attacks, ensuring reliable and secure intelligent twin migration even under various types of DDoS attacks.  Results and Discussions  The proposed secure dynamic intelligent twin migration scheme is based on the MADRL framework to select efficient and secure RSU chains in the POMDP. By defining an appropriate reward function, the the efficiency and security performance of intelligent twin migration across RSU chains are evaluated by assessing the impact of varying RSU chain lengths and different attack scenarios on system performance. Simulation results demonstrate that the proposed scheme can effectively enhance the security of intelligent twin migration in vehicular metaverses. Specifically, shorter RSU chains achieve lower migration delays than longer chains due to fewer handovers and lower communication overhead (Fig. 2). Additionally, the total reward reaches its maximum value when the RSU chain length is 6 (Fig. 3). The MADQN approach demonstrates strong defense capabilities against DDoS attacks. Under direct attacks, the MADQN approach yields final rewards that are 65.3% and 51.8% higher than those achieved by the random and greedy strategies, respectively. Against indirect attacks, MADQN improves upon other approaches by 9.3%. Under hybrid attack conditions, MADQN raises the final reward by 29% and 30.9% compared with the random and greedy strategies, respectively (Fig. 4), which shows the effectiveness and advantages of the DRL-based defense strategy in dealing with complex and dynamic attacks. Additionally, as indicated by experimental results (Figs. 5-7), when compared with other DRL algorithms such as PPO, A2C, and QR-DQN, the MADQN algorithm demonstrates superior performance under direct, indirect, and hybrid DDoS attacks. In conclusion, the proposed scheme ensures reliable and efficient intelligent twin migration across RSUs, even under diverse security threats, thereby supporting high-quality interactions in vehicular metaverses.  Conclusions  This study addresses the challenge of ensuring secure and efficient global migration of intelligent twins in vehicular metaverses by integrating RSU chains with a POMDP-based optimization framework. By utilizing the MADQN algorithm, the proposed scheme enhances the efficiency and security of intelligent twin migration under various network conditions and attack scenarios. Simulation results show that the efficiency and security of intelligent twin migration have been significantly enhanced. On the one hand, under the same driving route, shorter RSU chains are associated with higher migration efficiency and stronger security defense capabilities. On the other hand, when facing various types of DDoS attacks, MADQN consistently outperforms other baseline algorithms. The results show that the MADQN algorithm achieves higher final rewards than random and greedy strategies in various attack scenarios. Compared with other DRL algorithms, MADQN increases the final reward by as much as 50.1%. It indicates that MADQN offers superior reward outcomes and greater adaptability in complex attack environments. For future work, we will focus on further improving the communication security of RSU chains, such as implementing authentication mechanisms to ensure that only authenticated vehicles can access RSU edge communication networks.
LLM-based Data Compliance Checking for IoT Scenarios
LI Chaohao, WANG Haoran, ZHOU Shaopeng, YAN Haonan, ZHANG Feng, LU Tianyang, XI Ning, WANG Bin
Available online  , doi: 10.11999/JEIT250704
Abstract:
  Objective  The implementation of various regulations, including the Data Security Law of the People's Republic of China, the Personal Information Protection Law of the People's Republic of China, and the General Data Protection Regulation (GDPR) of the European Union, has led to the emergence of data compliance checking as a crucial mechanism for regulating data processing activities, ensuring data security, and safeguarding the legitimate rights and interests of individuals and organizations. However, the characteristics of the Internet of Things (IoT), including the abundance of diverse and heterogeneous devices and the dynamic, extensive, and variable nature of data, have increased the difficulty of data compliance checking. Specifically, the logs and traffic data generated by IoT devices are characterized by long text, unstructured formats, and ambiguous content, resulting in a high rate of false positives when employing traditional rule-matching methods. Conversely, the dynamic nature of business scenarios and user-defined compliance requirements serve to exacerbate the complexity of rule design, maintenance, and decision-making.  Methods  In response to the aforementioned challenges, this paper proposes a novel large language model-driven data compliance checking method for IoT scenarios. In the initial stage, a fast regular expression matching algorithm is employed to efficiently screen out all potential non-compliant data. This is accomplished by using a comprehensive rule database. The output of this stage is structured preliminary checking results containing information such as the original non-compliant content and the type of non-compliance. The comprehensive rule database encompasses contemporary legislation and regulations, standard requirements, enterprise norms, and customized business requirements, exhibiting both flexibility and expandability. This stage successfully overcomes the challenge of reviewing massive amounts of long IoT text data by leveraging the efficiency of regular expression matching algorithms and extracting structured preliminary results data to improve the accuracy of subsequent large language model review. In the subsequent stage, a large language model (LLM) is employed to assess the precision of the initial detection outcomes from the preceding stage. For different types of violations, the large language model adaptively selects different prompt words to achieve differentiated classification detection.  Results and Discussions  This paper collected data from 52 Internet of Things (IoT) devices in a running environment, including log and traffic data (Table 2), and established a compliance checking rule library for IoT devices in accordance with the provisions of the Cybersecurity Law, the Data Security Law, and other relevant laws and regulations, as well as internal information security regulations of enterprises. Subsequently, based on this database, the paper conducted the first phase of rule matching on the collected data, resulting in a false positive rate as high as 64.3%. Accordingly, a total of 55,080 potential non-compliant data points were identified.The present study compares three aspects: different benchmark models, different prompt schemes, and different role prompts. In the benchmark large language model comparison test, eight mainstream large language models were utilized to assess the detection outcomes (Table 5). These models encompassed Qwen2.5-32B-Instruct, DeepSeek-R1-70B, and DeepSeek-R1-0528, each with distinct parameter configurations. Following a thorough review and testing by the large language model, the original false positive rate was reduced to 6.9%, thereby effectively enhancing the quality of compliance testing. Additionally, the error rate of the large language model itself was less than 0.01%. (2)The prompt engineering method has been demonstrated to exert a substantial influence on the review outcomes of large language models (Table 6). Utilizing general prompts, the final false positive rate attained a considerable magnitude of 59%. Utilizing solely thought chains and concise sample prompts, the false positive rate was diminished to approximately 12% and 6%, correspondingly. Moreover, the error rate of the large language model itself was reduced to around 30% and 13%. The employment of a combination of the aforementioned methods led to a further reduction in the error rate of the small sample prompt method to 0.01%.(3) The impact of system role prompt words on review accuracy is also demonstrated in Table 7. The experimental results demonstrate that simple role prompt words exhibit superior performance in terms of accuracy and F1 when compared to the absence of role prompt words. Conversely, detailed role prompt words demonstrate a more pronounced overall advantage over simple role prompt words. Furthermore, the paper delves into the role of rule classification and prompt engineering in the context of compliance checking through ablation experiments (Table 8). The paper utilizes unique knowledge supplementation to mitigate the likelihood of mutual interference and misjudgment, thereby reducing the redundancy of prompts. This, in turn, contributes to a reduction in the false alarm rate of large language model reviews.  Conclusions  This paper proposes a novel large language model-driven data compliance checking method for IoT scenarios. This method addresses the problem of compliance checking for large-scale unstructured device data. The paper substantiates the feasibility of the solution through rationality analysis experiments, and the experimental results demonstrate the efficacy of the method in reducing the false positive rate of device data compliance checking. The original rule-based checking method exhibited an overall false positive rate of 64.3%, which was reduced to 6.9% through large language model review. Additionally, the error rate introduced by the model itself was controlled below 0.01%.
Complete Coverage Path Planning Algorithm Based on Rulkov-like Chaotic Mapping
LIU Sicong, HE Ming, LI Chunbiao, HAN Wei, LIU Chengzhuo, XIA Hengyu
Available online  , doi: 10.11999/JEIT250887
Abstract:
  Objective  This study proposes a novel complete coverage path planning (CCPP) algorithm based on a sine-constrained Rolkov-like hyper-chaotic (SRHC) mapping, addressing critical challenges in robotic path planning. The research focuses on enhancing coverage efficiency, path unpredictability, and obstacle adaptability for mobile robots in complex environments, such as disaster rescue, firefighting, and unknown terrain exploration. Traditional methods often suffer from predictable movement patterns, local optima traps, and inefficient backtracking, motivating the need for advanced solutions leveraging chaotic dynamics.  Methods  The SRHC-CCPP algorithm integrates: 1. SRHC Mapping A hyper-chaotic system with nonlinear coupling (Eq. 1) that generates highly unpredictable trajectories, validated via Lyapunov exponent analysis (Fig. 3a–b). Phase-space diagrams (Fig. 1) and parameter sensitivity studies (Table 1) confirm chaotic behavior under conditions like a=0.01a=0.01, b=1.3b=1.3. 2. Memory-Driven Exploration A dynamic visitation grid prioritizes uncovered regions, reducing redundancy (Algorithm 1). 3. Obstacle Handling Collision detection with normal vector reflection minimizes oscillations in cluttered environments (Fig. 4). Simulations employed a Mecanum-wheel robot model (Eq. 2) for omnidirectional mobility.  Results and Discussions  1. Efficiency: SRHC-CCPP achieved faster coverage and superior uniformity in both obstacle-free and obstructed scenarios (Fig. 56). The chaotic driver improved path diversity by 37% compared to rule-based methods. 2. Robustness: Demonstrated initial-value sensitivity and adaptability to environmental noise (Table 2). 3. Scalability: Low computational overhead enabled deployment in large-scale grids (>104 cells).  Conclusions  The SRHC-CCPP algorithm advances robotic path planning by: 1. Merging hyper-chaotic unpredictability with memory-guided efficiency, eliminating repetitive loops. 2. Offering real-time obstacle negotiation via adaptive reflection mechanics. 3. Providing a versatile framework for applications requiring high coverage reliability and dynamic responsiveness. Future work may explore multi-agent extensions and 3D environments.
Available online  , doi: 10.11999/JEIT250941
Abstract:
Multi-Channel Switching Array DOA Estimation Algorithm Based on FRIDA
CHEN Tao, XI Haolin, ZHAN Lei, YU Yuwei
Available online  , doi: 10.11999/JEIT250350
Abstract:
  Objective  With the increasing complexity of the electromagnetic environment, the requirements for estimation accuracy in practical direction finding systems are rising. Increasing the size of the antenna array is an effective means of improving the estimation accuracy of the actual direction finding system, but increasing the size of the antenna significantly increases the complexity of the actual direction finding system. This paper aims to reduce the number of channels used while maintaining the performance of DOA estimation for full-channel data. By combining the advantages of the channel compression algorithm in reducing the number of channels used with the structure of time-modulated arrays that incorporate switches in the RF front-end, This paper proposes a Multi-Channel Switching Array DOA Estimation Algorithm Based on FRIDA.  Methods  The algorithm first introduces a selection matrix consisting of switches between the antenna array and the channel, which is responsible for passing the signal data received by the particular antenna selected into the channel, i.e., a particular subarray is selected to receive the data through the selection matrix. Switching different subarrays to collect different less channel received data covariance matrix, by stipulating the common array elements in each subarray to ensure the phase consistency in the covariance matrix, these covariance matrices recover the dimensionality of the covariance matrix after the weighted summation to obtain the total covariance matrix. The total covariance matrix is weighted and summed to obtain the full-channel received data vector after weighting and summing the elements of the total covariance matrix that correspond to the same spacing of the array elements. The FRI reconstruction model is then constructed using the full-channel received data vector, and finally the estimated incident angle is obtained by using the proximal gradient descent algorithm together with the parameter recovery algorithm.  Results and Discussions  Simulation results of DOA estimation for SA-FRI at multiple source incidence show that: the full-channel received data vectors constructed using multiple covariance matrices of the less-channeled received data are able to discriminate multi-source incident signals, and their performance in this aspect is approximately the same as that of the full-channeled received data(Fig 2). Simulation results of the estimation accuracy with the number of snapshots and SNR under different channel numbers show that the estimation accuracy of the algorithm proposed in this paper with different channel numbers increases with the number of snapshots and SNR, and the more channels are used, the higher the DOA estimation accuracy is under the same SNR and the number of snapshots (Fig 3, Fig 4). Simulation results of DOA estimation accuracy of four different algorithms under different SNR and number of snapshots show that the DOA estimation accuracy of the four algorithms increases with the increase of SNR and number of snapshots, and the algorithms proposed in this paper outperform the other algorithms under the same conditions(Fig 5, Fig 6). Meanwhile, the same results were obtained by verifying the actual measured data(Fig 9), which further proved the effectiveness of the algorithm proposed in this paper.  Conclusions  Aiming at the problem of how to reduce the number of channels used in the actual DOA estimation system, this paper proposes an array switching DOA estimation based on proximal gradient descent. The algorithm firstly reduces the number of channels using the switching matrix, and then obtains multiple covariance matrices by switching different subarray access channels for multiple acquisitions, uses these covariance matrices to recover and complete the covariance matrix of the full-channel received data, and uses the matrix, and finally obtains the estimation of the DOA parameter of the incident signal by using the proximal gradient descent algorithm. At the same time, the simulation verifies that the proposed algorithm can reduce the use of channels under the premise of guaranteeing certain estimation accuracy. In addition, the effectiveness of the algorithm is further verified by the DOA estimation of the measured data collected through the actual DOA estimation system, which obtains results similar to those of the simulation.
Available online  , doi: 10.11999/JEIT250637
Abstract:
Kepler’s Laws Inspired Single Image Detail Enhancement Algorithm
JIANG He, SUN Mang, ZHENG Zhou, WU Peilin, CHENG Deqiang, ZHOU Chen
Available online  , doi: 10.11999/JEIT250455
Abstract:
  Objective  In recent years, single-image detail enhancement based on residual learning have attracted extensive attention. These algorithms update the residual layer by leveraging the similarity between the residual layer and the detail layer, and then linearly combine it with the original image to enhance image details. However, this update process is a greedy algorithm, which is prone to trapping the system in local optima, thereby limiting the overall performance. Inspired by Kepler’s laws, this study analogizes the residual update to the dynamic adjustment of planetary positions. By applying Kepler's laws and calculating the global optimal position of the planets, precise updates of the residual layer are achieved.  Methods  The input image is divided into multiple blocks. For each block, its candidate blocks are regarded as “lanets”, and the best matching block is treated as a “star”. The positions of the “planets” and the “star” are updated by calculating the differences between each “planet” and the original image block until the positions converge, thereby determining the location of the global optimal matching block.  Results and Discussions  In this study, 16 algorithms are tested on three datasets at two different magnification levels (Table 1). The test results show that the proposed algorithm performs excellently in both PSNR and SSIM evaluations. During the detail enhancement process, compared with other algorithms, the proposed algorithm demonstrates stronger edge preservation capabilities (Fig.7). However, the algorithm proposed in this paper is not robust to noise (Fig.8-Fig.10), and the performance of the detail-enhanced images continues to decline as the noise intensity increases (Fig.11). Both the initial gravitational constant and the gravitational attenuation rate constant show a fluctuating trend, that is, they first increase and then decrease (Fig.12). When the gradient loss and texture loss weights are set to 0.001, the KLDE system achieves the best performance (Fig.13).  Conclusions  This study proposes a single-image detail enhancement algorithm inspired by Keple’s laws. By analogizing the residual update process to the dynamic adjustment of planetary positions, the algorithm utilizes Kepler's laws to optimize the update of residual layers, alleviating the local optimum problem caused by greedy search and achieving more precise image detail enhancement. Experimental results show that this algorithm outperforms existing methods in both visual effects and quantitative metrics, and can achieve natural enhancement performance. However, it is worth noting that the algorithm has a relatively long running time, with the computational bottleneck being limited by the iterative update of candidate blocks and the calculation of parameters such as gravity. Future work will focus on optimizing the algorithm structure to reduce invalid searches and improve system operation efficiency. Additionally, this algorithm does not require training and has good performance, and it shows potential and promotion value in scenarios such as high-precision offline image enhancement.
Research on Data Recovery for the Power Grid Industry Based on Diffusion Models
YAN Yandong, LI Chenxi, LI Shijie, YANG Yang, GE Yuhao, HUANG Yu
Available online  , doi: 10.11999/JEIT250435
Abstract:
  Objective  Smart grid construction drives modern power systems. As a key link connecting the main grid to end-users, distribution networks rely on data management/analysis for stability, power quality, and efficiency—yet generate massive multi-source heterogeneous data (user consumption, real-time meteorology, equipment status, marketing info). This data often becomes incomplete during collection/transmission due to noise, sensor failures, aging, or bad weather. Missing data impairs real-time monitoring and critical tasks (load forecasting, fault diagnosis, health assessment, O&M decisions), while traditional methods (mean/regression imputation) and generative models (GANs, VAEs) fail to capture grid data’s temporal dependencies and complex distributions, limiting accuracy. Thus, this research aims to develop a novel diffusion model-based data augmentation method for distribution networks, to effectively recover missing data, preserve its semantic/statistical integrity, and boost data utility for smart grid stability and efficiency.  Methods  This paper proposes a novel power grid data augmentation method based on diffusion models. The core of this method involves mapping input Gaussian noise to the target distribution space of the missing data, enabling the restoration of this data in accordance with its original underlying distribution patterns. Furthermore, to minimize the semantic discrepancy between the recovered data and the actual data, the proposed approach integrates time-series sequence embeddings as conditional information. This conditional input guides and optimizes the diffusion generation process, ensuring a more contextually accurate imputation.  Results and Discussions  Experimental results demonstrate that the proposed diffusion model-based data augmentation technique achieves state-of-the-art accuracy in recovering missing power grid data when compared to conventional methods. This superior performance highlights the method's capability to significantly enhance the completeness and reliability of datasets crucial for various analytical applications and operational decision-making in smart grids.  Conclusions  This study successfully introduces and validates a diffusion model-based data augmentation method tailored for addressing data missingness in power distribution networks. Unlike traditional data restoration methods and conventional generative models that struggle to capture the temporal dependencies and complex distribution characteristics of grid data, this method effectively leverages temporal sequence information as a conditional guide, which not only enables accurate imputation of missing values but also well preserves the semantic integrity and statistical consistency of the original data. By solving the long-standing problem of low accuracy in restoring missing distribution network data, this method offers a robust and advanced solution for improving data quality, thereby providing solid technical support for the enhanced stability and efficiency of smart grid operations.
T3FRNet: A Cloth-Changing Person Re-identification via Texture-aware Transformer Tuning Fine-grained Reconstruction method
ZHUANG Jianjun, WANG Nan
Available online  , doi: 10.11999/JEIT250476
Abstract:
  Objective  Compared to conventional person re-identification, Cloth-Changing person Re-Identification (CC Re-ID) requires moving beyond the reliance on the stability of a person’s appearance features over time, instead demanding models with greater robustness and generalization capabilities to address real-world application scenarios. Existing deep feature representation methods can leverage salient regions or attribute information to obtain discriminative features and mitigate the impact of clothing variations, yet their performance often degrades under changing environments. To address the challenges of effective feature extraction and limited training samples in CC Re-ID tasks, this paper proposes a novel Texture-aware Transformer Tuning Fine-grained Reconstruction Network (T3FRNet), which aims to fully exploit fine-grained information within person images, enhance the robustness of feature learning, and reduce the adverse impact of clothing changes on model performance, ultimately overcoming performance bottlenecks under scene variations.  Methods  To better compensate for the limitations of local receptive fields, T3FRNet incorporates a Transformer-based attention mechanism into the ResNet50 backbone, constructing a hybrid architecture named ResFormer50. This design facilitates spatial relational modeling on top of local features, enhancing the model’s perceptual capacity for feature extraction while maintaining a balance between efficiency and performance. The fine-grained Texture-Aware (TA) module concatenates processed texture features with deep semantic features, thereby improving the model’s recognition capability under clothing variations. Meanwhile, the Adaptive Hybrid Pooling (AHP) module performs channel-wise autonomous aggregation, enabling deeper and more refined mining of feature representations. This contributes to achieving a balance between global representation consistency and robustness to clothing changes. A novel Adaptive Fine-grained Reconstruction (AFR) strategy introduces adversarial perturbations and selective reconstruction at a fine-grained level. Without relying on explicit supervision, the AFR strategy significantly enhances the model’s robustness and generalization against clothing changes and local detail perturbations, thereby improving recognition accuracy in real-world scenarios. Finally, a Joint Perception Loss (JP-Loss) is designed by integrating fine-grained identity robustness loss, texture feature loss, the widely used identity classification loss, and triplet loss. This composite loss jointly supervises the model to learn robust fine-grained identity features, ultimately boosting its performance under challenging cloth-changing conditions.  Results and Discussions  To validate the effectiveness of the proposed model, extensive evaluations are conducted on two widely used CC Re-ID benchmarks, LTCC, PRCC and Celeb-reID, as well as a large-scale dataset, DeepChange (Table 1). Under cloth-changing scenarios, the model achieves Rank-1/mAP scores of 45.6%/19.8% on LTCC, 70.6%/69.1% on PRCC (Table 2), 64.6%/18.4% on Celeb-reID(Table 3), and 58.0%/20.8% on DeepChange (Table 4), outperforming existing state-of-the-art approaches. The TA module effectively extracts latent local texture details within person images and, in conjunction with the AFR strategy, performs fine-grained adversarial perturbations and selective reconstruction. This enhances fine-grained feature representations, enabling the proposed method to also achieve 96.2% Rank-1 and 89.3% mAP on the clothing-consistent Market-1501 dataset (Table 5). The introduction of the JP-Loss further supports the TA module and AFR strategy by enabling fine-grained adaptive regulation and clustering of texture-sensitive identity features (Table 6). Furthermore, when Transformer-based attention mechanism is integrated after stage2 of the ResNet50, the model achieves improved local structural perception and global context modeling with only a slight increase in computational overhead, thereby enhancing overall performance (Table 7). Additionally, setting the \begin{document}$ \beta $\end{document} parameter to 0.5 (Fig.5) enables the JP-Loss to effectively balance global texture consistency and local fine-grained discriminability, thereby enhancing the overall robustness and accuracy of CC Re-ID. Finally, visualization experiments based on the PRCC dataset (Fig.6) offer intuitive evidence of the model’s superior feature extraction capability and highlight the significance of the Transformer-based attention mechanism. The top 10 ranking retrieval results of the baseline model and T3FRNet in the clothing changing scenario (Fig.7) intuitively demonstrate that T3FRNet has better stability and accuracy.  Conclusions  This paper proposes a CC Re-ID method based on T3FRNet, composed of the ResFormer50 backbone, TA module, AHP module, AFR strategy, and JP-Loss. Extensive experiments conducted on four publicly available cloth-changing benchmarks and one clothing-consistent dataset demonstrate the effectiveness and superiority of the proposed approach. Under long-term scenarios, Rank-1/mAP on the LTCC and PRCC datasets achieve significant improvements of 16.8%/8.3% and 30.4%/32.9% respectively. The ResFormer50 backbone facilitates spatial relationship modeling on top of local fine-grained features, while the TA module and AFR strategy enhance the expressiveness of fine-grained representations. The AHP module effectively balances the model’s sensitivity to local textures with the stability of global features, thereby ensuring strong feature representation alongside robustness. JP-Loss assists the model in constraining fine-grained feature representations and performing adaptive regulation, thereby enhancing its generalization capability in diverse and challenging cloth-changing scenarios. Future work will focus on simplifying the model architecture to reduce computational complexity and latency, aiming to strike a better balance between high recognition accuracy and deployment efficiency.
Stealthy Path Planning Algorithm for UAV Swarm Based on Improved APF-RRT* Under Dynamic Threat
ZHANG Xinrui, SHI Chenguang, WU Zhifeng, WEN Wen, ZHOU Jianjiang
Available online  , doi: 10.11999/JEIT250554
Abstract:
  Objective  The efficient penetration and survivability of unmanned aerial vehicle (UAV) swarms in complex battlefield environments critically depend on robust trajectory planning. With the increasing deployment of advanced air defense systems—featuring radar network, anti-aircraft artillery and dynamic no-fly zones—conventional planning methods struggle to meet simultaneous requirements for stealth, feasibility, and safety. Although prior studies have contributed valuable insights into UAV swarm path planning, they present several limitations: (1) Most research focuses on detection models for single radars and does not sufficiently incorporate the coupling between UAV radar cross section (RCS) and stealth trajectory optimization; (2) UAV kinematic constraints are often treated independently of stealth characteristics; (3) Environmental threats are typically modeled as static and singular, overlooking real-time dynamic threats; (4) Stealth planning is predominantly studied for individual UAVs, with limited attention to swarm-level coordination. This work addresses these gaps by proposing a cooperative stealth trajectory planning framework that integrates real-time threat perception with swarm dynamics optimization, significantly enhancing survivability in contested airspace.  Methods  To overcome the aforementioned challenges, this paper proposes a stealth path planning algorithm for UAV swarm based on improved artificial potential field (APF) and rapidly-exploring random trees star (RRT*) framework under dynamic threat. First, a multi-threat environment model is constructed, incorporating radars, anti-aircraft artillery, and fixed obstacles. A comprehensive stealth cost function is developed by integrating UAV RCS characteristics, accounting for flight distance, radar detection probability, and artillery threat probability. Second, a stealth trajectory optimization model is formulated with the objective of minimizing the overall cost function, subject to strict constraints on UAV kinematics, swarm coordination, and path feasibility. To solve this model efficiently, an enhanced APF-RRT* algorithm is designed. A rolling-window strategy is introduced to facilitate continuous local replanning in response to dynamic threats, enabling real-time trajectory updates and improving responsiveness to sudden changes in the threat landscape. Furthermore, a target-biased sampling technique is applied to reduce sampling redundancy, thereby enhancing algorithmic convergence speed. By combining the global search capability of RRT* with the local adaptability of APF, the proposed approach enables UAV swarms to generate stealth-optimal paths in real time while maintaining high levels of safety and coordination in adversarial environments.  Results and Discussions  Simulation experiments validate the effectiveness of the proposed algorithm. During global path planning, some UAVs enter regions threatened by dynamic no-fly zones, radars, and artillery systems, while others successfully reach their destinations through unobstructed paths. In the local replanning phase, affected UAVs adaptively adjust their trajectories to minimize radar detection probability and overall stealth cost. When encountering mobile threats, UAVs perform lateral evasive maneuvers to avoid collisions and ensure mission completion. In contrast, the detection probabilities of the UAVs requiring replanning all exceed the specified threshold for networked radar detection under the comparison algorithms. This indicates that, in practical scenarios, the comparison algorithms fail to generate UAV swarm trajectories that meet platform safety requirements, rendering them ineffective. Comparative simulations demonstrate that the proposed method significantly outperforms existing approaches by reducing stealth costs and improving trajectory feasibility and swarm coordination. The algorithm achieves optimal swarm-level stealth and ensures safe and efficient penetration in dynamic environments.  Conclusions  This study addresses the problem of stealth trajectory planning for UAV swarms in dynamic threat environments by proposing an improved APF-RRT* algorithm. The following key findings are derived from extensive simulations conducted across different contested scenarios (Section 5): (1) The proposed algorithm reduces the voyage distance by 11.1km in scene 1 and 66.9km in scene 2 compared with the baseline RRT* method (Tab. 3, Tab. 5), primarily due to RCS-minimizing attitude adjustments by heading angle chang (Fig. 3, Fig. 6); (2) The networked radar detection probability remains below the 30% threshold for all UAVs (Fig. 4(a), Fig. 7(a)), whereas comparison algorithm exceed the safety limit of 98% of the group members at most (Fig. 4(b), Fig. 7(b), Fig. 9(a), Fig. 9(b)); (3) The rolling-window replanning mechanism enables real-time avoidance of mobile threats such as dynamic no-fly zones and anti-aircraft artillery (Fig. 5, Fig. 8), while simultaneously reducing the comprehensive trajectory cost by 9.0% in Scene 1 and 15.6% in Scene 2 compared with the baseline RRT method (Tab. 3, Tab. 5). (4) Cooperative constraints embedded in the planning algorithm maintain safe inter-UAV separation and jointly optimize swarm-level stealth performance (Fig. 2, Fig. 5, Fig. 8). These results collectively demonstrate the superiority of the proposed method in balancing stealth optimization, dynamic threat adaptation, and swarm kinematic feasibility. Future research will extend this framework to 3D complex terrain environments and integrate deep reinforcement learning to further enhance predictive threat response and battlefield adaptability.
Power Allocation for Downlink Short Packet Transmission with Superimposed Pilots in Cell-free Massive MIMO
SHEN Luyao, ZHOU Xingguang, XIA Wenchao, ZHU Hongbo,
Available online  , doi: 10.11999/JEIT250655
Abstract:
  Objective  With the advancement of 5th Generation mobile communication, the volume of communication service interactions has experienced a dramatic surge. To effectively accommodate this substantial increase in communication demands, Cell-Free Massive Multiple-Input Multiple-Output (CF-mMIMO) has emerged as a key enabling technology. However, supporting multi-user access in CF-mMIMO systems introduces significant complexity in channel estimation. Conventional channel estimation methods based on regular pilots incur large overhead, which considerably reduces the number of symbols available for data transmission, thereby leading to a decrease in transmission rates. This issue is particularly pronounced in short packet transmission scenarios. To address this challenge, this paper investigates a downlink short packet transmission scheme based on Superimposed Pilots (SP) in CF-mMIMO systems, aiming to improve the performance of short packet transmission.  Methods  This study addresses the SP-based downlink short packet transmission scenario in CF-mMIMO systems. A power allocation algorithm is proposed. Considering the energy consumption and resource constraints encountered in practical applications, a User-Centric (UC) approach is adopted. Based on the Maximum Ratio Transmission (MRT) precoding scheme, a closed-form expression for the downlink achievable rate is derived under imperfect Channel State Information (CSI) . Due to the cross-interference between pilot signals and data signals, an iterative optimization algorithm based on Geometric Programming (GP) and Successive Convex Approximation (SCA) is further proposed. Specifically, considering the minimum data rate requirement and uplink and downlink power constraints, the objective is to optimize the power allocation between pilot signals and data signals. By employing logarithmic function approximation and SCA methods, the non-convex optimization problem is transformed into a GP problem, then an iterative algorithm is proposed to solve it. Furthermore, this paper compares the SP scheme with the Regular Pilots (RP) scheme to demonstrate the superiority of SP scheme and the algorithm proposed.  Results and Discussions  Simulation results first confirm the accuracy of the closed-form expressions for the downlink sum-rate under both SP and Regular Pilots (RP) schemes (Fig. 2). To further demonstrate the superiority and effectiveness of the proposed algorithm, this paper conducts a comparative analysis of the weighted sum-rate performance. The comparison is performed among the proposed power allocation algorithm under both SP scheme and RP scheme, and fixed power allocation under SP scheme, with the number of antennas of APs (Fig 3), the number of users (Fig 4), block length (Fig 5), and decoding error probability (Fig 6) serving as variables. The results demonstrate that the weighted sum-rate achieved with the proposed power allocation algorithm under the SP scheme outperforms both the RP scheme and the fixed power allocation scheme.  Conclusions  This paper investigates the downlink power allocation problem under SP scheme in CF-mMIMO systems for short packet transmission scenarios. Firstly, the user-centric (UC) scheme is adopted to derive a closed-form expression for the lower bound of the downlink transmission rate under imperfect CSI and MRT precoding schemes. Subsequently, the downlink weighted sum-rate maximization problem for SP scheme is formulated, and the non-convex problem is transformed into a solvable GP problem using SCA method. Finally, an iterative algorithm is employed to obtain the solution. Simulation results further validate the correctness of the closed-form expression for the transmission rate and demonstrate the superiority of the proposed power allocation algorithm.
Performance Optimization of UAV-RIS-Assisted Communication Networks Under No-Fly Zone Constraints
XU Junjie, LI Bin, YANG Jingsong
Available online  , doi: 10.11999/JEIT250681
Abstract:
  Objective  Reconfigurable Intelligent Surfaces (RIS) mounted on Unmanned Aerial Vehicles (UAVs) have emerged as a promising solution to enhance wireless communication coverage and adaptability in complex or constrained environments. However, in practical deployments, two major challenges remain largely underexplored. First, the existence of No-Fly Zones (NFZs), such as airports, government facilities, and high-rise areas, significantly restricts the UAV’s flight trajectory and may lead to communication blind spots. Second, the continuous attitude variation of UAVs during flight causes dynamic misalignment between the RIS and the desired reflection direction, which significantly degrades signal strength and system throughput. To address these issues, this paper proposes a comprehensive UAV-RIS-assisted communication framework that simultaneously considers NFZ avoidance and UAV attitude adjustment. Specifically, this paper studies a quadrotor UAV with a bottom-mounted RIS, operating in an environment with multiple polygonal NFZs and a group of ground users (GUs). The objective is to jointly optimize the UAV trajectory, RIS phase shift, UAV attitude (expressed via Euler angles), and base station (BS) beamforming, with the aim of maximizing the system sum rate while ensuring complete obstacle avoidance and high-quality service for GUs located both inside and outside the NFZs.  Methods  To achieve this objective, a multi-variable coupled non-convex optimization problem is formulated, jointly capturing UAV trajectory, RIS configuration, UAV attitude, and BS beamforming under NFZ constraints. RIS phase shifts are dynamically adjusted based on the UAV’s orientation to maintain beam alignment, while UAV motion follows quadrotor dynamics and avoids polygonal NFZs. Owing to the high dimensionality and non-convexity, conventional optimization methods are computationally prohibitive and lack real-time adaptability. To overcome this, the problem is reformulated as a Markov Decision Process (MDP), enabling policy learning via deep reinforcement learning. Specifically, this paper adopts the Soft Actor-Critic (SAC) algorithm, which leverages entropy regularization for efficient exploration and stable convergence. The UAV-RIS agent iteratively interacts with the environment, updating actor-critic networks to determine UAV position, RIS phases, and beamforming. Through continuous learning, the framework achieves higher throughput with guaranteed NFZ avoidance, outperforming benchmarks.  Results and Discussions  As shown in (Fig. 3), the proposed SAC algorithm achieves higher communication rates than PPO, DDPG and TD3 during training, benefiting from entropy-regularized exploration that mitigates premature convergence. While DDPG converges faster, it exhibits instability and inferior long-term performance. As illustrated in (Fig. 4), the UAV trajectories under different settings confirm the proposed algorithm’s ability to achieve complete obstacle avoidance while maintaining reliable communication. Regardless of changes in initial UAV positions, BS locations, or NFZ configurations, the UAV consistently avoids all NFZs and adjusts its trajectory to serve users both inside and outside restricted zones, demonstrating strong adaptability and scalability of the proposed model. In (Fig. 5), it shows that increasing the number of BS antennas enhances system performance. The proposed scheme significantly outperforms fixed phase shift, random phase shift and without RIS methods due to improved beamforming flexibility.  Conclusions  This paper investigates a UAV-RIS-assisted wireless communication system, where a quadrotor UAV carries a RIS for signal reflection and NFZ avoidance. Unlike conventional methods focusing only on avoidance, a path integral-based approach is proposed to ensure obstacle-free trajectories while maintaining reliable service for GUs inside and outside NFZs. To enhance generality, NFZs are modeled as prismatic obstacles with regular n-sided polygonal cross-sections. The system jointly considers UAV trajectory, RIS phase shifts, UAV attitude, and BS beamforming. A DRL framework with SAC is developed to optimize system efficiency. Simulations show that the proposed method achieves reliable avoidance and maximized sum rate, and it outperforms benchmarks in communication performance, scalability, and stability.
Minimax Robust Kalman Filtering under Multistep Random Measurement Delays and Packet Dropouts
YANG Chunshan, ZHAO Ying, LIU Zheng, QIU Yuan, JING Benqin
Available online  , doi: 10.11999/JEIT250741
Abstract:
  Objective  The networked control system (NCS) offer many advantages including flexibility in installation and maintenance, lower cost, but also introduce more complexities that include the random measurement delays and packet dropouts due to the unreliability of the communication network and the limited bandwidth. Meanwhile, the system noise variance may fluctuate significantly under the environment of strong electromagnetic interference. The time delay is random and uncertain in NCS. When a group of Bernoulli distributed random variables are used to describe multi-step random measurement delays and packet dropouts, the fictitious noise method in the current research work will lead to the autocorrelation between different components, which make it hard to compute the fictitious noise variances, and hard to prove the robustness. This research offers the solution to minimax robust Kalman filtering for system with uncertain noise variance, multistep random measurement delay and packet dropouts.  Methods  The main difficulties lie in model transformation and proof of robustness. When a group of Bernoulli distributed random variables are used to describe multi-step random measurement delays and packet dropouts, a series of approaches have been adopted to address the minimax robust Kalman filtering problem. Firstly, a new model transformation method is presented based on the flexibility of Hadamard product in multi-dimensional data processing, and then the robust time-varying Kalman estimator is designed in a unified form based on the minimax robust filtering principle. Secondly, the matrix elementary transformation, strict diagonal-dominance matrix, Gerŝgorin circle theorem and Hadamard product theorem are used to prove the robustness based on the generalized Lyapunov equation method. Additionally, the Hadamard product is converted into matrix product by using matrix factorization method, a sufficient condition for the existence of the steady-state estimator is obtained, then the robust steady-state Kalman estimator is designed.  Results and Discussions  The proposed minimax robust Kalman filter extended the robust Kalman filtering method and provided new theoretical support for solving the robust fusion filtering problem of complex NCS. The curves (Fig. 5) give the actual accuracy \begin{document}${\text{tr}}{{\mathbf{\bar P}}^l}(N)$\end{document}, \begin{document}$l = a,b,c,d$\end{document} versus \begin{document}$ 0.1 \le {\alpha _0},{\alpha _1},{\alpha _2} \le 1 $\end{document}. It can be seen that the situation (a) has the highest robust accuracy, followed by situation (b) and situation (c), and the situation (d) has worse actual accuracy. That is because the measurements received by estimators in situations (a) have one-step random delay, and situation (d) has higher packet loss rate. The curves (Fig. 5) show the reasonability and effectiveness of the proposed method. Another simulation is considered for mass–spring–damper system. The comparisons between the proposed method and optimal robust filtering method (Tab. 2, Fig. 7), it can be seen that although the proposed method can guarantee that the actual prediction error variance has the minimum upper bound, however, the actual accuracy is slightly lower than the optimal prediction accuracy.  Conclusions  The minimax robust Kalman filtering problem is addressed for system with uncertain noise variance, multistep random measurement delay and packet dropouts. The system noise variance is uncertain but with known conservative upper bounds, and a group of Bernoulli distributed random variables with known probability are used to describe the multistep random measurement delay and packet dropouts from sensor to estimator. The Hadamard product is used to improve the model transformation method, and then the minimax robust time-varying Kalman estimator are designed. The robustness is proved by matrix elementary transformation, Gerŝgorin circle theorem, Hadamard product theorem, matrix factorization and Lyapunov equation method. A sufficient condition that the time-vary generalized Lyapunov equation has steady-state unique positive semi-definite solution is given, and then robust steady-state estimator is designed. The convergence in a realization between the time-varying and steady state estimator is proved. Two simulation examples show the effectiveness of the proposed results. The proposed methods overcome the limitation of existing methods, and provide theoretical support for solving the robust fusion filtering problem of complex NCS.
Short Packet Secure Covert Communication Design and Optimization
TIAN Bo, YANG Weiwei, SHA Li, SHANG Zhihui, CAO Kuo, LIU Changming
Available online  , doi: 10.11999/JEIT250800
Abstract:
  Objective  This paper addresses the dual security threats of eavesdropping and detection in Multiple-Input Single-Output (MISO) communication systems with short packet transmissions. We propose an integrated secure and covert transmission scheme that combines physical layer security with covert communication techniques. The objective is to overcome the limitations of conventional encryption in short packet scenarios, improve communication concealment, and ensure information confidentiality. Our ultimate goal is to maximize the Average Effective Secrecy and Covert Rate (AESCR) via the joint optimization of packet length and transmit power, thereby providing robust security for low-latency Internet of Things (IoT) applications.  Methods  We adopt a MISO system model employing Maximum Ratio Transmission (MRT) beamforming to leverage spatial degrees of freedom for enhancing security. Through rigorous theoretical analysis, we derive closed-form expressions for the warden’s (Willie’s) optimal detection threshold and minimum detection error probability. A statistical covertness constraint based on Kullback-Leibler (KL) divergence is established to transform intractable instantaneous requirements into a manageable average constraint. We introduce a novel performance metric, the AESCR, to comprehensively evaluate system performance in terms of covertness, secrecy, and reliability. The core of our optimization strategy lies in the joint design of packet length and transmit power. By exploiting the inherent coupling between these variables, we reformulate the original dual-variable maximization problem into a tractable form that can be efficiently solved via a one-dimensional search.  Results and Discussions  : Simulation results validate the theoretical analyses, demonstrating close agreement between the derived expressions and Monte Carlo simulations for Willie’s detection error probability. The results indicate that multi-antenna configurations significantly improve the AESCR by concentrating signal energy toward the legitimate user and reducing eavesdropping risks. Notably, the proposed joint optimization of transmit power and packet length achieves a substantially higher AESCR compared to power-only optimization, particularly under strict covertness constraints. We also identify critical trade-offs: an optimal packet length exists that balances coding gain against exposure risk, and relaxed covertness constraints lead to consistent improvements in AESCR. Furthermore, multi-antenna technology proves essential in mitigating the inherent low-power limitations of covert communication.  Conclusions  This study presents an integrated framework for secure and covert communication in short packet MISO systems, achieving significant performance gains through the joint optimization of transmit power and packet length. The key contributions include: (i) a novel transmission architecture that integrates security and covertness, supported by closed-form solutions for the warden’s detection threshold and error probability under a KL divergence-based constraint; (ii) the introduction of the AESCR metric, which unifies the evaluation of secrecy, covertness, and reliability; and (iii) the formulation and efficient solution of the AESCR maximization problem. Simulations confirm that the proposed joint optimization scheme outperforms power-only optimization, especially under strict covertness conditions. The AESCR increases monotonically with the number of transmit antennas, and an optimal packet length exists to balance transmission efficiency and covertness.
A Hybrid Beamforming Algorithm Based on Riemannian Manifold Optimization with Non-Monotonic Line Search
YAN Junrong, SHI Weitao, LI Pei
Available online  , doi: 10.11999/JEIT250396
Abstract:
  Objective  Fully digital beamforming architectures provide high spectral efficiency but demand one Radio-Frequency (RF) chain per antenna element, resulting in substantial cost, power consumption, and hardware complexity. These limitations hinder their practical deployment in large-scale antenna systems. Hybrid beamforming offers a feasible alternative by reducing hardware requirements while retaining much of the performance. In such systems, analog beamforming modules follow a reduced number of RF chains to control massive antenna arrays. Analog phase shifters are energy-efficient and cost-effective but restricted to constant modulus constraints, which are essential for hardware implementation. In contrast, digital phase shifters offer flexible control over amplitude and phase. The central challenge is to approximate the spectral efficiency of fully digital systems while adhering to analog-domain constraints and minimizing energy and hardware demands. To overcome this challenge, this study proposes a novel hybrid beamforming algorithm that integrates Riemannian manifold optimization with a non-monotonic line search strategy (MO-NMLS). This approach achieves improved trade-offs among spectral efficiency, energy consumption, and hardware complexity.  Methods  The proposed methodology proceeds as follows. First, the joint matrix optimization problem for maximizing spectral efficiency in hybrid beamforming is decomposed into separate transmitter and receiver subproblems by formulating an appropriate objective function. This objective is then reformulated using a least squares approach, reducing the dimensionality of the search space from two to one. To accommodate the constant modulus constraints of analog beamforming, the problem is transformed into an unconstrained optimization on Riemannian manifolds. Both the Euclidean and Riemannian gradients of the modified objective function are derived analytically. Step sizes are adaptively determined using a MO-NMLS, which incorporates historical gradient information to compute dynamic step factors. This mechanism guides the search direction while avoiding convergence to suboptimal local minima due to fixed step sizes. Distinct update rules for the step factor are applied depending on whether the iteration count is odd or even. In each iteration, the current objective function value is compared with those from the preceding L iterations to decide whether to accept the new step and iteration point. After updating the step size, tangent vectors are retracted onto the manifold to generate new iterates until convergence criteria are satisfied. Once the analog precoder is fixed based on the optimized search direction, the corresponding digital precoder is derived in closed form. The dynamic step factor is computed using gradient data from the current and preceding L iterations, allowing the objective function to exhibit non-strict monotonicity within bounded ranges. This adaptive strategy results in faster convergence compared with conventional fixed-step methods.  Results and Discussions  The relationship between internal iteration count and Signal-to-Noise Ratio (SNR) for different beamforming algorithms is shown in Fig. 4. The MO-NMLS algorithm requires significantly fewer iterations than the conventional Conjugate Gradient (CG) method under both fully connected and overlapping subarray architectures. This improved efficiency arises from the use of Riemannian manifold optimization, which inherently satisfies the constant modulus constraints without necessitating computationally intensive Hessian matrix evaluations. Runtime performance is benchmarked in Fig. 5. The MO-NMLS algorithm reduces runtime by 75.3% relative to CG in the fully connected structure and by 79.2% in the overlapping subarray structure. Additionally, MO-NMLS achieves a further 21.1% reduction in runtime under the overlapping subarray architecture compared with the fully connected one, owing to simplified hardware requirements. Spectral efficiency as a function of SNR is presented in Fig. 6. In fully connected systems, MO-NMLS achieves a 0.64% improvement in spectral efficiency over CG while maintaining comparable stability in overlapping subarray architectures. This performance gain stems from the algorithm’s ability to avoid local optima, a key limitation of Orthogonal Matching Pursuit (OMP), which selects paths based solely on residual correlation. The scalability of MO-NMLS with respect to the number of antennas and data streams is demonstrated in Fig. 7. In fully connected systems, MO-NMLS outperforms CG by 1.94%, 2.16%, and 2.74% in spectral efficiency at antenna and data stream configurations of (32, 2), (64, 4), and (128, 8), respectively. While spectral efficiency increases across all algorithms as system scale grows, MO-NMLS exhibits the most substantial gains at higher scales. Energy efficiency improvements under the overlapping subarray architecture are shown in Fig. 8. Compared with the fully connected configuration, MO-NMLS yields energy efficiency gains of 1.2%, 10.9%, and 25.9% at subarray offsets of 1, 8, and 16, respectively. These improvements are attributed to the reduced number of required phase shifters and power amplifiers, which decreases total system power consumption as the subarray offset increases.  Conclusions  The proposed MO-NMLS algorithm achieves an effective balance among spectral efficiency, hardware complexity, and energy consumption in hybrid beamforming systems, while substantially reducing computational runtime. Moreover, the overlapping subarray architecture attains spectral efficiency comparable to that of fully connected systems, with significantly lower execution times. These results highlight the practical advantages of the proposed approach for large-scale antenna systems operating under resource constraints.
Flexible Network Modal Packet Processing Pipeline Construction Mechanism for Cloud-Network Convergence Environment
ZHU Jun, XU Qi, ZHANG Fujun, WANG Yongjie, ZOU Tao, LONG Keping
Available online  , doi: 10.11999/JEIT250806
Abstract:
  Objective  With the deep integration of information network technologies and vertical application fields, the demand for cloud-network convergence infrastructure has become increasingly prominent, and the boundaries between cloud computing and network technologies are becoming more blurred. The development of cloud-network convergence technologies has given rise to diverse network service requirements, further posing new challenges for the flexible processing of multi-modal network packets. The device-level network modal packet processing flexible pipeline construction mechanism is key to realizing an integrated environment that supports a variety of network technologies. This mechanism constructs a protocol packet processing flexible pipeline architecture that, based on different network modals and service demands, customizes a series of protocol packet processing operations, including packet parsing, packet editing, and packet forwarding, thus improving the adaptability of networks in cloud-network convergence environments. This flexible design allows the pipeline processing flow to be adjusted according to actual service demands, meeting the functional and performance requirements of different network transmission scenarios.  Methods  The construction of a device-level flexible pipeline faces two major challenges: (1) how to flexibly process diverse network modal packet protocols based on polymorphic network element devices, requiring coordination of various heterogeneous resources to quickly identify, parse, and correctly handle network modal packets in various formats; (2) how to ensure that the pipeline construction is flexible, providing a mechanism to dynamically generate and configure pipeline structures. This mechanism should not only adjust the number of stages in the pipeline but also allow customization of the specific functions of each stage. To address these challenges, this paper proposes a polymorphic network element abstraction model that integrates heterogeneous resources. It employs a hyper-converged approach using high-performance switching ASIC chips paired with more programmable but slightly weaker FPGA and CPU devices at the device-level hardware architecture layer. Through the synergy of hardware and software, it meets the flexibility demands for unified support of custom network protocols. On the basis of the network element abstraction model, a protocol packet flexible processing compilation mechanism is further designed, constructing a flexible pipeline architecture that supports customizable configurations to accommodate different network service transmission requirements. It is a front-end, mid-end, back-end three-stage compilation architecture. At the same time, in response to the adaptive issues between differentiated network modal demands and heterogeneous resources, a flexible pipeline technology based on IR slicing is proposed. This approach precisely decomposes and reconstructs the integrated IR of multiple network modals into several IR subsets according to specific optimization methods, ensuring the original functionality and semantics, thus enabling flexible customization of the network modal processing pipeline through collaborative handling of heterogeneous resources. By utilizing an intermediate representation slicing algorithm, this mechanism decomposes and maps hybrid processing logic of multiple network modalities onto heterogeneous hardware resources such as ASICs, FPGAs, and CPUs, thereby constructing a custom-configurable flexible pipeline that adapts to various network service transmission requirements.  Results and Discussions  To demonstrate the construction effect of the flexible pipeline, this paper introduces a prototype verification system for polymorphic network elements. As shown in Fig. 6, the system is equipped with Centec CTC8180 switch chips, multiple domestic FPGA chips, and domestic multi-core CPU chips. On this polymorphic network element prototype verification system, protocol processing pipelines for IPv4, GEO, and MF network modals were constructed, compiled, and deployed. As shown in Fig. 7, actual packet capture tests have verified that different network modals use different packet processing pipelines. To validate the core mechanism of network modal flexible pipeline construction, we compared the IR code size before and after slicing under the three network modals and network modal allocation strategies in Section 6.2. The integrated P4 code for the three network modals, after front-end compilation, produced an unsliced intermediate code of 32,717 lines. According to the modal allocation scheme, slicing was performed during the middle-end compilation stage, resulting in IR slices for ASIC, CPU, and FPGA with code sizes of 23,164, 23,282, and 22,772 lines, respectively. Finally, the performance of multi-modal protocol packet processing was evaluated, focusing on the impact of different traffic allocation schemes on the network modal protocol packet processing performance. According to the experimental results in Fig. 9, it can be observed that the average packet processing delay for Scheme 1 is significantly higher than the other schemes, reaching 4.237 milliseconds. In contrast, the average forwarding processing delay for Schemes 2, 3, and 4 decreased to 54.16 microseconds, 32.63 microseconds, and 15.48 microseconds, respectively. This shows that with changes in the traffic allocation strategy, especially the adjustment of CPU resources for GEO and MF modals, network packet processing bottlenecks can be effectively reduced, thereby significantly improving multi-modal network communication efficiency.  Conclusions  Experimental evaluations confirm the superiority of the proposed flexible pipeline in terms of construction effects and functional fulfillment. The results show that the proposed method can effectively address complex network environments and diverse service demands, demonstrating strong performance. Future work will further optimize this architecture and expand its applicability, aiming to provide more powerful and flexible technical support for network packet processing in hyper-converged cloud-network environments.
An Overview on Integrated Sensing and Communication for Low altitude economy
ZHU Zhengyu, WEN Xinping, LI Xingwang, WEI Zhiqing, ZHANG Peichang, LIU Fan, FENG Zhiyong
Available online  , doi: 10.11999/JEIT250747
Abstract:
With the development of the Low-altitude Internet of Things (IoT), the Low-altitude economy has gradually become a national strategic emerging industry. Therefore, the Integrated Sensing and Communication (ISAC) for Low-altitude economy can perform more complex tasks in more complex environments, laying the foundation for improving the security, flexibility, and multi-application scenarios of drones. This paper provides an overview on ISAC for Low-altitude economy. Firstly, it summarizes the theoretical foundations of the ISAC and Low altitude economy is, and the advantage of ISAC for Low- altitude economy are discussed. Then, it investigates the potential applications of 6G key technologies, such as covert communication, millimeter wave (mm wave) in the ISAC for Low altitude economy. Finally, the key technical challenges of the ISAC for Low altitude economy in the future were summarized.  Significance   The integration of UAVs with ISAC technology will offer considerable advantages in future developments. By implementing ISAC, the overall system payload can be minimized, greatly enhancing UAV maneuverability and operational freedom. This integration provides strong technical support for versatile UAV applications. Equipped with ISAC, low-altitude network systems can perform increasingly complex tasks in challenging environments. Unlike single-function UAV platforms, those incorporating ISAC benefit from synergistic improvements in both communication and sensing capabilities. As a result, ISAC-enabled drones are expected to see expanded use in fields such as aerial photography, agriculture, surveying, remote sensing, and telecommunications. This growth will further accelerate the advancement of relevant theoretical and technical frameworks while broadening the scope of ISAC applications.  Progress   ISAC networks for the low-altitude economy provide efficient and flexible solutions for applications such as military reconnaissance, emergency disaster relief, and smart city management. However, the open aerial environment and dynamic deployment requirements introduce multiple challenges, including vulnerability to hostile interception due to limited stealth, signal obstruction in complex terrains, and the need for high bandwidth and low latency. In response, both academic and industrial communities have been actively investigating technologies such as covert communications, intelligent reflecting surfaces, and millimeter-wave communications to enhance the reliability and intelligence of ISAC in low-altitude operational scenarios.  Conclusions  This paper provides a systematic overview of the current applications, critical technologies, and ongoing challenges associated with ISAC in low-altitude environments. It examines the synergistic integration of emerging 6G technologies—including covert communication, RIS and mm-Wave communications—within ISAC frameworks. In response to the highly dynamic and complex nature of low-altitude operations, the study also summarizes recent advances in UAV swarm power control algorithms and covert trajectory optimization based on deep reinforcement learning. Furthermore, it highlights key unresolved challenges such as spatiotemporal synchronization, multi-UAV resource allocation, and privacy preservation, offering valuable directions for future research.  Prospects   ISAC technology provides highly precise and reliable support for applications such as drone logistics, urban air mobility, and large-scale environmental monitoring in the low-altitude economy. Nevertheless, the large-scale deployment of ISAC systems in complex and dynamic low-altitude environments still remains challenging. Key obstacles include: suboptimal coordination and resource allocation in UAV swarms, spatiotemporal synchronization among heterogeneous devices, conflicting objectives between sensing and communication functions, as well as growing concerns over privacy and security in open airspace. These challenges represent major impediments to the high-quality development of the low-altitude economy.
Coalition Formation Game based User and Networking Method for Status Update Satellite Internet of Things
GAO Zhixiang, LIU Aijun, HAN Chen, ZHANG Senbai, LIN Xin
Available online  , doi: 10.11999/JEIT250838
Abstract:
  Objective  Satellite communication has become a key focus in the development of next-generation wireless networks, thanks to its advantages such as wide coverage, long communication distance, and high flexibility in networking. Short packet communication is an important scenario in the Satellite Internet of Things (S-IoT). However, the research on the status update issue for massive users remains insufficient. On one hand, it is necessary to reasonably design user networking schemes to resolve the contradiction between the massive access demands of users and the limited communication resources. On the other hand, in the face of the access demands from massive users, how to design user networking schemes with low complexity is a problem worthy of research. This paper provides a solution for status updates in S-IoT with dynamic orthogonal access for massive users.  Methods  In the S-IoT, a state update model for user orthogonal dual-layer access is established. A dual-layer networking scheme is proposed where users dynamically allocate bandwidth to access the base station, and the base station adopts time slot polling for accessing the satellite. The closed-form expression of the average age of information (aAoI) of users is derived using the short packet theory, and a simplified approximate expression is also derived under high signal-to-noise ratio conditions. Subsequently, based on the coalition formation game, a distributed dual-layer coalition formation game user-base station-satellite networking (DCFGUSSN) algorithm is proposed.  Results and Discussions  This approximate aAoI expression effectively reduces the computational complexity. And the exact potential game is used to prove that the proposed DCFGUSSN algorithm can form a stable networking. The simulation results verifies the correctness of the theoretical analysis of the user's aAoI in the proposed state update model (Fig.5). The proposed DCFGUSSN algorithm shows that as the number of iterations increases, the aAoI of the users gradually decreases and eventually converges. (Fig.6). Compared with other access schemes, the proposed double-layer access scheme has a lower aAoI (Fig.7, Fig 8, Fig 9).  Conclusions  This paper investigates the massive users networking problem with the help of base stations in the state update S-IoT. Firstly, a dynamic two-layer user access framework and the corresponding state update model are established. Then, a DCFGUSSN algorithm is proposed to reduce the aAoI of users. Finally, the theoretical value of aAoI highly coincides with the simulation value, and the proposed algorithm has a significant performance improvement compared with the traditional algorithm.
Belief Propagation-Ordered Statistics Decoding Algorithm with Parameterized List Structures
LIANG Jifan, WANG Qianfan, SONG Linqi, LI Lvzhou, MA Xiao
Available online  , doi: 10.11999/JEIT250552
Abstract:
  Objective  Traditional Belief Propagation–Ordered Statistics Decoding (BP-OSD) algorithms for quantum error-correcting codes often rely on a single normalization factor (\begin{document}$ \alpha $\end{document}) in the Belief Propagation (BP) stage, which restricts the search space and limits decoding performance. An enhanced BP-OSD algorithm is presented to address this limitation by employing a list of candidate \begin{document}$ \alpha $\end{document} values. The central idea is to perform BP decoding iteratively for multiple \begin{document}$ \alpha $\end{document} values, with the resulting posterior probabilities post-processed by Ordered Statistics Decoding (OSD). To balance performance gains with computational tractability, the multi-\begin{document}$ \alpha $\end{document} BP-OSD process is embedded within a two-stage framework: the more computationally intensive parameter-listed decoding is activated only when an initial BP decoding with a fixed \begin{document}$ {\alpha }_{0} $\end{document} fails. This design broadens the parameter search to improve decoding performance, while conditional activation ensures that computational complexity remains manageable, particularly at low physical error rates.  Methods  The proposed enhanced BP-OSD algorithm (Algorithm 1) introduces a two-stage decoding process. In the first stage, decoding is attempted using standard BP with a single predetermined normalization factor (\begin{document}$ {\alpha }_{0} $\end{document}), providing a computationally efficient baseline. If this attempt fails to produce a valid syndrome match, the second stage is activated. In the second stage, parameter listing is employed: BP decoding is executed independently across a predefined list of \begin{document}$ L $\end{document} distinct normalization factors \begin{document}$ \left\{{\alpha }_{1},{\alpha }_{2}, \cdots,{\alpha }_{L}\right\} $\end{document}. Each run generates a set of posterior probabilities corresponding to a different BP operational point. These posterior probabilities are then individually post-processed by an OSD module, forming a pool of candidate error patterns. The final decoded output is selected from this pool according to the maximum likelihood criterion, or the minimum Hamming weight criterion under a depolarizing channel. Complexity analysis shows that this conditional two-stage design ensures that the average computational cost remains comparable to that of standard BP decoding, particularly at low physical error rates where the first stage frequently succeeds.  Results and Discussions  The effectiveness of the proposed algorithm is evaluated through Monte Carlo simulations on both Surface codes ⟦\begin{document}$ {2{d}^{2}-2d+\mathrm{1,1},d} $\end{document}⟧ and Quantum Low-Density Parity-Check (QLDPC) codes \begin{document}$ \left[\kern-0.15em\left[ {\mathrm{882,24}} \right]\kern-0.15em\right] $\end{document} under a depolarizing channel. For Surface codes, the enhanced BP-OSD algorithm achieves a substantially lower logical error rate compared with both the Minimum-Weight Perfect Matching (MWPM) algorithm and the original BP algorithm (Fig. 4(a)). The error threshold is improved from approximately \begin{document}$ 15.5\% $\end{document} (MWPM) to about \begin{document}$ 18.3\% $\end{document} with the proposed method. The average decoding time comparison in Fig. 4(b) demonstrates that, particularly at low physical error rates, the proposed algorithm maintains a decoding speed comparable to the original BP algorithm. This efficiency results from the two-stage design, in which the more computationally intensive parameter-listed search is activated only when required. For QLDPC codes (Fig. 5(a), the proposed algorithm outperforms both the original BP and BP-OSD algorithms in terms of logical error rate, even when a smaller OSD candidate list per \begin{document}$\alpha$\end{document} value is employed. As shown in Table 3, increasing the parameter list size L (e.g., \begin{document}$ L=\mathrm{4,8},16 $\end{document}) improves decoding performance, although the gains diminish as L grows. This observation supports the choice of L = 16 as an effective balance between performance and complexity. Furthermore, the activation probability of the second stage (Table 2) decreases rapidly as the physical error rate declines, confirming the efficiency of the two-stage framework.  Conclusions  An enhanced BP-OSD algorithm for quantum error-correcting codes is presented, featuring a parameter-listing strategy for the normalization factor (\begin{document}$ \alpha $\end{document}) in the BP stage. Unlike conventional approaches that rely on a single \begin{document}$ \alpha $\end{document}, the proposed method explores multiple \begin{document}$ \alpha $\end{document} values, with the resulting posterior probabilities processed by the OSD module to select the most likely output. This systematic expansion of the search space improves decoding performance. To control computational overhead, a two-stage decoding mechanism is employed: the parameter-listed BP-OSD is activated only when an initial BP decoding with a fixed \begin{document}$ {\alpha }_{0} $\end{document} fails. Complexity analysis, supported by numerical simulations, shows that the average computational cost of the proposed algorithm remains comparable to that of standard BP decoding in low physical error rate regimes. Monte Carlo simulations further demonstrate its efficacy. For Surface codes, the enhanced BP-OSD achieves lower logical error rates than the MWPM algorithm and raises the error threshold from approximately 15.5% to 18.3%. For QLDPC codes, it exceeds both the original BP and BP-OSD algorithms in logical error rate performance, even with a reduced OSD candidate list size in the second stage. Overall, the proposed algorithm provides a promising pathway toward high-performance, high-threshold quantum error correction by balancing decoding power with operational efficiency, highlighting its potential for practical applications.
Dynamic Target Localization Method Based on Optical Quantum Transmission Distance Matrix Constructing
ZHOU Mu, WANG Min, CAO Jingyang, HE Wei
Available online  , doi: 10.11999/JEIT250020
Abstract:
  Objective  In recent years, the integration of quantum mechanics with information science and computer science has sparked a surge in quantum information research. Based on principles such as quantum superposition and quantum entanglement, quantum information technology holds the potential to overcome the limitations of traditional technologies and solve problems that classical information technologies and conventional computers cannot address. As a key technology, space-based quantum information technology has rapidly developed, opening new avenues to break through the performance bottlenecks of traditional positioning systems. However, existing quantum positioning methods primarily target stationary objects, struggling to address the challenges posed by dynamic changes in the transmission channels of entangled photon pairs due to particles, scatterers, and noise photons in the environment. This leads to difficulties in detecting moving targets and increased positioning errors caused by reduced data acquisition at fixed locations due to target movement. Traditional wireless signal-based target localization techniques also face numerous challenges in dynamic target localization, including signal attenuation, multipath effects, and noise interference in complex environments. To overcome these issues, this paper proposes a dynamic target localization method based on the construction of an optical quantum transmission distance matrix. The method aims to achieve high-precision and high-robustness dynamic target localization, particularly addressing the demands of moving target localization in practical application scenarios. This method not only provides centimeter-level positioning accuracy but also significantly enhances the adaptability and stability of the system for moving targets, offering strong support for the future practical application of quantum dynamic localization technology.  Methods  To improve the accuracy of the dynamic target localization system, this paper first proposes a dynamic threshold optical quantum detection model based on background noise estimation, utilizing the characteristics of optical quantum echo signals. A dynamic target localization optical path is established, where two entangled optical signals are generated through the SPDC process: one is retained as the reference signal in a local SPD, and the other is transmitted to the moving target position as the signal light. The optical quantum echo signals are then analyzed, and the background noise is estimated by combining the coincidence counting algorithm. The detection threshold is dynamically adjusted and compared with the signals from the detection unit, enabling rapid dynamic target detection. To better adapt to the variations in quantum echo signals caused by target movement, an adaptive optical quantum grouping method based on velocity measurement is introduced. The time pulse sequence is initially grouped coarsely, and the rough velocity of the target is calculated. The grouping size is then adjusted based on the target’s speed, updating the time grouping sequence and further optimizing the distance measurement accuracy, resulting in an updated velocity matrix. The photon transmission distance matrix is optimized using the relative velocity error matrix. By constructing a system of equations involving the light source position coordinates, the optical quantum transmission distance matrix, and the dynamic target coordinate sequence, the position of the dynamic target is estimated using the least squares method. This approach not only improves the localization accuracy but also effectively eliminates errors caused by target movement.  Results and Discussions  The effectiveness of the proposed method is validated through both simulations and the construction of a practical measurement platform. Experimental results show that the dynamic threshold detection method based on background noise estimation, as proposed in this paper, exhibits high-sensitivity detection performance (Fig. 7). When a moving target enters the detection range, rapid detection is achieved, enabling subsequent dynamic target localization. The adaptive grouping method based on velocity measurement significantly improves the performance of the quantum dynamic target localization system. By using grouped coincidence counting, the issue of blurred coincidence counting peaks due to target movement is addressed (Fig. 8), achieving high-precision velocity measurement (Table 1) and reducing localization errors caused by target movement. Centimeter-level positioning accuracy is achieved (Fig. 9). The study also constructs an entangled optical quantum experimental platform, emphasizing the measurement results at different velocities and positioning outcomes under various methods, further confirming the reliability and adaptability of the proposed method in enhancing distance measurement accuracy (Fig. 11).  Conclusions  This paper proposes a novel method for dynamic target localization in the field of entangled optical quantum dynamics, based on the construction of an optical quantum transmission distance matrix. The method enhances distance measurement accuracy and optimizes the overall positioning accuracy of the localization system through the use of a background noise estimation-based dynamic threshold detection method, a velocity measurement-based adaptive grouping method. By combining the optical quantum transmission distance matrix with the least squares optimization method, the approach provides a promising path for more precise quantum localization systems and demonstrates potential for real-time dynamic target tracking applications. This method not only improves the accuracy of dynamic target quantum localization systems but also expands the application potential of quantum localization technology in complex environments. In the future, it is expected to provide strong support for real-time applications of quantum dynamic target localization systems and have wide applications in areas such as intelligent health monitoring, the Internet of Things, and autonomous driving.
A Polymorphic Network Backend Compiler for Domestic Switching Chips
TU Huaqing, WANG Yuanhong, XU Qi, ZHU Jun, ZOU Tao, LONG Keping
Available online  , doi: 10.11999/JEIT250132
Abstract:
  Objective  The P4 language and programmable switching chips provide a feasible solution for the deployment of polymorphic networks. However, polymorphic network packets written in P4 cannot be directly executed on the domestically produced TsingMa.MX programmable switching chip from Centec, necessitating the design of a specialized compiler to translate and deploy P4 language onto this chip. Existing backend compilers are primarily designed and optimized for software-programmable switches such as BMv2, FPGAs, and Intel Tofino series chips, making them unsuitable for compiling polymorphic network programs on the TsingMa.MX chip. To address this issue, this paper proposes p4c-TsingMa, a backend compiler tailored for the TsingMa.MX switching chip. This compiler enables the translation of high-level network programming languages into executable formats for the TsingMa.MX chip, allowing it to simultaneously support the parsing and forwarding of multiple types of network modal packets.  Methods  p4c-TsingMa first employs a preorder traversal method to extract key information such as protocol types, protocol fields, and actions from the Intermediate Representation (IR). Then performs instruction translation, ultimately generating control commands for the TsingMa.MX chip. Additionally, p4c-TsingMa adopts a UDF entry merging method to consolidate matching instructions from different network modalities into a single lookup table, enabling the extraction of multiple modal matching entries in one operation and significantly improving the utilization of chip resources.  Results and Discussions  This paper implements the p4c-TsingMa compiler using C++, which maps network modal programs written in P4 language into chip configurations for the TsingMa.MX switching chip. A polymorphic network packet testing environment (Fig. 7) is established, where multiple types of network data packets are simultaneously sent to the same port of the switch. The chip, following the flow table configuration, successfully identifies polymorphic network data packets and forwards them to their corresponding ports (Fig. 9). Additionally, the table entry merging algorithm improves register resource utilization by 37.5% to 75%, enabling the chip to process more than two types of modal data packets in parallel.  Conclusions  This paper designs a polymorphic network backend compiler, p4c-TsingMa, specifically for domestic switching chip. By leveraging the FlexParser and FlexEdit capabilities of the TsingMa chip, the compiler translates polymorphic network programs into TsingMa.MX chip commands, enabling the chip to parse and edit polymorphic data packets. Experimental results demonstrate that p4c-TsingMa achieves high compilation efficiency and improves register resource utilization by 37.5% to 75%.
Finite-time Adaptive Sliding Mode Control of Servo Motors Considering Frictional Nonlinearity and Unknown Loads
TIANYU Zhang, QINXIA Guo, TINGKAI Yang, XIANGJI Guo, MING Ming
Available online  , doi: 10.11999/JEIT250521
Abstract:
  Objective  Ultra-fast laser processing with an infinite field of view demands exceptional tracking accuracy and robustness from servo motor systems. However, these systems are highly nonlinear and subject to coupled unknown load disturbances and complex friction, which limit the performance of existing controllers. While sliding mode control is inherently robust, conventional SMC and observers struggle to achieve accurate, finite-time disturbance compensation under such nonlinearities, hindering fast, high-precision trajectory tracking. To overcome this bottleneck, this study proposes a novel finite-time adaptive SMC strategy that ensures rapid and precise angular position tracking within a finite time, meeting the stringent synchronization requirements of advanced laser processing.  Methods  In this paper, we propose a novel strategy to combine an adaptive disturbance observer fused with RBFNN with finite-time sliding mode control. Firstly, the unknown load disturbance and the complex friction nonlinear dynamics in the system are innovatively integrated into a "lumped disturbance" term, which significantly enhances the universality of the model and the ability to describe the actual complex working conditions. Second, a finite-time adaptive disturbance observer is designed for this lumped disturbance. The core of the observer is to make full use of the universal approximation property of RBF neural network to learn and approximate the dynamic characteristics of unknown disturbances online. At the same time, a finite-time adaptive law based on the form of error norm is introduced to adjust the weights of the neural network in real time, which ensures that the observer can estimate the lumped disturbance quickly and accurately in finite time, and effectively reduces the dependence on accurate model parameters. Based on this, a finite-time sliding mode controller is designed. The controller uses the accurate disturbance estimation of the observer output as the feedforward compensation term, combines a carefully designed finite-time sliding mode surface and an equivalent control law, and introduces a saturation function to effectively suppress the control input chattering. By constructing a suitable Lyapunov function and strictly applying the finite-time stability theory, the practical finite-time convergence of the proposed adaptive observer and the closed-loop control system is rigorously proved, which ensures that the tracking error of the system can converge to a bounded neighborhood near the origin in finite time.  Results and Discussions  In order to verify the effectiveness and superiority of the proposed control strategy, a typical permanent magnet synchronous motor servo system model is built in Matlab environment, and a simulation scene containing different frequency desired trajectories is set up, which is comprehensively compared with the widely used PI control and the advanced method in reference [7]. Simulation results show that: 1. Tracking performance: Under a variety of reference trajectories , the controller designed in this paper can drive the system to accurately track the target trajectory, and its tracking error is significantly smaller than that of PI control. Compared with the method in reference [7], it also shows better smoothness and smaller residual error, which effectively avoids the obvious chattering phenomenon of the latter in some working conditions. Disturbance rejection and robustness: The designed adaptive disturbance observer based on RBFNN can quickly and effectively learn and compensate the lumped disturbance composed of unknown load variation and friction nonlinearity. In the presence of such disturbances, the proposed controller can still maintain high accuracy tracking performance, which proves its strong disturbance rejection capability and robustness to changes in system parameters. 3. Control input characteristics: Compared with the comparison method, the control signal in this paper can quickly tend to smooth after the initial stage, effectively suppress the chattering problem caused by high-frequency switching, and the amplitude variation range of the control signal is reasonable, which is more conducive to the application of practical actuators. 4. Comprehensive evaluation: Through the comprehensive comparison of multiple error performance indicators such as integral squared Error (ISE), integral Absolute error (IAE), time-weighted integral absolute error (ITAE) and time-weighted integral squared error (ITSE), the proposed controller is comprehensively and significantly better than the PI control and the reference [7] method in all indicators. It shows that the proposed method has comprehensive advantages in rapid suppression of transient error, reduction of overall error accumulation, improvement of long-term steady-state accuracy, and balance of response speed and noise suppression. 5. Observer performance: The RBFNN weight norm estimation can quickly converge and stabilize at a low level after the initial rapid adjustment, which verifies the effectiveness of the adaptation law and the efficiency of observer learning.  Conclusions  This paper proposes a finite-time sliding mode control strategy with an adaptive disturbance observer for servo systems in ultra-fast laser processing. The approach models unknown load and friction nonlinearities as lumped disturbances. An adaptive observer, combining an RBF neural network with a finite-time mechanism, estimates these disturbances accurately for online compensation. A finite-time SMC law is designed based on the observer, with theoretical proof of the closed-loop system's practical finite-time stability. Simulations on a permanent magnet synchronous motor platform demonstrate superior tracking performance, robustness, and control smoothness over traditional PI and existing advanced methods. This work provides an effective solution for high-precision control of nonlinear systems under strong disturbances.
A Test-Time Adaptive Method for Nighttime Image-Aided Beam Prediction
SUN kunayng, YAO Rui, ZHU Hancheng, ZHAO JIaqi, LI Xixi, HU Dianlin, HUANG Wei
Available online  , doi: 10.11999/JEIT250530
Abstract:
To address the high latency of traditional beam management methods in dynamic scenarios and the severe performance degradation of vision-aided beam prediction under adverse environmental conditions in millimeter-wave (mmWave) communication systems, this work proposes a nighttime image-assisted beam prediction method based on test-time adaptation (TTA). While mmWave communications rely on massive multiple input multiple output (MIMO) technology to achieve high-gain narrow beam alignment, conventional beam scanning mechanisms suffer from exponential complexity and latency bottlenecks, failing to meet the demands of high-mobility scenarios such as vehicular networks. Existing vision-assisted approaches employ deep learning models to extract image features and map them to beam parameters. However, in low-light, rainy, or foggy environments, the distribution shift between training data and real-time image features leads to a drastic decline in prediction accuracy. This work innovatively introduces a TTA mechanism, overcoming the limitations of conventional static inference paradigms. By performing a single gradient back propagation for entire model parameters during inference on real-time low-quality images, the proposed method dynamically aligns cross-domain feature distributions without requiring prior collection or annotation of adverse scenario data. Besides, an entropy minimization-based consistency learning strategy is designed to enforce prediction consistency between original and augmented views, driving model parameter updates toward maximizing prediction confidence and reducing uncertainty. Experimental results on real-world nighttime scenarios demonstrate that the proposed method achieves a top-3 beam prediction accuracy of 93.01%, outperforming static schemes by almost20% and significantly surpassing traditional low-light enhancement approaches. Leveraging the cross-domain consistency of background semantics in fixed-base-station deployment scenarios, this lightweight online adaptation mechanism enhances model robustness, offering a novel pathway for efficient beam management in mmWave systems operating in complex open environments.  Objective  Millimeter-wave communication, a cornerstone of 5G and beyond, relies on massive multiple-input multiple-output (MIMO) architectures to mitigate severe path loss through high-gain narrow beam alignment. However, traditional beam management schemes, dependent on exhaustive beam scanning and channel measurement, incur exponential complexity and latency (hundreds of milliseconds), rendering them impractical for high-mobility scenarios like vehicular networks. Vision-aided beam prediction has emerged as a promising solution, leveraging deep learning to map visual features (e.g., user location, motion) to optimal beam parameters. Despite its daytime success (>90% accuracy), this approach suffers catastrophic performance degradation under low-light, rain, or fog due to domain shifts between training data (e.g., daylight images) and real-time degraded inputs. Existing solutions rely on costly offline data augmentation with limited generalization to unseen harsh environment. This work addresses these limitations by proposing a lightweight, online adaptation framework that dynamically aligns cross-domain features during inference, eliminating the need for pre-collected harsh environment data. The necessity lies in enabling robust mmWave communications in unpredictable environments, a critical step toward practical deployment in autonomous driving and industrial IoT.  Methods  This TTA method operates in three stages. First, a pre-trained beam prediction model (ResNet-18 backbone) is initialized using daylight images and labeled beam indices. During inference, real-time low-quality nighttime images are fed into two parallel pipelines: (1) the original view and (2) a data-augmented view incorporating Gaussian noise. A consistency loss minimizes the prediction distance between these two views, enforcing robustness against local feature perturbations. Simultaneously, an entropy minimization loss sharpens the output probability distribution by penalizing high prediction uncertainty. These combined losses drive single-step gradient back propagation to update the model's entire parameters. This process aligns feature distributions between the training (daylight) and testing (nighttime) domains without altering the global semantic understanding, as illustrated in Fig. 2. The system architecture integrates a roadside base station equipped with an RGB camera and a 32-element antenna array, capturing environmental data and executing real-time beam prediction.  Results and Discussions  Experiments on a real-world dataset demonstrate the method’s superiority. Under nighttime conditions, the proposed TTA framework achieves 93.01% top-3 beam prediction accuracy, outperforming static inference (71.25%) and traditional low-light enhancement methods (85.27%) (Table 3). Ablation studies confirm the effectiveness of both the online feature alignment method designed for small-batch data (Table 4) and the entropy minimization with multi-view consistency learning (Table 5). Figure 4 illustrates the continuous online adaptation performance during testing, revealing rapid convergence that enables base stations to swiftly recover performance after new environmental disturbances occur.  Conclusions  To address the insufficient robustness of existing visual-aided beam prediction methods in dynamically changing environments, this study introduces a test-time adaptation framework using nighttime image-aided beam prediction. Firstly, a novel small-batch adaptive feature alignment strategy is developed to resolve feature mismatch in unseen domains while meeting real-time communication constraints. Besides, a joint optimization framework integrates classical low-light image enhancement with multi-view consistency learning, enhancing feature discrimination under complex lighting conditions. Experiments were conducted using real-scene data to validate the proposed algorithm. Results demonstrate that the method achieves over 20% higher Top-3 beam prediction accuracy compared to direct testing. This improvement highlights the method's effectiveness in dynamic environments. This approach provides new technical pathways for optimizing visual-aided communication systems in non-ideal conditions. Future work may extend to beam prediction under rain/fog and multi-modal perception-assisted communication systems.
Research on Skill-Aware Task Assignment Algorithm under Local Differential Privacy
FANG Xianjin, ZHEN Yaru, ZHANG Pengfei, HUANG Shanshan
Available online  , doi: 10.11999/JEIT250425
Abstract:
  Objective  With the proliferation of mobile smart devices and wireless networks, Spatial Crowdsourcing (SC) has emerged as a new paradigm for collaborative task execution. By leveraging workers’ real-time locations, SC platforms dynamically assign tasks to distributed participants. However, continuous exposure of location data creates privacy risks, including trajectory inference and identity disclosure, which reduce worker participation and threaten system sustainability. Existing privacy-preserving methods either rely on trusted third parties or apply traditional differential privacy mechanisms. The former incurs high costs and security vulnerabilities, whereas the latter struggles to balance the trade-off between noise magnitude and data utility, often reducing task matching accuracy. To address these challenges, this study proposes a skill-aware task assignment algorithm under Local Differential Privacy (LDP) that simultaneously enhances location privacy protection and task assignment performance. The algorithm is particularly effective in settings characterized by uneven skill distributions and complex task requirements.  Methods  To protect workers’ location privacy, a Clip–Laplace (CLP) mechanism is applied to perturb real-time location data under Local Differential Privacy (LDP), ensuring bounded noise while maintaining data utility. To mitigate mismatches between heterogeneous task requirements and imbalanced worker skills, an entropy-based metric is used to evaluate skill diversity. When entropy falls below a predefined threshold, a secondary screening strategy rebalances the distribution by suppressing common skills and prioritizing rare ones. A skill-aware Pruning Greedy task assignment algorithm (PUGR) is then developed. PUGR iteratively selects the worker–task pair with the highest marginal contribution to maximize skill coverage under spatiotemporal and budget constraints. To improve computational efficiency, three pruning strategies are integrated: time–distance pruning, high-reward pruning, and budget-infeasibility pruning. Finally, comparative and ablation experiments on three real-world datasets assess the method using multiple metrics, including Loss of Quality of Service (LQS), Average Remaining Budget Rate (ARBR), and Task Completion Rate (TCR).  Results and Discussions  Experimental results show that the CLP mechanism consistently achieves lower LQS than the traditional Laplace mechanism (LP) across different privacy budgets, effectively reducing errors introduced by noise (Fig. 2). For skill diversity, the entropy-based metric combined with secondary screening nearly doubles the average entropy of candidate workers on the TKY and NYC datasets, demonstrating its effectiveness in balancing skill distribution. In task assignment, the proposed PUGR algorithm completes most worker–task matchings within four iterations, thereby reducing redundant computation and accelerating convergence (Fig. 3). Regarding budget utilization, the ARBR under CLP remains close to the No Privacy (NoPriv) baseline, indicating efficient resource allocation (Fig. 4, Table 2). For task completion, the method achieves a TCR of up to 90% in noise-free settings and consistently outperforms Greedy, OE-ELA, and TsPY under CLP (Fig. 5). Ablation studies further validate the contributions of secondary screening and pruning strategies to overall performance improvement.  Conclusions  This study addresses two central challenges in spatial crowdsourcing: protecting workers’ location privacy and improving skill-aware task assignment. A task assignment framework is proposed that integrates the CLP mechanism with a skill-aware strategy under the LDP model. The CLP mechanism provides strong privacy guarantees while preserving data utility by limiting noise magnitude. An entropy-based metric combined with secondary screening ensures balanced skill distribution, substantially enhancing skill coverage and task execution success in multi-skill scenarios. The PUGR algorithm incorporates skill contribution evaluation with multiple pruning constraints, thereby reducing the search space and improving computational efficiency. Experiments on real-world datasets demonstrate the method’s superiority in terms of LQS, ARBR, and TCR, confirming its robustness, scalability, and effectiveness in balancing privacy protection with assignment performance. Future work will explore dynamic pricing mechanisms based on skill scarcity and personalized, adaptive incentives to foster fairness, long-term worker engagement, and the sustainable development of spatial crowdsourcing platforms.
Joint Focus Measure and Context-Guided Filtering for Depth From Focus
JIANG Ying, DENG Huiping, XIANG Sen, WU Jin
Available online  , doi: 10.11999/JEIT250540
Abstract:
  Objective  Depth from Focus (DFF) seeks to determine scene depth by analyzing the focus variation of each pixel in an image. A key challenge in DFF is identifying the best-focused slice within the focal stack. However, focus variation in weakly textured regions is often subtle, making it difficult to detect focused areas, which adversely affects the accuracy of depth maps. To address this issue, this study proposes a depth estimation network that integrates focus measures and contextual information from the focal stack. The network accurately locates the best-focused pixels and generates a reliable depth map. By explicitly incorporating focus cues into a Convolutional Neural Network (CNN) and thoroughly considering spatial correlations within the scene, the approach facilitates comprehensive inference of focus states in weakly textured regions. This enables the network to capture both local focus-related details and global contextual information, thereby enhancing the accuracy and efficiency of depth estimation in challenging regions.  Methods  The proposed network consists of two main components. The first is focus region detection, which extracts focus-related features from the focal stack. A focus measure operator is introduced into the network during learning, yielding the maximum response when an image region is in sharp focus. After identifying the best-focused slices within the stack, the detected focus features are fused with those extracted by a 2D CNN. Because focus variations in weakly textured regions are often subtle, the representation of focus regions is enhanced to improve sensitivity to such changes. The second component comprises a semantic network and a semantic context module. A semantic context network is used to extract semantic cues, and semantic-guided filtering is then applied to the focus volume, integrating target features (focus volume) with guiding features (semantic context features). When local focus cues are indistinguishable, the global semantic context allows reliable inference of the focus state. This framework combines the strengths of deep learning and traditional methods while accounting for the specific characteristics of DFF and CNN architectures. Therefore, it produces robust and accurate depth maps, even in challenging regions.  Results and Discussions  The proposed architecture is evaluated through quantitative and qualitative comparisons on two public datasets. Prediction reliability is assessed using multiple evaluation metrics, including Mean Squared Error (MSE) and squared relative error (Sqr.rel.). Quantitative results (Tables 1 and 2) show that the proposed method consistently outperforms existing approaches on both datasets. The small discrepancy between predicted and ground-truth depths indicates precise depth estimation with reduced prediction errors. In addition, higher accuracy is achieved while computational cost remains within a practical range. Qualitative analysis (Figures 10 and 11) further demonstrates superior depth reconstruction and detail preservation, even when a limited number of focal stack slices is used. The generalization ability of the network is further examined on the unlabeled Mobile Depth dataset (Figure 12). The results confirm that depth can be reliably recovered in diverse unseen scenes, indicating effectiveness for real-world applications. Ablation studies (Table 3) validate the contribution of each proposed module. Optimal performance is obtained when both the Focus Measure (FM) and the Semantic Context-Guided Module (SCGM) are applied. Parameter count comparisons further indicate that the proposed approach achieves a balance between performance and complexity, delivering robust accuracy without excessive computational burden.  Conclusions  This study proposes a CNN–based DFF framework to address the challenge of depth estimation in weakly textured regions. By embedding focus measure operators into the deep learning architecture, the representation of focused regions is enhanced, improving focus detection sensitivity and enabling precise capture of focus variations. In addition, the introduction of semantic context information enables effective integration of local and global focus cues, further increasing estimation accuracy. Experimental results across multiple datasets show that the proposed model achieves competitive performance compared with existing methods. Visual results on the Mobile Depth dataset further demonstrate its generalization ability. Nonetheless, the model shows limitations in extremely distant regions. Future work could incorporate multimodal information or frequency-domain features to further improve depth accuracy in weakly textured scenes.
Highly Dynamic Doppler Space Target Situation Awareness Algorithm for Spaceborne ISAR
ZHOU Yichen, WANG Yong, DING Wenjun
Available online  , doi: 10.11999/JEIT250667
Abstract:
  Objective  With the growing number of operational satellites in orbit, Space Situation Awareness (SSA) has become a critical capability for ensuring the safety of space operations. Traditional ground-based radar and optical systems face inherent limitations in tracking deep-space objects due to atmospheric interference and orbital obscuration. Therefore, spaceborne Inverse Synthetic Aperture Radar (ISAR) has emerged as a pivotal technology for on-orbit target characterization, offering all-weather, long-duration observation. However, higher-order Three-Dimensional (3D) spatial-variant range migration and phase errors, caused by the complex relative motion between a spaceborne ISAR platform and its target, can seriously degrade imaging quality. Meanwhile, conventional Two-Dimensional (2D) Range–Doppler (RD) imaging provides valuable intensity distributions of scattering points but remains a projection of the target’s 3D structure. The absence of geometric information limits accurate attitude estimation and collision risk assessment. To address these challenges and achieve more comprehensive SSA, this paper proposes a joint space target imaging and attitude estimation algorithm.  Methods  This paper proposes a joint space target imaging and attitude estimation algorithm composed of three main components: space target imaging characterization, high-resolution imaging, and attitude estimation. First, the imaging characteristics of satellite targets are analyzed to establish the mapping relationship between the image domain and the Doppler parameters of individual scattering points. Second, adaptive segmentation in the two-dimensional (2D) image domain combined with high-precision regional compensation is applied to obtain high-resolution imaging results. Finally, the spatial distribution characteristics of the Doppler parameters are exploited to derive an explicit expression for the second-order Doppler parameters and to estimate the planar component attitude of the target, such as that of the solar wing.  Results and Discussions  The proposed SSA method achieves high-resolution imaging even in the presence of orbital error and complex 3D spatial-variant Doppler error. Moreover, target attitude estimation can be performed without the need for rectangular component extraction. The effectiveness of the algorithm is verified through three simulation experiments. When the target adopts different attitudes, the method successfully produces both high-resolution imaging results and accurate target attitude estimation (Fig. 7, Fig. 8). To further evaluate performance, comparative simulations are conducted (Fig. 9, Fig. 10). In addition, a method for estimating the long- and short-edge pointing of the satellite solar wing is presented in Section 3.3. The effectiveness of the proposed high-precision imaging algorithm for spinning targets is analyzed in Section 3.4, where the third simulation demonstrates the extended SSA capability of the algorithm (Fig. 11, Fig. 12).  Conclusions  This paper proposes a joint high-resolution imaging and attitude estimation algorithm to address the situational awareness requirements of highly dynamic Doppler space targets. First, the imaging characteristics of satellite targets and the mapping relationship between scattering points and higher-order Doppler parameters are derived. Second, an adaptive region segmentation algorithm is developed to compensate for 3D spatial-variant errors, thereby significantly enhancing imaging resolution. Meanwhile, an explicit correlation between Doppler parameters and satellite attitude is established based on the characteristics of planar components. Simulation results under different imaging conditions confirm the validity and reliability of the algorithm. Compared with conventional approaches, the proposed method achieves joint compensation of orbital and rotational errors. Furthermore, the attitude estimation process does not require rectangular component segmentation and remains effective even when rectangular components are partially obscured.
Low Complexity Sequential Decoding Algorithm of PAC Code for Short Packet Communication
DAI Jingxin, YIN Hang, WANG Yuhuan, LV Yansong, YANG Zhanxin, LV Rui, XIA Zhiping
Available online  , doi: 10.11999/JEIT250533
Abstract:
  Objective  With the rise of the intelligent Internet of Things (IoT), short packet communication among IoT devices must meet stringent requirements for low latency, high reliability, and very short packet length, posing challenges to the design of channel coding schemes. As an advanced variant of polar codes, Polarization-Adjusted Convolutional (PAC) codes enhance the error-correction performance of polar codes at medium and short code lengths, approaching the dispersion bound in some cases. This makes them promising for short packet communication. However, the high decoding complexity required to achieve near-bound error-correction performance limits their practicality. To address this, we propose two low complexity sequential decoding algorithms: Low Complexity Fano Sequential (LC-FS) and Low Complexity Stack (LC-S). Both algorithms effectively reduce decoding complexity with negligible loss in error-correction performance.  Methods  To reduce the decoding complexity of Fano-based sequential decoding algorithms, we propose the LC-FS algorithm. This method exploits special nodes to terminate decoding at intermediate levels of the decoding tree, thereby reducing the complexity of tree traversal. Special nodes are classified into two types according to decoder structure: low-rate nodes (Type-\begin{document}$ \mathrm{T} $\end{document} node) and high-rate nodes [Rate-1 and Single Parity-Check (SPC) nodes]. This classification minimizes unnecessary hardware overhead by avoiding excessive subdivision of special nodes. For each type, a corresponding LC-FS decoder and node-movement strategy are developed. To reduce the complexity of stack-based decoding algorithms, we propose the LC-S algorithm. While preserving the low backtracking feature of stack-based decoding, this method introduces tailored decoding structures and node-movement strategies for low-rate and high-rate special nodes. Therefore, the LC-S algorithm achieves significant complexity reduction without compromising error-correction performance.  Results and Discussions  The performance of the proposed LC-FS and LC-S decoding algorithms is evaluated through extensive simulations in terms of Frame Error Rate (FER), Average Computational Complexity (ACC), Maximum Computational Complexity (MCC), and memory requirements. Traditional Fano sequential, traditional stack, and Fast Fano Sequential (FFS) decoding algorithms are set as benchmarks. The simulation results show that the LC-FS and LC-S algorithms exhibit negligible error-correction performance loss compared with traditional Fano sequential and stack decoders (Fig. 5). Across different PAC codes, both algorithms effectively reduce decoding complexity. Specifically, as increases, the reductions in ACC and MCC become more pronounced. For ACC, LC-FS decoding algorithm (\begin{document}$T = 4$\end{document}) achieves reductions of 13.77% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 11.42% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 25.52% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with FFS (Fig. 6). LC-S decoding algorithm (\begin{document}$T = 4$\end{document}) reduces ACC by 56.48% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 47.63% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 49.61% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with the traditional stack algorithm (Fig. 6). For MCC, LC-FS decoding algorithm (\begin{document}$T = 4$\end{document}) achieves reductions of 29.71% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 21.18% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 23.62% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with FFS (Fig. 7). LC-S decoding algorithm (\begin{document}$T = 4$\end{document}) reduces MCC by 67.17% (\begin{document}$N = 256$\end{document}, \begin{document}$K = 128$\end{document}), 49.33% (\begin{document}$N = 128$\end{document}, \begin{document}$K = 64$\end{document}), and 51.84% (\begin{document}$N = 64$\end{document}, \begin{document}$K = 32$\end{document}) on average compared with the traditional stack algorithm (Fig. 7). By exploiting low-rate and high-rate special nodes to terminate decoding at intermediate levels of the decoding tree, the LC-FS and LC-S algorithms also reduce memory requirements (Table 2). However, as \begin{document}$T$\end{document} increases, the memory usage of LC-S rises because all extended paths of low-rate special nodes are pushed into the stack. The increase in \begin{document}$T$\end{document} enlarges the number of extended paths, indicating its critical role in balancing decoding complexity and memory occupation (Fig. 8).  Conclusions  To address the high decoding complexity of sequential decoding algorithms for PAC codes, this paper proposes two low complexity approaches: the LC-FS and LC-S algorithms. Both methods classify special nodes into low-rate and high-rate categories and design corresponding decoders and movement strategies. By introducing Type-\begin{document}$ \mathrm{T} $\end{document} nodes, the algorithms further eliminate redundant computations during decoding, thereby reducing complexity. Simulation results demonstrate that the LC-FS and LC-S algorithms substantially decrease decoding complexity while maintaining the error-correction performance of PAC codes at medium and short code lengths.
Multimodal Hypergraph Learning Guidance with Global Noise Enhancement for Sentiment Analysis under Missing Modality Information
HUANG Chen, LIU Huijie, ZHANG Yan, YANG Chao, SONG Jianhua
Available online  , doi: 10.11999/JEIT250649
Abstract:
  Objective  Multimodal Sentiment Analysis (MSA) has shown considerable promise in interdisciplinary domains such as Natural Language Processing (NLP) and Affective Computing, particularly by integrating information from ElectroEncephaloGraphy (EEG) signals, visual images, and text to classify sentiment polarity and provide a comprehensive understanding of human emotional states. However, in complex real-world scenarios, challenges including missing modalities, limited high-level semantic correlation learning across modalities, and the lack of mechanisms to guide cross-modal information transfer substantially restrict the generalization ability and accuracy of sentiment recognition models. To address these limitations, this study proposes a Multimodal Hypergraph Learning Guidance method with Global Noise Enhancement (MHLGNE), designed to improve the robustness and performance of MSA under conditions of missing modality information in complex environments.  Methods  The overall architecture of the MHLGNE model is illustrated in Fig. 2 and consists of the Adaptive Global Noise Sampling Module, the Multimodal Hypergraph Learning Guiding Module, and the Sentiment Prediction Target Module. A pretrained language model is first applied to encode the multimodal input data. To simulate missing modality conditions, the input data are constructed with incomplete modal information, where a modality \begin{document}$ m\in \{e,v,t\} $\end{document} is randomly absent. The adaptive global noise sampling strategy is then employed to supplement missing modalities from a global perspective, thereby improving adaptability and enhancing both robustness and generalization in complex environments. This design allows the model to handle noisy data and missing modalities more effectively. The Multimodal Hypergraph Learning Guiding Module is further applied to capture high-level semantic correlations across different modalities, overcoming the limitations of conventional methods that rely only on feature alignment and fusion. By guiding cross-modal information transfer, this module enables the model to focus on essential inter-modal semantic dependencies, thereby improving sentiment prediction accuracy. Finally, the performance of MHLGNE is compared with that of State-Of-The-Art (SOTA) MSA models under two conditions: complete modality data and randomly missing modality information.  Results and Discussions  Three publicly available MSA datasets (SEED-IV, SEED-V, and DREAMER) are employed, with features extracted from EEG signals, visual images, and text. To ensure robustness, standard cross-validation is applied, and the training process is conducted with iterative adjustments to the noise sampling strategy, modality fusion method, and hypergraph learning structure to optimize sentiment prediction. Under the complete modality condition, MHLGNE is observed to outperform the second-best M2S model across most evaluation metrics, with accuracy improvements of 3.26%, 2.10%, and 0.58% on SEED-IV, SEED-V, and DREAMER, respectively. Additional metrics also indicate advantages over other SOTA methods. Under the random missing modality condition, MHLGNE maintains superiority over existing MSA approaches, with improvements of 1.03% in accuracy, 0.24% in precision, and 0.08 in Kappa score. The adaptive noise sampling module is further shown to effectively compensate for missing modalities. Unlike conventional models that suffer performance degradation under such conditions, MHLGNE maintains robustness by generating complementary information. In addition, the multimodal hypergraph structure enables the capture of high-level semantic dependencies across modalities, thereby strengthening cross-modal information transfer and offering clear advantages when modalities are absent. Ablation experiments confirm the independent contributions of each module. The removal of either the adaptive noise sampling or the multimodal hypergraph learning guiding module results in notable performance declines, particularly under high-noise or severely missing modality conditions. The exclusion of the cross-modal information transfer mechanism causes a substantial decline in accuracy and robustness, highlighting its essential role in MSA.  Conclusions  The MHLGNE model, equipped with the Adaptive Global Noise Sampling Module and the Multimodal Hypergraph Learning Guiding Module, markedly improves the performance of MSA under conditions of missing modalities and in tasks requiring effective cross-modal information transfer. Experiments on SEED-IV, SEED-V, and DREAMER confirm that MHLGNE exceeds SOTA MSA models across multiple evaluation metrics, including accuracy, precision, Kappa score, and F1 score, thereby demonstrating its robustness and effectiveness. Future work may focus on refining noise sampling strategies and developing more sophisticated hypergraph structures to further strengthen performance under extreme modality-missing scenarios. In addition, this framework has the potential to be extended to broader sentiment analysis tasks across diverse application domains.
Entropy Quantum Collaborative Planning Method for Emergency Path of Unmanned Aerial Vehicles Driven by Survival Probability
WANG Enliang, ZHANG Zhen, SUN Zhixin
Available online  , doi: 10.11999/JEIT250694
Abstract:
  Objective  Natural disaster emergency rescue places stringent requirements on the timeliness and safety of Unmanned Aerial Vehicle (UAV) path planning. Conventional optimization objectives, such as minimizing total distance, often fail to reflect the critical time-sensitive priority of maximizing the survival probability of trapped victims. Moreover, existing algorithms struggle with the complex constraints of disaster environments, including no-fly zones, caution zones, and dynamic obstacles. To address these challenges, this paper proposes an Entropy-Enhanced Quantum Ripple Synergy Algorithm (E2QRSA). The primary goals are to establish a survival probability maximization model that incorporates time decay characteristics and to design a robust optimization algorithm capable of efficiently handling complex spatiotemporal constraints in dynamic disaster scenarios.  Methods  E2QRSA enhances the Quantum Ripple Optimization framework through four key innovations: (1) information entropy–based quantum state initialization, which guides population generation toward high-entropy regions; (2) multi-ripple collaborative interference, which promotes beneficial feature propagation through constructive superposition; (3) entropy-driven parameter control, which dynamically adjusts ripple propagation according to search entropy rates; and (4) quantum entanglement, which enables information sharing among elite individuals. The model employs a survival probability objective function that accounts for time-sensitive decay, base conditions, and mission success probability, subject to constraints including no-fly zones, warning zones, and dynamic obstacles.  Results and Discussions  Simulation experiments are conducted in medium- and large-scale typhoon disaster scenarios. The proposed E2QRSA achieves the highest survival probabilities of 0.847 and 0.762, respectively (Table 1), exceeding comparison algorithms such as SEWOA and PSO by 4.2–16.0%. Although the paths generated by E2QRSA are not the shortest, they are the most effective in maximizing survival chances. The ablation study (Table 3) confirms the contribution of each component, with the removal of multi-ripple interference causing the largest performance decrease (9.97%). The dynamic coupling between search entropy and ripple parameters (Fig. 2) is validated, demonstrating the effectiveness of the adaptive control mechanism. The entanglement effect (Fig. 4) is shown to maintain population diversity. In terms of constraint satisfaction, E2QRSA-planned paths consume only 85.2% of the total available energy (Table 5), ensuring a safe return, and all static and dynamic obstacles are successfully avoided, as visually verified in the 3D path plots (Figs. 6 and 7).  Conclusions  E2QRSA effectively addresses the challenge of UAV path planning for disaster relief by integrating adaptive entropy control with quantum-inspired mechanisms. The survival probability objective captures the essential requirements of disaster scenarios more accurately than conventional distance minimization. Experimental validation demonstrates that E2QRSA achieves superior solution quality and faster convergence, providing a robust technical basis for strengthening emergency response capabilities.
A Method for Named Entity Recognition in Military Intelligence Domain Using Large Language Models
LI Yongbin, LIU Lian, ZHENG Jie
Available online  , doi: 10.11999/JEIT250764
Abstract:
  Objective  Named Entity Recognition (NER) is a fundamental task in information extraction within specialized domains, particularly military intelligence. It plays a critical role in situation assessment, threat analysis, and decision support. However, conventional NER models face major challenges. First, the scarcity of high-quality annotated data in the military intelligence domain is a persistent limitation. Due to the sensitivity and confidentiality of military information, acquiring large-scale, accurately labeled datasets is extremely difficult, which severely restricts the training performance and generalization ability of supervised learning–based NER models. Second, military intelligence requires handling complex and diverse information extraction tasks. The entities to be recognized often possess domain-specific meanings, ambiguous boundaries, and complex relationships, making it difficult for traditional models with fixed architectures to adapt flexibly to such complexity or achieve accurate extraction. This study aims to address these limitations by developing a more effective NER method tailored to the military intelligence domain, leveraging Large Language Models (LLMs) to enhance recognition accuracy and efficiency in this specialized field.  Methods  To achieve the above objective, this study focuses on the military intelligence domain and proposes a NER method based on LLMs. The central concept is to harness the strong semantic reasoning capabilities of LLMs, which enable deep contextual understanding of military texts, accurate interpretation of complex domain-specific extraction requirements, and autonomous execution of extraction tasks without heavy reliance on large annotated datasets. To ensure that general-purpose LLMs can rapidly adapt to the specialized needs of military intelligence, two key strategies are employed. First, instruction fine-tuning is applied. Domain-specific instruction datasets are constructed to include diverse entity types, extraction rules, and representative examples relevant to military intelligence. Through fine-tuning with these datasets, the LLMs acquire a more precise understanding of the characteristics and requirements of NER in this field, thereby improving their ability to follow targeted extraction instructions. Second, Retrieval-Augmented Generation (RAG) is incorporated. A domain knowledge base is developed containing expert knowledge such as entity dictionaries, military terminology, and historical extraction cases. During the NER process, the LLM retrieves relevant knowledge from this base in real time to support entity recognition. This strategy compensates for the limited domain-specific knowledge of general LLMs and enhances recognition accuracy, particularly for rare or complex entities.  Results and Discussions  Experimental results indicate that the proposed LLM–based NER method, which integrates instruction fine-tuning and RAG, achieves strong performance in military intelligence NER tasks. Compared with conventional NER models, it demonstrates higher precision, recall, and F1-score, particularly in recognizing complex entities and managing scenarios with limited annotated data. The effectiveness of this method arises from several key factors. The powerful semantic reasoning capability of LLMs enables a deeper understanding of contextual nuances and ambiguous expressions in military texts, thereby reducing missed and false recognitions commonly caused by rigid pattern-matching approaches. Instruction fine-tuning allows the model to better align with domain-specific extraction requirements, ensuring that the recognition results correspond more closely to the practical needs of military intelligence analysis. Furthermore, the incorporation of RAG provides real-time access to domain expert knowledge, markedly enhancing the recognition of entities that are highly specialized or morphologically variable within military contexts. This integration effectively mitigates the limitations of traditional models that lack sufficient domain knowledge.  Conclusions  This study proposes a LLM–based NER method for the military intelligence domain, effectively addressing the challenges of limited annotated data and complex extraction requirements encountered by traditional models. By combining instruction fine-tuning and RAG, general-purpose LLMs can be rapidly adapted to the specialized demands of military intelligence, enabling the construction of an efficient domain-specific expert system at relatively low cost. The proposed method provides an effective and scalable solution for NER tasks in military intelligence scenarios, enhancing both the efficiency and accuracy of information extraction in this field. It offers not only practical value for military intelligence analysis and decision support but also methodological insight for NER research in other specialized domains facing similar data and complexity constraints, such as aerospace and national security. Future research will focus on optimizing instruction fine-tuning strategies, expanding the domain knowledge base, and reducing computational cost to further improve model performance and applicability.
A 3D Underwater Target Tracking Algorithm with Integrated Grubbs-Information Entropy and Improved Particle Filter
CAI Fanglin, WANG Ji, QIU Haowei
Available online  , doi: 10.11999/JEIT250249
Abstract:
  Objective  To address the limited target tracking accuracy of traditional Particle Filter (PF) algorithms in three-dimensional Underwater Wireless Sensor Networks (UWSNs) under abnormal conditions, this study proposes a three-dimensional underwater target tracking algorithm (OGIE-IPF). The algorithm integrates an optimized Grubbs criterion–based information entropy-weighted data fusion with an Improved Particle Filter (IPF). Conventional PF algorithms often suffer from particle degeneracy and impoverishment, which restrict estimation accuracy. Although weight optimization strategies introduced during resampling can enhance particle diversity, existing approaches mainly rely on fixed weighting factors that cannot dynamically adapt to real-time operating conditions. Moreover, current anomaly detection methods for multi-source data fusion fail to effectively address data coupling and heteroscedasticity across dimensions. To overcome these challenges, a dynamic adaptive hierarchical weight optimization strategy is designed for the resampling phase, enabling adaptive particle weighting across hierarchy levels. Additionally, a Mahalanobis distance discrimination mechanism is incorporated into the Grubbs criterion-based anomaly detection method, achieving effective multi-dimensional anomaly detection through covariance-sensitive analysis.  Methods  The proposed OGIE-IPF algorithm enhances target tracking accuracy under underwater abnormal conditions through a distributed data processing framework that integrates multi-source data fusion and adaptive filtering. First, the Unscented Kalman Filter (UKF) is incorporated into the particle filtering framework to construct the importance density function, thereby alleviating particle degeneracy. Simultaneously, a dynamic adaptive hierarchical weight optimization mechanism is proposed during the resampling phase to improve particle diversity. Second, the Mahalanobis distance replaces the conventional standardized residual method in the standard Grubbs criterion for anomaly statistic construction. By incorporating the covariance matrix of multidimensional variables, the method achieves effective anomaly detection for multi-dimensional data. Finally, local target tracking is performed using the IPF combined with the optimized Grubbs criterion for anomaly detection and sensor credibility evaluation, whereas global state estimation is realized through an information entropy-weighted multi-source fusion algorithm.  Results and Discussions  The IPF developed in this study is designed to enhance particle set diversity through optimization of the importance density function and refinement of the resampling strategy. To evaluate algorithm performance, a comparative experimental group with a particle population of 100 is established. Simulation results indicate that the weight distribution variances of the IPF at specific time points and over the entire tracking period are reduced by approximately 98.27% and 97.26%, respectively, compared with the traditional PF (Figs. 2 and 3). These findings suggest that the improved strategy effectively regulates particles with varying weights, resulting in a balanced distribution across hierarchical weight levels. Sensor anomalies are simulated by introducing substantial perturbations in observation noise. The experimental data show that the OGIE-IPF algorithm maintains optimal error metrics throughout the operational period (Figs. 4 and 5), demonstrating superior capability in suppressing abnormal noise interference. To further assess algorithm robustness, two representative scenarios under low-noise and high-noise conditions are constructed for multi-algorithm comparison. The results indicate that OGIE-IPF achieves Root Mean Square Error (RMSE) reductions of 79.78%, 66.78%, and 56.41% compared with the PF, Extended Particle Filter (EPF), and Unscented Particle Filter (UPF) under low-noise conditions, and reductions of 83.41%, 70.38%, and 21.68% under high-noise conditions (Figs. 8 and 11).  Conclusions  The OGIE-IPF algorithm proposed in this study enhances target tracking accuracy in three-dimensional underwater environments through two synergistic mechanisms. First, tracking precision is improved by refining the PF framework to optimize the intrinsic accuracy of the filtering process. Second, data fusion reliability is strengthened via an anomaly detection framework that mitigates interference from erroneous sensor measurements. Simulation results confirm that the OGIE-IPF algorithm produces state estimations more consistent with ground truth trajectories than conventional PF, EPF, and UPF algorithms, achieving lower RMSE and maintaining stable tracking performance under limited particle populations and abnormal noise conditions. Future work will extend the model to incorporate dynamic marine environmental factors and address the effects of malicious node interference within underwater network security systems.
A Spatial-semantic Combine Perception for Infrared UAV Target Tracking
YU Guodong, JIANG Yichun, LIU Yunqing, WANG Yijun, ZHAN Weida, WANG Chunyang, FENG Jianghai, HAN Yueyi
Available online  , doi: 10.11999/JEIT250613
Abstract:
  Objective  In recent years, infrared image-based UAV target tracking technology has attracted widespread attention. In real-world scenarios, infrared UAV target tracking still faces significant challenges due to factors such as complex backgrounds, UAV target deformation, and camera movement. Siamese network-based tracking methods have made breakthroughs in balancing tracking accuracy and efficiency. However, existing approaches rely solely on high-level feature outputs from deep networks to predict target positions, neglecting the effective use of low-level features. This leads to the loss of spatial detail features of infrared UAV targets, severely affecting tracking performance. To efficiently utilize low-level features, some methods have incorporated Feature Pyramid Networks (FPN) into the tracking framework, progressively fusing cross-layer feature maps in a top-down manner, thereby effectively enhancing tracking performance for multi-scale targets. Nevertheless, these methods directly adopt traditional FPN channel reduction operations, which result in significant loss of spatial contextual information and channel semantic information. To address the above issues, a novel infrared UAV target tracking method based on spatial-semantic combine perception is proposed. By capturing spatial multi-scale features and channel semantic information, the proposed approach enhances the model’s capability to track infrared UAV targets in complex backgrounds.  Methods  The proposed method comprises four main components: a backbone network, multi-scale feature fusion, template-search feature interaction, and a detection head. Initially, template and search images containing infrared UAV targets are input into a weight-sharing backbone network to extract features. Subsequently, an FPN is constructed, within which a Spatial-semantic Combine Attention Module (SCAM) is integrated to efficiently fuse multi-scale features. Finally, a Dual-branch global Feature interaction Module (DFM) is employed to facilitate feature interaction between the template and search branches, and the final tracking results are obtained through the detection head. The proposed SCAM enhances the network’s focus on spatial and semantic information by jointly leveraging spatial and channel attention mechanisms, thereby mitigating the loss of spatial and semantic information in low-level features caused by channel dimensionality reduction in traditional FPN. SCAM primarily consists of two components: the Spatial Multi-scale Attention module (SMA) and the Global-local Channel Semantic Attention module (GCSA). The SMA captures long-range multi-scale dependencies efficiently through axial positional embedding and multi-branch grouped feature extraction, thereby improving the network’s perception of global contextual information. GCSA adopts a dual-branch design to effectively integrate global and local information across feature channels, suppress irrelevant background noise, and enable more rational channel-wise feature weighting. The proposed DFM treats the template branch features as the query source for the search branch and applies global cross-attention to capture more comprehensive features of infrared UAV targets. This enhances the tracking network’s ability to attend to the spatial location and boundary details of infrared UAV targets.  Results and Discussions  The proposed method has been validated on the infrared UAV benchmark dataset (Anti-UAV). Quantitative analysis (Table 1) demonstrates that, compared to 10 state-of-the-art methods, the proposed approach achieves the highest average normalized precision score of 76.9%, surpassing the second-best method, LGTrack, by 4.4%. Qualitative analysis (Figs. 68) further confirms that the proposed method exhibits strong adaptability and robustness when addressing various typical challenges in infrared UAV tracking, such as out of view, distracting objects and complex backgrounds. The collaborative design of the individual modules significantly enhances the model’s ability to perceive and represent small targets and dynamic scenes. In addition, qualitative experiments (Fig. 9) conducted on a self-constructed infrared UAV tracking dataset demonstrate the effectiveness and generalization capability of the proposed method in real-world tracking scenarios. Ablation studies (Tables 25) reveal that integrating any individual proposed module consistently improves tracking performance.  Conclusions  This paper conducts a systematic theoretical analysis and experimental validation addressing the issue of spatial and semantic information loss in infrared UAV target tracking. Focusing on the limitations of existing FPN-based infrared UAV tracking methods, particularly the drawbacks associated with channel reduction in multi-scale low-level features, a novel infrared UAV target tracking method based on spatial-semantic combine perception is proposed which fully leverages the complementary advantages of spatial and channel attention mechanisms. This method enhances the network’s focus on spatial context and critical semantic information, thereby improving overall tracking performance. The following main conclusions are obtained: (1) The proposed SCAM combining SMA and GCSA, where SMA captures spatial long-range feature dependencies through position coordinate embedding and one-dimensional convolution operations, ensuring the acquisition of multi-scale contextual information, while GCSA achieves more comprehensive semantic feature attention by interacting local and global channel features. (2) The designed DFM, which realizes feature interaction between search branch features and template branch features through global cross-attention, enabling the dual-branch features to complement each other and enhancing network tracking performance. (3) Extensive experimental results demonstrate that the proposed algorithm outperforms existing advanced methods in both quantitative evaluation and qualitative analysis, with an average state accuracy of 0.769, success rate of 0.743, and precision of 0.935, achieving more accurate tracking of infrared UAV targets. Although the algorithm in this paper has been optimized in terms of computing resource utilization efficiency, further research is needed on efficient deployment strategies for embedded and mobile devices to improve real-time performance and computing adaptability.
Research Progress of Deep Learning Enabled Automatic Modulation Classification Technology
ZHENG Qinghe, LI Binglin, YU Zhiguo, JIANG Weiwei, ZHU Zhengyu, XU Chi, HUANG Chongwen, GUI Guan
Available online  , doi: 10.11999/JEIT250674
Abstract:
  Significance   With the advancement of sixth-Generation (6G) wireless communication systems towards the terahertz frequency band and space–air–ground integrated networks, the communication environment is becoming increasingly heterogeneous and densely deployed. This evolution imposes stringent precision requirements at the sub-symbol period level for Automatic Modulation Classification (AMC). Under complex channel conditions, AMC faces several challenges: feature mixing and distortion caused by time-varying multipath channels, substantial degradation in recognition accuracy of traditional methods under low Signal-to-Noise Ratio (SNR) conditions, and elevated complexity in detecting mixed modulation signals introduced by Sparse Code Multiple Access (SCMA) techniques. Addressing these challenges, this paper first analyzes the fundamental constraints on AMC method design from the perspective of signal transmission characteristics in communication models. It then systematically reviews Deep Learning (DL)-based AMC approaches, summarizes the difficulties these methods encounter in different wireless communication scenarios, evaluates the performance of representative DL models, and concludes with a discussion of current limitations in AMC together with promising research directions.  Process   Current research on AMC technology under complex channel conditions mainly focuses on three methodological categories: Likelihood-Based (LB), Feature-Based (FB), and DL, emphasizing both theoretical exploration and algorithmic innovation. Among these, end-to-end DL approaches have demonstrated superior performance in AMC tasks. By stacking multiple layers of nonlinear activation functions, DL models establish strong nonlinear fitting capabilities that allow them to uncover hidden patterns in radio signals. This enables DL to achieve high robustness and accuracy in complex environments. Convolutional Neural Networks (CNNs), leveraging their hierarchical local perception mechanism, can effectively capture amplitude and phase distortion characteristics of modulated signals, showing distinctive advantages in spatial feature extraction. Recurrent Neural Networks (RNNs), through the temporal memory function of gated units, exhibit theoretical superiority in modeling dynamic signal impairments such as inter-symbol interference, carrier frequency offset, carrier phase offset, and timing errors. More recently, Transformer architectures have achieved global feature association modeling through self-attention mechanisms, thereby enhancing the ability to identify key features and markedly improving AMC accuracy under low SNR conditions. The application potential of Transformers in AMC can be further extended by integrating multi-scale feature fusion, optimizing computational efficiency, and improving generalization.  Prospects   With the continuous growth of communication demands and the increasing complexity of application scenarios, the efficient and reliable management and utilization of wireless spectrum resources has become a central research focus. AMC enables mobile communication systems to achieve dynamic channel adaptation and heterogeneous network integration. Driven by the development of space–air–ground integrated networks, the application scope of AMC has expanded beyond traditional terrestrial cellular systems to emerging domains such as satellite communication and vehicular networking. DL-based AMC frameworks can capture dynamic channel responses through joint time–frequency domain representations, enhance transient feature extraction via attention mechanisms, and effectively decouple the coupling effects of multipath fading and Doppler shifts. By applying neural architecture search and model quantization–compression techniques, DL models can achieve low-complexity, real-time inference at the edge, thereby supporting end-to-end latency control in Vehicle-to-Everything (V2X) communication links. Furthermore, advanced DL architectures introduce feature enhancement mechanisms to preserve signal phase integrity, improving resilience against channel distortion. In dynamic optical network monitoring, feature extraction networks tailored to time-varying channels can adaptively capture the evolution of nonlinear phase shifts. Through implicit channel compensation, DL enables collaborative learning of time-domain and frequency-domain features. At present, AMC technology is progressing towards elastic architectures that support dynamic reconstruction of model parameters through online knowledge distillation and meta-learning frameworks, offering adaptive and lightweight solutions for Internet-of-Things (IoT) scenarios.  Conclusions  This paper systematically reviews the current research and challenges of AMC technology in the context of 6G networks. First, the applications of CNNs, RNNs, Transformers, and hybrid DL models in AMC are discussed in detail, with analysis of the technical advantages and limitations of each approach. Next, three representative application scenarios are examined: the mobile communication, the optical communication, and the IoT, highlighting the specific challenges faced by AMC technology. At present, the development of DL-driven AMC has moved beyond model design to include deployment and application challenges in real wireless communication environments. For example, constructing DL architectures with continuous learning capabilities is essential for adapting to dynamic communication conditions, while developing large-scale DL models provides an effective way to improve cross-scenario generalization. Future research should emphasize directions that integrate prior knowledge of the physical layer with DL architectures, strengthen feature fusion strategies, and advance hardware–algorithm co-design frameworks.
Secrecy Rate Maximization Algorithm for IRS Assisted UAV-RSMA Systems
WANG Zhengqiang, KONG Weidong, WAN Xiaoyu, FAN Zifu, DUO Bin
Available online  , doi: 10.11999/JEIT250452
Abstract:
  Objective  Under the stringent requirements of Sixth-Generation(6G) mobile communication networks for spectral efficiency, energy efficiency, low latency, and wide coverage, Unmanned Aerial Vehicle (UAV) communication has emerged as a key solution for 6G and beyond, leveraging its Line-of-Sight propagation advantages and flexible deployment capabilities. Functioning as aerial base stations, UAVs significantly enhance network performance by improving spectral efficiency and connection reliability, demonstrating irreplaceable value in critical scenarios such as emergency communications, remote area coverage, and maritime operations. However, UAV communication systems face dual challenges in high-mobility environments: severe multi-user interference in dense access scenarios that substantially degrades system performance, alongside critical physical-layer security threats resulting from the broadcast nature and spatial openness of wireless channels that enable malicious interception of transmitted signals. Rate-Splitting Multiple Access (RSMA) mitigates these challenges by decomposing user messages into common and private streams, thereby providing a flexible interference management mechanism that balances decoding complexity with spectral efficiency. This makes RSMA especially suitable for high-density user access scenarios. In parallel, Intelligent Reflecting Surfaces (IRS) have emerged as a promising technology to dynamically reconfigure wireless propagation through programmable electromagnetic unit arrays. IRS improves the quality of legitimate links while reducing the capacity of eavesdropping links, thereby enhancing physical-layer security in UAV communications. It is noteworthy that while existing research has predominantly centered on conventional multiple access schemes, the application potential of RSMA technology in IRS-assisted UAV communication systems remains relatively unexplored. Against this background, this paper investigates secure transmission strategies in IRS-assisted UAV-RSMA systems.  Methods  This paper investigates the effect of eavesdroppers on the security performance of UAV communication systems and proposes an IRS-assisted RSMA-based UAV communication model. The system comprises a multi-antenna UAV base station, an IRS mounted on a building, multiple single-antenna legitimate users, and multiple single-antenna eavesdroppers. The optimization problem is formulated to maximize the system secrecy rate by jointly optimizing precoding vectors, common secrecy rate allocation, IRS phase shifts, and UAV positioning. The problem is highly non-convex due to the strong coupling among these variables, rendering direct solutions intractable. To overcome this challenge, a two-layer optimization framework is developed. In the inner layer, with UAV position fixed, an alternating optimization strategy divides the problem into two subproblems: (1) joint optimization of precoding vectors and common secrecy rate allocation and (2) optimization of IRS phase shifts. Non-convex constraints are transformed into convex forms using techniques such as Successive Convex Approximation (SCA), relaxation variables, first-order Taylor expansion, and Semidefinite Relaxation (SDR). In the outer layer, the Particle Swarm Optimization (PSO) algorithm determines the UAV deployment position based on the optimized inner-layer variables.  Results and Discussions  Simulation results show that the proposed algorithm outperforms RSMA without IRS, NOMA with IRS, and NOMA without IRS in terms of secrecy rate. (Fig. 2) illustrates that the secrecy rate increases with the number of iterations and converges under different UAV maximum transmit power levels and antenna configurations. (Fig. 3) demonstrates that increasing UAV transmit power significantly enhances the secrecy rate for both the proposed and benchmark schemes. This improvement arises because higher transmit power strengthens the signal received by legitimate users, increasing their achievable rates and enhancing system secrecy performance. (Fig. 4) indicates that the secrecy rate grows with the number of UAV antennas. This improvement is due to expanded signal coverage and greater spatial degrees of freedom, which amplify effective signal strength in legitimate user channels. (Fig. 5) shows that both the proposed scheme and NOMA with IRS achieve higher secrecy rate as the number of IRS reflecting elements increases. The additional elements provide greater spatial degrees of freedom, improving channel gains for legitimate users and strengthening resistance to eavesdropping. In contrast, benchmark schemes operating without IRS assistance exhibit no performance improvement and maintain constant secrecy rate. This result highlights the critical role of the IRS in enabling secure communications. Finally, (Fig. 6) demonstrates the optimal UAV position when \begin{document}${P_{\max }} = 30{\text{ dBm}}$\end{document}. Deploying the UAV near the center of legitimate users and adjacent to the IRS minimizes the average distance to users, thereby reducing path loss and fully exploiting IRS passive beamforming. This placement strengthens legitimate signals while suppressing the eavesdropping link, leading to enhanced secrecy performance.  Conclusions  This study addresses secure communication scenarios with multiple eavesdroppers by proposing an IRS-assisted secure resource allocation algorithm for UAV-enabled RSMA systems. An optimization problem is formulated to maximize the system secrecy rate under multiple constraints, including UAV transmit power, by jointly optimizing precoding vectors, common rate allocation, IRS configurations, and UAV positioning. Due to the non-convex nature of the problem, a hierarchical optimization framework is developed to decompose it into two subproblems. These are effectively solved using techniques such as SCA, SDR, Gaussian randomization, and PSO. Simulation results confirm that the proposed algorithm achieves substantial secrecy rate gains over three benchmark schemes, thereby validating its effectiveness.
BIRD1445: Large-scale Multimodal Bird Dataset for Ecological Monitoring
WANG Hongchang, XIAN Fengyu, XIE Zihui, DONG Miaomiao, JIAN Haifang
Available online  , doi: 10.11999/JEIT250647
Abstract:
  Objective  With the rapid advancement of Artificial Intelligence (AI) and growing demands in ecological monitoring, high-quality multimodal datasets have become essential for training and deploying AI models in specialized domains. Existing bird datasets, however, face notable limitations, including challenges in field data acquisition, high costs of expert annotation, limited representation of rare species, and reliance on single-modal data. To overcome these constraints, this study proposes an efficient framework for constructing large-scale multimodal datasets tailored to ecological monitoring. By integrating heterogeneous data sources, employing intelligent semi-automatic annotation pipelines, and adopting multi-model collaborative validation based on heterogeneous attention fusion, the proposed approach markedly reduces the cost of expert annotation while maintaining high data quality and extensive modality coverage. This work offers a scalable and intelligent strategy for dataset development in professional settings and provides a robust data foundation for advancing AI applications in ecological conservation and biodiversity monitoring.  Methods  The proposed multimodal dataset construction framework integrates multi-source heterogeneous data acquisition, intelligent semi-automatic annotation, and multi-model collaborative verification to enable efficient large-scale dataset development. The data acquisition system comprises distributed sensing networks deployed across natural reserves, incorporating high-definition intelligent cameras, custom-built acoustic monitoring devices, and infrared imaging systems, supplemented by standardizedpublic data to enhance species coverage and modality diversity. The intelligent annotation pipeline is built upon four core automated tools: (1) spatial localization annotation leverages object detection algorithms to generate bounding boxes; (2) fine-grained classification employs Vision Transformer models for hierarchical species identification; (3) pixel-level segmentation combines detection outputs with SegGPT models to produce instance-level masks; and (4) multimodal semantic annotation uses Qwen large language models to generate structured textual descriptions. To ensure annotation quality and minimize manual verification costs, a multi-scale attention fusion verification mechanism is introduced. This mechanism integrates seven heterogeneous deep learning models, each with different feature perception capacities across local detail, mid-level semantic, and global contextual scales. A global weighted voting module dynamically assigns fusion weights based on model performance, while a prior knowledge-guided fine-grained decision module applies category-specific accuracy metrics and Top-K model selection to enhance verification precision and computational efficiency.  Results and Discussions  The proposed multi-scale attention fusion verification method dynamically assesses data quality based on heterogeneous model predictions, forming the basis for automated annotation validation. Through optimized weight allocation and category-specific verification strategies, the collaborative verification framework evaluates the effect of different model combinations on annotation accuracy. Experimental results demonstrate that the optimal verification strategy—achieved by integrating seven specialized models—outperforms all baseline configurations across evaluation metrics. Specifically, the method attains a Top-1 accuracy of 95.39% on the CUB-200-2011 dataset, exceeding the best-performing single-model baseline, which achieves 91.79%, thereby yielding a 3.60% improvement in recognition precision. The constructed BIRD1445 dataset, comprising 3.54 million samples spanning 1,445 bird species and four modalities, outperforms existing datasets in terms of coverage, quality, and annotation accuracy. It serves as a robust benchmark for fine-grained classification, density estimation, and multimodal learning tasks in ecological monitoring.  Conclusions  This study addresses the challenge of constructing large-scale multimodal datasets for ecological monitoring by integrating multi-source data acquisition, intelligent semi-automatic annotation, and multi-model collaborative verification. The proposed approach advances beyond traditional manual annotation workflows by incorporating automated labeling pipelines and heterogeneous attention fusion mechanisms as the core quality control strategy. Comprehensive evaluations on benchmark datasets and real-world scenarios demonstrate the effectiveness of the method: (1) the verification strategy improves annotation accuracy by 3.60% compared to single-model baselines on the CUB-200-2011 dataset; (2) optimal trade-offs between precision and computational efficiency are achieved using Top-K = 3 model selection, based on performance–complexity alignment; and (3) in large-scale annotation scenarios, the system ensures high reliability across 1,445 species categories. Despite its effectiveness, the current approach primarily targets species with sufficient data. Future work should address the representation of rare and endangered species by incorporating advanced data augmentation and few-shot learning techniques to mitigate the limitations posed by long-tail distributions.
A Waveform Design for Integrated Radar and Jamming Based on Intra-Pulse and Inter-Pulse Multiple-Phase Modulation
ZHANG Shiyuan, LU Xingyu, YAN Huabin, YANG Jianchao, TAN Ke, GU Hong
Available online  , doi: 10.11999/JEIT250600
Abstract:
  Objective  An integrated radar-jamming waveform employing multiple-phase modulation both within pulses (intra-pulse) and between pulses (inter-pulse) is proposed. The design increases the degrees of freedom in waveform synthesis compared with existing integrated signals, thereby improving joint performance in detection and jamming. In detection, phase compensation and complementary synthesis of received echoes are used to reconstruct a Linear Frequency Modulation (LFM) waveform, preserving the range resolution and ambiguity characteristics of LFM. In jamming, multi-parameter control of phase both across and within pulses allows flexible adjustment of the jamming energy distribution in the adversary’s range-Doppler map, enabling targeted energy allocation and concealment strategies. Simulation and experimental results show that the proposed waveform enhances overall detection and jamming performance relative to conventional integrated designs.  Methods  An integrated waveform that combines intra-pulse and inter-pulse multi-phase modulation is proposed. Carefully designed inter-pulse phase perturbations are introduced to prevent jamming energy from concentrating at zero Doppler and to allow precise control of the Doppler distribution of the jamming signal. During echo processing, the inter-pulse perturbations are removed by phase compensation so that inter-pulse complementarity reconstructs a continuous LFM waveform, thereby preserving detection performance. Each pulse is encoded with a binary phase-coded sequence, and additional phase modulation is applied between pulses. The resulting waveform has multiple tunable parameters and increased degrees of freedom, achieves low-sidelobe detection comparable to LFM, and permits flexible allocation of jamming energy across the range-Doppler plane.  Results and Discussions  The proposed integrated waveform is evaluated through simulations and practical experiments. Detection performance is significantly enhanced, with the Signal-to-Clutter-Noise Ratio (SCNR) for moving-target detection reaching 63.46 dB, representing a 25.25 dB improvement over conventional integrated waveforms and only 3.57 dB lower than that of a reference LFM signal (67.03 dB). These findings demonstrate that phase compensation and inter-pulse complementarity effectively enhance target detectability. Jamming performance is governed by the range of inter-pulse random phase perturbations. When the perturbation range is 0°, jamming energy is concentrated in the zero-Doppler main lobe, resulting in limited target masking. Expanding the range to ±90° flattens the Doppler spectrum and substantially weakens the target signature. Further extending the range to ±180° eliminates the zero-frequency main peak and achieves near-uniform diffusion of jamming energy across the Doppler domain. Therefore, by varying the inter-pulse phase range, continuous adjustment between concentrated and distributed jamming energy allocation is achieved. Overall, the waveform maintains detection performance comparable to that of optimal LFM signals while enabling flexible, parameterized control of jamming energy distribution. This design provides an adaptable solution for integrated radar-jamming systems that achieves a balance between efficient detection and adaptive jamming capability.  Conclusions  This study is based on a previously proposed integrated radar-jamming waveform and focuses on solving the problem of uneven jamming energy distribution in the unoptimized design. An integrated radar-jamming waveform based on combined intra-pulse and inter-pulse multiple-phase modulation is proposed by introducing random phase modulation between pulses. The proposed waveform achieves detection performance comparable to that of LFM signals and provides flexible control of jamming effects through multiple adjustable parameters, offering high design freedom. Theoretical analysis shows that intra-pulse modulation alone is insufficiently adaptable. The addition of random inter-pulse phases with variable distribution ranges enables more precise regulation of jamming energy diffusion. Simulation results indicate that increasing the range of inter-pulse phase perturbation leads to progressively wider diffusion of jamming energy, while detection performance remains similar to that of LFM. Therefore, by adjusting the distribution range of inter-pulse phases, the jamming energy pattern can be flexibly shaped, providing greater degrees of freedom in waveform design. Experimental results verify that the proposed waveform exhibits good overall performance in both detection and jamming. However, its practical application remains limited by specific operational conditions, which will be addressed in future studies.
Modeling, Detection, and Defense Theories and Methods for Cyber-Physical Fusion Attacks in Smart Grid
WANG Wenting, TIAN Boyan, WU Fazong, HE Yunpeng, WANG Xin, YANG Ming, FENG Dongqin
Available online  , doi: 10.11999/JEIT250659
Abstract:
  Significance   Smart Grid (SG), the core of modern power systems, enables efficient energy management and dynamic regulation through cyber–physical integration. However, its high interconnectivity makes it a prime target for cyberattacks, including False Data Injection Attacks (FDIAs) and Denial-of-Service (DoS) attacks. These threats jeopardize the stability of power grids and may trigger severe consequences such as large-scale blackouts. Therefore, advancing research on the modeling, detection, and defense of cyber–physical attacks is essential to ensure the safe and reliable operation of SGs.  Progress   Significant progress has been achieved in cyber–physical security research for SGs. In attack modeling, discrete linear time-invariant system models effectively capture diverse attack patterns. Detection technologies are advancing rapidly, with physical-based methods (e.g., physical watermarking and moving target defense) complementing intelligent algorithms (e.g., deep learning and reinforcement learning). Defense systems are also being strengthened: lightweight encryption and blockchain technologies are applied to prevention, security-optimized Phasor Measurement Unit (PMU) deployment enhances equipment protection, and response mechanisms are being continuously refined.  Conclusions  Current research still requires improvement in attack modeling accuracy and real-time detection algorithms. Future work should focus on developing collaborative protection mechanisms between the cyber and physical layers, designing solutions that balance security with cost-effectiveness, and validating defense effectiveness through high-fidelity simulation platforms. This study establishes a systematic theoretical framework and technical roadmap for SG security, providing essential insights for safeguarding critical infrastructure.  Prospects   Future research should advance in several directions: (1) deepening synergistic defense mechanisms between the information and physical layers; (2) prioritizing the development of cost-effective security solutions; (3) constructing high-fidelity information–physical simulation platforms to support research; and (4) exploring the application of emerging technologies such as digital twins and interpretable Artificial Intelligence (AI).
Multi-modal Joint Automatic Modulation Recognition Method Towards Low SNR Sequences
WANG Zhen, LIU Wei, LU Wanjie, NIU Chaoyang, LI Runsheng
Available online  , doi: 10.11999/JEIT250594
Abstract:
  Objective  The rapid evolution of data-driven intelligent algorithms and the rise of multi-modal data indicate that the future of Automatic Modulation Recognition (AMR) lies in joint approaches that integrate multiple domains, use multiple frameworks, and connect multiple scales. However, the embedding spaces of different modalities are heterogeneous, and existing models lack cross-modal adaptive representation, limiting their ability to achieve collaborative interpretation. To address this challenge, this study proposes a performance-interpretable two-stage deep learning–based AMR (DL-AMR) method that jointly models the signal in the time and transform domains. The approach explicitly and implicitly represents signals from multiple perspectives, including temporal, spatial, frequency, and intensity dimensions. This design provides theoretical support for multi-modal AMR and offers an intelligent solution for modeling low Signal-to-Noise Ratio (SNR) time sequences in open environments.  Methods  The proposed AMR network begins with a preprocessing stage, where the input signal is represented as an in-phase and quadrature (I–Q) sequence. After wavelet thresholding denoising, the signal is converted into a dual-channel representation, with one channel undergoing Short-Time Fourier transform (STFT). This preprocessing yields a dual-stream representation comprising both time-domain and transform-domain signals. The signal is then tokenized through time-domain and transform-domain encoders. In the first stage, explicit modal alignment is performed. The token sequences from the time and transform domains are input in parallel into a contrastive learning module, which explicitly captures and strengthens correlations between the two modalities in dimensions such as temporal structure and amplitude. The learned features are then passed into the feature fusion module. Bidirectional Long Short-Term Memory (BiLSTM) and local representation layers are employed to capture temporally sparse features, enabling subsequent feature decomposition and reconstruction. To refine feature extraction, a subspace attention mechanism is applied to the high-dimensional sparse feature space, allowing efficient capture of discriminative information contained in both high-frequency and low-frequency components. Finally, Convolutional Neural Network – Kolmogorov-Arnold Network (CNN-KAN) layers replace traditional multilayer perceptrons as classifiers, thereby enhancing classification performance under low SNR conditions.  Results and Discussions  The proposed method is experimentally validated on three datasets: RML2016.10a, RML2016.10b, and HisarMod2019.1. Under high SNR conditions (SNR > 0 dB), classification accuracies of 93.36%, 93.13%, and 93.37% are achieved on the three datasets, respectively. Under low SNR conditions, where signals are severely corrupted or blurred by noise, recognition performance decreases but remains robust. When the SNR ranges from –6 dB to 0 dB, overall accuracies of 78.36%, 80.72%, and 85.43% are maintained, respectively. Even at SNR levels below –6 dB, accuracies of 17.10%, 21.30%, and 29.85% are obtained. At particularly challenging low-SNR levels, the model still achieves 43.45%, 44.54%, and 60.02%. Compared with traditional approaches, and while maintaining a low parameter count (0.33–0.41 M), the proposed method improves average recognition accuracy by 2.12–7.89%, 0.45–4.64%, and 6.18–9.53% on the three datasets. The improvements under low SNR conditions are especially significant, reaching 4.89–12.70% (RML2016.10a), 2.62–8.72% (RML2016.10b), and 4.96–11.63% (HisarMod2019.1). The results indicate that explicit modeling of time–transform domain correlations through contrastive learning, combined with the hybrid architecture consisting of LSTM for temporal sequence modeling, CNN for local feature extraction, and KAN for nonlinear approximation, substantially enhances the noise robustness of the model.  Conclusions  This study proposes a two-stage AMR method based on time–transform domain multimodal fusion. Explicit multimodal alignment is achieved through contrastive learning, while temporal and local features are extracted using a combination of LSTM and CNN. The KAN is used to enhance nonlinear modeling, enabling implicit feature-level multimodal fusion. Experiments conducted on three benchmark datasets demonstrate that, compared with classical methods, the proposed approach improves recognition accuracy by 2.62–11.63% within the SNR range of –20 to 0 dB, while maintaining a similar number of parameters. The performance gains are particularly significant under low-SNR conditions, confirming the effectiveness of multimodal joint modeling for robust AMR.
UAV-Assisted Intelligent Data Collection and Computation Offloading for Railway Wireless Sensor Networks
YAN Li, WANG Junkai, FANG Xuming, LIN Wei, LIANG Yiqun
Available online  , doi: 10.11999/JEIT250340
Abstract:
  Objective  Ensuring the safety and stability of train operations is essential in the advancement of railway intelligence. The growing maturity of Wireless Sensor Network (WSN) technology offers an efficient, reliable, low-cost, and easily deployable approach to monitoring railway operating conditions. However, in complex and dynamic maintenance environments, WSNs encounter several challenges, including weak signal coverage at monitoring sites, limited accessibility for tasks such as sensor node battery replacement, and the generation of large volumes of monitoring data. To address these issues, this study proposes a multi-Unmanned Aerial Vehicle (UAV)-assisted method for data collection and computation offloading in railway WSNs. This approach enhances overall system energy efficiency and data freshness, offering a more effective and robust solution for railway safety monitoring.  Methods  An intelligent data collection and computation offloading system is constructed for multi-UAV-assisted railway WSNs. UAV flight constraints within railway safety protection zones are considered, and wireless sensing services are prioritized to ensure preferential transmission for safety-critical tasks. To balance energy consumption and data freshness, the system optimization objective is defined as the weighted sum of UAV energy consumption, WSN energy consumption, and the Age of Information (AoI). A joint optimization algorithm based on Multi-Agent Soft Actor-Critic (MASAC) is proposed, which balances exploration and exploitation through entropy regularization and adaptive temperature parameters. This approach enables efficient joint optimization of UAV trajectories and computation offloading strategies.  Results and Discussions  (1) Compared with the Multi-Agent Deep Deterministic Policy Gradient (MADDPG), MASAC-Greedy, and MASAC-AOU algorithms, the MASAC-based scheme converges more rapidly and demonstrates greater stability (Fig. 4), ultimately achieving the highest reward. In contrast, MADDPG exhibits slower learning and less stable performance. (2) The comparison of multi-UAV flight trajectories under different algorithms shows that the proposed MASAC algorithm enables effective collaboration among UAVs, with each responsible for monitoring distinct regions while strictly adhering to railway safety protection zone constraints (Fig. 5). (3) The MASAC algorithm yields the best objective function value across all evaluated algorithms (Fig. 6). (4) As the number of sensors and the AoI weight increase, UAV energy consumption rises for all algorithms; however, the MASAC algorithm consistently maintains the lowest energy consumption (Fig. 7). (5) In terms of sensor node energy consumption, MADDPG achieves the lowest value, but at the expense of information freshness (Fig. 8). (6) Regarding average AoI performance, the MASAC algorithm performs best across a range of sensor densities and AoI weight settings, with the greatest improvements observed under higher AoI weight conditions (Fig. 9). (7) The AoI performance comparison by sensor type (Table 2) confirms that the system effectively supports priority-based data collection services.  Conclusions  This study proposes a MASAC-based intelligent data collection and computation offloading scheme for railway WSNs supported by multiple UAVs, addressing critical challenges such as limited WSN battery life and the high real-time computational demands of complex railway environments. The proposed algorithm jointly optimizes UAV flight trajectories and computation offloading strategies by integrating considerations of UAV and WSN energy consumption, data freshness, sensing service priorities, and railway safety protection zone constraints. The optimization objective is to minimize the weighted sum of average UAV energy consumption, average WSN energy consumption, and average WSN AoI. Simulation results demonstrate that the proposed scheme outperforms baseline algorithms across multiple performance metrics. Specifically, it achieves faster convergence, efficient multi-UAV collaboration that avoids resource redundancy and spatial overlap, and superior results in UAV energy consumption, sensor node energy consumption, and average AoI.
Research and Design of a Ballistocardiogram-Based Heart Rate Variability (HRV) Monitoring Device Integrated into Pilot Helmets
ZHAO Yanpeng, LI Falin, LI Xuan, YU Haibo, CAO Zhengtao, ZHANG Yi
Available online  , doi: 10.11999/JEIT250342
Abstract:
  Objective  Conventional Heart Rate Variability (HRV) monitoring in aviation is limited by bulky wearable devices that require direct skin contact, are prone to electromagnetic interference during flight, and suffer from electrode displacement during high-G maneuvers. These constraints hinder continuous physiological monitoring, which is critical for flight safety. This study presents a non-contact monitoring approach integrated into pilot helmets, utilizing BallistoCardioGram (BCG) technology to detect cardiac mechanical activity via helmet-mounted inertial sensors. The objective is to establish a novel physiological monitoring paradigm that eliminates the need for skin–electrode interfaces while achieving measurement accuracy suitable for aviation operational standards.  Methods  Hardware ConfigurationA patented BCG sensing module is embedded within the occipital stabilization system of flight protective helmets. Miniaturized, high-sensitivity inertial sensors interface with proprietary signal conditioning circuits that execute a three-stage physiological signal refinement process. First, primary analog amplification scales microvolt-level inputs to measurable voltage ranges. Second, a fourth-order Butterworth bandpass filter (0.5–20 Hz) isolates cardiac mechanical signatures. Third, analog-to-digital conversion quantizes the signals at a 250 Hz sampling rate. Physical integration complies with military equipment standards for helmet structural integrity and ergonomic performance, ensuring full compatibility with existing flight gear without compromising protection or pilot comfort during extended missions.Computational Framework A multi-layer signal processing architecture is implemented to extract physiological features. Raw BCG signals undergo five-level discrete wavelet transformation using Daubechies-4 basis functions, effectively separating cardiac components from respiratory modulation and motion-induced artifacts. J-wave identification is achieved through dual-threshold detection: morphological amplitudes exceeding three times the local baseline standard deviation and temporal positioning within 200 ms sliding analysis windows. Extracted J–J intervals are treated as functional analogs of ElectroCardioGram (ECG)-derived R–R intervals. Time-domain HRV metrics are computed as follows: (1) Standard Deviation of NN intervals (SDNN), representing overall autonomic modulation; (2) Root Mean Square of Successive Differences (RMSSD), indicating parasympathetic activity; (3) Percentage of adjacent intervals differing by more than 50 ms (pNN50). Frequency-domain analysis applies Fourier transformation to quantify Low-Frequency (LF: 0.04–0.15 Hz) and High-Frequency (HF: 0.15–0.4 Hz) spectral powers. The LF/HF ratio is used to assess sympathetic–parasympathetic balance. The entire processing pipeline is optimized for real-time execution under in-flight operational conditions.  Results and Discussions  System validation is conducted under simulated flight conditions to evaluate physiological monitoring performance. Signal acquisition is found to be reliable across static, turbulent, and high-G scenarios, with consistent capture of BCG waveforms. Quantitative comparisons with synchronized ECG recordings show strong agreement between measurement modalities: (1) SDNN: 95.80%; (2) RMSSD: 94.08%; (3) LF/HF ratio: 92.86%. These results demonstrate that the system achieves physiological measurement equivalence to established clinical standards. Artifact suppression is effectively performed by the wavelet-based signal processing framework, which maintains waveform integrity under conditions of aircraft vibration and rapid gravitational transition—conditions where conventional ECG monitoring often fails. Among tested sensor placements, the occipital position exhibits the highest signal-to-noise ratio. Operational stability is maintained during continuous 6-hour monitoring sessions, with no observed signal degradation. This long-duration robustness indicates suitability for extended flight operations.Validation results indicate that the BCG-based approach addresses three primary limitations associated with ECG systems in aviation. The removal of electrode–skin contact mitigates the risk of contact dermatitis during prolonged wear. Non-contact sensing eliminates susceptibility to electromagnetic interference generated by radar and communication systems. Furthermore, mechanical coupling ensures signal continuity during abrupt gravitational changes, which typically displace ECG electrodes and cause signal dropout. The wavelet decomposition method is particularly effective in attenuating rotorcraft harmonic vibrations and turbulence-induced high-frequency noise. Autonomic nervous system modulation is reliably captured through pulse transit time variability, which aligns with neurocardiac regulation indices derived from ECG. Two operational considerations are identified. First, respiratory coupling under hyperventilation may introduce artifacts that require additional filtering. Second, extreme cervical flexion exceeding 45 degrees may degrade signal quality, indicating the potential benefit of redundant sensor configurations under such conditions.  Conclusions  This study establishes a functional, helmet-integrated BCG monitoring system capable of delivering medical-grade HRV metrics without compromising flight safety protocols. The technology represents a shift from contact-based to non-contact physiological monitoring in aviation settings. Future system development will incorporate: (1) Infrared eye-tracking modules to assess blink interval variability for objective fatigue evaluation; (2) Dry-contact electroencephalography sensors to quantify prefrontal cortex activity and assess cognitive workload; (3) Multimodal data fusion algorithms to generate unified indices of physiological strain. The integrated framework aims to enable real-time pilot state awareness during critical operations such as aerial combat maneuvers, hypoxia exposure, and emergency responses. Further technology maturation will prioritize operational validation across diverse aircraft platforms and environmental conditions. System implementation remains fully compliant with military equipment specifications and is positioned for future translation to commercial aviation and human factors research. Broader applications include astronaut physiological monitoring during spaceflight missions and enhanced safety systems in high-performance motorsports.
Mutualistic Backscatter NOMA Method for Coordinated Direct and Relay Transmission System
XU Yao, HU Rongfei, JIA Shaobo, LI Bo, WANG Gang, ZHANG Zhizhong
Available online  , doi: 10.11999/JEIT250405
Abstract:
  Objective  The exponential growth in data traffic necessitates that cellular Internet of Things (IoT) systems achieve both ultra-high spectral efficiency and wide-area coverage to meet the stringent service requirements of vertical applications such as industrial automation and smart cities. Non-Orthogonal Multiple Access-based Coordinated Direct and Relay Transmission (NOMA-CDRT) method can enhance both spectral efficiency and coverage by leveraging power-domain multiplexing and cooperative relaying, making it a promising approach to address these challenges. However, existing NOMA-CDRT frameworks are primarily designed for cellular communications and do not effectively support spectrum sharing or the deep integration of cellular and IoT transmissions. To overcome these limitations, this study proposes a Mutualistic Backscatter NOMA-CDRT (MB-NOMA-CDRT) method. This approach facilitates spectrum sharing and mutualistic coexistence between cellular users and IoT devices, while improving the system’s Ergodic Sum Rate (ESR).  Methods  The proposed MB-NOMA-CDRT method integrates backscatter modulation and power-domain superposition coding to develop a bidirectional communication strategy that unifies information transmission and cooperative assistance, enabling spectrum sharing and mutualistic coexistence between cellular users and IoT devices. Specifically, the base station uses downlink NOMA to serve the cellular center user directly and the cellular edge user via a relaying user. Simultaneously, IoT devices utilize cellular radio frequency signals and backscatter modulation to transmit their data to the base station, thereby achieving spectrum sharing. The backscattered IoT signals act as multipath gains, contributing to improved cellular communication quality. To rigorously characterize the system performance, the squared generalized-K distribution and Meijer-G functions are adopted to derive closed-form expressions for the ESR under both perfect and imperfect Successive Interference Cancellation (SIC). Building on this analytical foundation, a power allocation optimization scheme is developed using an enhanced Particle Swarm Optimization (PSO) algorithm to maximize system ESR. Finally, extensive Monte Carlo simulations are conducted to verify the ESR gains of the proposed method, confirm the theoretical analysis, and demonstrate the efficacy of the optimization scheme.  Results and Discussions  The performance advantage of the proposed MB-NOMA-CDRT method is demonstrated through comparisons of ESR with conventional NOMA-CDRT and Orthogonal Multiple Access (OMA) schemes (Fig. 2 and Fig. 3). The theoretical ESR results closely match the simulation data, confirming the validity of the analytical derivations. Under both perfect and imperfect SIC, the proposed method consistently achieves the highest ESR. This improvement arises from spectrum sharing between cellular users and IoT devices, where the IoT link contributes multipath gain to the cellular link, thereby enhancing overall system performance. To investigate the influence of power allocation, simulation results illustrate the three-dimensional relationship between ESR and power allocation coefficients (Fig. 4). A maximum ESR is observed under specific coefficient combinations, indicating that optimized power allocation can significantly improve system throughput. Furthermore, the proposed optimization scheme demonstrates rapid convergence, with ESR values stabilizing within a few iterations (Fig. 5), supporting its computational efficiency. Finally, ESR performance is compared among the proposed optimization scheme, exhaustive search, and fixed power allocation strategies (Fig. 6). The proposed scheme consistently yields higher ESR across both perfect and imperfect SIC scenarios, demonstrating its superiority in enhancing spectral efficiency while maintaining low computational complexity.  Conclusions  This study proposes a MB-NOMA-CDRT method that enables spectrum sharing between IoT devices and cellular users while improving cellular communication quality through the backscatter-assisted reflection link. To evaluate system performance, closed-form expressions for the ESR are derived under both perfect and imperfect SIC. Building on this analytical foundation, a power allocation optimization scheme based on PSO is developed to maximize the system ESR. Simulation results demonstrate that the proposed method consistently outperforms conventional NOMA-CDRT and OMA schemes in terms of ESR, under both perfect and imperfect SIC conditions. The optimization scheme also exhibits favorable convergence behavior and effectively improves system performance. Given its advantages in spectral efficiency and computational efficiency, the proposed MB-NOMA-CDRT method is well suited to cellular IoT scenarios. Future work will focus on exploring the mathematical conditions necessary to fully characterize and exploit the mutualistic transmission mechanism.
The effects of ELF-MF on Aβ42 deposition in AD mice and SWM-related neural oscillations
GENG Duyan, LIU Aoge, YAN Yuxin, ZHENG Weiran
Available online  , doi: 10.11999/JEIT241106
Abstract:
  Objective  Extremely Low-Frequency Magnetic Fields (ELF-MF) have shown beneficial effects in various diseases; however, their influence on Alzheimer’s Disease (AD) remains insufficiently understood. With global population aging, AD has become one of the most prevalent neurodegenerative disorders. Its complex pathogenesis is characterized by neuronal loss, extracellular Amyloid-β (Aβ) deposition, and intracellular neurofibrillary tangles. Cognitive decline, particularly Spatial Working Memory (SWM) impairment, is among its main clinical manifestations. As a crucial cognitive function for encoding and retaining spatial location information, SWM underpins the execution of complex cognitive tasks. Impairment of SWM not only affects daily functioning but also serves as a key indicator of AD progression. Although previous studies have suggested potential cognitive benefits of ELF-MF exposure, systematic investigations integrating pathological, behavioral, and electrophysiological analyses remain limited. This study aims to investigate whether 40 Hz ELF-MF exposure mitigates AD pathology by assessing Aβ42 deposition, SWM performance, and neural oscillatory activity in the hippocampal CA1 region, and to elucidate the relationships between electrophysiological modulation and behavioral improvement.  Methods  An integrated multidisciplinary approach combining immunofluorescence detection, behavioral assessment, and electrophysiological recording is employed. Transgenic AD model mice and Wild-Type (WT) controls are used and assigned to three groups: WT control (Con), AD model group (AD), and AD model group exposed to ELF-MF stimulation (ES). The ES group receives 40 Hz, 10 mT continuous pulse stimulation twice daily for 0.5 h per session over 14 consecutive days, whereas the AD and Con groups undergo sham stimulation during identical time periods. SWM is evaluated using the Object Location Task (OLT). Behavioral performance is quantitatively determined by calculating the Cognitive Index (CI), which reflects the animal’s capacity to recognize spatial novelty. During behavioral testing, Local Field Potential (LFP) signals are synchronously recorded from the hippocampal CA1 region via chronically implanted microelectrodes. Advanced signal processing techniques, including time-frequency distribution analysis and phase-amplitude coupling computation, are applied to characterize neural oscillations within the theta (4~13 Hz) and gamma (30~80 Hz) frequency bands. After completion of the experiments, brain tissues are collected for quantitative measurement of Aβ42 plaque deposition in hippocampal sections through immunofluorescence staining, using standardized imaging and quantification protocols. Statistical analyses are performed to evaluate correlations between behavioral indices and electrophysiological parameters, with the objective of identifying mechanistic relationships underlying the effects of ELF-MF exposure.  Results and Discussions  Exposure to 40 Hz ELF-MF produced significant therapeutic effects across all examined parameters. Pathological analysis revealed markedly reduced Aβ42 deposition in the hippocampal region of treated AD mice compared with untreated controls, supporting the amyloid cascade hypothesis, which identifies Aβ oligomers as critical triggers of neurodegeneration. This reduction suggests that ELF-MF may influence Aβ metabolic pathways, potentially through the regulation of mitochondrial dynamics, as reported in previous studies. Behavioral assessment indicated a pronounced improvement in SWM following ELF-MF exposure, reflected by significantly elevated CI scores in the OLT. Electrophysiological recordings revealed notable alterations in neural oscillatory activity, with treated animals exhibiting increased power spectral density in both theta (4~13 Hz) and gamma (30~80 Hz) bands during memory task performance. The temporal dynamics of theta oscillations also differed among groups: in Con and ES mice, peak theta power occurred approximately 0.5~1 seconds before the behavioral reference point, indicating anticipatory processing, whereas in AD mice, peaks appeared after the reference point, reflecting delayed cognitive responses. Cross-frequency coupling analysis further demonstrated enhanced theta-gamma phase-amplitude coupling strength in the hippocampal CA1 region of ELF-MF-exposed mice, with coupling peaks primarily observed in the lower theta and higher gamma frequencies. Correlation analyses revealed statistically significant positive relationships between behavioral cognitive indices and electrophysiological measures, particularly for theta power and theta-gamma coupling strength. These convergent findings across pathological, behavioral, and electrophysiological domains indicate that ELF-MF exposure may restore impaired neural synchronization mechanisms. Enhanced theta-gamma coupling is particularly relevant, as this neurophysiological mechanism is known to facilitate temporal coordination among neuronal assemblies during memory processing. Although the present study demonstrates clear benefits of ELF-MF stimulation, heterogeneity in previously reported results warrants consideration. The efficacy of ELF-MF appears highly dependent on key stimulation parameters such as frequency, intensity, duration, and exposure intervals. Previous studies have reported divergent effects, ranging from negligible or adverse outcomes to substantial cognitive enhancement under different experimental conditions. This parameter dependency presents challenges for clinical translation and highlights the need for systematic optimization in higher-order animal models.  Conclusions  This study demonstrates that exposure to a 40 Hz ELF-MF effectively reduces Aβ42 deposition in the hippocampal region of AD mice, alleviates SWM deficits, and normalizes neural oscillatory activity in the hippocampal CA1 region. The observed cognitive improvements are closely linked to enhanced oscillations in the theta and gamma frequency bands and to strengthened theta-gamma cross-frequency coupling, indicating that neuromodulatory regulation of neural synchronization underlies behavioral recovery. These findings provide strong evidence supporting the potential of ELF-MF as a noninvasive therapeutic approach for AD, targeting both pathological markers and functional impairments. The study establishes a foundation for future work aimed at optimizing stimulation parameters and advancing translational applications, while highlighting the central role of neural oscillatory restoration as a therapeutic mechanism in neurodegenerative disorders. Further investigations should focus on refining exposure protocols and developing personalized stimulation strategies to accommodate individual variability in treatment responsiveness.
VCodePPA: A Large-scale Verilog Dataset with PPA Annotations
CHEN Xiyuan, JIANG Yuxuan, XIA Yingjie, HU Ji, ZHOU Yizhao
Available online  , doi: 10.11999/JEIT250449
Abstract:
  Objective  As a predominant hardware description language, the quality of Verilog code directly affects the Power, Performance, and Area (PPA) metrics of the resulting circuits. Current Large Language Model (LLM)-based approaches for generating hardware description languages face a central challenge: incorporating a design feedback mechanism informed by PPA metrics to guide model optimization, rather than relying solely on syntactic and functional correctness. The field faces three major limitations: the absence of PPA metric annotations in training data, which prevents models from learning the effects of code modifications on physical characteristics; evaluation frameworks that remain disconnected from downstream engineering needs; and the lack of systematic data augmentation methods to generate functionally equivalent code with differentiated PPA characteristics. To address these gaps, we present VCodePPA, a large-scale dataset that establishes precise correlations between Verilog code structures and PPA metrics. The dataset comprises 17 342 entries and provides a foundation for data-driven optimization paradigms in hardware design.  Methods  The dataset construction is initiated by collecting representative Verilog code samples from GitHub repositories, OpenCores projects, and standard textbooks. After careful selection, a seed dataset of 3 500 samples covering 20 functional categories is established. These samples are preprocessed through functional coverage optimization, syntax verification with Yosys, format standardization, deduplication, and complexity filtering. An automated PPA extraction pipeline is implemented in Vivado to evaluate performance characteristics, with metrics including LookUp Table (LUT) count, register usage, maximum operating frequency, and power consumption. To enhance dataset diversity while preserving functional equivalence, a multi-dimensional code transformation framework is applied, consisting of nine methods across three dimensions: architecture layer (finite state machine encoding, interface protocol reconstruction, arithmetic unit replacement), logic layer (control flow reorganization, operator rewriting, logic hierarchy restructuring), and timing layer (critical path cutting, register retiming, pipeline insertion or deletion). Efficient exploration of the transformation space is achieved through a Heterogeneous Verilog Mutation Search (HVMS) algorithm based on Monte Carlo Tree Search, which generates 5~10 PPA-differentiated variants for each seed code. A dual-task LLM training strategy with PPA-guided adaptive loss functions is subsequently employed, incorporating contrastive learning mechanisms to capture the relationship between code structure and physical implementation.  Results and Discussions  The VCodePPA dataset achieves broad coverage of digital hardware design scenarios, representing approximately 85%~90% of common design contexts. The multi-dimensional transformation framework generates functionally equivalent yet structurally diverse code variants, with PPA differences exceeding 20%, thereby exposing optimization trade-offs inherent in hardware design. Experimental evaluation demonstrates that models trained with VCodePPA show marked improvements in PPA optimization across multiple Verilog functional categories, including arithmetic, memory, control, and hybrid modules. In testing scenarios, VCodePPA-trained models produced implementations with superior PPA metrics compared with baseline models. The PPA-oriented adaptive loss function effectively overcame the traditional limitation of language model training, which typically lacks sensitivity to hardware implementation efficiency. By integrating contrastive learning and variant comparison loss mechanisms, the model achieved an average improvement of 17.7% across PPA metrics on the test set, influencing 32.4% of token-level predictions in code generation tasks. Notably, VCodePPA-trained models reduced on-chip resource usage by 10%\begin{document}$ \sim $\end{document}15%, decreased power consumption by 8%\begin{document}$ \sim $\end{document}12%, and shortened critical path delay by 5%\begin{document}$ \sim $\end{document}8% relative to baseline models.  Conclusions  This paper introduces VCodePPA, a large-scale Verilog dataset with precise PPA annotations, addressing the gap between code generation and physical implementation optimization. The main contributions are as follows: (1)construction of a seed dataset spanning 20 functional categories with 3 500 samples, expanded through systematic multi-dimensional code transformation to 17 000 entries with comprehensive PPA metrics; (2)development of an MCTS-based homogeneous code augmentation scheme employing nine transformers across architectural, logical, and timing layers to generate functionally equivalent code variants with significant PPA differences; and (3)design of a dual-task training framework with PPA-oriented adaptive loss functions, enabling models to learn PPA trade-off principles directly from data rather than relying on manual heuristics or single-objective constraints. Experimental results demonstrate that models trained on VCodePPA effectively capture PPA balancing principles and generate optimized hardware description code. Future work will extend the dataset to more complex design scenarios and explore advanced optimization strategies for specialized application domains.
Bayesian Optimization-driven Design Space Exploration Method for Coarse-Grained Reconfigurable Cipher Logic Array
JIANG Danping, DAI Zibin, LIU Yanjiang, ZHOU Zhaoxu, SONG Xiaoyu
Available online  , doi: 10.11999/JEIT250624
Abstract:
  Objective  Coarse-Grained Reconfigurable Cipher logic Arrays (CGRCAs) are widely employed in information security systems owing to their high flexibility, strong performance, and inherent security. Design Space Exploration (DSE) plays a critical role in evaluating and optimizing the performance of cryptographic algorithms deployed on CGRCAs. However, conventional DSE approaches require extensive computation time to locate optimal solutions in multi-objective optimization problems and often yield suboptimal performance. To overcome these limitations, this study proposes a Bayesian optimization-based DSE framework, termed Multi-Objective Bayesian optimization-based Exploration (MOBE), which enhances search efficiency and solution quality while effectively satisfying the complex design requirements of CGRCA architectures.  Methods  The high-dimensional characteristics and multi-objective optimization features of the CGRCA are analyzed, and its design space is systematically modeled. A DSE method based on Bayesian optimization is then proposed, comprising initial sampling design, rapid evaluation model construction, surrogate model development, and acquisition function optimization. A knowledge-aware unsupervised learning sampling strategy is introduced to integrate domain-specific knowledge with clustering algorithms, thereby improving the representativeness and diversity of the initial samples. A rapid evaluation model is established to estimate throughput, area overhead, and Function Unit (FU) utilization for each sample, effectively reducing the computational cost of performance evaluation. To enhance both search efficiency and generalizability, a greedy-based hybrid surrogate model is constructed by combining Gaussian Process with Deep Kernel Learning (DKL-GP), random forest, and neural network models. Moreover, an adaptive multi-acquisition function is designed by integrating Expected Hyper Volume Improvement (EHVI) and quasi-Monte Carlo Upper Confidence Bound (qUCB) to identify the most promising samples and maintain a balanced trade-off between exploration and exploitation. The weighting ratio between EHVI and qUCB is dynamically adjusted to accommodate the varying optimization requirements across different search phases.  Results and Discussions  The DSE method based on Bayesian optimization (Algorithm 2) includes initial sampling design, rapid evaluation model construction, surrogate model development, and acquisition function optimization to enhance solution quality and search efficiency. Simulation results show that the knowledge-aware unsupervised learning sampling strategy reduces the Average Distance from Reference Set (ADRS) by up to 28.2% and increases hypervolume by 15.1% compared with existing sampling approaches (Table 3). This improvement primarily arises from the integration of domain knowledge with clustering algorithms. Compared with single surrogate model-based DSE methods, the greedy-based hybrid surrogate model leverages the complementary advantages of multiple surrogate models across different optimization stages, prioritizing samples that contribute most to hypervolume expansion. The hybrid surrogate model achieves a reduction in ADRS of up to 31.7% and an improvement in hypervolume of 20.0% (Table 4). Furthermore, the proposed MOBE framework achieves a 34.9% reduction in ADRS and increases hypervolume by 28.7% relative to state-of-the-art DSE methods (Table 5). Regarding the average performance metrics of Pareto-front samples, MOBE enhances throughput by up to 29.9%, reduces area overhead by 6.0%, and improves FU utilization by 11.6% (Fig. 6), confirming its superiority in overall solution quality. Moreover, the MOBE method exhibits excellent cross-algorithm stability in both hypervolume and Normalized Overall Execution Time (NOET) (Table 6 and Fig. 7).  Conclusions  This study presents a multi-objective DSE method based on Bayesian optimization that enhances both solution quality and search efficiency for CGRCA. The proposed approach employs a knowledge-aware unsupervised learning sampling strategy to generate an initial sample set with high representativeness and diversity. A rapid evaluation model is subsequently developed to reduce the computational cost of performance assessments. Additionally, the integration of adaptive multi-acquisition functions with a greedy-based hybrid surrogate model further improves the efficiency and generalization capability of the DSE framework. Comparative experiments demonstrate the effectiveness of the proposed MOBE method: (1) the sampling strategy reduces the ADRS by up to 28.2% and increases hypervolume by 15.1% compared with existing methods; (2) the greedy-based hybrid surrogate model achieves up to a 31.7% reduction in ADRS and a 20.0% improvement in hypervolume relative to single surrogate model-based approaches; (3) the overall MOBE framework achieves a 34.9% reduction in ADRS and a 28.7% increase in hypervolume compared with state-of-the-art DSE techniques; (4) MOBE improves throughput by up to 29.9%, reduces area overhead by 6.0%, and increases FU utilization by 11.6% relative to existing methods; and (5) MOBE exhibits excellent cross-algorithm stability in hypervolume and NOET. MOBE is applicable to medium-and-high-performance cryptographic application scenarios, including cloud platforms and desktop terminals. Nevertheless, two limitations remain. First, MOBE currently employs only traditional surrogate models, which may constrain feature learning efficiency and modeling accuracy. Second, its validation is confined to a CGRCA architecture previously developed by the research group, lacking verification across existing CGRCA architectures. Future work will address these limitations by incorporating emerging artificial intelligence techniques, such as large models, and conducting extensive experiments on diverse CGRCA architectures to further enhance the generalization and effectiveness of MOBE.
A Review of Clutter Suppression Techniques in Ground Penetrating Radar: Mechanisms, Methods, and Challenges
LEI Wentai, WANG Yiming, ZHONG Jiwei, XU Qiguo, JIANG Yuyin, LI Cheng
Available online  , doi: 10.11999/JEIT250524
Abstract:
  Significance   Ground Penetrating Radar (GPR) is a widely adopted non-destructive subsurface detection technology, extensively applied in urban subsurface exploration, transportation infrastructure monitoring, geophysical surveys, and military operations. It is employed to detect underground pipelines, structural foundations, road voids, and concealed defects in roadbeds, railway tracks, and tunnels, as well as shallow geological formations and military targets such as unexploded ordnance. However, the presence of clutter—unwanted signals including direct coupling waves, ground reflections, and non-target echoes—severely degrades GPR data quality and complicates target detection, localization, imaging, and parameter estimation. Effective clutter suppression is therefore essential to enhance the accuracy and reliability of GPR data interpretation, making it a central research focus in improving GPR performance across diverse application domains.  Progress   Significant progress has been achieved in GPR clutter suppression, largely through two main approaches: signal model-based and neural network-based methods. Signal model-based techniques, such as time–frequency analysis, subspace decomposition, and dictionary learning, rely on physical modeling to distinguish clutter from target signals. These methods provide clear interpretability but are limited in addressing complex and non-linear clutter patterns. Neural network-based methods, employing architectures such as Convolutional Neural Networks, U-Net, and Generative Adversarial Networks, are more effective in capturing non-linear features through data-driven learning. Recent advances, including multi-scale convolutional autoencoders, attention mechanisms, and hybrid models, have further enhanced clutter suppression under challenging conditions. Quantitative metrics such as Mean Squared Error, Peak Signal-to-Noise Ratio, and Structural Similarity Index are commonly used for performance evaluation, often complemented by qualitative visual assessment.  Conclusion  The complexity and diversity of GPR clutter, originating from direct coupling, ground reflections, equipment imperfections, non-uniform media, and non-target scatterers, demand robust suppression strategies. Signal model-based methods provide strong theoretical foundations but are constrained by simplified assumptions, whereas neural network-based approaches offer greater adaptability at the expense of large data requirements and high computational cost. Hybrid approaches that integrate the strengths of both paradigms show considerable potential in addressing complex clutter scenarios. The selection of evaluation metrics plays a pivotal role in algorithm design, with quantitative measures offering objective assessment and qualitative analyses providing intuitive validation. Despite recent advances, significant challenges remain in suppressing non-linear clutter, enabling real-time processing, and reducing reliance on labeled data.  Prospect   Future research in GPR clutter suppression is likely to emphasize integrating the strengths of signal model-based and neural network-based methods to develop robust and adaptive solutions. Real-time processing and online learning will be prioritized to meet the requirements of dynamic applications. Self-supervised and unsupervised learning approaches are expected to reduce dependence on costly labeled datasets and improve model adaptability. Cross-task learning and multi-modal fusion, combining data from multiple sensors or frequencies, are expected to enhance robustness and precision. Furthermore, deeper integration of physical principles, including electromagnetic wave propagation and media properties, into algorithm design is expected to improve suppression accuracy and computational efficiency, advancing the development of more intelligent and effective GPR systems.
Microfabrication Method for Amorphous Wires GMI Magnetic Sensors
ZHANG Bo, WEN Xiaolong, WAN Yadong, ZHANG Chao, LI Jianhua
Available online  , doi: 10.11999/JEIT250338
Abstract:
  Objective  Compared with amorphous ribbons and thin films, amorphous wires exhibit superior Giant MagnetoImpedance (GMI) performance, making them promising materials for GMI magnetic sensors. Their flexible and heterogeneous morphology, however, complicates precise positioning during device fabrication. Additionally, the poor wettability of amorphous wires hinders control of contact resistance during soldering, often resulting in inconsistent device performance. This study proposes a microfabrication method for GMI magnetic sensors based on amorphous wires. Through-glass vias are employed as alignment markers, and auxiliary fixtures are used to accurately position and secure the wires on a glass wafer. Using photolithography and electroplating, bonding pads are fabricated to establish reliable electrical interconnections between the wires and the pads, enabling device-level processing and integration. A winding machine is then applied to wind the signal pickup coil on the device surface, completing fabrication of the GMI magnetic sensor. This approach avoids deformation and stress accumulation caused by direct coil winding on the amorphous wires, thereby improving manufacturability and ensuring stable performance of amorphous wire-based GMI magnetic sensors.  Methods  A glass wafer is employed as the substrate, owing to its high surface flatness and mechanical rigidity, which provide stable support for the flexible amorphous wire structure. To mitigate deformation caused by wire flexibility during winding, a microelectronics process integration scheme based on the glass wafer is implemented. A metal seed layer is first deposited by magnetron sputtering. Ultraviolet lithography and electroplating are then applied to form a high-precision array of electrical interconnection pads on the wafer surface. The ends of the amorphous wire are threaded through through-glass vias fabricated along the wafer edge by laser ablation and subsequently secured, ensuring accurate positioning over the bonding pad area while maintaining the natural straight form of the wire (Fig. 4). The amorphous wire is interconnected with the pads using electroplating. Standardized devices with an amorphous wire–glass substrate–interconnection structure are obtained by wafer dicing. After the microstructure of the amorphous wire and substrate is established, a winding machine is used to wind enameled wire onto the structure to form the signal pickup coil. The number of turns and spacing are precisely controlled according to the design. The sensor structure with the wound pickup coil is mounted on a Printed Circuit Board (PCB) with bonding pads. Finally, flip-chip bonding is performed to achieve secondary interconnection between the sensor structure and the PCB, completing fabrication of the sensor device.  Results and Discussions  The fabricated sensor device based on microelectronics processes is shown in Fig. 6(a). A 40 μm diameter enameled wire is uniformly wound on the substrate surface to form the signal pickup coil, with the number of turns and spacing precisely controlled by programmed parameters of the winding machine. As shown in the magnified view in Fig. 6(b), the bonding pad areas at both ends of the amorphous wire are completely covered by a copper layer. The copper plating defines the electrical connection area of the amorphous wire, while polyimide provides reliable fixation and surface protection on the substrate. The performance of five fabricated amorphous wire GMI magnetic sensors is presented in Fig. 13 and Table 1. The standard deviation of sensor output ranges from 0.0272 to 0.0163, and the sensors exhibit similar sensitivity, indicating good consistency. The output characteristic curves are shown in Fig. 14. Fitting analysis shows that both the Pearson correlation coefficient and the coefficient of determination are close to 1, demonstrating excellent linearity. When a 1 MHz excitation signal is applied to the amorphous wire, the output voltage exhibits a linear relationship with the external magnetic field within the range of –1 Oe to +1 Oe, with a sensitivity of 5.7 V/Oe. The magnetic noise spectrum, measured inside a magnetic shielding barrel, is shown in Fig. 15. The results indicate that the magnetic noise level of the sensor is approximately 55 pT/√Hz.  Conclusions  A fabrication method for amorphous wire-based GMI magnetic sensors is proposed using a glass substrate integration process. The sensor is constructed through microfabrication of a glass substrate–amorphous wire microstructure. The method is characterized by three features: (i) highly reliable interconnections between the amorphous wire and bonding pads are established by electroplating, yielding a 10 mm × 0.6 mm × 0.5 mm microstructure with fixed amorphous wires; (ii) a signal pickup coil is precisely wound on the microstructure surface with a winding machine, ensuring accurate control of coil turns and spacing; and (iii) electrical connection and circuit integration with a PCB are completed by flip-chip bonding. Compared with conventional amorphous wire GMI sensors, this approach provides two technical advantages. The microfabrication interconnection process reduces contact resistance fluctuations, addressing sensor performance dispersion. In addition, the combination of conventional winding and microelectronics techniques ensures device consistency while avoiding the high cost of full-process microfabrication. This method improves process compatibility and manufacturing repeatability, offering a practical route for engineering applications of GMI magnetic sensors.
Three-Dimensional Imaging Method for Concealed Human Targets Based on Array Stitching
QIU Chen, CHEN Jiahui, SHAO Fengzhi, LI Nian, XU Zihan, GUO Shisheng, CUI Guolong
Available online  , doi: 10.11999/JEIT250334
Abstract:
  Objective  Traditional Through-the-Wall Radar (TWR) systems based on planar multiple-input multiple-output arrays often face high hardware complexity, calibration challenges, and increased system cost. To overcome these limitations, we propose a Three-Dimensional (3D) imaging framework based on array stitching. The method uses either time-sequential or simultaneous operation of multiple small-aperture radar sub-arrays to emulate a large aperture. This strategy substantially reduces hardware complexity while maintaining accurate 3D imaging of concealed human targets.  Methods  The proposed framework integrates three core techniques: 3D weighted total variation (3DWTV) reconstruction, Lucy–Richardson (LR) deconvolution, and 3D wavelet transform (3DWT)-based fusion. Radar echoes are first collected from horizontally and vertically distributed sub-arrays that emulate a planar aperture. Each sub-array image is independently reconstructed using 3DWTV, which enforces spatial sparsity to suppress noise while preserving structural details. The horizontal and vertical images are then multiplicatively fused to jointly recover azimuth and elevation information. To reduce diffraction-induced blurring, LR deconvolution models system degradation through the Point Spread Function (PSF) and iteratively refines scene reflectivity, thereby enhancing cross-range resolution. Finally, 3DWT decomposes the images into multi-scale sub-bands (e.g., LLL, LLH, LHL), which are selectively fused using absolute-maximum and fuzzy-logic rules. The inverse wavelet transform is then applied to reconstruct the final 3D image, retaining both global and local features.  Results and Discussions  The proposed method is evaluated through both simulations and real-world experiments using a Stepped-Frequency Continuous-Wave (SFCW) radar operating from 1.6 to 2.2 GHz with a 2Tx–4Rx configuration. In simulations, compared with baseline algorithms such as Back-Projection (BP) and the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), the proposed method achieves better performance. Image Entropy (IE) decreases from 9.7125 for BP and 9.7065 for FISTA to 8.0711, which reflects improved image quality. Experimental tests conducted in indoor environments further confirm robustness. For both standing and sitting postures, IE is reduced from 9.9982 to 7.0030 and from 9.9947 to 6.2261, respectively.  Conclusions  This study presents a low-cost, high-resolution 3D imaging method for TWR systems based on array stitching. By integrating 3DWTV reconstruction, LR deconvolution, and 3DWT fusion, the method effectively reconstructs concealed human postures using a limited aperture. The approach simplifies hardware design, reduces system complexity, and preserves imaging quality under sparse sampling, thereby providing a practical solution for portable and scalable TWR systems.
Construction of Multi-Scroll Conservative Chaotic System and Its Application in Image Encryption
AN Xinlei, LI Zhifu, XUE Rui, XIONG Li, ZHANG Li
Available online  , doi: 10.11999/JEIT250432
Abstract:
  Objective  Existing conservative chaotic systems often suffer from structural simplicity and weak nonlinear characteristics, and research on complex dynamical behaviors such as multi-scroll structures remains limited, constraining their potential in engineering applications. To address security risks in face image transmission and the inefficiency of traditional global encryption methods, this study constructs a conservative chaotic system with multi-scroll characteristics, investigates its complex dynamical behavior, and designs a face-detection-based selective image encryption algorithm targeting sensitive regions. The work explores the practical application of conservative chaotic systems in image encryption.  Methods  A five-dimensional conservative hyperchaotic system is constructed on the basis of the generalized Hamiltonian system, and the controlled generation of multi-scroll chaotic flows is achieved through modulation of the Hamiltonian energy function. The Hessian matrix is used to analyze the stationary points of the Hamiltonian energy function, thereby revealing the relationship between scroll structures and stationary points. The spatial distribution of multi-scroll chaotic flows is further characterized by energy isosurfaces. The complex dynamical behaviors of the proposed system are investigated using Lyapunov exponent spectra and phase diagrams, while the sequence complexity is evaluated with the SE complexity algorithm. On this basis, an image encryption algorithm integrated with face detection technology is designed. The algorithm applies a diffusion–scrambling strategy to selectively encrypt facial regions. The security performance is evaluated through multiple indicators, including key space, pixel correlation, and information entropy.  Results and Discussions  Analysis of stationary points in the Hamiltonian energy function revealed a positive correlation between their number and scroll generation. Extreme points primarily drive scroll formation, whereas saddle points define transition zones, indicating that the scroll structure can be effectively regulated through the Hamiltonian energy function. The Lyapunov exponent spectrum of the multi-scroll conservative chaotic system is distributed symmetrically about the x-axis and exhibits an integer Lyapunov dimension, fully confirming the system’s volume-conserving property. Under different initial conditions, the system demonstrates diverse coexistence behaviors, including phase trajectories of varying types and scales. Complexity evaluation further showed that the multi-scroll conservative chaotic system achieves markedly higher spectral entropy complexity, supporting its potential for image encryption applications. Experimental validation demonstrated that the proposed algorithm can accurately detect faces and selectively encrypt sensitive regions, thereby avoiding the computational inefficiency of indiscriminate global encryption. Moreover, the algorithm exhibited strong performance across multiple security metrics.  Conclusions  A conservative chaotic system is constructed on the basis of the generalized Hamiltonian system, and its complex dynamical behavior and application in image encryption are investigated. The study provides theoretical references for the generation of multi-scroll conservative chaotic flows and offers practical guidance for the application of image encryption technology.
Multi-modal Joint Distillation Optimization for Source Code Vulnerability Detection
ZHANG Xuejun, ZHANG Yiffan, LIU Cancan, JIA Xiaohong, CHEN Zhuo, ZHANG Lei
Available online  , doi: 10.11999/JEIT250453
Abstract:
  Objective  As software systems increase in scale and complexity, the frequency of security vulnerabilities in source code rises accordingly, threatening system reliability, data integrity, and user privacy. Conventional automated vulnerability detection methods typically depend on a narrow set of shallow features—such as API call sequences, opcode patterns, or syntactic heuristics—rendering them susceptible to learning spurious correlations and unable to capture the rich semantic and structural information essential for accurate detection. Moreover, most existing approaches either rely on single-modal representations or weakly integrate multiple modalities without explicitly addressing distribution mismatches across them. This often results in overfitting to seen datasets and limited generalization to unseen codebases, particularly across different projects or programming languages. Although recent advances in machine learning and deep learning have improved source code analysis, effectively modeling the complex interactions between code semantics and execution structures remains a major challenge. To overcome these limitations, this paper proposes a multi-modal joint Distillation Optimization for Vulnerability Detection (mVulD-DO), a multimodal framework that combines deep feature distillation with dynamic global feature alignment. The proposed method aims to enhance semantic comprehension, structural reasoning, and cross-modal consistency, which are critical for robust vulnerability detection. By enforcing both intra-modality refinement and inter-modality alignment, mVulD-DO addresses the semantic-structural gap that constrains traditional methods.  Methods  The mVulD-DO framework begins by extracting multiple semantic modalities from raw source code—function names, variable names, token_type attributes, and local code slices—using program slicing and syntactic parsing techniques. In parallel, a Program Dependency Graph (PDG) is constructed to capture both control-flow and data-flow relationships, generating a heterogeneous graph that represents explicit and implicit program behaviors. Each semantic modality is embedded using pretrained encoders and subsequently refined via a deep feature distillation module, which integrates multi-head self-attention and multi-scale convolutional layers to emphasize salient patterns and suppress redundant noise. To model the sequential dependencies intrinsic to program execution, a Bidirectional Long Short-Term Memory (BLSTM) network captures long-range contextual interactions. For structural representation, a Graph Attention Network (GAT) processes the PDG-C to produce topology-aware embeddings. To bridge the gap between modalities, adaptive dynamic Sinkhorn regularization is applied to globally align the distributions of semantic and structural embeddings. This approach mitigates modality mismatches while preserving flexibility by avoiding rigid one-to-one correspondences. Finally, the distilled and aligned multimodal features are fused and passed through a lightweight fully connected classifier for binary vulnerability prediction. The model is jointly optimized using both classification and alignment objectives, improving its ability to generalize across unseen codebases.  Results and Discussions  Comprehensive comparisons conducted on the mixed CVEFixes+SARD dataset—covering 25 common CWE vulnerability types with an 8:1:1 train:validation:test split—demonstrate that traditional source code detectors, which directly map code to labels, often rely on superficial patterns and show limited generalization, particularly for out-of-distribution samples. These methods achieve accuracies ranging from 55.41% to 85.84%, with recall typically below 80% (Table 1). In contrast, mVulD-DO leverages multi-layer feature distillation to purify and enhance core representations, while dynamic Sinkhorn alignment mitigates cross-modal inconsistencies. This results in accuracy of 87.11% and recall of 83.59%, representing absolute improvements of 1.27% and 6.26%, respectively, over the strongest baseline method (ReGVD). Although mVulD-DO reports a slightly higher False Positive Rate (FPR) of 3.54%—2.92% above that of ReGVD—it remains lower than that of most traditional detectors. This modest increase is considered acceptable in practice, given that failing to detect a critical vulnerability typically incurs greater cost than issuing additional alerts. Compared with instruction-tuned large language models (e.g., VulLLM), which maintain low FPRs below 10% but suffer from recall below 75%, mVulD-DO offers a more favorable trade-off between false alarms and coverage of true vulnerabilities. Ablation studies (Table 2) further validate the contribution of each component. Removing function name embeddings (unfunc) results in a 1.3% decrease in F1 score; removing variable name embeddings (unvar) causes a 1.3% drop; and omitting token_type attributes (untype) leads to a 3.35% reduction. The most substantial performance degradation—9.11% in F1—occurs when the deep feature distillation module is disabled (undis), highlighting the critical role of multi-scale semantic refinement and noise suppression. Additional evaluations on vulnerability-sensitive subsets—Call, OPS, Array, and PTR—demonstrate consistent benefits from Sinkhorn alignment. F1 score improvements over unaligned variants are observed as follows: 1.45% for Call, 4.22% for OPS, 1.38% for Array, and 0.36% for PTR (Table 3), confirming the generalization advantage across a broad spectrum of vulnerability types.  Conclusions  Experimental results demonstrate that the proposed mVulD-DO framework consistently outperforms existing vulnerability detection methods in recall, F1-score, and accuracy, while maintaining a low FPR The effectiveness of deep feature distillation, multi-scale semantic extraction, and dynamic Sinkhorn alignment is validated through extensive ablation and visualization analyses. Despite these improvements, the model incurs additional computational overhead due to multimodal distillation and Sinkhorn alignment, and shows sensitivity to hyperparameter settings, which may limit its suitability for real-time applications. Moreover, while strong performance is achieved on the mixed dataset, the model's generalization across unseen projects and programming languages remains an open challenge. Future work will focus on developing lightweight training strategies—such as knowledge distillation and model pruning—to reduce computational costs. Additionally, incorporating unsupervised domain adaptation and incremental alignment mechanisms will be critical to support dynamic code evolution and enhance cross-domain robustness. These directions aim to improve the scalability, adaptability, and practical deployment of multimodal vulnerability detection systems in diverse software environments.
Advances in Deep Neural Network Based Image Compression: A Survey
BAI Yuanchao, LIU Wenchang, JIANG Junjun, LIU Xianming
Available online  , doi: 10.11999/JEIT250567
Abstract:
  Significance   With the continuous advancement of information technology, digital images are evolving toward ultra-high-definition formats characterized by increased resolution, dynamic range, color depth, sampling rates, and multi-viewpoint support. In parallel, the rapid development of artificial intelligence is reshaping both the generation and application paradigms of digital imagery. As visual big data converges with AI technologies, the volume and diversity of image data expand exponentially, creating unprecedented challenges for storage and transmission. As a core technology in digital image processing, image compression reduces storage costs and bandwidth requirements by eliminating internal information redundancy, thereby serving as a fundamental enabler for visual big data applications. However, traditional image compression standards increasingly struggle to meet rising industrial demands due to limited modeling capacity, inadequate perceptual adaptability, and poor compatibility with machine vision tasks. Deep Neural Network (DNN)-based image compression methods, leveraging powerful modeling capabilities, end-to-end optimization mechanisms, and compatibility with both human perception and machine understanding, are progressively exceeding conventional coding approaches. These methods demonstrate clear advantages and broad potential across diverse application domains, drawing growing attention from both academia and industry.  Progress   This paper systematically reviews recent advances in DNN-based image compression from three core perspectives: signal fidelity, human visual perception, and machine analysis. First, in signal fidelity–oriented compression, the rate–distortion optimization framework is introduced, with detailed discussion of key components in lossy image compression, including nonlinear transforms, quantization strategies, entropy coding mechanisms, and variable-rate techniques for multi-rate adaptation. The synergistic design of these modules underpins the architecture of modern DNN-based image compression systems. Second, in perceptual quality–driven compression, the principles of joint rate–distortion–perception optimization models are examined, together with a comparative analysis of two major perceptual paradigms: Generative Adversarial Network (GAN)-based models and diffusion model–based approaches. Both strategies employ perceptual loss functions or generative modeling techniques to markedly improve the visual quality of reconstructed images, aligning them more closely with the characteristics of the human visual system. Finally, in machine analysis–oriented compression, a co-optimization framework for rate–distortion–accuracy trade-offs is presented, with semantic fidelity as the primary objective. From the perspective of integrating image compression with downstream machine analysis architectures, this section analyzes how current methods preserve essential semantic information that supports tasks such as object detection and semantic segmentation during the compression process.  Conclusions  DNN-based image compression shows strong potential across signal fidelity, human visual perception, and machine analysis. Through end-to-end jointly optimized neural network architectures, these methods provide comprehensive modeling of the encoding process and outperform traditional approaches in compression efficiency. By leveraging the probabilistic modeling and image generation capabilities of DNNs, they can accurately estimate distributional differences between reconstructed and original images, quantify perceptual losses, and generate high-quality reconstructions that align with human visual perception. Furthermore, their compatibility with mainstream image analysis frameworks enables the extraction of semantic features and the design of collaborative optimization strategies, allowing efficient compression tailored to machine vision tasks.  Prospects   Despite significant progress in compression performance, perceptual quality, and task adaptability, DNN-based image compression still faces critical technical challenges and practical limitations. First, computational complexity remains high. Most high-performance models rely on deep and sophisticated architectures (e.g., attention mechanisms and Transformer models), which enhance modeling capability but also introduce substantial computational overhead and long inference latency. These limitations are particularly problematic for deployment on mobile and embedded devices. Second, robustness and generalization continue to be major concerns. DNN-based compression models are sensitive to input perturbations and vulnerable to adversarial attacks, which can lead to severe reconstruction distortions or even complete failure. Moreover, while they perform well on training data and similar distributions, their performance often degrades markedly under cross-domain scenarios. Third, the evaluation framework for perceptual- and machine vision–oriented compression remains immature. Although new evaluation dimensions have been introduced, no unified and objective benchmark exists. This gap is especially evident in machine analysis–oriented compression, where downstream tasks vary widely and rely on different visual models. Therefore, comparability across methods is limited and consistent evaluation metrics are lacking, constraining both research and practical adoption. Overall, DNN-based image compression is in transition from laboratory research to real-world deployment. Although it demonstrates clear advantages over traditional approaches, further advances are needed in efficiency, robustness, generalization, and standardized evaluation protocols. Future research should strengthen the synergy between theoretical exploration and engineering implementation to accelerate widespread adoption and continued progress in areas such as multimedia communication, edge computing, and intelligent image sensing systems.
Optimal Federated Average Fusion of Gaussian Mixture–Probability Hypothesis Density Filters
XUE Yu, XU Lei
Available online  , doi: 10.11999/JEIT250759
Abstract:
  Objective  To realize optimal decentralized fusion tracking of uncertain targets, this study proposes a federated average fusion algorithm for Gaussian Mixture–Probability Hypothesis Density (GM-PHD) filters, designed with a hierarchical structure. Each sensor node operates a local GM-PHD filter to extract multi-target state estimates from sensor measurements. The fusion node performs three key tasks: (1) maintaining a master filter that predicts the fusion result from the previous iteration; (2) associating and merging the GM-PHDs of all filters; and (3) distributing the fused result and several parameters to each filter. The association step decomposes multi-target density fusion into four categories of single-target estimate fusion. We derive the optimal single-target estimate fusion both in the absence and presence of missed detections. Information assignment applies the covariance upper-bounding theory to eliminate correlation among all filters, enabling the proposed algorithm to achieve the accuracy of Bayesian fusion. Simulation results show that the federated fusion algorithm achieves optimal tracking accuracy and consistently outperforms the conventional Arithmetic Average (AA) fusion method. Moreover, the relative reliability of each filter can be flexibly adjusted.  Methods  The multi-sensor multi-target density fusion is decomposed into multiple groups of single-target component merging through the association operation. Federated filtering is employed as the merging strategy, which achieves the Bayesian optimum owing to its inherent decorrelation capability. Section 3 rigorously extends this approach to scenarios with missed detections. To satisfy federated filtering’s requirement for prior estimates, a master filter is designed to compute the predicted multi-target density, thereby establishing a hierarchical architecture for the proposed algorithm. In addition, auxiliary measures are incorporated to compensate for the observed underestimation of cardinality.  Results and Discussions  modified Mahalanobis distance (Fig.3). The precise association and the single-target decorrelation capability together ensure the theoretical optimality of the proposed algorithm, as illustrated in Fig. 2. Compared with conventional density fusion, the Optimal Sub-Pattern Assignment (OSPA) error is reduced by 8.17% (Fig. 4). The advantage of adopting a small average factor for the master filter is demonstrated in Figs. 5 and 6. The effectiveness of the measures for achieving cardinality consensus is also validated (Fig. 7). Another competitive strength of the algorithm lies in the flexibility of adjusting the average factors (Fig. 8). Furthermore, the algorithm consistently outperforms AA fusion across all missed detection probabilities (Fig. 9).  Conclusions  This paper achieves theoretically optimal multi-target density fusion by employing federated filtering as the merging method for single-target components. The proposed algorithm inherits the decorrelation capability and single-target optimality of federated filtering. A hierarchical fusion architecture is designed to satisfy the requirement for prior estimates. Extensive simulations demonstrate that: (1) the algorithm can accurately associate filtered components belonging to the same target, thereby extending single-target optimality to multi-target fusion tracking; (2) the algorithm supports flexible adjustment of average factors, with smaller values for the master filter consistently preferred; and (3) the superiority of the algorithm persists even under sensor malfunctions and high missed detection rates. Nonetheless, this study is limited to GM-PHD filters with overlapping Fields Of View (FOVs). Future work will investigate its applicability to other filter types and spatially non-overlapping FOVs.
Research on Federated Unlearning Approach Based on Adaptive Model Pruning
MA Zhenguo, HE Zixuan, SUN Yanjing, WANG Bowen, LIU Jianchun, XU Hongli
Available online  , doi: 10.11999/JEIT250503
Abstract:
  Objective  The rapid proliferation of Internet of Things (IoT) devices and the enforcement of data privacy regulations, including the General Data Protection Regulation (GDPR) and the Personal Information Protection Act, have positioned Federated Unlearning (FU) as a critical mechanism to safeguard the “right to be forgotten” in Edge Computing (EC). Existing class-level unlearning approaches often adopt uniform model pruning strategies. However, because edge nodes vary substantially in computational capacity, storage, and network bandwidth, these methods suffer from efficiency degradation, leading to imbalanced training delays and decreased resource utilization. This study proposes FU with Adaptive Model Pruning (FunAMP), a framework that minimizes training time while reliably eliminating the influence of target-class data. FunAMP dynamically assigns pruning ratios according to node resources and incorporates a parameter correlation metric to guide pruning decisions. In doing so, it addresses the challenge of resource heterogeneity while preserving compliance with privacy regulations.  Methods  The proposed framework establishes a quantitative relationship among model training time, node resources, and pruning ratios, on the basis of which an optimization problem is formulated to minimize overall training time. To address this problem, a greedy algorithm (Algorithm 2) is designed to adaptively assign appropriate pruning ratios to each node. The algorithm discretizes the pruning ratio space and applies a binary search strategy to balance computation and communication delays across nodes. Additionally, a Term Frequency–Inverse Document Frequency (TF–IDF)-based metric is introduced to evaluate the correlation between model parameters and the target-class data. For each parameter, the TF score reflects its activation contribution to the target class, whereas the IDF score measures its specificity across all classes. Parameters with high TF–IDF scores are iteratively pruned until the assigned pruning ratio is satisfied, thereby ensuring the effective removal of target-class data.  Results and Discussions  Simulation results confirm the effectiveness of FunAMP in balancing training efficiency and unlearning performance under resource heterogeneity. The effect of pruning granularity on model accuracy (Fig. 2): fine granularity (e.g., 0.01) preserves model integrity, whereas coarse settings degrade accuracy due to excessive parameter removal. Under fixed training time, FunAMP consistently achieves higher accuracy than FunUMP and Retrain (Fig. 3), as adaptive pruning ratios reduce inter-node waiting delays. For instance, FunAMP attains 76.48% accuracy on LeNet and 83.60% on AlexNet with FMNIST, outperforming baseline methods by 5.91% and 4.44%, respectively. The TF–IDF-driven pruning mechanism fully removes contributions of target-class data, achieving 0.00% accuracy on the target data while maintaining competitive performance on the remaining data (Table 2). Robustness under varying heterogeneity levels is further verified (Fig. 4). Compared with baselines, FunAMP markedly reduces the training time required to reach predefined accuracy and delivers up to 11.8× speedup across four models. These results demonstrate FunAMP’s capability to harmonize resource utilization, preserve model performance, and ensure unlearning efficacy in heterogeneous edge environments.  Conclusions  To mitigate training inefficiency caused by resource heterogeneity in FU, this study proposes FunAMP, a framework that integrates adaptive pruning with parameter relevance analysis. A system model is constructed to formalize the relationship among node resources, pruning ratios, and training time. A greedy algorithm dynamically assigns pruning ratios to edge nodes, thereby minimizing global training time while balancing computational and communication delays. Furthermore, a TF–IDF-driven metric quantifies the correlation between model parameters and target-class data, enabling the selective removal of critical parameters to erase target-class contributions. Theoretical analysis verifies the stability and reliability of the framework, while empirical results demonstrate that FunAMP achieves complete removal of target-class data and sustains competitive accuracy on the remaining classes. This work is limited to single-class unlearning, and extending the approach to scenarios requiring the simultaneous removal of multiple classes remains an important direction for future research.
Secure Beamforming Design for Multi-User Near-Field ISAC Systems
DENG Zhixiang, ZHANG Zhiwei
Available online  , doi: 10.11999/JEIT250462
Abstract:
  Objective  Integrated Sensing and Communication (ISAC) systems, a key enabling technology for 6G, achieve the joint realization of communication and sensing by sharing spectrum and hardware. However, radar targets may threaten the confidentiality of user communications, necessitating secure transmission against potential eavesdropping. At the same time, large-scale antenna arrays and high-frequency bands are expected to be widely deployed to meet future performance requirements, making near-field wireless transmission increasingly common. This trend creates a mismatch between existing ISAC designs that rely on the far-field assumption and the characteristics of real propagation environments. In this study, we design optimal secrecy beamforming for a multi-user near-field ISAC system to improve the confidentiality of user communications while ensuring radar sensing performance. The results show that distance degrees of freedom inherent in the near-field model, together with radar sensing signals serving as Artificial Noise (AN), provide significant gains in communication secrecy.  Methods  A near-field ISAC system model is established, in which multiple communication users and a single target, regarded as a potential eavesdropper, are located within the near-field region of a transmitter equipped with a Uniform Linear Array (ULA). Based on near-field channel theory, channel models are derived for all links, including the communication channels from the transmitter to the users, the transmitter to the target, and the radar echo-based sensing channel.The secrecy performance of each user is quantified as the difference between the achievable communication rate and the eavesdropping rate at the target, and the sum secrecy rate across all users is adopted as the metric for system-wide confidentiality. The sensing performance of the ISAC system is evaluated using the Cramér–Rao bound (CRB), obtained from the Fisher Information Matrix (FIM) for parameter estimation. To enhance secrecy, a joint optimization problem is formulated for the beamforming vectors of communication and radar sensing signals, with the objective of maximizing the sum secrecy rate under base station transmit power and sensing performance constraints.As the joint optimization problem is inherently non-convex, an algorithm combining Semi-Definite Relaxation (SDR) and Weighted Minimum Mean Square Error (WMMSE) is developed. The equivalence between the MMSE-transformed problem and the original secrecy rate maximization problem is first established to handle non-convexity. The CRB constraint is then expressed in convex form using the Schur complement. Finally, SDR is applied to recast the problem into a convex optimization framework, which allows a globally optimal solution to be derived.  Results and Discussions  Numerical evaluations show that the proposed near-field ISAC secrecy beamforming design achieves clear advantages in communication confidentiality compared with far-field and non-AN schemes. Under the near-field channel model, the designed beams effectively concentrate energy on legitimate users while suppressing information leakage through radar sensing signals (Fig. 3b). Even when communication users and radar targets are angularly aligned, the secure beamforming scheme attains spatial isolation through distance-domain degrees of freedom, thereby maintaining positive secrecy rates (Fig. 3a).Joint optimization of communication beams and radar sensing signals significantly improves multi-user secrecy rates while satisfying the CRB constraint. Compared with conventional AN-assisted methods, the proposed solution exhibits superior trade-off performance between sensing and communication (Fig. 4).The number of antennas is directly correlated with beam focusing performance: increasing the antenna count produces more concentrated beam patterns. In the near-field model, however, the incorporation of the distance dimension amplifies this effect, yielding larger performance gains than those observed in conventional far-field systems (Fig. 5).Raising the transmit power further improves the received signal quality at the users, which proportionally enhances system secrecy. The near-field scheme achieves more substantial gains than far-field baselines under higher transmit power conditions (Fig. 6).This paper also examines the effect of user population on secrecy performance. A larger number of users increases inter-user interference, which degrades overall secrecy (Fig. 7). Nevertheless, owing to the intrinsic interference suppression capability of the near-field scheme and the ability of AN to impair eavesdroppers’ decoding, the proposed method maintains stronger robustness against multi-user interference compared with conventional approaches.  Conclusions  This study investigates multi-user secure communication design in near-field ISAC systems and proposes a beamforming optimization scheme that jointly enhances sensing accuracy and communication secrecy. A non-convex optimization model is established to maximize the multi-user secrecy sum rate under base station transmit power and CRB constraints, where radar sensing signals are exploited as AN to impair potential eavesdroppers. To address the complexity of the problem, a joint optimization algorithm combining SDR and WMMSE is developed, which reformulates the original non-convex problem into a convex form solvable with standard optimization tools.
Detection and Localization of Radio Frequency Interference via Cross-domain Multi-feature from SAR Raw Data
FU Zewen, WEI Tingting, LI Ningning, LI Ning
Available online  , doi: 10.11999/JEIT250701
Abstract:
  Objective  The increasing congestion of the electromagnetic spectrum presents major challenges for Synthetic Aperture Radar (SAR) systems, where Radio Frequency Interference (RFI) can severely degrade imaging quality and compromise interpretation accuracy. Existing detection methods have critical limitations: time-domain approaches are insensitive to weak interference, whereas transform-domain methods perform poorly in characterizing broadband interference. This study develops a cross-domain framework that integrates complementary features from multiple domains, enabling robust RFI detection and accurate localization. The proposed approach addresses the deficiencies of single-domain methods and provides a reliable solution for operational SAR systems.  Methods  This study introduces two methodological innovations. First, a weighted feature fusion framework combines the first-order derivatives of time-domain kurtosis and skewness using Principal Component Analysis (PCA)-optimized weights, thereby capturing both global statistical distributions and local dynamic variations. Second, a differential time-frequency analysis technique applies the Short-Time Fourier Transform (STFT) with logarithmic ratio operations and adaptive thresholding to achieve sub-pulse interference localization. The overall workflow integrates K-means clustering for initial detection, STFT-based feature enhancement, binary region identification, and Inverse STFT (ISTFT) reconstruction. The proposed approach is validated against three state-of-the-art methods using both simulated data and Sentinel-1 datasets.  Results and Discussions  Experimental results demonstrate marked improvements across all evaluation metrics. For simulated data, the proposed method achieves a signal accuracy (SA) of 98.56% and a False Alarm (FA) rate of 0.65% (Table 2), representing a 3.13% gain in SA compared with conventional methods. The Root Mean Square Error (RMSE) reaches 0.1902 (Table 3), corresponding to a 10.9% improvement over existing techniques. Visual analysis further confirms more complete interference detection (Fig. 2) and cleaner suppression results (Figs. 4 and 7), with target features preserved. For measured data, the method maintains robust performance, achieving a gray entropy of 0.7843 (Table 5), and effectively mitigating the severe FAs observed in traditional approaches (Fig. 8).  Conclusions  In complex and dynamic electromagnetic environments, traditional RFI detection methods often show inaccuracies or even fail when processing NarrowBand Interference (NBI) or WideBand Interference (WBI), limiting their operational applicability. To address this challenge, this study proposes an engineering-oriented interference detection method designed for practical SAR operations. By combining time-domain kurtosis with the first derivative of skewness, the approach significantly enhances detection accuracy and adaptability. Furthermore, a localization technique is introduced that precisely identifies interference positions. Using time-frequency domain analysis, the method calculates differential values between the time-frequency representations of echo signals with and without interference, and determines interference locations through threshold-based judgment. Extensive simulations and Sentinel-1 experiments confirm the universality and effectiveness of the proposed method in both detection and localization.
Dynamic Inversion Algorithm for Rainfall Intensity Based on Dual-Mode Microwave Radar Combined Rain Gauge
ZHANG Qishuo, ZHANG Wenxin, GAO Mengyu, XIONG Fei
Available online  , doi: 10.11999/JEIT250535
Abstract:
  Objective  Microwave meteorological radar has broad application potential in rainfall detection due to its non-contact measurement, high spatiotemporal resolution, and multi-parameter retrieval capability. However, in the context of climate change, increasingly complex rainfall events require monitoring systems to deliver high-precision, multi-dimensional, real-time data to support disaster warning and climate research. Conventional single-mode radars, constrained by fixed functionalities, cannot fully meet these requirements, which has led to the development of multi-mode radar technology. The dual-mode radar examined in this study employs Frequency Modulated Continuous Wave (FMCW) and Continuous Wave (CW) modes. These modes adopt different algorithmic principles for raindrop velocity measurement: FMCW enables spatially stratified detection and strong anti-interference performance, whereas CW provides more accurate measurements of raindrop fall speed, yielding integral rainfall information in the vertical column. Despite these advantages, retrieval accuracy remains limited by the reliance of traditional algorithms on fixed empirical parameters, which restrict adaptability to regional climate variations and dynamic microphysical precipitation processes, and hinder real-time response to variations in rain Drop Size Distribution (DSD). Ground rain gauges, by contrast, provide near-true reference data through direct measurement of rainfall intensity. To address the above challenges, this paper proposes a dynamic inversion algorithm that integrates dual-mode (FMCW–CW) radar with rain gauge data, enhancing adaptability and retrieval accuracy for rainfall monitoring.  Methods  Two models are developed for the two radar modes. For the FMCW mode, which can retrieve DSD parameters, a fusion algorithm based on Attention integrated with a double-layer Long Short-Term Memory (LSTM) network (LSTM–Attention–LSTM) is proposed. The first LSTM extracts features from DSD data and rain gauge–measured rainfall intensity through its hidden state output, with a dropout layer applied to randomly discard neurons and reduce overfitting. The Attention mechanism calculates feature similarity using dot products and converts it into attention weights. The second LSTM then processes the time series and integrates the hidden-layer features, which are passed through a fully connected layer to generate the retrieval results. For the CW mode, which cannot directly retrieve DSD parameters and is constrained to the reflectivity factor–Rainfall rate (Z–R) relationship (Z=aRb), an algorithm based on the Extended Kalman Filter (EKF) is proposed to optimize this relationship. The method dynamically models the Z–R parameters, computes the residual between predicted rainfall intensity and rain gauge observations, and updates the prior estimates accordingly. Physical constraints are applied to parameters a and b during state updates to ensure consistency with physical laws, thereby enabling accurate fitting of the Z–R relationship.  Results and Discussions  Experimental results show that both models enhance the accuracy of rainfall intensity retrieval. For the FMCW mode, the LSTM–Attention–LSTM model applied to the test dataset outperforms traditional physical models, single-layer LSTM, and double-layer LSTM. It effectively captures the temporal variation of rainfall intensity, with the absolute error relative to observed values remaining below 0.25 mm/h (Fig. 5). Compared with the traditional physical model, the LSTM–Attention–LSTM reduces RMSE and MAE by 46% and 38%, achieving values of 0.1623 mm/h and 0.147 mm/h, respectively, and increases R2 by 14.5% to 0.95 (Table 2). For the CW mode, the Z–R relationship optimized by the EKF model provides the best fit for the Z and R distribution in the validation dataset (Fig. 6). Rainfall intensity retrieved with this algorithm on the test set exhibits the smallest deviation from actual observations compared with convective cloud empirical formulas, Beijing plain area empirical formulas, and the dynamic Z–R method. The corresponding RMSE, MAE, and R2 reach 0.1076 mm/h, 0.094 mm/h, and 0.972, respectively (Fig. 7; Table 4).  Conclusions  This study proposes two multi-source data fusion schemes that integrate dual-mode radar with rain gauges for short-term rainfall monitoring. Experimental results confirm that both methods significantly improve the accuracy of rainfall intensity retrieval and demonstrate strong dynamic adaptability and robustness.
Efficient Storage Method for Real-Time Simulation of Wide-Range Multipath Delay Spread Channels
LI Weishi, ZHOU Hui, JIAO Xun, XU Qiang, TANG Youxi
Available online  , doi: 10.11999/JEIT250525
Abstract:
  Objective  The real-time channel emulator is a critical tool in wireless device research and development, enabling accurate and repeatable experiments in controlled laboratory environments. This capability reduces testing costs by avoiding extensive field trials and accelerates development cycles by allowing rapid iteration and validation of wireless devices under realistic conditions. With the rapid advancement of aerial platforms—including drones, High-Altitude Pseudo-Satellites (HAPS), and Unmanned Aerial Vehicles (UAVs)—for integrated sensing and communication, high-resolution imaging, and environmental reconstruction in complex wireless environments, the challenges of channel modeling have increased considerably. In particular, there is growing demand for real-time simulation of wide-range multipath delay spread channels. Existing simulation methods, although effective in traditional scenarios, face substantial limitations in hardware storage resources when handling such channels. This study addresses these limitations by proposing an efficient storage method for real-time emulation of wide-range multipath channels. The method reduces memory overhead while preserving high fidelity in channel reproduction, thereby offering a practical and optimized solution for next-generation wireless communication research.  Methods  In conventional real-time channel emulation, a combined simulation approach is adopted, employing cascaded common delay and multipath delay spread components. The common delay component is implemented using a single high-capacity memory module, whereas the multipath delay spread component is implemented using a Dense Tapped Delay Line (D-TDL). This design reduces storage resource requirements by multiplexing the common delay component, but the achievable multipath delay spread range remains limited. Moreover, the multipath delay is constrained by the common delay component, reducing flexibility and compromising the ability to emulate complex scenarios. The Sparse Tapped Delay Line (S-TDL) scheme is used in some algorithms to extend the multipath delay emulation range by cascading block memory modules. However, this method introduces inter-tap delay dependencies and cannot adapt to the requirements of wide-range multipath delay spread channels. Alternatively, Time-Division Multiplexing (TDM) is applied in other algorithms to improve the utilization efficiency of block memory modules and decouple multipath delay control. Despite this, TDM is constrained by the read/write bandwidth of memory, making it unsuitable for real-time channel emulation of large-bandwidth signals. To overcome the multi-tap delay coupling issue in the S-TDL algorithm, an Optimized Sparse Tapped Delay Line (OS-TDL) algorithm is proposed. By analyzing delay-dependent relationships among multipath taps, theoretical derivation establishes an analytical relationship between the number of multipaths and the delay spread range achievable under decoupling constraints. Redundant taps are introduced to eliminate inter-tap delay dependencies, enabling flexible configuration of arbitrary multipath delay combinations. The algorithm formulates a joint optimization model that balances hardware memory allocation and multipath delay spread fidelity, supports wide-range multipath scenarios without being limited by memory read/write bandwidth, and allows real-time emulation of large-bandwidth signals. The central innovation lies in dynamically constraining tap activation and sparsity patterns to reduce redundant memory while preserving wide-range multipath delay spread channel characteristics. Compared with conventional approaches, the proposed algorithm significantly enhances storage resource utilization efficiency in wide-range multipath channel emulation. On this basis, a concrete algorithmic procedure is developed, in which an input multipath delay sequence is computationally processed to derive delay configuration parameters and activation sequences for multiple cascaded memory units. Comprehensive validation procedures for the algorithm are presented in later sections.  Results and Discussions  Conventional S-TDL algorithms are constrained by inter-tap delay coupling, which limits their ability to achieve high-fidelity emulation of wide-range multipath delay variations. To overcome this limitation, a comparative simulation of three algorithms—the memory resource exclusive algorithm, the TDM memory resource algorithm, and the OS-TDL algorithm proposed herein—is systematically conducted. A controlled variable approach is employed to evaluate storage resource utilization efficiency across three key dimensions: signal sampling rate, number of emulated multipath components, and multipath delay spread range. Theoretical analysis and simulation results show that the proposed OS-TDL algorithm significantly reduces memory requirements compared with conventional methods, while maintaining emulation fidelity. Its effectiveness is further verified through experimental implementation on AMD’s Virtex UltraScale+ series high-performance Field-Programmable Gate Array (FPGA), using the XCVU13P verification platform. Comparative FPGA resource measurements under identical system specifications confirm the superiority of the proposed algorithm, demonstrating its ability to improve memory efficiency while accurately reproducing wide-range multipath delay spread channels.  Conclusions  This study addresses the challenge of storage resource utilization efficiency in real-time channel emulation for wide-range multipath delay spread by analyzing the inter-tap delay dependency inherent in conventional S-TDL algorithms. An OS-TDL algorithm is proposed to emulate wide-range multipath delay spread channels. Both simulation and hardware verification results demonstrate that the proposed algorithm substantially improves storage efficiency while accurately reproducing multipath wide-range delay spread characteristics. These findings confirm that the algorithm meets the design requirements of real-time channel emulators for increasingly complex verification scenarios.
Ultra-wideband Bonding Wire RF Characteristics Compensation IC and Circuit Design for Microwave Components
KONG Weidong, YAN Pengyi, LU Shaopeng, WANG Qiaonan, DENG Shixiong, LIN Peng, WANG Cong, YANG Guohui, ZHANG Kuang
Available online  , doi: 10.11999/JEIT250502
Abstract:
  Objective  In microwave modules, assembly gaps often occur between power amplifier chips and multilayer hybrid circuit boards or among different circuit units. These gaps form deep transition trenches that significantly degrade RF signal transmission quality, particularly at millimeter-wave frequencies. Bonding wires remain a critical solution for establishing electrical interconnections between RF chips and other structures. However, the inherent parasitic inductance of gold bonding wires adversely affects system performance. As RF modules increasingly operate in the Ka-band and W-band, the degradation caused by this parasitic inductance has become more pronounced. The problem is especially severe when the ground-signal return path is excessively long or when the bonding wires themselves are too long.  Methods  The impedance transformation paths of T-type and π-type matching networks are compared on the Smith chart. The analysis indicates that for a given parasitic inductance of bonding wires, the Q-circle of the π-type matching network is smaller, thereby enabling a broader matching bandwidth. A π-type matching network for chip-to-chip interconnection is realized by optimizing the bonding pad dimensions on the GaAs chip to provide capacitive loading. As the bonding pad size increases, more gold wires can be bonded to the chip, which simultaneously reduces the parasitic inductance of the wires. Additionally, a symmetric “Ground-Signal-Ground (GSG)” bonding pad structure is designed on the GaAs chip, which shortens the ground return path and further reduces the parasitic inductance of the bonding wires. By integrating these three design strategies, the proposed chip and transition structure are shown to substantially improve the performance of cross-deep-gap transitions between different circuit units in microwave modules.  Results and Discussions  The proposed chip and transition structure substantially improve the performance of cross-trench transitions between different circuit units in microwave modules (Fig. 7). Simulation results show that the interconnection architecture effectively mitigates the adverse effects of trench depth on RF characteristics. Experimental validation further confirms that the π-type matching network implemented with the designed chip achieves an ultra-wideband, high-performance cross-trench transition, with a return loss of ≥ 17 dB and an insertion loss of ≤ 0.7 dB over the DC~40 GHz frequency range.  Conclusions  Comparative analysis of impedance transformation paths between T-type and π-type matching networks demonstrates that in gold-wire bonding interconnections, the π-type configuration is more effective in mitigating the effect of bonding wire parasitic inductance on matching bandwidth, making it suitable for ultra-wideband bonded interconnection circuits. To implement the π-type matching network using GaAs technology, the bonding pad area on the chip is enlarged to provide capacitive loading and to allow additional bonding wires, thereby further reducing parasitic inductance. A GSG structure is also designed on the GaAs chip surface to modify the reference ground return path of the bonded interconnections, leading to additional reduction in parasitic inductance. By integrating these features, an ultra-wideband compensation chip is developed and applied to cross-trench transition structures in microwave modules. Experimental results indicate that for a transition structure with a trench depth of 2 mm and a width of 0.2 mm, the proposed design achieves high-performance characteristics from DC to 40 GHz, with return loss ≥ 17 dB and insertion loss ≤ 0.7 dB. When applied to interconnections between RF chips and circuit boards in microwave modules, the chip also significantly enhances the RF matching performance of bonded interconnections.
Lightweight Incremental Deployment for Computing-Network Converged AI Services
WANG Qinding, TAN bin, HUANG Guangping, DUAN Wei, YANG Dong, ZHANG Hongke
Available online  , doi: 10.11999/JEIT250663
Abstract:
  Objective   The rapid expansion of Artificial Intelligence (AI) computing services has heightened the demand for flexible access and efficient utilization of computing resources. Traditional Domain Name System (DNS) and IP-based scheduling mechanisms are constrained in addressing the stringent requirements of low latency and high concurrency, highlighting the need for integrated computing-network resource management. To address these challenges, this study proposes a lightweight deployment framework that enhances network adaptability and resource scheduling efficiency for AI services.  Methods   The AI-oriented Service IDentifier (AISID) is designed to encode service attributes into four dimensions: Object, Function, Method, and Performance. Service requests are decoupled from physical resource locations, enabling dynamic resource matching. AISID is embedded within IPv6 packets (Fig. 5), consisting of a 64-bit prefix for identification and a 64-bit service-specific suffix (Fig. 4). A lightweight incremental deployment scheme is implemented through hierarchical routing, in which stable wide-area routing is managed by ingress gateways, and fine-grained local scheduling is handled by egress gateways (Fig. 6). Ingress and egress gateways are incrementally deployed under the coordination of an intelligent control system to optimize resource allocation. AISID-based paths are encapsulated at ingress gateways using Segment Routing over IPv6 (SRv6), whereas egress gateways select optimal service nodes according to real-time load data using a weighted least-connections strategy (Fig. 8). AISID lifecycle management includes registration, query, migration, and decommissioning phases (Table 2), with global synchronization maintained by the control system. Resource scheduling is dynamically adjusted according to real-time network topology and node utilization metrics (Fig. 7).  Results and Discussions   Experimental results show marked improvements over traditional DNS/IP architectures. The AISID mechanism reduces service request initiation latency by 61.3% compared to DNS resolution (Fig. 9), as it eliminates the need for round-trip DNS queries. Under 500 concurrent requests, network bandwidth utilization variance decreases by 32.8% (Fig. 10), reflecting the ability of AISID-enabled scheduling to alleviate congestion hotspots. Computing resource variance improves by 12.3% (Fig. 11), demonstrating more balanced workload distribution across service nodes. These improvements arise from AISID’s precise semantic matching in combination with the hierarchical routing strategy, which together enhance resource allocation efficiency while maintaining compatibility with existing IPv6/DNS infrastructure (Fig. 23). The incremental deployment approach further reduces disruption to legacy networks, confirming the framework’s practicality and viability for real-world deployment.  Conclusions   This study establishes a computing-network convergence framework for AI services based on semantic-driven AISID and lightweight deployment. The key innovations include AISID’s semantic encoding, which enables dynamic resource scheduling and decoupled service access, together with incremental gateway deployment that optimizes routing without requiring major modifications to legacy networks. Experimental validation demonstrates significant improvements in latency reduction, bandwidth efficiency, and balanced resource utilization. Future research will explore AISID’s scalability across heterogeneous domains and its robustness under dynamic network conditions.
Multi-granularity Text Perception and Hierarchical Feature Interaction Method for Visual Grounding
CAI Hua, RAN Yue, FU Qiang, LI Junyan, ZHANG Chenjie, SUN Junxi
Available online  , doi: 10.11999/JEIT250387
Abstract:
  Objective  Visual grounding requires effective use of textual information for accurate target localization. Traditional methods primarily emphasize feature fusion but often neglect the guiding role of text, which limits localization accuracy. To address this limitation, a Multi-granularity Text Perception and Hierarchical Feature Interaction method for Visual Grounding (ThiVG) is proposed. In this method, the hierarchical feature interaction module is progressively incorporated into the image encoder to enhance the semantic representation of image features. The multi-granularity text-aware module is designed to generate weighted text with spatial and semantic enhancement, and a preliminary Hadamard product-based fusion strategy is applied to refine image features for cross-modal fusion. Experimental results show that the proposed method substantially improves localization accuracy and effectively alleviates the performance bottleneck arising from over-reliance on feature fusion modules in conventional approaches.  Methods  The proposed method comprises an image-text feature extraction network, a hierarchical feature interaction module, a multi-granularity text perception module, and a graphic-text cross-modal fusion and target localization network (Fig. 1). The image-text feature extraction network includes image and text branches for extracting their respective features (Fig. 2). In the image branch, text features are incorporated into the image encoder through the hierarchical feature interaction module (Fig. 3). This enables text information to filter and update image features, thereby strengthening their semantic expressiveness. The multi-granularity text perception module employs three perception mechanisms to fully extract spatial and semantic information from the text (Fig. 4). It generates weighted text, which is preliminarily fused with image features through a Hadamard product-based strategy, providing fine-grained image features for subsequent cross-modal fusion. The graphic-text cross-modal fusion module then deeply integrates image and text features using a Transformer encoder (Fig. 5), capturing their complex relationships. Finally, a Multilayer Perceptron (MLP) performs regression to predict the bounding box coordinates of the target location. This method not only achieves effective integration of image and text information but also improves accuracy and robustness in visual grounding tasks through hierarchical feature interaction and deep cross-modal fusion, offering a novel approach to complex localization challenges.  Results and Discussions  Comparison experiments demonstrate that the proposed method achieves substantial accuracy gains across five benchmark visual localization datasets (Tables 1 and 2), with particularly strong performance on the long-text RefCOCOg dataset. Although the model has a larger parameter size, comparisons of parameter counts and training-inference times indicate that its overall performance still exceeds that of traditional methods (Table 3). Ablation studies further verify the contribution of each key module (Table 4). The hierarchical feature interaction module improves the semantic representation of image features by incorporating textual information into the image encoder (Table 5). The multi-granularity text perception module enhances attention to key textual components through perception mechanisms and adaptive weighting (Table 6). By avoiding excessive modification of the text structure, it markedly strengthens the model’s capacity to process long text and complex sentences. Experiments on the number of encoder layers in the cross-modal fusion module show that a 6-layer deep fusion encoder effectively filters irrelevant background information (Table 7), yielding a more precise feature representation for the localization regression MLP. Generalization tests and visualization analyses further demonstrate that the proposed method maintains high adaptability and accuracy across diverse and challenging localization scenarios (Figs. 6, and 7).  Conclusions  This study proposes a visual grounding algorithm that integrates multi-granularity text perception with hierarchical feature interaction, effectively addressing the under-utilization of textual information and the reliance on single-feature fusion in existing approaches. Key innovations include the hierarchical feature interaction module in the image branch, which markedly enhances the semantic representation of image features; the multi-granularity text perception module, which fully exploits textual information to generate weighted text with spatial and semantic enhancement; and a preliminary Hadamard product-based fusion strategy, which provides fine-grained image representations for cross-modal fusion. Experimental results show that the proposed method achieves substantial accuracy improvements on classical vision datasets and demonstrates strong adaptability and robustness across diverse and complex localization scenarios. Future work will focus on extending this method to accommodate more diverse text inputs and further improving localization performance in challenging visual environments.
Electromagnetic Signal Feature Matching Characterization for Constant False Alarm Detection
WANG Zixin, XIANG Houhong, TIAN Bo, MA Hongwei, WANG Yuhao, ZENG Xiaolu, WANG Fengyu
Available online  , doi: 10.11999/JEIT250589
Abstract:
  Objective  Small targets such as unmanned aerial vehicles and unmanned vessels, which exhibit small Radar Cross Section (RCS) values and weak echoes, are difficult to detect due to their low observability. Traditional Constant False Alarm Rate (CFAR) detection is typically represented by the Cell-Averaged (CA) CFAR method, in which the detection threshold is determined by the statistical power parameter of the signal. However, its detection performance is constrained by the Signal-to-Noise Ratio (SNR). This study focuses on how to exploit and apply signal features beyond power parameters to achieve CFAR detection under lower SNR conditions.  Methods  After pulse compression, the envelope of a Linear Frequency Modulation (LFM) signal exhibits sinc characteristics, whereas noise retains its random nature. This difference can be used to distinguish target echoes from non-target signals. On this basis, we propose a constant false alarm detection method based on signal feature matching. First, both the ideal echo signal and the actual echo signal are processed with sliding windows of equal length to generate an ideal sample and a set of test samples. A dual-port fully connected neural network is then constructed to extract the deep feature matching degree between the ideal sample and the test samples. Finally, the constant false alarm threshold is obtained by numerically calculating the deep feature matching parameter from a large number of non-target samples compared with the standard sample.  Results and Discussions  Several sets of simulation experiments are carried out, and measured radar data from different frequency bands are applied to verify the effectiveness of the proposed method. The simulations first confirm that the method maintains stable constant false alarm characteristics (Table 1). The detection performance is then compared with traditional CA-CFAR detection, machine learning approaches, and other deep learning methods. The results indicate that, relative to CA-CFAR detection, the proposed method achieve 2–5 dB gain in equivalent SNR across different false alarm probabilities (Fig. 4). Under mismatched SNR conditions, the method continues to demonstrate robust detection performance with strong generalization capability (Fig. 5). In the processing of measured X-band radar data, the proposed method detects targets that CA-CFAR fails to identify, extending the detection range to 740 distance units, compared with 562 distance units for CA-CFAR, corresponding to an improvement of approximately 28.72% in radar detection capability (Fig. 7, 8). In the case of S-band radar data, the proposed method significantly reduces false alarms (Fig. 10, 11).  Conclusions  This study exploits the difference between target and noise signal envelopes by introducing a feature extraction network that effectively enhances target detection performance. Comparative simulation experiments and the processing of measured radar data across different frequency bands demonstrate the following: (1) the proposed method markedly improves detection performance over traditional CA-CFAR detection, yielding a 2–5 dB gain in equivalent SNR; (2) under mismatched SNR conditions, the method shows strong generalization capability, achieving better detection performance than other deep learning and machine learning approaches; (3) in X-band radar data processing, the method increases detection capability by approximately 28.72%; and (4) in S-band radar data processing, it significantly reduces false alarms. Future work will focus on accelerating the detection process to further improve efficiency.
Recent Advances of Programmable Schedulers
ZHAO Yazhu, GUO Zehua, DOU Songshi, FU Xiaoyang
Available online  , doi: 10.11999/JEIT250657
Abstract:
  Objective  In recent years, diversified user demands, dynamic application scenarios, and massive data transmissions have imposed increasingly stringent requirements on modern networks. Network schedulers play a critical role in ensuring efficient and reliable data delivery, enhancing overall performance and stability, and directly shaping user-perceived service quality. Traditional scheduling algorithms, however, rely largely on fixed hardware, with scheduling logic hardwired during chip design. These designs are inflexible, provide coarse and static scheduling granularity, and offer limited capability to represent complex policies. Therefore, they hinder rapid deployment, increase upgrade costs, and fail to meet the evolving requirements of heterogeneous and large-scale network environments. Programmable schedulers, in contrast, leverage flexible hardware architectures to support diverse strategies without hardware replacement. Scheduling granularity can be dynamically adjusted at the flow, queue, or packet level to meet varied application requirements with precision. Furthermore, they enable the deployment of customized logic through data plane programming languages, allowing rapid iteration and online updates. These capabilities significantly reduce maintenance costs while improving adaptability. The combination of high flexibility, cost-effectiveness, and engineering practicality positions programmable schedulers as a superior alternative to traditional designs. Therefore, the design and optimization of high-performance programmable schedulers have become a central focus of current research, particularly for data center networks and industrial Internet applications, where efficient, flexible, and controllable traffic scheduling is essential.  Methods  The primary objective of current research is to design universal, high-performance programmable schedulers. Achieving simultaneous improvements across multiple performance metrics, however, remains a major challenge. Hardware-based schedulers deliver high performance and stability but incur substantial costs and typically support only a limited range of scheduling algorithms, restricting their applicability in large-scale and heterogeneous network environments. In contrast, software-based schedulers provide flexibility in expressing diverse algorithms but suffer from inherent performance constraints. To integrate the high performance of hardware with the flexibility of software, recent designs of programmable schedulers commonly adopt First-In First-Out (FIFO) or Push-In First-Out (PIFO) queue architectures. These approaches emphasize two key performance metrics: scheduling accuracy and programmability. Scheduling accuracy is critical, as modern applications such as real-time communications, online gaming, telemedicine, and autonomous driving demand strict guarantees on packet timing and ordering. Even minor errors may result in increased latency, reduced throughput, or connection interruptions, compromising user experience and service reliability. Programmability, by contrast, enables network devices to adapt to diverse scenarios, supporting rapid deployment of new algorithms and flexible responses to application-specific requirements. Improvements in both accuracy and programmability are therefore essential for developing efficient, reliable, and adaptable network systems, forming the basis for future high-performance deployments.  Results and Discussions  The overall packet scheduling process is illustrated in (Fig. 1), where scheduling is composed of scheduling algorithms and schedulers. At the ingress or egress pipelines of end hosts or network devices, scheduling algorithms assign a Rank value to each packet, determining the transmission order based on relative differences in Rank. Upon arrival at the traffic manager, the scheduler sorts and forwards packets according to their Rank values. Through the joint operation of algorithms and schedulers, packet scheduling is executed while meeting quality-of-service requirements. A comparative analysis of the fundamental principles of FIFO and PIFO scheduling mechanisms (Fig. 2) highlights their differences in queue ordering and disorder control. At present, most studies on programmable schedulers build upon these two foundational architectures (Fig. 3), with extensions and optimizations primarily aimed at improving scheduling accuracy and programmability. Specific strategies include admission control, refinement of scheduling algorithms, egress control, and advancements in data structures and queue mechanisms. On this basis, the current research progress on programmable schedulers is reviewed and systematically analyzed. Existing studies are compared along three key dimensions: structural characteristics, expressive capability, and approximation accuracy (Table 1).  Conclusions  Programmable schedulers, as a key technology for next-generation networks, enable flexible traffic management and open new possibilities for efficient packet scheduling. This review has summarized recent progress in the design of programmable schedulers across diverse application scenarios. The background and significance of programmable schedulers within the broader packet scheduling process were first clarified. An analysis of domestic and international literature shows that most current studies focus on FIFO-based and PIFO-based architectures to improve scheduling accuracy and programmability. The design approaches of these two architectures were examined, the main technical methods for enhancing performance were summarized, and their structural characteristics, expressive capabilities, and approximation accuracy were compared, highlighting respective advantages and limitations. Potential improvements in existing research were also identified, and future development directions were discussed. Nevertheless, the design of a universal, high-performance programmable scheduler remains a critical challenge. Achieving optimal performance across multiple metrics while ensuring high-quality network services will require continued joint efforts from both academia and industry.
Collaborative Inference for Large Language Models Against Jamming Attacks
LIN Zhiping, XIAO Liang, CHEN Hongyi, XU Xiaoyu, LI Jieling
Available online  , doi: 10.11999/JEIT250675
Abstract:
  Objective  Collaborative inference with Large Language Models (LLMs) is employed to enable mobile devices to offload multi-modal data, including images, text, video, and environmental information such as temperature and humidity, to edge servers. This offloading improves the performance of inference tasks such as human-computer question answering, logical reasoning, and decision support. Jamming attacks, however, increase transmission latency and packet loss, which reduces task completion rates and slows inference. A reinforcement learning-based collaborative inference scheme is proposed to enhance inference speed, accuracy, and task completion under jamming conditions. LLMs with different sparsity levels and quantization precisions are deployed on edge servers to meet heterogeneous inference requirements across tasks.  Methods  A reinforcement learning-based collaborative inference scheme is proposed to enhance inference accuracy, speed, and task completion under jamming attacks. The scheme jointly selects the edge servers, sparsity rates and quantization levels of LLMs, as well as the transmit power and channels for data offloading, based on task type, data volume, channel gains, and received jamming power. A policy risk function is formulated to quantify the probability of inference task failure given offloading latency and packet loss rate, thereby reducing the likelihood of unsafe policy exploration. Each edge server deploys LLMs with varying sparsity rates and quantization precisions, derived from layer-wise unstructured pruning and model parameter quantization, to process token vectors of multi-modal data including images, text, video, and environmental information such as temperature and humidity. This configuration is designed to meet diverse requirements for inference accuracy and speed across different tasks. The LLM inference system is implemented with mobile devices offloading images and text to edge servers for human-computer question answering and driving decision support. The edge servers employ a vision encoder and tokenizer to transform the received sensing data into token vectors, which serve as inputs to the LLMs. Pruning and parameter quantization are applied to the foundation model LLaVA-1.5-7B, generating nine LLM variants with different sparsity rates and quantization precisions to accommodate heterogeneous inference demands.  Results and Discussions  Experiments are conducted with three vehicles offloading images (i.e., captured traffic scenes) and texts (i.e., user prompts) using a maximum transmit power of 100 mW on 5.170~5.330 MHz frequency channels. The system is evaluated against a smart jammer that applies Q-learning to block one of the 20 MHz channels within this band. The results show consistent performance gains over benchmark schemes. Faster responses and more accurate driving advice are achieved, enabled by reduced offloading latency and lower packet loss in image transmission, which allow the construction of more complete traffic scenes. Over 20 repeated runs, inference speed is improved by 20.3%, task completion rate by 14.1%, and accuracy by 12.2%. These improvements are attributed to the safe exploration strategy, which prevents performance degradation and satisfies diverse inference requirements across tasks.  Conclusions  This paper proposed a reinforcement learning-based collaborative inference scheme that jointly selects the edge servers, sparsity rates and quantization levels of LLMs, as well as the transmit power and offloading channels, to counter jamming attacks. The inference system deploys nine LLM variants with different sparsity rates and quantization precisions for human-computer question answering and driving decision support, thereby meeting heterogeneous requirements for accuracy and speed. Experimental results demonstrate that the proposed scheme provides faster responses and more reliable driving advice. Specifically, it improves inference speed by 20.3%, task completion rate by 14.1%, and accuracy by 12.2%, achieved through reduced offloading latency and packet loss compared with benchmark approaches.
Parametric Holographic MIMO Channel Modeling and Its Bayesian Estimation
YUAN Zhengdao, GUO Yabo, GAO Dawei, GUO Qinghua, HUANG Chongwen, LIAO Guisheng
Available online  , doi: 10.11999/JEIT250436
Abstract:
  Objective  Holographic Multiple-Input Multiple-Output (HMIMO), based on continuous-aperture antennas and programmable metasurfaces, is regarded as a cornerstone of 6G wireless communication. Its potential to overcome the limitations of conventional massive MIMO is critically dependent on accurate channel modeling and estimation. Three major challenges remain: (1) oversimplified electromagnetic propagation models, such as far-field approximations, cause severe mismatches in near-field scenarios; (2) statistical models fail to characterize the coupling between channel coefficients, user positions, and random orientations; and (3) the high dimensionality of parameter spaces results in prohibitive computational complexity. To address these challenges, a hybrid parametric-Bayesian framework is proposed in which neural networks, factor graphs, and convex optimization are integrated. Precise channel estimation, user position sensing, and angle decoupling in near-field HMIMO systems are thereby achieved. The methodology provides a pathway toward high-capacity 6G applications, including Integrated Sensing And Communication (ISAC).  Methods  A hybrid channel estimation method is proposed to decouple the “channel-coordinate-angle” parameters and to enable joint estimation of channel coefficients, coordinates, and angles under random user orientations. A neural network is first employed to capture the nonlinear relationship between holographic channel characteristics and the relative coordinates of the base station and user. The trained network is then embedded into a factor graph, where global optimization is performed. The neural network is dynamically approximated through Taylor expansion, allowing bidirectional message propagation and iterative refinement of parameter estimates. To address random user orientations, Euler angle rotation theory is introduced. Finally, convex optimization is applied to estimate the rotation mapping matrix, resulting in the decoupling of coordinate and angle parameters and accurate channel estimation.  Results and Discussions  The simulations evaluate the performance of different algorithms under varying key parameters, including Signal-to-Noise Ratio (SNR), pilot length L, and base station antenna number M. Two performance metrics are considered: Normalized Mean Square Error (NMSE) of channel estimation and user positioning accuracy, with the Cramér–Rao Lower Bound (CRLB) serving as the theoretical benchmark. At an SNR of 10 dB, the proposed method achieves a channel NMSE below –40 dB, outperforming Least Squares (LS) estimation and approximate model-based approaches. Under high SNR conditions, the NMSE converges toward the CRLB, confirming near-optimal performance (Fig. 5a). The proposed channel model demonstrates superior performance over “approximate methods” due to its enhanced characterization of real-world channels. Moreover, the positioning error gap between the proposed method and the “parallel bound” narrows to nearly 3 dB at high SNR, confirming the accuracy of angle estimation and the effectiveness of parameter decoupling (Fig. 5b). Moreover, the proposed method maintains performance close to the theoretical bounds when system parameters, such as user antenna number N, base station antenna number M, and pilot length L, are varied, demonstrating strong robustness (Figs. 68). These results also show that the Euler angle rotation-based estimation effectively compensates for coordinate offsets induced by random user orientations.  Conclusions  This study proposes a framework for HMIMO channel estimation by integrating neural networks, factor graphs, and convex optimization. The main contributions are threefold. First, Euler angles and coordinate mapping are incorporated into the parameterized channel model through factorization and factor graphs, enabling channel modeling under arbitrary user antenna orientations. Second, neural networks and convex optimization are embedded as factor nodes in the graph, allowing nonlinear function approximation and global optimization. Third, bidirectional message passing between neural network and convex optimization nodes is realized through Taylor expansion, thereby achieving joint decoupling and estimation of channel parameters, coordinates, and angles. Simulation results confirm that the proposed framework achieves higher accuracy—exceeding benchmarks by more than 3 dB, and demonstrates strong robustness across a range of scenarios. Future work will extend the method to multi-user environments, incorporate polarization diversity, and address hardware impairments such as phase noise, with the aim of supporting practical deployment in 6G systems.
Multi-Mode Anti-Jamming for UAV Communications: A Cooperative Mode-Based Decision-Making Approach via Two-Dimensional Transfer Reinforcement Learning
WANG Shiyu, WANG Ximing, KE Zhenyi, LIU Dianxiong, LIU Jize, DU Zhiyong
Available online  , doi: 10.11999/JEIT250566
Abstract:
  Objective  With the widespread application of Unmanned Aerial Vehicles (UAVs) in military reconnaissance, logistics, and emergency communications, ensuring the security and reliability of UAV communication systems has become a critical challenge. Wireless channels are highly vulnerable to diverse jamming attacks. Traditional anti-jamming techniques, such as Frequency-Hopping Spread Spectrum (FHSS), are limited in dynamic spectrum environments and may be compromised by advanced machine learning algorithms. Furthermore, UAVs operate under strict constraints on onboard computational power and energy, which hinders the real-time use of complex anti-jamming algorithms. To address these challenges, this study proposes a multi-mode anti-jamming framework that integrates Intelligent Frequency Hopping (IFH), Jamming-based Backscatter Communication (JBC), and Energy Harvesting (EH) to strengthen communication resilience in complex electromagnetic environments. A Multi-mode Transfer Deep Q-Learning (MT-DQN) method is further proposed, enabling two-dimensional transfer to improve learning efficiency and adaptability under resource constraints. By leveraging transfer learning, the framework reduces computational load and accelerates decision-making, thereby allowing UAVs to counter jamming threats effectively even with limited resources.  Methods  The proposed framework adopts a multi-mode anti-jamming architecture that integrates IFH, JBC, and EH to establish a comprehensive defense strategy of “avoiding, utilizing, and converting” interference. The system is formulated as a Markov Decision Process (MDP) to dynamically optimize the selection of anti-jamming modes and communication channels. To address the challenges of high-dimensional state-action spaces and restricted onboard computational resources, a two-dimensional transfer reinforcement learning framework is developed. This framework comprises a cross-mode strategy-sharing network for extracting common features across different anti-jamming modes (Fig. 3) and a parallel network for cross-task transfer learning to adapt to variable task requirements (Fig. 4). The cross-mode strategy-sharing network accelerates convergence by reusing experiences, whereas the cross-task transfer learning network enables knowledge transfer under different task weightings. The reward function is designed to balance communication throughput and energy consumption. It guides the UAV to select the optimal anti-jamming strategy in real time based on spectrum sensing outcomes and task priorities.  Results and Discussions  The simulation results validate the effectiveness of the proposed MT-DQN. The dynamic weight allocation mechanism exhibits strong cross-task transfer capability (Fig. 6), as weight adjustments enable rapid convergence toward the corresponding optimal reward values. Compared with conventional Deep Reinforcement Learning (DRL) algorithms, the proposed method achieves a 64% faster convergence rate while maintaining the probability of communication interruption below 20% in dynamic jamming environments (Fig. 7). The framework shows robust performance in terms of throughput, convergence rate, and adaptability to variations in jamming patterns. In scenarios with comb-shaped and sweep-frequency jamming, the proposed method yields higher normalized throughput and faster convergence, exceeding baseline DQN and other transfer learning-based approaches. The results also indicate that MT-DQN improves stability and accelerates policy optimization during jamming pattern switching (Fig. 7), highlighting its adaptability to abrupt changes in jamming patterns through transfer learning.  Conclusions  This study proposes a multi-modal anti-jamming framework that integrates IFH, JBC, and EH, thereby enhancing the communication capability of UAVs. The proposed solution shifts the paradigm from traditional jamming avoidance toward active jamming exploitation, repurposing jamming signals as covert carriers to overcome the limitations of conventional frequency-hopping systems. Simulation results confirm the advantages of the proposed method in throughput performance, convergence rate, and environmental adaptability, demonstrating stable communication quality even under complex electromagnetic conditions. Although DRL approaches are inherently constrained in handling completely random jamming without intrinsic patterns, this work improves adaptability to dynamic jamming through transfer learning and cross-modal strategy sharing. These findings provide a promising approach for countering complex jamming threats in UAV networks. Future work will focus on validating the proposed algorithm in hardware implementations and enhancing the robustness of DRL methods under highly non-stationary, though not entirely unpredictable, jamming conditions such as pseudo-random or adaptive interference.
Quasi-Vortex Electromagnetic Wave Radar Forward Looking based on Echo Phase Weighting
SHU Gaofeng, WEI Yixin, LI Ning
Available online  , doi: 10.11999/JEIT250542
Abstract:
  Objective  Forward-looking radar imaging plays a critical role in multiple applications. Numerous algorithms have been proposed to enhance azimuth resolution; however, improvement remains difficult due to the limitations imposed by antenna aperture. Existing high-resolution techniques, including synthetic aperture radar and Doppler beam sharpening, rely on Doppler bandwidth and inevitably create blind spots in the forward-looking region. Vortex electromagnetic waves carrying orbital angular momentum offer potential in forward-looking scenarios because of the orthogonality between different orbital angular momentum modes. In conventional vortex electromagnetic wave imaging, a Uniform Circular Array (UCA) is used to generate and transmit multi-mode vortex electromagnetic waves. Yet, the UCA-generated waves suffer from main lobe divergence, which disperses energy and weakens echo signals, while multi-mode transmission increases system complexity. To address these issues, this paper proposes a Quasi-Circular Array (QCA) that reduces system complexity, produces vortex electromagnetic waves with more concentrated main lobes, and preserves phase linearity. In addition, a post-processing method based on echo phase weighting is introduced. By applying phase modulation to the single-mode echo received by each antenna element, a complete equivalent multi-mode echo is synthesized. The proposed method enhances azimuth resolution and exhibits strong anti-noise performance.  Methods  To obtain clear images under low Signal-to-Noise Ratio (SNR) conditions, a phase modulation echo post-processing method combined with a QCA is proposed. The QCA first generates a single-mode vortex electromagnetic wave to illuminate the region of interest. Each element of the array then receives and stores the echo. Phase modulation is subsequently applied to the stored echo to generate signals of specific modes, thereby synthesizing an equivalent multi-mode echo with enhanced amplitude that preserves target information. This approach demonstrates strong potential for practical applications in forward-looking radar imaging under low SNR conditions.  Results and Discussions  When noise is added to the echo and imaging is performed (Figure 11), the proposed method achieves superior results under noisy conditions. As noise intensity increases, a clear target can still be reconstructed at a SNR of –10 dB. Even when the SNR is reduced to –15 dB and the target is submerged in noise, the contour features of the reconstructed target remain distinguishable. These results demonstrate that the method has strong anti-noise performance. In addition, when imaging is performed within a smaller mode range, the azimuth resolution achieved by the proposed method improves by an average factor of 2.2 compared with the traditional method (Figure 9). The improvements in resolution and anti-noise performance can be attributed to two factors: (1) The vortex electromagnetic waves generated by the QCA experience reduced destructive interference due to the asymmetric spatial distribution of array elements, producing waves with more concentrated main lobes, lower side lobes, and higher radiation gain. (2) Applying phase modulation in echo processing reduces the pulse repetition frequency of the vortex electromagnetic wave at the transmitting end, thereby lowering system complexity.  Conclusions  This study proposes a method capable of effective imaging under low SNR conditions. The echo expression of the electric field generated by the QCA is derived, and the radiation gain and phase characteristics of the quasi-vortex electromagnetic wave are analyzed. In addition, an echo post-processing method based on phase modulation is introduced. Simulation results demonstrate that, compared with the traditional UCA method, the proposed approach generates vortex electromagnetic waves with more concentrated main lobes, lower side lobes, and higher gain, while improving azimuth resolution by a factor of 2.2. Even at a SNR of –15 dB, the reconstructed imaging results remain distinguishable.
A Multi-class Local Distribution-based Weighted Oversampling Algorithm for Multi-class Imbalanced Datasets
TAO Xinmin, XU Annan, SHI Lihang, LI Junxuan, GUO Xinyue, ZHANG Yanping
Available online  , doi: 10.11999/JEIT250381
Abstract:
  Objective  Classification with imbalanced datasets remains one of the most challenging problems in machine learning. In addition to class imbalance, such datasets often contain complex factors including class overlap, small disjuncts, outliers, and low-density regions, all of which can substantially degrade classifier performance, particularly in multi-class settings. To address these challenges simultaneously, this study proposes the Multi-class Local Distribution-based Weighted Oversampling Algorithm (MC-LDWO).  Methods  The MC-LDWO algorithm first constructs hyperspheres centered on dynamically determined minority classes, with radii estimated from the distribution of each class. Within these hyperspheres, minority class samples are selected for oversampling according to their local distribution, and an adaptive weight allocation strategy is designed using local density metrics. This ensures that samples in low-density regions and near class boundaries are assigned higher probabilities of being oversampled. Next, a low-density vector is computed from the local distribution of both majority and minority classes. A random vector is then introduced and integrated with the low-density vector, and a cutoff threshold is applied to determine the generation sites of synthetic samples, thereby reducing class overlap during boundary oversampling. Finally, an improved decomposition strategy tailored for multi-class imbalance is employed to further enhance classification performance in multi-class imbalanced scenarios.  Results and Discussions  The MC-LDWO algorithm dynamically identifies the minority and combined majority class sample sets and constructs hyperspheres centered on each minority class sample, with radii determined by the distribution of the corresponding minority class. These hyperspheres guide the subsequent oversampling process. A trade-off parameter (\begin{document}$ \beta $\end{document}) is introduced to balance the influence of local densities between the combined majority and minority classes. Experimental results on KEEL datasets show that this approach effectively prevents class overlap during boundary oversampling while assigning higher oversampling weights to critical minority samples located near boundaries and in low-density regions. This improves boundary distribution and simultaneously addresses within-class imbalance. When the trade-off parameter is set to 0.5, MC-LDWO achieves a balanced consideration of both boundary distribution and the diverse densities present in minority classes due to data difficulty factors, thereby supporting improved performance in downstream classification tasks (Fig. 10).  Conclusions  Comparative results with other state-of-the-art oversampling algorithms demonstrate that: (1) The MC-LDWO algorithm effectively prevents overlap when strengthening decision boundaries by setting the cutoff threshold (\begin{document}$ T $\end{document}) and adaptively assigns oversampling weights according to two local density indicators for the minority and combined majority classes within the hypersphere. This approach addresses within-class imbalance caused by data difficulty factors and enhances boundary distribution. (2) By jointly considering density and boundary distribution, and setting the trade-off parameter to 0.5, the proposed algorithm can simultaneously mitigate within-class imbalance and reinforce the boundary information of minority classes. (3) When applied to highly imbalanced datasets characterized by complex decision boundaries and data difficulty factors such as outliers and small disjuncts, MC-LDWO significantly improves the boundary distribution of each minority class while effectively managing within-class imbalance, thereby enhancing the performance of subsequent classifiers.
A Ku-band Circularly Polarized Leaky-wave Antenna Loaded with Parasitic Slots
HUANG Zhiyuan, ZHANG Yunhua, ZHAO Xiaowen
Available online  , doi: 10.11999/JEIT250347
Abstract:
This paper proposes a Ku-band circularly polarized Leaky-Wave Antenna (LWA) based on a Substrate Integrated Waveguide (SIW). A parasitic slot, with the same configuration as the main radiation slot but reduced in size, is employed to address the open-stopband problem and enhance impedance matching. The radiation slot excites Circularly Polarized (CP) waves, while the parasitic slot simultaneously broadens the Axial Ratio (AR) bandwidth and suppresses the open-stopband effect. A prototype antenna is designed, fabricated, and measured. The results demonstrate that the antenna achieves a 32% 3-dB AR bandwidth from 12.6 GHz to 17.4 GHz, with CP beam scanning from –49° to +14°. The simulated and measured results are in good agreement. In addition, the realized gain remains stable across the operating band. Compared with existing works, the proposed design achieves the widest scanning range.  Objective  Compared with traditional phased array antennas, frequency-scanning antennas have extensive applications in both military and civilian fields owing to their advantages of low profile, low cost, and lightweight design. CP waves offer superior anti-interference performance compared with linearly polarized waves. As a representative frequency-scanning antenna, the LWA has attracted sustained global research interest. This study focuses on the investigation of a Ku-band Circularly Polarized Leaky-Wave Antenna (CP-LWA), with emphasis on wide-bandwidth and wide-scanning techniques, as well as methods for achieving circular polarization. The aim is to provide potential design concepts for next-generation mobile communication and radar system antennas.  Methods   The fan-shaped slot is modified based on previous work, and an additional size-reduced parasitic slot of the same shape as the main slot is introduced. The parasitic slots cancel the reflected waves generated by the main radiating slot, thereby suppressing the Open-Stop-Band (OSB) effect, and they also enlarge the effective radiating aperture, which improves radiation efficiency and impedance matching. By exploiting the metallic boundary of the conductors, the parasitic slots enhance CP performance and broaden the AR bandwidth. To validate the proposed design, an antenna consisting of 12 main slots and 11 parasitic slots is designed, simulated, and measured.  Results and Discussions  A prototype is designed, fabricated, and measured in a microwave anechoic chamber to validate the proposed antenna. Both simulated and measured S11 values remain below –10 dB across the entire Ku-band. The measured S11 is slightly higher in the low-frequency range (12~13 GHz) and slightly lower in the high-frequency range (16~18 GHz), while maintaining an overall consistent trend with the simulations, except for a frequency shift of approximately 0.2 GHz toward lower frequencies. For the AR bandwidth, the simulated and measured 3-dB AR bandwidths are 32.7% (12.8~17.8 GHz) and 32.0% (12.6~17.4 GHz), respectively. The realized gains are on average 0.6 dB lower than the simulated values across the AR bandwidth, likely due to measurement system errors and fabrication tolerances. The simulated and measured peak gains reach 14.26 dB and 13.65 dB, respectively, with maximum gain variations of 2.91 dB and 2.85 dB. The measured AR and gain results therefore show strong agreement with the simulations. The measured sidelobe level increases on average by approximately 0.65 dB. The simulated CP scanning range extends from –47° to +17°, while the measured range narrows slightly to –49° to +14°. The frequency shift of the LWA is analyzed, and based on the simulated effect of variations in εr on the scanning patterns, the shift toward lower frequencies is attributed to the actual dielectric constant of the substrate being smaller than the nominal value of 2.2 specified by the manufacturer.  Conclusions  This paper proposes a Ku-band CP-LWA based on a SIW. The antenna employs etched slots consisting of fan-shaped radiation slots and size-reduced parasitic slots. The radiation slots excite circular polarization due to their inherent geometric properties, while the parasitic slots suppress the CP effect and broaden the CP bandwidth. Measurements confirm that the proposed LWA achieves a wide 3-dB AR bandwidth of 12.6~17.4 GHz (32%) with a CP beam scanning range from –49° to +14°. Meanwhile, the antenna demonstrates stable gain performance across the entire AR bandwidth.
Research on ECG Pathological Signal Classification Empowered by Diffusion Generative Data
GE Beining, CHEN Nuo, JIN Peng, SU Xin, LU Xiaochun
Available online  , doi: 10.11999/JEIT250404
Abstract:
  Objective  ElectroCardioGram (ECG) signals are key indicators of human health. However, their complex composition and diverse features make visual recognition prone to errors. This study proposes a classification algorithm for ECG pathological signals based on data generation. A Diffusion Generative Network (DGN), also known as a diffusion model, progressively adds noise to real ECG signals until they approach a noise distribution, thereby facilitating model processing. To improve generation speed and reduce memory usage, a Knowledge Distillation-Diffusion Generative Network (KD-DGN) is proposed, which demonstrates superior memory efficiency and generation performance compared with the traditional DGN. This work compares the memory usage, generation efficiency, and classification accuracy of DGN and KD-DGN, and analyzes the characteristics of the generated data after lightweight processing. In addition, the classification effects of the original MIT-BIH dataset and an extended dataset (MIT-BIH-PLUS) are evaluated. Experimental results show that convolutional networks extract richer feature information from the extended dataset generated by DGN, leading to improved recognition performance of ECG pathological signals.  Methods  The generative network-based ECG signal generation algorithm is designed to enhance the performance of convolutional networks in ECG signal classification. The process begins with a Gaussian noise-based image perturbation algorithm, which obscures the original ECG data by introducing controlled randomness. This step simulates real-world variability, enabling the model to learn more robust representations. A diffusion generative algorithm is then applied to reconstruct and reproduce the data, generating synthetic ECG signals that preserve the essential characteristics of the original categories despite the added noise. This reconstruction ensures that the underlying features of ECG signals are retained, allowing the convolutional network to extract more informative features during classification. To improve efficiency, the approach incorporates knowledge distillation. A teacher-student framework is adopted in which a lightweight student model is trained from the original, more complex teacher ECG data generation model. This strategy reduces computational requirements and accelerates the data generation process, improving suitability for practical applications. Finally, two comparative experiments are designed to validate the effectiveness and accuracy of the proposed method. These experiments evaluate classification performance against existing approaches and provide quantitative evidence of its advantages in ECG signal processing.  Results and Discussions  The data generation algorithm yields ECG signals with a Signal-to-Noise Ratio (SNR) comparable to that of the original data, while presenting more discernible signal features. The student model constructed through knowledge distillation produces ECG samples with the same SNR as those generated by the teacher model, but with substantially reduced complexity. Specifically, the student model achieves a 50% reduction in size, 37.5% lower memory usage, and a 57% shorter runtime compared with the teacher model (Fig. 6). When the convolutional network is trained with data generated by the KD-DGN, its classification performance improves across all metrics compared with a convolutional network trained without KD-DGN. Precision reaches 95.7%, and the misidentification rate is reduced to approximately 3% (Fig. 9).  Conclusions  The DGN provides an effective data generation strategy for addressing the scarcity of ECG datasets. By supplying additional synthetic data, it enables convolutional networks to extract more diverse class-specific features, thereby improving recognition performance and reducing misidentification rates. Optimizing DGN with knowledge distillation further enhances efficiency, while maintaining SNR equivalence with the original DGN. This optimization reduces computational cost, conserves machine resources, and supports simultaneous task execution. Moreover, it enables the generation of new data without LOSS, allowing convolutional networks to learn from larger datasets at lower cost. Overall, the proposed approach markedly improves the classification performance of convolutional networks on ECG signals. Future work will focus on further algorithmic optimization for real-world applications.
Multi-target Behavior and Intent Prediction on the Ground Under Incomplete Perception Conditions
ZHU Xinyi, PING Peng, HOU Wanying, SHI Quan, WU Qi
Available online  , doi: 10.11999/JEIT250322
Abstract:
  Objective  Modern battlefield environments, characterized by complex and dynamically uncertain target behaviors combined with information asymmetry, present significant challenges for intent prediction. Conventional methods lack robustness in processing incomplete data, rely on oversimplified behavioral models, and fail to capture tactical intent semantics or adapt to rapidly evolving multi-target coordinated scenarios. These limitations restrict their ability to meet the demands of real-time recognition of high-value target intent and comprehensive ground target situational awareness. To address these challenges, this study proposes a Threat Field-integrated Gated Recurrent Unit model (TF-GRU), which improves prediction accuracy and robustness through threat field modeling, dynamic data repair, and multi-target collaboration, thereby providing reliable support for battlefield decision-making.  Methods  The TF-GRU framework integrates static and dynamic threat field modeling with a hybrid Particle Filtering (PF) and Dynamic Time Warping (DTW) strategy. Static threat fields quantify target-specific threats (e.g., tanks, armored vehicles, artillery) using five factors: enemy-friend distance, range, firepower, defense, and mobility. Gaussian and exponential decay models are employed to describe spatial threat diffusion across different target categories. Dynamic threat fields incorporate real-time kinematic variables (velocity, acceleration, orientation) and temporal decay, allowing adaptive updates of threat intensity. To address incomplete sensor data, a PF-DTW switching mechanism dynamically alternates between short-term PF (N = 1 000 particle) and long-term historical trajectory matching (DTW with β = 50). Collaborative PF introduces neighborhood angular constraints to refine multi-target state estimation. The GRU architecture is further enhanced with Mish activation, adaptive Xavier initialization, and threat-adaptive gating, ensuring effective fusion of trajectory and threat features.  Results and Discussions  Experiments were conducted on a simulated dataset comprising 40 trajectories and 144,000 timesteps. Under complete data conditions, the TF-GRU model achieved the highest accuracy on both the training and test sets, reaching 94.7% and 92.9%, respectively, indicating strong fitting capability and generalization performance (Fig. 10). After integrating static and dynamic threat fields, model accuracy increased from 72% (trajectory-only input) to 83%, accompanied by substantial improvements in F1 scores and reductions in predictive uncertainty (Fig. 6). In scenarios with 30% missing data, TF-GRU maintained an accuracy of 86.2%, outperforming comparative models and demonstrating superior robustness (Fig. 10). These results confirm that the PF-DTW mechanism effectively reduces the adverse effects of both short-term and long-term data loss, while the collaborative PF strategy strengthens multi-target prediction through neighborhood synergy (η = 0.6). This combination enables robust threat field reconstruction and reliable intent inference (Figs. 89).  Conclusions  The TF-GRU model effectively addresses the challenges of intent prediction in complex battlefield environments with incomplete data through threat field modeling, the PF-DTW dynamic repair mechanism, and multi-target collaboration. It achieves high accuracy and robustness, providing reliable support for situational awareness and command decision-making. Future work will focus on applying the model to real-world datasets and enhancing computational efficiency to facilitate practical deployment.
Low-complexity Ordered Statistic Decoding Algorithm Based on Skipping Mechanisms
WANG Qianfan, GUO Yangeng, SONG Linqi, MA Xiao
Available online  , doi: 10.11999/JEIT250447
Abstract:
  Objective  Ultra-Reliable Low-Latency Communication (URLLC) in 5G and the emerging Hyper-Reliable Low-Latency Communication (HRLLC) in 6G impose exceptionally stringent requirements on both reliability and end-to-end delay. These requirements create opportunities and challenges for short-length channel codes, particularly in scenarios where Maximum-Likelihood (ML) or near-ML decoding is desirable but computational complexity and latency are prohibitive. Ordered Statistic Decoding (OSD) is a universal near-ML decoding technique that can closely approach finite-length performance bounds. However, its re-encoding step suffers from combinatorial explosion, resulting in impractical complexity in high-throughput and low-latency systems. The excessive number of Test-Error-Pattern (TEP) re-encodings fundamentally restricts the deployment of OSD in URLLC and HRLLC contexts. To address this bottleneck, we design multiple efficient skip mechanisms that substantially reduce re-encoding operations while maintaining negligible performance degradation.  Methods  Three complementary skipping mechanisms are developed to prune the OSD re-encoding search: (1) Soft-information based skipping. Two criteria are introduced—Trivial and Dynamic Approximate Ideal (DAI), to compare the soft metric of each TEP against the minimum soft weight in the current list. Candidates with excessively large soft weights, which are unlikely to be correct, are skipped. Unlike prior work that evaluates only the first TEP at each Hamming weight increment, both criteria are applied to every candidate. The Trivial criterion ensures no performance loss by skipping only when a TEP’s soft metric exceeds the best-so-far. The DAI criterion incorporates an expected residual soft-weight compensation term over non-basis bits, enabling more aggressive skipping with minimal performance degradation. (2) Extra-parity skipping. The search dimension is expanded from \begin{document}$ k $\end{document} to \begin{document}$ k+\delta $\end{document} by appending the \begin{document}$ \delta $\end{document} most reliable non-basis bit positions to the test vector. Additional parity checks arising from the extended generator matrix eliminate invalid TEPs. Any candidate failing these extra parity constraints is bypassed. (3) Joint skipping. This approach integrates the two preceding mechanisms. Each partial TEP \begin{document}$ ({\boldsymbol{e}}_{\mathrm{L}},{\boldsymbol{e}}_{\delta })\in {\mathbb{F}}_{2}^{k+\delta } $\end{document} is first tested using the DAI rule and then subjected to the extra-parity check. Only candidates passing both criteria are re-encoded.  Results and Discussions  Extensive simulations on extended BCH \begin{document}$ \left[\mathrm{128,64}\right] $\end{document} and BCH \begin{document}$ \left[\mathrm{127,64}\right] $\end{document} codes over the BPSK-AWGN channel demonstrate the efficacy of the proposed skipping mechanisms. Soft-information skipping: When compared with conventional OSD using maximum flipping order \begin{document}$ t=4 $\end{document}, the Trivial rule is found to reduce average re-encodings by 50%~90% across the SNR range. The DAI rule achieves an additional 60%~99% reduction beyond the Trivial rule. At SNR = 3 dB, the average number of re-encodings decreases from approximately \begin{document}$ 6.7\times {10}^{5} $\end{document} to \begin{document}$ 1.2\times {10}^{3} $\end{document}, with negligible degradation in Frame-Error Rate (FER) (Fig. 1). Extra-parity skipping: For \begin{document}$ \delta =4 $\end{document}, over \begin{document}$ 90 \% $\end{document} of re-encodings are eliminated uniformly across SNR values, thereby reducing dependence on channel conditions. This reduction is achieved without significant FER loss (Fig. 2). Joint skipping: The combined mechanism demonstrates superior performance over individual schemes. It reduces average re-encodings by an additional ~40% compared with the DAI rule alone, and by more than \begin{document}$ > 99.9 \% $\end{document} compared with extra-parity alone in high-SNR regimes. In this region, re-encodings decrease from \begin{document}$ \sim 6.7\times {10}^{5} $\end{document} to fewer than 100, while FER remains nearly identical to that of baseline OSD (Fig. 3). The joint skipping mechanism is further evaluated on BCH codes with different rates, including \begin{document}$ \left[\mathrm{127,36}\right] $\end{document}, \begin{document}$ \left[\mathrm{127,64}\right] $\end{document}, and \begin{document}$ \left[\mathrm{127,92}\right] $\end{document}. In all cases, substantial reductions in re-encodings are consistently achieved with negligible performance degradation (Fig. 4). A comparative analysis with state-of-the-art schemes—including Probabilistic Sufficient/Necessary Conditions (PSC/PNC), Fast OSD (FOSD), and Order-Skipping OSD (OS-OSD)—shows that the proposed joint skipping OSD with \begin{document}$ \delta =4 $\end{document} achieves the lowest re-encoding count. Up to two orders of magnitude fewer re-encodings are observed relative to OS-OSD at low SNR, and superiority over FOSD is maintained at moderate SNR, while error-correction performance is preserved across all tested SNRs (Fig. 5).  Conclusions  To address the stringent reliability and latency requirements of 5G URLLC and future 6G HRLLC, this work presents novel skipping mechanisms for OSD that substantially reduce re-encoding complexity. For offline pre-computed TEPs, the soft-information, extra-parity, and joint skipping rules eliminate more than \begin{document}$ 99 \% $\end{document} of redundant re-encodings in typical operating regimes with negligible degradation in Frame-Error Rate (FER). In particular, the proposed joint skipping mechanism lowers the average re-encoding count from approximately \begin{document}$ 6.7\times {10}^{5} $\end{document} to only tens in the high-SNR region, thereby meeting practical latency constraints while preserving near-ML performance. These findings demonstrate the potential of the proposed skipping framework to enable high-performance short-block decoding in next-generation HRLLC.
Combine the Pre-trained Model with Bidirectional Gated Recurrent Units and Graph Convolutional Network for Adversarial Word Sense Disambiguation
ZHANG Chunxiang, SUN Ying, GAO Kexin, GAO Xueyao
Available online  , doi: 10.11999/JEIT250386
Abstract:
  Objective  In Word Sense Disambiguation (WSD), the Linguistically-motivated bidirectional Encoder Representation from Transformer (LERT) is employed to capture rich semantic representations from large-scale corpora, enabling improved contextual understanding of word meanings. However, several challenges remain. Current WSD models are not sufficiently sensitive to temporal and spatial dependencies within sequences, and single-dimensional features are inadequate for representing the diversity of linguistic expressions. To address these limitations, a hybrid network is constructed by integrating LERT, Bidirectional Gated Recurrent Units (Bi-GRU), and Graph Convolutional Network (GCN). This network enhances the modeling of structured text and contextual semantics. Nevertheless, generalization and robustness remain problematic. Therefore, an adversarial training algorithm is applied to improve the overall performance and resilience of the WSD model.  Methods  An adversarial WSD method is proposed based on a pre-trained model, combining Bi-GRU and GCN. First, word forms, parts of speech, and semantic categories of the neighboring words of an ambiguous term are input into the LERT model to obtain the CLS sequence and token sequence. Second, cross-attention is applied to fuse the global semantic information extracted by Bi-GRU from the token sequence with the local semantic information derived from the CLS sequence. Sentences, word forms, parts of speech, and semantic categories are then used as nodes to construct a disambiguation feature graph, which is subsequently input into GCN to update the feature information of the nodes. Third, the semantic category of the ambiguous word is determined through the interpolated prediction layer and semantic classification layer. Fourth, subtle continuous perturbations are generated by computing the gradient of the dynamic word vectors in the input. These perturbations are added to the original word vector matrix to create adversarial samples, which are used to optimize the LERT+Bi-GRU+CA+GCN (LBGCA-GCN) model. A cross-entropy loss function is applied to measure the performance of the LBGCA-GCN model on adversarial samples. Finally, the loss from the network is combined with the loss from AT to optimize the LBGCA-GCN model..  Results and Discussions  When the FreeLB algorithm is applied, stronger adversarial perturbations are generated, and the LBGCA-GCN-AT model achieves the best performance (Table 2). As the number of perturbation steps increases, the strength of AT improves. However, when the number of steps exceeds a certain threshold, the LBGCA-GCN+AT(LBGCA-GCN-AT) model begins to overfit. The Free Large-Batch (FreeLB) algorithm demonstrates strong robustness with three perturbation steps (Table 3). The cross-attention mechanism, which fuses the token sequence with the CLS sequence, yields significant performance gains in complex semantic scenarios (Fig. 3). By incorporating AT, the LBGCA-GCN-AT model achieves notable improvements across multiple evaluation metrics (Table 4).  Conclusions  This study presents an adversarial WSD method based on a pre-trained model, integrating Bi-GRU and GCN to address the weak generalization ability and robustness of conventional WSD models. LERT is used to transform discriminative features into dynamic word vectors, while cross-attention fuses the global semantic information extracted by Bi-GRU from the token sequence with the local semantic information derived from the CLS sequence. This fusion generates more complete node representations for the disambiguation feature graph. A GCN is then applied to update the relationships among nodes within the feature graph. The interpolated prediction layer and semantic classification layer are used to determine the semantic category of ambiguous words. To further improve robustness, the gradient of the dynamic word vector is computed and perturbed to generate adversarial samples, which are used to optimize the LBGCA-GCN model. The network loss is combined with the AT loss to refine the model. Experiments conducted on the SemEval-2007 Task #05 and HealthWSD datasets examine multiple factors affecting model performance, including adversarial algorithms, perturbation steps, and sequence fusion methods. Results demonstrate that introducing AT improves the model’s ability to handle real-world noise and perturbations. The proposed method not only enhances robustness and generalization but also strengthens the capacity of WSD models to capture subtle semantic distinctions.
Global–local Co-embedding and Semantic Mask-driven Aging Approach
LIU Yaohui, LIU Jiaxin, SUN Peng, SHEN Zhe, LANG Yubo
Available online  , doi: 10.11999/JEIT250430
Abstract:
  Objective  Facial age progression has become increasingly important in applications such as criminal investigation and digital identity authentication, making it a key research area in computer vision. However, existing mainstream facial age progression networks face two primary limitations. First, they tend to overemphasize the embedding of age-related features, often at the expense of preserving identity-consistent multi-scale attributes. Second, they fail to effectively eliminate interference from non-age-related elements such as hair and glasses, leading to suboptimal performance in complex scenarios. To address these challenges, this study proposes a global–local co-embedding and semantic mask-driven aging method. The global–local co-embedding strategy improves the accuracy of input portrait reconstruction while reducing computational cost during the embedding phase. In parallel, a semantic mask editing mechanism is introduced to remove non-age-related features—such as hair and eyewear—thereby enabling more accurate embedding of age-related characteristics. This dual strategy markedly enhances the model’s capacity to learn and represent age-specific attributes in facial imagery.  Methods  A Global–Local Collaborative Embedding (GLCE) strategy is proposed to achieve high-quality latent space mapping of facial images. Distinct learning objectives are assigned to separate latent subspaces, which enhances the representation of fine-grained facial features while preserving identity-specific information. Therefore, identity consistency is improved, and both training time and computational cost are reduced, increasing the efficiency of feature extraction. To address interference from non-age-related elements, a semantic mask-driven editing mechanism is employed. Semantic segmentation and image inpainting techniques are integrated to accurately remove regions such as hair and glasses that hinder precise age modeling. A differentiable generator, DsGAN, is introduced to align the transferred latent codes with the embedded identity-preserving codes. Through this alignment, the expression of age-related features is enhanced, and identity information is better retained during the age progression process.  Results and Discussions  Experimental results on benchmark datasets, including CCAD and CelebA, demonstrate that GLS-Age outperforms existing methods such as IPCGAN, CUSP, SAM, and LATS in identity confidence assessment. The age distributions of the generated portraits are also more closely aligned with those of the target age groups. Qualitative analysis further shows that, in cases with hair occlusion, GLS-Age produces more realistic wrinkle textures and enables more accurate embedding of age-related features compared with other methods. Simultaneously, it significantly improves the identity consistency of the synthesized facial images.  Conclusions  This study addresses core challenges in facial age progression, including identity preservation, inadequate detail modeling, and interference from non-age-related factors. A novel Global–Local collaborative embedding and Semantic mask-driven Aging method (GLS-Age) is proposed to resolve these limitations. By employing a differentiated latent space learning strategy, the model achieves hierarchical decoupling of structural and textural features. When integrated with semantic-guided portrait editing and a differentiable generator for latent space alignment, GLS-Age markedly enhances both the fidelity of age feature expression and the consistency of identity retention. The method demonstrates superior generalization and synthesis quality across multiple benchmark datasets, effectively reproducing natural wrinkle patterns and age-related facial changes. These results confirm the feasibility and advancement of GLS-Age in facial age synthesis tasks. Furthermore, this study establishes a compact, high-quality dataset focused on Asian facial portraits, supporting further research in image editing and face generation within this demographic. The proposed method not only contributes technical support to practical applications such as cold case resolution and missing person identification in public security but also offers a robust data and modeling framework for advancing human age-based simulation technologies. Future work will focus on enhancing controllable editing within latent spaces, improving anatomical plausibility in skull structure transformations, and strengthening model performance across extreme age groups, including infants and the elderly. These efforts aim to expand the application of facial age progression in areas such as forensic analysis, humanitarian family search, and social security systems.
Cross Modal Hashing of Medical Image Semantic Mining for Large Language Model
LIU Qinghai, WU Qianlin, LUO Jia, TANG Lun, XU Liming
Available online  , doi: 10.11999/JEIT250529
Abstract:
  Objective  A novel cross-modal hashing framework driven by Large Language Models (LLMs) is proposed to address the semantic misalignment between medical images and their corresponding textual reports. The objective is to enhance cross-modal semantic representation and improve retrieval accuracy by effectively mining and matching semantic associations between modalities.  Methods  The generative capacity of LLMs is first leveraged to produce high-quality textual descriptions of medical images. These descriptions are integrated with diagnostic reports and structured clinical data using a dual-stream semantic enhancement module, designed to reinforce inter-modality alignment and improve semantic comprehension. A structural similarity-guided hashing scheme is then developed to encode both visual and textual features into a unified Hamming space, ensuring semantic consistency and enabling efficient retrieval. To further enhance semantic alignment, a prompt-driven attention template is introduced to fuse image and text features through fine-tuned LLMs. Finally, a contrastive loss function with hard negative mining is employed to improve representation discrimination and retrieval accuracy.  Results and Discussions  Experiments are conducted on a multimodal medical dataset to compare the proposed method with existing cross-modal hashing baselines. The results indicate that the proposed method significantly outperforms baseline models in terms of precision and Mean Average Precision (MAP) (Table 3; Table 4). On average, a 7.21% improvement in retrieval accuracy and a 7.72% increase in MAP are achieved across multiple data scales, confirming the effectiveness of the LLM-driven semantic mining and hashing approach.  Conclusions  Experiments are conducted on a multimodal medical dataset to compare the proposed method with existing cross-modal hashing baselines. The results indicate that the proposed method significantly outperforms baseline models in terms of precision and Mean Average Precision (MAP) (Table 3; Table 4). On average, a 7.21% improvement in retrieval accuracy and a 7.72% increase in MAP are achieved across multiple data scales, confirming the effectiveness of the LLM-driven semantic mining and hashing approach.
Depression Screening Method Driven by Global-Local Feature Fusion
ZHANG Siyong, QIU Jiefan, ZHAO Xiangyun, XIAO Kejiang, CHEN Xiaofu, MAO Keji
Available online  , doi: 10.11999/JEIT250035
Abstract:
  Objective  Depression is a globally prevalent mental disorder that poses a serious threat to the physical and mental health of millions of individuals. Early screening and diagnosis are essential to reducing severe consequences such as self-harm and suicide. However, conventional questionnaire-based screening methods are limited by their dependence on the reliability of respondents’ answers, their difficulty in balancing efficiency with accuracy, and the uneven distribution of medical resources. New auxiliary screening approaches are therefore needed. Existing Artificial Intelligence (AI) methods for depression detection based on facial features primarily emphasize global expressions and often overlook subtle local cues such as eye features. Their performance also declines in scenarios where partial facial information is obscured, for instance by masks, and they raise privacy concerns. This study proposes a Global-Local Fusion Axial Network (GLFAN) for depression screening. By jointly extracting global facial and local eye features, this approach enhances screening accuracy and robustness under complex conditions. A corresponding dataset is constructed, and experimental evaluations are conducted to validate the method’s effectiveness. The model is deployed on edge devices to improve privacy protection while maintaining screening efficiency, offering a more objective, accurate, efficient, and secure depression screening solution that contributes to mitigating global mental health challenges.  Methods  To address the challenges of accuracy and efficiency in depression screening, this study proposes GLFAN. For long-duration consultation videos with partial occlusions such as masks, data preprocessing is performed using OpenFace 2.0 and facial keypoint algorithms, combined with peak detection, clustering, and centroid search strategies to segment the videos into short sequences capturing dynamic facial changes, thereby enhancing data validity. At the model level, GLFAN adopts a dual-branch parallel architecture to extract global facial and local eye features simultaneously. The global branch uses MTCNN for facial keypoint detection and enhances feature extraction under occlusion using an inverted bottleneck structure. The local branch detects eye regions via YOLO v7 and extracts eye movement features using a ResNet-18 network integrated with a convolutional attention module. Following dual-branch feature fusion, an integrated convolutional module optimizes the representation, and classification is performed using an axial attention network.  Results and Discussions  The performance of GLFAN is evaluated through comprehensive, multi-dimensional experiments. On the self-constructed depression dataset, high accuracy is achieved in binary classification tasks, and non-depression and severe depression categories are accurately distinguished in four-class classification. Under mask-occluded conditions, a precision of 0.72 and a precision of 0.690 are obtained for depression detection. Although these values are lower than the precision of 0.87 and precision of 0.840 observed under non-occluded conditions, reliable screening performance is maintained. Compared with other advanced methods, GLFAN achieves higher recall and F1 scores. On the public AVEC2013 and AVEC2014 datasets, the model achieves lower Mean Absolute Error (MAE) values and shows advantages in both short- and long-sequence video processing. Heatmap visualizations indicate that GLFAN dynamically adjusts its attention according to the degree of facial occlusion, demonstrating stronger adaptability than ResNet-50. Edge device tests further confirm that the average processing delay remains below 17.56 milliseconds per frame, and stable performance is maintained under low-bandwidth conditionsThe performance of GLFAN is evaluated through comprehensive, multi-dimensional experiments. On the self-constructed depression dataset, high accuracy is achieved in binary classification tasks, and non-depression and severe depression categories are accurately distinguished in four-class classification. Under mask-occluded conditions, a precision of 0.72 and a recall of 0.690 are obtained for depression detection. Although these values are lower than the precision of 0.87 and recall of 0.840 observed under non-occluded conditions, reliable screening performance is maintained. Compared with other advanced methods, GLFAN achieves higher recall and F1 scores. On the public AVEC2013 and AVEC2014 datasets, the model achieves lower Mean Absolute Error (MAE) values and shows advantages in both short- and long-sequence video processing. Heatmap visualizations indicate that GLFAN dynamically adjusts its attention according to the degree of facial occlusion, demonstrating stronger adaptability than ResNet-50. Edge device tests further confirm that the average processing delay remains below 17.56 frame/s, and stable performance is maintained under low-bandwidth conditions.  Conclusions  This study proposes a depression screening approach based on edge vision technology. A lightweight, end-to-end GLFAN is developed to address the limitations of existing screening methods. The model integrates global facial features extracted via MTCNN with local eye-region features captured by YOLO v7, followed by effective feature fusion and classification using an Axial Transformer module. By emphasizing local eye-region information, GLFAN enhances performance in occluded scenarios such as mask-wearing. Experimental validation using both self-constructed and public datasets demonstrates that GLFAN reduces missed detections and improves adaptability to short-duration video inputs compared with existing models. Grad-CAM visualizations further reveal that GLFAN prioritizes eye-region features under occluded conditions and shifts focus to global facial features when full facial information is available, confirming its context-specific adaptability. The model has been successfully deployed on edge devices, offering a lightweight, efficient, and privacy-conscious solution for real-time depression screening.
Edge Network Data Scheduling Optimization Method Integrating Improved Jaya and Cluster Center Selection Algorithm
YANG Wensheng, PAN Chengsheng
Available online  , doi: 10.11999/JEIT250317
Abstract:
  Objective  The rapid advancement of technologies such as artificial intelligence and the Internet of Things has placed increasing strain on traditional centralized cloud computing architectures, which struggle to meet the communication and computational demands of large-scale data processing. Due to the physical separation between cloud servers and end-users, data transmission typically incurs considerable latency and energy consumption. Therefore, edge computing—by deploying computing and storage resources closer to users, has emerged as a viable paradigm for supporting data-intensive and latency-sensitive applications. However, effectively addressing the challenges of data-intensive services in edge computing environments, such as efficient edge node clustering and resource scheduling, remains a key issue. This study proposes a data scheduling optimization method for edge networks that integrates an improved Jaya algorithm with a cluster center selection strategy. Specifically, for data-intensive services, the method partitions edge nodes into clusters and identifies optimal cluster centers. Data are first aggregated at these centers before being transmitted to the cloud. By leveraging cluster-based aggregation, the method facilitates more efficient data scheduling and improved resource management in edge environments.  Methods  The proposed edge network data scheduling optimization method comprises two core components: a shortest-path selection algorithm based on an improved Jaya algorithm and an optimal cluster center selection algorithm. The scheduling framework accounts for both the shortest communication paths among edge nodes and the availability of network resources. The improved Jaya algorithm incorporates a cosine-based nonlinear decay function and a multi-stage search strategy to dynamically optimize inter-node paths. The nonlinear decay function modulates the variation of random factors across iterations, allowing adaptive adjustment of the algorithm’s exploration capacity. This mechanism helps prevent premature convergence and reduces the likelihood of becoming trapped in local optima during the later optimization stages. To further enhance performance, a multi-stage search strategy divides the optimization process into two phases: an exploration phase during early iterations, which prioritizes global search across the solution space, and an exploitation phase during later iterations, which refines solutions locally. This staged approach improves the trade-off between convergence speed and solution accuracy, increasing the algorithm’s robustness in complex edge network environments. Based on the optimized paths and available bandwidth, a criterion is established for selecting the initial cluster center. Subsequently, a selection scheme for additional cluster centers is formulated by evaluating inter-cluster center distances. Finally, a partitioning method assigns edge nodes to their respective clusters based on the optimized topology.  Results and Discussions  The simulation experiments comprise two parts: performance evaluation of the improved Jaya algorithm (Jaya*) and analysis of the cluster partitioning scheme. To assess convergence speed and optimization accuracy, three benchmark test functions are used to compare Jaya* with four existing algorithms: Simulated Annealing (SA), Genetic Algorithm (GA), Ant Colony Optimization (ACO), and the standard Jaya algorithm. Building on these results, two additional experiments—cluster center selection and cluster partitioning—are conducted to evaluate the feasibility and effectiveness of the proposed optimal cluster center selection algorithm for resource scheduling. A parameter sensitivity analysis using the multi-modal Rastrigin function is performed to investigate the effects of different population sizes and maximum iteration counts on optimization accuracy and stability (Table 2 and Table 3). The optimal configuration is determined to be \begin{document}$ {\text{po}}{{\text{p}}_{{\text{size}}}} = 50 $\end{document} and \begin{document}$ {t_{\max }} = 500 $\end{document}, which achieves a favorable balance between accuracy and computational efficiency. Subsequently, a multi-algorithm comparison experiment is carried out under consistent conditions. The improved Jaya algorithm outperforms the four alternatives in convergence speed and optimization accuracy across three standard functions: Sphere (Fig. 4), Rastrigin (Fig. 5), and Griewank (Fig. 6). The algorithm also demonstrates superior stability. Its convergence trajectory is characterized by a rapid initial decline followed by gradual stabilization in later stages. Based on these findings, the cluster center selection algorithm is applied to tactical edge networks of varying scales—25, 38, and 50 nodes (Fig. 7). The parameter mi is calculated (Fig. 8), and various numbers of cluster centers are set to complete center selection and cluster member assignment (Table 5). Evaluation using the Average Sum of Squared Errors (AvgSSE) under different cluster center counts reveals that the minimum AvgSSE for all three network sizes occurs when the number of cluster centers is 4 (Table 6), indicating that this configuration yields the optimal clustering outcome. Therefore, the proposed method effectively selects cluster centers and derives the optimal clustering configuration (Fig. 9), while maintaining low clustering error and enhancing the efficiency and accuracy of resource scheduling. Finally, in a 38-node edge network scenario with four cluster centers, a multi-algorithm cluster partitioning comparison is conducted (Table 7). The improved Jaya algorithm achieves the best AvgSSE result of 16.22, significantly outperforming the four baseline algorithms. These results demonstrate its superiority in convergence precision and global search capability.  Conclusions  To address data resource scheduling challenges in edge computing environments, this study proposes an edge network data scheduling optimization method that integrates an improved Jaya algorithm with a cluster center selection strategy. The combined approach achieves high clustering accuracy, robustness, and generalization performance. It effectively enhances path planning precision and central node selection, leading to improved data transmission performance and resource utilization in edge networks.
Optimized Design of Non-Transparent Bridge for Heterogeneous Interconnects in Hyper-converged Infrastructure
ZHENG Rui, SHEN Jianliang, LV Ping, DONG Chunlei, SHAO Yu, ZHU Zhengbin
Available online  , doi: 10.11999/JEIT250272
Abstract:
  Objective  The integration of heterogeneous computing resource clusters into modern Hyper-Converged Infrastructure (HCI) systems imposes stricter performance requirements in latency, bandwidth, throughput, and cross-domain transmission stability. Traditional HCI systems primarily rely on the Ethernet TCP/IP protocol, which exhibits inherent limitations, including low bandwidth efficiency, high latency, and limited throughput. Existing PCIe Switch products typically employ Non-Transparent Bridges (NTBs) for conventional dual-system connections or intra-server communication; however, they do not meet the performance demands of heterogeneous cross-domain transmission within HCI environments. To address this limitation, a novel Dual-Mode Non-Transparent Bridge Architecture (D-MNTBA) is proposed to support dual transmission modes. D-MNTBA combines a fast transmission mode via a bypass mechanism with a stable transmission mode derived from the Traditional Data Path Architecture (TDPA), thereby aligning with the data characteristics and cross-domain streaming demands of HCI systems. Hardware-level enhancements in address and ID translation schemes enable D-MNTBA to support more complex mappings while minimizing translation latency. These improvements increase system stability and effectively support the cross-domain transmission of heterogeneous data in HCI systems.  Methods  To overcome the limitations of traditional single-pass architectures and the bypass optimizations of the TDPA, the proposed D-MNTBA incorporates both a fast transmission path and a stable transmission path. This dual-mode design enables the NTB to leverage the data characteristics of HCI systems for telegram-based streaming, thereby reducing dependence on intermediate protocols and data format conversions. The stable transmission mode ensures reliable message delivery, while the fast transmission mode—enhanced through hardware-level optimizations in address and ID translation—supports high-real-time cross-domain communication. This combination improves overall transmission performance by reducing both latency and system overhead. To meet the low-latency demands of the bypass transmission path, the architecture implements hardware-level enhancements to the address and ID conversion modules. The address translation module is expanded with a larger lookup table, allowing for more complex and flexible mapping schemes. This enhancement enables efficient utilization of non-contiguous and fragmented address spaces without compromising performance. Simultaneously, the ID conversion module is optimized through multiple conversion strategies and streamlined logic, significantly reducing the time required for ID translation.  Results and Discussions  Address translation in the proposed D-MNTBA is validated through emulation within a constructed HCI environment. The simulation log for indirect address translation shows no errors or deadlocks, and successful hits are observed on BAR2/3. During dual-host disk access, packet header addresses and payload content remain consistent, with no packet loss detected (Fig. 14), indicating that indirect address translation is accurately executed under D-MNTBA. ID conversion performance is evaluated by comparing the proposed architecture with the TDPA implemented in the PEX8748 chip. The switch based on D-MNTBA exhibits significantly shorter ID conversion times. A maximum reduction of approximately 34.9% is recorded, with an ID conversion time of 71 ns for a 512-byte payload (Fig. 15). These findings suggest that the ID function mapping method adopted in D-MNTBA effectively reduces conversion latency and enhances system performance. Throughput stability is assessed under sustained heavy traffic with payloads ranging from 256 to 2048 bytes. The maximum throughputs of D-MNTBA, the Ethernet card, and PEX8748 are measured at 1.36 GB/s, 0.97 GB/s, and 0.9 GB/s, respectively (Fig. 16). Compared to PEX8748 and the Ethernet architecture, D-MNTBA improves throughput by approximately 51.1% and 40.2%, respectively, and shows the slowest degradation trend, reflecting superior stability in heterogeneous cross-domain transmission. Bandwidth comparison reveals that D-MNTBA outperforms TDPA and the Ethernet card, with bandwidth improvements of approximately 27.1% and 19.0%, respectively (Fig. 17). These results highlight the significant enhancement in cross-domain transmission performance achieved by the proposed architecture in heterogeneous environments.  Conclusions  This study proposes a Dual-Mode D-MNTBA to address the challenges of heterogeneous interconnection in HCI systems. By integrating a fast transmission path enabled by a bypass architecture with the stable transmission path of the TDPA, D-MNTBA accommodates the specific data characteristics of cross-domain transmission in heterogeneous environments and enables efficient message routing. D-MNTBA enhances transmission stability while improving system-wide performance, offering robust support for high-real-time cross-domain transmission in HCI. It also reduces latency and overhead, thereby improving overall transmission efficiency. Compared with existing transmission schemes, D-MNTBA achieves notable gains in performance, making it a suitable solution for the demands of heterogeneous domain interconnects in HCI systems. However, the architectural enhancements, particularly the bypass design and associated optimizations, increase logic resource utilization and power consumption. Future work should focus on refining hardware design, layout, and wiring strategies to reduce logic complexity and resource consumption without compromising performance.
A Survey on System and Architecture Optimization Techniques for Mixture-of-Experts Large Language Models
WANG Zehao, ZHU Zhenhua, XIE Tongxin, WANG Yu
Available online  , doi: 10.11999/JEIT250407
Abstract:
The Mixture-of-Experts (MoE) framework has become a pivotal approach for enhancing the knowledge capacity and inference efficiency of Large Language Models (LLMs). Conventional methods for scaling dense LLMs have reached significant limitations in training and inference due to computational and memory constraints. MoE addresses these challenges by distributing knowledge representation across specialized expert sub-networks, enabling parameter expansion while maintaining efficiency through sparse expert activation during inference. However, the dynamic nature of expert activation introduces substantial challenges in resource management and scheduling, necessitating targeted optimization at both the system and architectural levels. This survey focuses on the deployment of MoE-based LLMs. It first reviews the definitions and developmental trajectory of MoE, followed by an in-depth analysis of current system-level optimization strategies and architectural innovations tailored to MoE. The paper concludes by summarizing key findings and proposing prospective optimization techniques for MoE-based LLMs.  Significance   The MoE mechanism offers a promising solution to the computational and memory limitations of dense LLMs. By distributing knowledge representation across specialized expert sub-networks, MoE facilitates model scaling without incurring prohibitive computational costs. This architecture alleviates the bottlenecks associated with training and inference in traditional dense models, marking a notable advance in LLM research. Nonetheless, the dynamic expert activation patterns inherent to MoE introduce new challenges in resource scheduling and management. Overcoming these challenges requires targeted system- and architecture-level optimizations to fully harness the potential of MoE-based LLMs.  Progress   Recent advancements in MoE-based LLMs have led to the development of various optimization strategies. At the system level, approaches such as automatic parallelism, communication–computation pipelining, and communication operator fusion have been adopted to reduce communication overhead. Memory management has been improved through expert prefetching, caching mechanisms, and queue scheduling policies. To address computational load imbalance, both offline scheduling methods and runtime expert allocation strategies have been proposed, including designs that leverage heterogeneous CPU–GPU architectures. In terms of hardware architecture, innovations include dynamic adaptation to expert activation patterns, techniques to overcome bandwidth limitations, and near-memory computing schemes that improve deployment efficiency. In parallel, the open-source community has developed supporting tools and frameworks that facilitate the practical deployment and optimization of MoE-based models.  Conclusions  This survey presents a comprehensive review of system and architectural optimization techniques for MoE-based LLMs. It highlights the importance of reconciling parameter scalability with computational efficiency through the MoE framework. The dynamic nature of expert activation poses significant challenges in scheduling and resource management, which this survey systematically addresses. By evaluating current optimization techniques across both system and hardware layers, the paper offers key insights into the state of the field. It also proposes directions for future work, providing a reference for researchers and practitioners seeking to improve the performance and scalability of MoE-based models. The findings emphasize the need for continued innovation across algorithm development, system engineering, and architectural design to fully realize the potential of MoE in real-world applications.  Prospects   Future research on MoE-based LLMs is expected to advance the integration of algorithm design, system optimization, and hardware co-design. Key research directions include resolving load imbalance and maximizing resource utilization through adaptive expert scheduling algorithms, refining system frameworks to support dynamic sparse computation more effectively, and exploring hardware paradigms such as near-memory computing and hierarchical memory architectures. These developments aim to deliver more efficient and scalable MoE model deployments by fostering deeper synergy between software and hardware components.
Hybrid Far-Near Field Channel Estimation for XL-RIS Assisted Communication Systems
SHAO Kai, HUA Fanyu, WANG Guangyu
Available online  , doi: 10.11999/JEIT250306
Abstract:
  Objective  With the rapid development of sixth-generation mobile communication, Extra-Large Reconfigurable Intelligent Surfaces (XL-RIS) have attracted significant attention due to their potential to enhance spectral efficiency, expand coverage, and reduce energy consumption. However, conventional channel estimation methods, primarily based on Far-Field (FF) or near-field (NF) models, face limitations in addressing the hybrid far-NF environment that arises from the coexistence of NF spherical waves and FF planar waves in XL-RIS deployments. These limitations restrict the intelligent control capability of RIS technology due to inaccurate channel modeling and reduced estimation accuracy. To address these challenges, this paper constructs a hybrid-field channel model for XL-RIS and proposes a robust channel estimation method to resolve parameter estimation challenges under coupled FF and NF characteristics, thereby improving channel estimation accuracy in complex propagation scenarios.  Methods  For channel estimation in XL-RIS-aided communication systems, several key challenges must be addressed, including the modeling of hybrid far-NF cascaded channels, separation of FF and NF channel components, and individual parameter estimation. To capture the hybrid-field effects of XL-RIS, a hybrid-field cascaded channel model is constructed. The RIS-to-User Equipment (UE) channel is modeled as a hybrid far-NF channel, whereas the Base Station (BS)-to-RIS channel is characterized under the FF assumption. A unified representation of FF and NF models is established by introducing equivalent cascaded angles for the angle of departure and angle of arrival on the RIS side. The XL-RIS hybrid-field cascaded channel is parameterized through BS arrival angles, RIS-UE cascaded angles, and distances. To reduce the computational complexity of joint parameter estimation, a Two-Stage Hybrid-Field (TS-HF) channel estimation scheme is proposed. In the first stage, the BS arrival angle is estimated using the MUltiple SIgnal Classification (MUSIC) algorithm. In the second stage, a Hybrid-Field forward spatial smoothing Rank-reduced MUSIC (HF-RM) algorithm is proposed to estimate the parameters of the RIS-UE hybrid-field channel. The received signals are pre-processed using a forward spatial smoothing technique to mitigate multipath coherence effects. Subsequently, the Rank-reduced MUSIC (RM) algorithm is applied to separately estimate the FF and NF angle parameters, as well as the NF distance parameter. During this stage, a power spectrum comparison scheme is designed to distinguish FF and NF angles based on power spectral characteristics, thereby providing high-precision angular information to support NF distance estimation. Finally, channel attenuation is estimated using the least squares method. To validate the effectiveness of the proposed hybrid-field channel estimation scheme, comparative analyses are conducted against FF, NF, and the proposed TS-HF-RM schemes. The FF estimation approximates the hybrid-field channel using a FF channel model and estimates FF angle parameters with the MUSIC algorithm, referred to as the TS-FF-M scheme. The NF estimation applies a NF channel model to characterize the hybrid channel and estimates angle and distance parameters using the RM algorithm, referred to as the TS-NF-RM scheme. To further evaluate the estimation performance, additional benchmark schemes are considered, including the Two-Stage Near-Field Orthogonal Matching Pursuit (TS-NOMP) scheme, the Two-Stage Hybrid Orthogonal Matching Pursuit with Prior (TS-HOMP-P) scheme that requires prior knowledge of FF and NF quantities, and the Two-Stage Hybrid Orthogonal Matching Pursuit with No Prior (TS-HOMP-NP) scheme that operates without requiring such prior information.  Results and Discussions  Compared with the TS-FF-M and TS-NF-RM schemes, the proposed TS-HF-RM approach achieves effective separation and accurate estimation of both FF and NF components by jointly modeling the hybrid-field channel. The method consistently demonstrates superior estimation accuracy across a wide range of Signal-to-Noise Ratio (SNR) conditions (Fig. 4). These results confirm both the necessity of hybrid-field channel modeling and the effectiveness of the proposed estimation scheme. Experimental findings show that the TS-HF-RM approach significantly improves channel estimation performance in XL-RIS-assisted communication systems. Further comparative analysis reveals that the TS-HF-RM scheme outperforms TS-NOMP and TS-HOMP-P by mitigating power leakage effects and overcoming limitations associated with unknown path numbers through distinct processing of FF and NF components. Without requiring prior knowledge of the propagation environment, the proposed method achieves lower Normalized Mean Square Error (NMSE) while demonstrating improved robustness and estimation precision (Fig. 5). Although TS-HOMP-NP also operates without prior field information, the TS-HF-RM scheme provides superior parameter resolution, attributed to its subspace decomposition principle. Additionally, both the TS-HF-RM and TS-HOMP-P schemes exhibit improved performance as the number of pilot signals increases. However, TS-HF-RM consistently outperforms TS-HOMP-P under low-SNR conditions (0 dB). At high SNR (10 dB) with a limited number of pilot signals (<280), TS-HOMP-P temporarily achieves better performance due to its higher sensitivity to SNR. Nevertheless, the proposed TS-HF-RM approach demonstrates greater stability and adaptability under low-SNR and resource-constrained conditions (Fig. 6).  Conclusions  This study addresses the challenge of hybrid-field channel estimation for XL-RIS by constructing a hybrid-field cascaded channel model and proposing a two-stage estimation scheme. The HF-RM algorithm is specifically designed for accurate hybrid component estimation in the second stage. Theoretical analysis and simulation results demonstrate the following: (1) The hybrid-field model reduces inaccuracies associated with traditional single-field assumptions, providing a theoretical foundation for reliable parameter estimation in complex propagation environments; (2) The proposed TS-HF-RM algorithm enables high-resolution parameter estimation with effective separation of FF and NF components, achieving lower NMSE compared to hybrid-field OMP-based methods.
An Optimization Design Method for Zero-Correlation Zone Sequences Based on Newton’s Method
HU Enbo, LIU Tao, LI Yubo
Available online  , doi: 10.11999/JEIT250394
Abstract:
  Objective  Sequences with favorable correlation properties are widely applied in radar and communication systems. Sequence sets with zero or low correlation characteristics enhance radar resolution, target detection, imaging quality, and information acquisition, while also improving the omnidirectional transmission capability of massive multiple-input multiple-output (MIMO) systems. Designing aperiodic Zero Correlation Zone (ZCZ) sequence sets with excellent correlation performance is therefore critical for both wireless communication and radar applications. For example, aperiodic Z-Complementary Set (ZCS) sequence sets are often used in omnidirectional precoding for MIMO systems, whereas aperiodic ZCZ sequence sets are employed in integrated MIMO radar-communication systems. These ZCZ sequence sets are thus valuable across a range of system applications. However, most prior studies rely on analytical construction methods, which impose constraints on parameters such as sequence length and the number of sequences, thereby limiting design flexibility and practical applicability. This study proposes a numerical optimization approach for designing ZCS and aperiodic ZCZ sequence sets with improved correlation properties and greater parametric flexibility. The method minimizes the Complementary Peak Sidelobe Level (CPSL) and Weighted Peak Sidelobe Level (WPSL) using Newton’s method to achieve superior sequence performance.  Methods  This study proposes an optimization-based design method using Newton’s method to construct both aperiodic ZCS sequence sets and aperiodic ZCZ sequence sets with low sidelobe levels and flexible parameters. The optimization objective is first formulated using the CPSL and WPSL. The problem is then reformulated as an equivalent system of nonlinear equations, which is solved using Newton’s method. To reduce computation time, partial derivatives are approximated using numerical differentiation techniques. A loop iteration strategy is employed to address multiple constraints during the optimization process. To ensure algorithmic convergence, Armijo’s rule is used for step size selection, promoting stable descent of the objective function along the defined search direction.  Results and Discussions  The aperiodic ZCS sequence set is constructed using Newton’s method. As the number of sequences increases, the CPSL progressively decreases, falling below –300 dB when \begin{document}$M \geqslant 2$\end{document}. The proposed method yields better sidelobe performance than the improved Iterative Twisted Approximation (ITORX) algorithm (Fig. 1). The performance of ZCS sequences generated by both methods is evaluated under different ZCZ conditions. While both approaches achieve low CPSL, Newton’s method yields aidelobe levels closer to the ideal value (Fig. 2). Convergence behavior is assessed using CPSL and the number of iterations. The improved ITROX algorithm typically requires around 20000 iterations to converge, with increasing iterations as ZCZ size grows. In contrast, Newton’s method achieves rapid convergence within approximately 10 iterations (Figs. 3 and 4). The aperiodic ZCZ sequence set constructed using Newton’s method exhibits autocorrelation and cross-correlation peak sidelobe levels below –300 dB within the ZCZ. Moreover, Newton’s method achieves the lowest WPSL, offering the best overall performance among all tested methods (Fig. 5). The smooth convergence curves further confirm the algorithm’s stability when applied to aperiodic ZCZ sequence construction (Fig. 6).  Conclusions  This study proposes an optimization-based algorithm for designing aperiodic ZCS and aperiodic ZCZ sequence sets using Newton’s method, aiming to address the limitations of fixed parameters and high peak sidelobe levels found in existing approaches. Two optimization problems are formulated by minimizing the WPSL and CPSL, respectively. To simplify computation, the optimization tasks are converted into systems of nonlinear equations, which are solved using Newton’s method. The Jacobian matrix is computed via numerical differentiation to reduce computational cost. A loop iteration strategy is introduced to meet multiple constraints in the construction of aperiodic ZCZ sequences. Simulation results confirm that the proposed method yields sequence sets with excellent correlation properties and flexible parameter configurations. By tuning the weighting coefficients, low sidelobe levels can be achieved in specific regions of interest, accommodating different application requirements. The combination of flexible design parameters and favorable correlation performance makes the proposed sequences suitable for a wider range of practical scenarios.
Continuous Federation of Noise-resistant Heterogeneous Medical Dialogue Using the Trustworthiness-based Evaluation
LIU Yupeng, ZHANG Jiang, TANG Shichen, MENG Xin, MENG Qingfeng
Available online  , doi: 10.11999/JEIT250057
Abstract:
  Objective   To address the key challenges of client model heterogeneity, data distribution heterogeneity, and text noise in medical dialogue federated learning, this paper proposes a trustworthiness-based, noise-resistant heterogeneous medical dialogue federated learning method, termed FedRH. FedRH enhances robustness by improving the objective function, aggregation strategy, and local update process, among other components, based on credibility evaluation.  Methods   Model training is divided into a local training stage and a heterogeneous federated learning stage. During local training, text noise is mitigated using a symmetric cross-entropy loss function, which reduces the risk of overfitting to noisy text. In the heterogeneous federated learning stage, an adaptive aggregation mechanism incorporates clean, noisy, and heterogeneous client texts by evaluating their quality. Local parameter updates consider both local and global parameters simultaneously, enabling continuous adaptive updates that improve resistance to both random and structured (syntax/semantic) noise and model heterogeneity. The main contributions are threefold: (1) A local noise-resistant training strategy that uses symmetric cross-entropy loss to prevent overfitting to noisy text during local training; (2) A heterogeneous federated learning approach based on client trustworthiness, which evaluates each client’s text quality and learning effectiveness to compute trust scores. These scores are used to adaptively weight clients during model aggregation, thereby reducing the influence of low-quality data while accounting for text heterogeneity; (3) A local continuous adaptive aggregation mechanism, which allows the local model to integrate fine-grained global model information. This approach reduces the adverse effects of global model bias caused by heterogeneous and noisy text on local updates.  Results and Discussions   The effectiveness of the proposed model is systematically validated through extensive, multi-dimensional experiments. The results indicate that FedRH achieves substantial improvements over existing methods in noisy and heterogeneous federated learning scenarios (Table 2, Table 3). The study also presents training process curves for both heterogeneous models (Figure 3) and isomorphic models (Figure 6), supplemented by parameter sensitivity analysis, ablation experiments, and a case study.  Conclusions   The proposed FedRH framework significantly enhances the robustness of federated learning for medical dialogue tasks in the presence of heterogeneous and noisy text. The main conclusions are as follows: (1) Compared to baseline methods, FedRH achieves superior performance in client-side models under heterogeneous and noisy text conditions. It demonstrates improvements across multiple metrics, including precision, recall, and factual consistency, and converges more rapidly during training. (2) Ablation experiments confirm that both the symmetric cross-entropy-based local training strategy and the credibility-weighted heterogeneous aggregation approach contribute to performance gains.
Precise Hand Joint Motion Analysis Driven by Complex Physiological Information
YAN Jiaqing, LIU Gengchen, ZHOU Qingqi, XUE Weiqi, ZHOU Weiao, TIAN Yunzhi, WANG Jiaju, DONG Zhekang, LI Xiaoli
Available online  , doi: 10.11999/JEIT250033
Abstract:
  Objective  The human hand is a highly dexterous organ essential for performing complex tasks. However, dysfunction due to trauma, congenital anomalies, or disease substantially impairs daily activities. Restoring hand function remains a major challenge in rehabilitation medicine. Virtual Reality (VR) technology presents a promising approach for functional recovery by enabling hand pose reconstruction from surface ElectroMyoGraphy (sEMG) signals, thereby facilitating neural plasticity and motor relearning. Current sEMG-based hand pose estimation methods are limited by low accuracy and coarse joint resolution. This study proposes a new method to estimate the motion of 15 hand joints using eight-channel sEMG signals, offering a potential improvement in rehabilitation outcomes and quality of life for individuals with hand impairment.  Methods  The proposed method, termed All Hand joints Posture Estimation (AHPE), incorporates a continuous denoising network that combines sparse attention and multi-channel attention mechanisms to extract spatiotemporal features from sEMG signals. A dual-decoder architecture estimates both noisy hand poses and the corresponding correction ranges. These outputs are subsequently refined using a Bidirectional Long Short-Term Memory (BiLSTM) network to improve pose accuracy. Model training employs a composite loss function that integrates Mean Squared Error (MSE) and Kullback-Leibler (KL) divergence to enhance joint angle estimation and capture inter-joint dependencies. Performance is evaluated using the NinaproDB8 and NinaproDB5 datasets, which provide sEMG and hand pose data for single-finger and multi-finger movements, respectively.  Results and Discussions  The AHPE model outperforms existing methods—including CNN-Transformer, DKFN, CNN-LSTM, TEMPOnet, and RPC-Net—in estimating hand poses from multi-channel sEMG signals. In within-subject validation (Table 1), AHPE achieves a Root Mean Squared Error (RMSE) of 2.86, a coefficient of determination (R2) of 0.92, and a Mean Absolute Deviation (MAD) of 1.79° for MetaCarPophalangeal (MCP) joint rotation angle estimation. In between-subject validation (Table 2), the model maintains high accuracy with an RMSE of 3.72, an R2 of 0.88, and an MAD of 2.36°, demonstrating strong generalization. The model’s capacity to estimate complex hand gestures is further confirmed using the NinaproDB5 dataset. Estimated hand poses are visualized with the Mano Torch hand model (Fig. 4, Fig. 5). The average R2 values for finger joint extension estimation are 0.72 (thumb), 0.692 (index), 0.696 (middle), 0.689 (ring), and 0.696 (little finger). Corresponding RMSE values are 10.217°, 10.257°, 10.290°, 10.293°, and 10.303°, respectively. A grid error map (Fig. 6) highlights prediction accuracy, with red regions indicating higher errors.  Conclusions  The AHPE model offers an effective approach for estimating hand poses from sEMG signals, addressing key challenges such as signal noise, high dimensionality, and inter-individual variability. By integrating mixed attention mechanisms with a dual-decoder architecture, the model enhances both accuracy and robustness in multi-joint hand pose estimation. Results confirm the model’s capacity to reconstruct detailed hand kinematics, supporting its potential for applications in hand function rehabilitation and human-machine interaction. Future work will aim to improve robustness under real-world conditions, including sensor noise and environmental variation.
Breakthrough in Solving NP-Complete Problems Using Electronic Probe Computers
XU Jin, YU Le, YANG Huihui, JI Siyuan, ZHANG Yu, YANG Anqi, LI Quanyou, LI Haisheng, ZHU Enqiang, SHI Xiaolong, WU Pu, SHAO Zehui, LENG Huang, LIU Xiaoqing
Available online  , doi: 10.11999/JEIT250352
Abstract:
This study presents a breakthrough in addressing NP-complete problems using a newly developed Electronic Probe Computer (EPC60). The system employs a hybrid serial–parallel computational model and performs large-scale parallel operations through seven probe operators. In benchmark tests on 3-coloring problems in graphs with 2,000 vertices, EPC60 achieves 100% accuracy, outperforming the mainstream solver Gurobi, which succeeds in only 6% of cases. Computation time is reduced from 15 days to 54 seconds. The system demonstrates high scalability and offers a general-purpose solution for complex optimization problems in areas such as supply chain management, finance, and telecommunications.  Objective   NP-complete problems pose a fundamental challenge in computer science. As problem size increases, the required computational effort grows exponentially, making it infeasible for traditional electronic computers to provide timely solutions. Alternative computational models have been proposed, with biological approaches—particularly DNA computing—demonstrating notable theoretical advances. However, DNA computing systems continue to face major limitations in practical implementation.  Methods  Computational Model: EPC is based on a non-Turing computational model in which data are multidimensional and processed in parallel. Its database comprises four types of graphs, and the probe library includes seven operators, each designed for specific graph operations. By executing parallel probe operations, EPC efficiently addresses NP-complete problems.Structural Features:EPC consists of four subsystems: a conversion system, input system, computation system, and output system. The conversion system transforms the target problem into a graph coloring problem; the input system allocates tasks to the computation system; the computation system performs parallel operations via probe computation cards; and the output system maps the solution back to the original problem format.EPC60 features a three-tier hierarchical hardware architecture comprising a control layer, optical routing layer, and probe computation layer. The control layer manages data conversion, format transformation, and task scheduling. The optical routing layer supports high-throughput data transmission, while the probe computation layer conducts large-scale parallel operations using probe computation cards.  Results and Discussions  EPC60 successfully solved 100 instances of the 3-coloring problem for graphs with 2,000 vertices, achieving a 100% success rate. In comparison, the mainstream solver Gurobi succeeded in only 6% of cases. Additionally, EPC60 rapidly solved two 3-coloring problems for graphs with 1,500 and 2,000 vertices, which Gurobi failed to resolve after 15 days of continuous computation on a high-performance workstation.Using an open-source dataset, we identified 1,000 3-colorable graphs with 1,000 vertices and 100 3-colorable graphs with 2,000 vertices. These correspond to theoretical complexities of O(1.3289n) for both cases. The test results are summarized in Table 1.Currently, EPC60 can directly solve 3-coloring problems for graphs with up to n vertices, with theoretical complexity of at least O(1.3289n).On April 15, 2023, a scientific and technological achievement appraisal meeting organized by the Chinese Institute of Electronics was held at Beijing Technology and Business University. A panel of ten senior experts conducted a comprehensive technical evaluation and Q&A session. The committee reached the following unanimous conclusions:1. The probe computer represents an original breakthrough in computational models.2. The system architecture design demonstrates significant innovation.3. The technical complexity reaches internationally leading levels.4. It provides a novel approach to solving NP-complete problems.Experts at the appraisal meeting stated, “This is a major breakthrough in computational science achieved by our country, with not only theoretical value but also broad application prospects.” In cybersecurity, EPC60 has also demonstrated remarkable potential. Supported by the National Key R&D Program of China (2019YFA0706400), Professor Xu Jin’s team developed an automated binary vulnerability mining system based on a function call graph model. Evaluation of the system using the Modbus Slave software showed over 95% vulnerability coverage, far exceeding the 75 vulnerabilities detected by conventional depth-first search algorithms. The system also discovered a previously unknown flaw, the “Unauthorized Access Vulnerability in Changyuan Shenrui PRS-7910 Data Gateway” (CNVD-2020-31406), highlighting EPC60’s efficacy in cybersecurity applications.The high efficiency of EPC60 derives from its unique computational model and hardware architecture. Given that all NP-complete problems can be polynomially reduced to one another, EPC60 provides a general-purpose solution framework. It is therefore expected to be applicable in a wide range of domains, including supply chain management, financial services, telecommunications, energy, and manufacturing.  Conclusions   The successful development of EPC offers a novel approach to solving NP-complete problems. As technological capabilities continue to evolve, EPC is expected to demonstrate strong computational performance across a broader range of application domains. Its distinctive computational model and hardware architecture also provide important insights for the design of next-generation computing systems.
Research on an EEG-based Neurofeedback System for the Auxiliary Intervention of Post-Traumatic Stress Disorder
TAN Lize, DING Peng, WANG Fan, LI Na, GONG Anmin, NAN Wenya, LI Tianwen, ZHAO Lei, FU Yunfa
Available online  , doi: 10.11999/JEIT250093
Abstract:
  Objective  The ElectroEncephaloGram (EEG)-based Neurofeedback Regulation (ENR) system is designed for real-time modulation of dysregulated stress responses to reduce symptoms of Post-Traumatic Stress Disorder (PTSD) and anxiety. This study evaluates the system’s effectiveness and applicability using a series of neurofeedback paradigms tailored for both PTSD patients and healthy participants.  Methods  Employing real-time EEG monitoring and feedback, the ENR system targets the regulation of alpha wave activity, to alleviate mental health symptoms associated with dysregulated stress responses. The system integrates MATLAB and Unity3D to support a complete workflow for EEG data acquisition, processing, storage, and visual feedback. Experimental validation includes both PTSD patients and healthy participants to assess the system’s effects on neuroplasticity and emotional regulation. Primary assessment indices include changes in alpha wave dynamics and self-reported reductions in stress and anxiety.  Results and Discussions  Compared with conventional therapeutic methods, the ENR system shows significant potential in reducing symptoms of PTSD and anxiety. During functionality tests, the system effectively captures and regulates alpha wave activity, enabling real-time and efficient neurofeedback. Dynamic adjustment of feedback thresholds and task paradigms allows participants to improve stress responses and emotional states following training. Quantitative data indicate clear enhancements in EEG pattern modulation, while qualitative assessments reflect improvements in participants’ self-reported stress and anxiety levels.  Conclusion  This study presents an effective and practical EEG-based neurofeedback regulation system that proves applicable and beneficial for both individuals with PTSD and healthy participants. The successful implementation of the system provides a new technological approach for mental health interventions and supports ongoing personalized neuroregulation strategies. Future research should explore broader applications of the system across neurological conditions to fully assess its efficacy and scalability.
Personalized Federated Learning Method Based on Collation Game and Knowledge Distillation
SUN Yanhua, SHI Yahui, LI Meng, YANG Ruizhe, SI Pengbo
Available online  , doi: 10.11999/JEIT221203
Abstract:
To overcome the limitation of the Federated Learning (FL) when the data and model of each client are all heterogenous and improve the accuracy, a personalized Federated learning algorithm with Collation game and Knowledge distillation (pFedCK) is proposed. Firstly, each client uploads its soft-predict on public dataset and download the most correlative of the k soft-predict. Then, this method apply the shapley value from collation game to measure the multi-wise influences among clients and quantify their marginal contribution to others on personalized learning performance. Lastly, each client identify it’s optimal coalition and then distill the knowledge to local model and train on private dataset. The results show that compared with the state-of-the-art algorithm, this approach can achieve superior personalized accuracy and can improve by about 10%.
The Range-angle Estimation of Target Based on Time-invariant and Spot Beam Optimization
Wei CHU, Yunqing LIU, Wenyug LIU, Xiaolong LI
Available online  , doi: 10.11999/JEIT210265
Abstract:
The application of Frequency Diverse Array and Multiple Input Multiple Output (FDA-MIMO) radar to achieve range-angle estimation of target has attracted more and more attention. The FDA can simultaneously obtain the degree of freedom of transmitting beam pattern in angle and range. However, its performance is degraded due to the periodicity and time-varying of the beam pattern. Therefore, an improved Estimating Signal Parameter via Rotational Invariance Techniques (ESPRIT) algorithm to estimate the target’s parameters based on a new waveform synthesis model of the Time Modulation and Range Compensation FDA-MIMO (TMRC-FDA-MIMO) radar is proposed. Finally, the proposed method is compared with identical frequency increment FDA-MIMO radar system, logarithmically increased frequency offset FDA-MIMO radar system and MUltiple SIgnal Classification (MUSIC) algorithm through the Cramer Rao lower bound and root mean square error of range and angle estimation, and the excellent performance of the proposed method is verified.
Satellite Navigation
Research on GRI Combination Design of eLORAN System
LIU Shiyao, ZHANG Shougang, HUA Yu
Available online  , doi: 10.11999/JEIT201066
Abstract:
To solve the problem of Group Repetition Interval (GRI) selection in the construction of the enhanced LORAN (eLORAN) system supplementary transmission station, a screening algorithm based on cross interference rate is proposed mainly from the mathematical point of view. Firstly, this method considers the requirement of second information, and on this basis, conducts a first screening by comparing the mutual Cross Rate Interference (CRI) with the adjacent Loran-C stations in the neighboring countries. Secondly, a second screening is conducted through permutation and pairwise comparison. Finally, the optimal GRI combination scheme is given by considering the requirements of data rate and system specification. Then, in view of the high-precision timing requirements for the new eLORAN system, an optimized selection is made in multiple optimal combinations. The analysis results show that the average interference rate of the optimal combination scheme obtained by this algorithm is comparable to that between the current navigation chains and can take into account the timing requirements, which can provide referential suggestions and theoretical basis for the construction of high-precision ground-based timing system.