电子与信息学报

2026, 48(1)

[Abstract](172) [FullText HTML] (96) [PDF 9804KB](151)

Abstract:

2026, 48(1): 1-4.

[Abstract](120) [FullText HTML] (96) [PDF 286KB](140)

Abstract:

Intelligent Unmanned Aerial Vehicles for Low-altitude Economy: A Review of the Technology Framework and Future Prospects

QIAN Zhihong, WANG Yijun

2026, 48(1): 1-33. doi: 10.11999/JEIT251246

[Abstract](638) [FullText HTML] (433) [PDF 6280KB](276)

Abstract:
Significance The deep integration of new quality productive forces with the digital economy accelerates the development of the low-altitude economy and positions it as an emerging driver of global economic growth. Operating in airspace typically below 3 000 m, this industrial system supports diverse applications, including Unmanned Aerial Vehicle (UAV) logistics, Urban Air Mobility (UAM), industrial inspection, and public safety. Intelligent UAVs, characterized by cost efficiency, scalability, and autonomous capability, function as the core technical enabler of this ecosystem. Their deployment promotes a transition in aviation from centralized and isolated operation modes toward distributed, intelligent, and service-oriented aerial utilization. From a strategic perspective, intelligent UAVs contribute to industrial upgrading, urban infrastructure improvement, airspace security assurance, and regional economic development. Therefore, a systematic review and structured construction of an intelligent UAV technology framework is necessary to support future research, clarify key challenges, and promote sustained development of the low-altitude economy. Progress A holistic technology framework for intelligent UAVs is constructed, organized hierarchically from foundational technologies to application-oriented systems. The framework integrates four interrelated domains. Intelligent perception and navigation emphasize stable operation in complex environments through tightly coupled multi-sensor fusion and advanced state estimation methods, such as visual-inertial odometry, supported by multi-source adaptive positioning in Global Navigation Satellite System (GNSS)-denied scenarios. Wireless communication networks focus on reliable Beyond-Visual-Line-Of-Sight (BVLOS) connectivity by combining cellular network access, self-organizing flying ad hoc networks (FANETs) with intelligent topology control, and UAV-assisted edge computing for efficient resource scheduling. Autonomous decision-making and cooperative control evolve from classical rule-based approaches toward learning-based paradigms, where multi-agent reinforcement learning enables coordinated swarm behavior and adaptive task execution. Low-altitude security and airspace management provide essential system support through integrated detection and countermeasure technologies, supplemented by UAV cloud platforms and Unmanned aircraft system Traffic Management (UTM) for coordinated airspace operation. Conclusions The review indicates that UAVs are transitioning from isolated platforms to interconnected intelligent nodes embedded within the low-altitude economy system. Although substantial progress has been achieved across multiple technological domains, several critical challenges remain. Major technical constraints include maintaining communication reliability in complex low-altitude channels, addressing perception degradation in cluttered or deceptive environments, achieving robust autonomous cooperation under uncertainty, and overcoming the inherent limitations of existing energy and power technologies. These technical issues coexist with non-technical barriers, such as the establishment of adaptive regulatory and airspace governance frameworks, the formation of scalable and sustainable business models, and the enhancement of public acceptance. The analysis suggests that addressing these challenges requires deep integration of enabling technologies. A closed-loop evolution paradigm of “challenge-driven → technology fusion → system construction → feedback iteration” is proposed to describe the intrinsic iterative logic of technological development and to provide methodological guidance for future research and engineering practice. Prospects Future intelligent UAV development is expected to concentrate on several strongly coupled directions. Intelligent holistic communication will advance through deep integration of air-ground-space networks and Integrated Sensing And Communication (ISAC), forming a proactive data environment that supports predictive resource management and resilient connectivity. Cognitive swarm intelligence will promote the transformation of UAV clusters into cooperative cognitive systems by combining large language models for task comprehension with multi-agent reinforcement learning for decentralized decision-making, enabling emergent collective intelligence. High-assurance autonomous security will rely on formal verification of artificial intelligence models, explainable decision mechanisms, and extensive application of digital twins for virtual validation and certification, thereby strengthening operational trust. In parallel, green and sustainable technologies will influence the full lifecycle of UAV systems, encouraging advances in high-energy-density power solutions, including solid-state batteries and hydrogen fuel cells, the use of environmentally friendly materials, and artificial intelligence-based optimization of energy consumption and acoustic performance, which together support the long-term sustainability of the low-altitude economy.

Physical Layer Authentication for Large Language Models in Maritime Communications

CHEN Qiaoxin, XIAO Liang, WANG Pengcheng, LI Jieling, YAO Jinqing, XU Xiaoyu

2026, 48(1): 34-44. doi: 10.11999/JEIT250804

[Abstract](335) [FullText HTML] (129) [PDF 4520KB](217)

Abstract:
Objective PHYsical (PHY)-layer authentication exploits channel state information to detect spoofing attacks. However, for smart ocean applications supported by Large Language Models (LLMs), authentication accuracy and speed remain limited because of insufficient channel estimation and rapidly time-varying channels in short-packet communications with constrained preamble length. An environment perception-aware PHY-layer authentication scheme is therefore proposed for LLM edge inference in maritime applications. A hypothesis-testing-based multi-mode authentication framework is designed to evaluate channel state information and packet arrival interval. Application types and environmental indicators inferred by the LLM are used in reinforcement learning to optimize the authentication mode and test threshold, thereby improving authentication accuracy and speed. Methods An environment perception-aware PHY-layer authentication scheme is developed for LLM edge inference in maritime wireless networks. Hypothesis-testing-based multi-mode authentication is used to jointly evaluate channel state information and packet arrival interval for spoofing detection. Reinforcement learning is adopted to optimize the authentication mode and test threshold according to application types and environmental indicators inferred by a multimodal LLM fed with images and prompts. A multi-level policy risk function is formulated to quantify miss-detection risk and to reduce exploration probability for unsafe policies. A Benna-Fusi synapse-based continual learning mechanism is proposed to obtain multi-scale optimization experience across multiple maritime scenarios, such as deck and cabin environments, and to replay identical cases to accelerate policy optimization. Results and Discussions Simulations are conducted using four legal devices and a shipborne server with maritime channel data collected in the Xiamen Pearl Harbor area. A spoofing attacker moving at 1.5 m/s transmits false data packets to the server with a maximum power of 100 mW. The results demonstrate clear performance gains over benchmark methods. Compared with RLPA, the proposed scheme achieves an 84.2% reduction in false alarm rate and an 82.3% reduction in miss-detection rate. These gains are attributed to the use of LLM-derived environmental indicators and a safe exploration mechanism that avoids high-risk authentication policies leading to increased miss detection. Conclusions A PHY-layer authentication scheme is proposed for LLM-enabled intelligent maritime wireless networks, in which both the authentication mode and test threshold are optimized to counter spoofing attacks. By jointly using LLM-derived environmental indicators, channel state information, and packet arrival interval, a safe exploration mechanism is applied to improve authentication accuracy and efficiency. Simulation results confirm that the proposed scheme reduces the false alarm rate by 84.2% and the miss-detection rate by 82.3% compared with the benchmark RLPA.

Multi-Matrix Representative Ordered Statistics Decoding

WANG Yiwen, WANG Qianfan, LIANG Jifan, SONG Linqi, MA Xiao

2026, 48(1): 45-56. doi: 10.11999/JEIT250854

[Abstract](174) [FullText HTML] (73) [PDF 2933KB](136)

Abstract:
Objective Representative Ordered Statistics Decoding (ROSD) is a class of efficient decoding algorithms originally proposed for staircase matrix codes. ROSD supports parallel Gaussian Elimination (GE), enabling low-latency implementations. This paper extends ROSD to general linear block codes using the Minimum-Weight Staircase Generator Matrix (MWSGM) construction, which produces staircase-structured matrices for arbitrary linear codes. Based on this construction, a Multi-Matrix Representative Ordered Statistics Decoding (MM-ROSD) framework is proposed. MM-ROSD exploits the diversity of multiple candidate staircase matrices to improve decoding performance and reduce decoding complexity. For performance evaluation, a saddlepoint-approximation-based analytical framework is developed to predict the upper bound of the Frame Error Rate (FER) and to estimate the required average number of searches. Methods The proposed MM-ROSD algorithm consists of two main components. (1) Multi-matrix construction and selection strategy: In the construction phase, the first

\begin{document}$ M $\end{document}

minimum-weight candidate codewords are retained as the first row, that is, the first staircase. For each candidate, the remaining rows are searched independently. This process generates

\begin{document}$ {M} $\end{document}

staircase generator matrices with enhanced basis diversity. In the decoding phase, the optimal staircase matrix is selected according to the sum of reliabilities of the available re-encoding bases within each candidate matrix. ROSD is then applied to the selected staircase matrix. (2) Saddlepoint-based performance analysis: A saddlepoint approximation method is used to estimate the FER upper bound and the required average number of searches. This analysis provides guidance for complexity-performance trade-offs and parameter tuning. Results and Discussions Extensive simulations are performed over binary phase-shift keying modulated additive white Gaussian noise channels using 5G CA-polar codes

\begin{document}$ \mathcal{C}[128{,}64] $\end{document}

concatenated with an 11-bit Cyclic Redundancy Check (CRC). The main results are summarized as follows. Accuracy of saddlepoint approximation: The predicted FER upper bound closely matches the simulation results over the entire signal-to-noise ratio range. It also tightly approaches both the maximum-likelihood lower bound and the random coding union bound. The estimated average number of searches agrees well with simulation results in the medium and high signal-to-noise ratio regions, validating the accuracy of the analytical framework. Effect of multi-matrix diversity: Increasing the number of pre-stored staircase matrices

\begin{document}$ M $\end{document}

improves basis quality and decoding performance. For example, with

\begin{document}$ {M}\in \{1{,}2,8\} $\end{document}

and a limited maximum number of searches

\begin{document}$ {\ell}_{\max }\in \{{10}^{4},{10}^{5},{10}^{6}\} $\end{document}

, the FER performance improves significantly and approaches the finite-length capacity and the ML lower bounds (Fig. 2(a)). Under a limited search list (e.g.,

\begin{document}$ {\ell}_{\max }={10}^{4} $\end{document}

), both the FER and the average number of searches decrease substantially as

\begin{document}$ M $\end{document}

increases. This improvement mainly results from the higher quality of the re-encoding basis enabled by the multi-matrix strategy. Under larger search budgets (e.g.,

\begin{document}$ {\ell}_{\max }={10}^{6} $\end{document}

), increasing

\begin{document}$ M $\end{document}

primarily reduces the average number of searches. Conclusions This work extends ROSD to general linear block codes and proposes an efficient MM-ROSD framework based on the MWSGM construction. By leveraging the diversity of multiple candidate staircase matrices and the low-latency property of parallel GE, the proposed approach improves decoding performance and reduces the average number of searches. The saddlepoint-based analytical framework accurately predicts both the FER and the average number of searches, providing theoretical support for practical system design. Simulation results show that, under identical maximum search constraints, MM-ROSD achieves notable FER gains and substantial reductions in the average number of searches compared with single-matrix ROSD. These results indicate that MM-ROSD is a promising decoding framework for short-block codes in ultra-reliable low-latency communication and hyper-reliable low-latency communication scenarios.

Power Allocation for Downlink Short Packet Transmission with Superimposed Pilots in Cell-free Massive MIMO

SHEN Luyao, ZHOU Xingguang, XU Zile, WANG Yihang, XIA Wenchao, ZHU Hongbo

2026, 48(1): 57-66. doi: 10.11999/JEIT250655

[Abstract](298) [FullText HTML] (162) [PDF 3223KB](163)

Abstract:
Objective With the advancement of 5th Generation mobile communication, the volume of communication service interactions increases rapidly. To meet this growth in demand, Cell-Free Massive Multiple-Input Multiple-Output (CF-mMIMO) is regarded as a key technology. Multi-user access in CF-mMIMO systems creates complexity in channel estimation. Conventional methods based on Regular Pilots (RP) generate high overhead, which reduces the number of symbols available for data transmission. This reduction lowers the transmission rate, and the effect is stronger in short packet transmission. This study examines a downlink short packet transmission scheme based on Superimposed Pilots (SP) in CF-mMIMO systems to improve short packet transmission performance. Methods This study examines an SP-based downlink short packet transmission scenario in CF-mMIMO systems and proposes a power allocation algorithm. Considering energy consumption and resource constraints in practical settings, a User-Centric (UC) approach is used. Based on the Maximum Ratio Transmission (MRT) precoding scheme, a closed-form expression for the downlink achievable rate is derived under imperfect Channel State Information (CSI). Because pilot signals and data signals create cross-interference, an iterative optimization algorithm based on Geometric Programming (GP) and Successive Convex Approximation (SCA) is developed. The objective is to optimize the power allocation between pilot signals and data signals under the minimum data rate requirement and uplink and downlink power constraints. Using logarithmic function approximation and SCA, the non-convex optimization problem is converted into a GP problem, then an iterative algorithm is designed to obtain the solution. This study also compares the SP scheme with the RP scheme to show the superiority of the SP scheme and the proposed algorithm. Results and Discussions Simulation results confirm the accuracy of the closed-form expressions for the downlink sum rate under both SP and RP schemes (Fig. 2). To assess the effectiveness of the proposed algorithm, a comparative analysis of weighted sum rate is conducted. The comparison considers the proposed power allocation algorithm under both the SP ansd RP schemes, as well as fixed power allocation under the SP scheme. The number of antennas of APs (Fig. 3), the number of UEs (Fig. 4), block length (Fig. 5), and decoding error probability (Fig. 6) are treated as variables. The results show that the weighted sum rate achieved with the proposed power allocation algorithm under the SP scheme is higher than that achieved with the RP scheme and the fixed power allocation scheme. Conclusions This paper investigates the downlink power allocation problem under the SP scheme in CF-mMIMO systems for short packet transmission. The UC scheme is adopted to derive a closed-form expression for the lower bound of the downlink transmission rate under imperfect CSI and MRT precoding. The downlink weighted sum-rate maximization problem for the SP scheme is then formulated, and the non-convex problem is converted into a solvable GP problem through the SCA method. An iterative algorithm is employed to obtain the solution. Simulation results confirm the correctness of the closed-form expression for the transmission rate and show the superiority of the proposed power allocation algorithm.

Short-packet Covert Communication Design for Minimizing Age of Information under Non-ideal Channel Conditions

ZHU Kaiji, MA Ruiqian, LIN Zhi, MA Yue, WANG Yong, GUAN Xinrong, CAI Yueming

2026, 48(1): 67-77. doi: 10.11999/JEIT250836

[Abstract](224) [FullText HTML] (81) [PDF 2849KB](150)

Abstract:
Objective With the rapid development of mobile communication technologies and the widespread adoption of smart devices, the security and timeliness of information transmission are critical. Most existing studies on covert communication assume ideal channel conditions and long packet lengths, which are impractical for delay-sensitive applications. This paper addresses the problem of minimizing the average Covert Age of Information (CAoI) under non-ideal channel conditions caused by limited pilot symbols. The objective is to improve both timeliness and security in short-packet covert communication systems. Methods A system model is considered in which a transmitter sends short packets to a legitimate receiver under the surveillance of a warden. The effects of pilot length and transmit power on channel estimation error are characterized. Based on this analysis, closed-form expressions for the detection error probability and the average CAoI are derived. A joint optimization problem is then formulated to determine the optimal transmit power, total blocklength, and pilot-to-data ratio. This problem is solved using a golden-section search algorithm. Results and Discussions Numerical results show that an optimal total packet length and an optimal pilot-to-data ratio exist for minimizing the average CAoI (Fig. 3). The proposed joint optimization strategy significantly outperforms fixed-ratio schemes (Fig. 4). As the covertness constraint becomes stricter, the transmit power decreases, which requires longer pilot sequences to preserve channel estimation accuracy (Fig. 6(a)). The optimal total packet length is also shown to decrease as the covertness constraint is relaxed (Fig. 6(b)). Additionally, increasing the distance between Alice and Bob degrades the average CAoI performance due to poorer channel conditions (Fig. 5). Conclusions This study optimizes the average CAoI in short-packet covert communication systems with imperfect channel estimation. Closed-form expressions for covertness and CAoI are obtained, and a golden-section search method is applied to dynamically adjust the packet structure to minimize the average CAoI. Numerical results confirm that the optimized design outperforms fixed-allocation methods. The results further show that stricter covertness constraints require longer pilot sequences to compensate for reduced transmit power, providing useful design guidance for latency-sensitive covert wireless systems.

Optimization of Short Packet Communication Resources for UAV Assisted Power Inspection

CHU Hang, DONG Zhihao, CAO Jie, SHI Huaifeng, ZENG Haiyong, ZHU Xu

2026, 48(1): 78-85. doi: 10.11999/JEIT250852

[Abstract](188) [FullText HTML] (94) [PDF 1854KB](45)

Abstract:
Objective In Unmanned Aerial Vehicle (UAV)-assisted power grid inspection, the real-time acquisition and transmission of multi-modal data (key parameters, images, and videos) are essential for secure grid operation. These tasks require heterogeneous communication conditions, including ultra-reliable low-latency transmission and high-bandwidth data delivery. The limited wireless communication resources and UAV energy constraints restrict the ability to meet these conditions and reduce data timeliness and task performance. The present study is designed to establish a collaborative optimization framework for transmission scheduling and communication resource allocation, ensuring minimal system overhead while meeting task performance and reliability requirements. Methods To address the challenges mentioned above, a collaborative optimization framework is established for data transmission scheduling and communication resource allocation. Data transmission scheduling is formulated as a Markov Decision Process (MDP), in which communication consumption is incorporated into the decision cost. At the resource allocation level, Non-Orthogonal Multiple Access (NOMA) technology is applied to increase spectral efficiency. This approach reduces communication cost, maintains transmission reliability, and supports heterogeneous data transmission requirements in UAV-assisted power inspection. Results and Discussions The effectiveness of the proposed framework is verified through comprehensive simulations. A scenario is established in which the UAV is required to collect data from multiple distributed power towers within a designated area. A trade-off is observed between reliability and transmission speed (Fig. 3). At the same transmission rate, the bit error rate is reduced by approximately one order of magnitude. When a minimum long-packet signal-to-noise ratio threshold of 7 dB is applied, the optimized transmission system reduces the bit error rate from the 10^–3 level to the 10^–5 level while requiring only about a 0.4 Mbps decrease in transmission rate. After algorithm optimization, a lower effective signal-to-noise ratio is needed to achieve the same bit error rate; under the same signal-to-noise ratio, the short-packet error performance is improved, indicating more stable system behavior and higher transmission efficiency (Fig. 4). Conclusions This study presents a collaborative optimization framework that addresses the challenges posed by limited communication resources and heterogeneous data transmission requirements in UAV power inspection. By integrating MDP-based adaptive scheduling with NOMA-based joint resource allocation, the framework maintains an appropriate balance between communication performance and system overhead. The findings provide a theoretical and practical foundation for efficient, low-cost, and reliable data transmission in future intelligent autonomous aerial systems.

Low Complexity Sequential Decoding Algorithm of PAC Code for Short Packet Communication

DAI Jingxin, YIN Hang, WANG Yuhuan, LÜ Yansong, YANG Zhanxin, LÜ Rui, XIA Zhiping

2026, 48(1): 86-97. doi: 10.11999/JEIT250533

[Abstract](345) [FullText HTML] (151) [PDF 6380KB](45)

Abstract:
Objective With the rise of the intelligent Internet of Things (IoT), short packet communication among IoT devices must meet stringent requirements for low latency, high reliability, and very short packet length, posing challenges to the design of channel coding schemes. As an advanced variant of polar codes, Polarization-Adjusted Convolutional (PAC) codes enhance the error-correction performance of polar codes at medium and short code lengths, approaching the dispersion bound in some cases. This makes them promising for short packet communication. However, the high decoding complexity required to achieve near-bound error-correction performance limits their practicality. To address this, we propose two low complexity sequential decoding algorithms: Low Complexity Fano Sequential (LC-FS) and Low Complexity Stack (LC-S). Both algorithms effectively reduce decoding complexity with negligible loss in error-correction performance. Methods To reduce the decoding complexity of Fano-based sequential decoding algorithms, we propose the LC-FS algorithm. This method exploits special nodes to terminate decoding at intermediate levels of the decoding tree, thereby reducing the complexity of tree traversal. Special nodes are classified into two types according to decoder structure: low-rate nodes (Type-

\begin{document}$ \mathrm{T} $\end{document}

node) and high-rate nodes [Rate-1 and Single Parity-Check (SPC) nodes]. This classification minimizes unnecessary hardware overhead by avoiding excessive subdivision of special nodes. For each type, a corresponding LC-FS decoder and node-movement strategy are developed. To reduce the complexity of stack-based decoding algorithms, we propose the LC-S algorithm. While preserving the low backtracking feature of stack-based decoding, this method introduces tailored decoding structures and node-movement strategies for low-rate and high-rate special nodes. Therefore, the LC-S algorithm achieves significant complexity reduction without compromising error-correction performance. Results and Discussions The performance of the proposed LC-FS and LC-S decoding algorithms is evaluated through extensive simulations in terms of Frame Error Rate (FER), Average Computational Complexity (ACC), Maximum Computational Complexity (MCC), and memory requirements. Traditional Fano sequential, traditional stack, and Fast Fano Sequential (FFS) decoding algorithms are set as benchmarks. The simulation results show that the LC-FS and LC-S algorithms exhibit negligible error-correction performance loss compared with traditional Fano sequential and stack decoders (Fig. 5). Across different PAC codes, both algorithms effectively reduce decoding complexity. Specifically, as increases, the reductions in ACC and MCC become more pronounced. For ACC, LC-FS decoding algorithm (

\begin{document}$T = 4$\end{document}

) achieves reductions of 13.77% (

\begin{document}$N = 256$\end{document}

,

\begin{document}$K = 128$\end{document}

), 11.42% (

\begin{document}$N = 128$\end{document}

,

\begin{document}$K = 64$\end{document}

), and 25.52% (

\begin{document}$N = 64$\end{document}

,

\begin{document}$K = 32$\end{document}

) on average compared with FFS (Fig. 6). LC-S decoding algorithm (

\begin{document}$T = 4$\end{document}

) reduces ACC by 56.48% (

\begin{document}$N = 256$\end{document}

,

\begin{document}$K = 128$\end{document}

), 47.63% (

\begin{document}$N = 128$\end{document}

,

\begin{document}$K = 64$\end{document}

), and 49.61% (

\begin{document}$N = 64$\end{document}

,

\begin{document}$K = 32$\end{document}

) on average compared with the traditional stack algorithm (Fig. 6). For MCC, LC-FS decoding algorithm (

\begin{document}$T = 4$\end{document}

) achieves reductions of 29.71% (

\begin{document}$N = 256$\end{document}

,

\begin{document}$K = 128$\end{document}

), 21.18% (

\begin{document}$N = 128$\end{document}

,

\begin{document}$K = 64$\end{document}

), and 23.62% (

\begin{document}$N = 64$\end{document}

,

\begin{document}$K = 32$\end{document}

) on average compared with FFS (Fig. 7). LC-S decoding algorithm (

\begin{document}$T = 4$\end{document}

) reduces MCC by 67.17% (

\begin{document}$N = 256$\end{document}

,

\begin{document}$K = 128$\end{document}

), 49.33% (

\begin{document}$N = 128$\end{document}

,

\begin{document}$K = 64$\end{document}

), and 51.84% (

\begin{document}$N = 64$\end{document}

,

\begin{document}$K = 32$\end{document}

) on average compared with the traditional stack algorithm (Fig. 7). By exploiting low-rate and high-rate special nodes to terminate decoding at intermediate levels of the decoding tree, the LC-FS and LC-S algorithms also reduce memory requirements (Table 2). However, as

\begin{document}$T$\end{document}

increases, the memory usage of LC-S rises because all extended paths of low-rate special nodes are pushed into the stack. The increase in

\begin{document}$T$\end{document}

enlarges the number of extended paths, indicating its critical role in balancing decoding complexity and memory occupation (Fig. 8). Conclusions To address the high decoding complexity of sequential decoding algorithms for PAC codes, this paper proposes two low complexity approaches: the LC-FS and LC-S algorithms. Both methods classify special nodes into low-rate and high-rate categories and design corresponding decoders and movement strategies. By introducing Type-

\begin{document}$ \mathrm{T} $\end{document}

nodes, the algorithms further eliminate redundant computations during decoding, thereby reducing complexity. Simulation results demonstrate that the LC-FS and LC-S algorithms substantially decrease decoding complexity while maintaining the error-correction performance of PAC codes at medium and short code lengths.

Research on GFRA Preamble Design and Active Device Detection Technology for Short-Packet Communication in LEO Satellite IoT

DAI Jianmei, ZHANG Mengchen, LI Keying, SU Qi, CHENG Ying, WANG Xianpeng, XU Rong

2026, 48(1): 98-106. doi: 10.11999/JEIT250609

[Abstract](270) [FullText HTML] (128) [PDF 3178KB](12)

Abstract:
Objective To address preamble collision and high detection complexity in massive device random access for Low-Earth Orbit Satellite Internet of Things (LEO-IoT) short-packet communication, and to overcome the limitations of traditional random access schemes in preamble pool capacity and detection efficiency, thereby enabling highly reliable access for massive devices. Methods A Grant-Free Random Access (GFRA) scheme is adopted, and a three-pilot superimposed preamble structure with a cyclic prefix is constructed. The proposed preamble structure preserves time-frequency resource efficiency and further expands the pilot code pool capacity. To satisfy the detection requirements of superimposed preambles, a dynamic detection algorithm based on idle preamble search is proposed. This algorithm reduces computational complexity and improves detection accuracy. Results and Discussions Under the GFRA mode, a three-pilot superimposed preamble structure with a cyclic prefix is constructed (Fig. 3). The pilot code pool capacity is increased to 3.2 times that of traditional schemes, whereas time-frequency resource efficiency is maintained (Fig. 4, Fig. 5, Fig. 6). For superimposed preamble detection, a dynamic detection algorithm based on idle preamble search is proposed (Algorithm 1). Compared with the traditional exhaustive search method, the proposed algorithm reduces computational complexity to 18.7% of the original scheme while maintaining a detection accuracy of 99.5% (Fig. 7). Theoretical analysis shows that the proposed scheme achieves a Signal-to-Interference-plus-Noise Ratio (SINR) gain of 3.8 dB at a Bit Error Rate (BER) of 10^–5. Simulation results indicate that the miss detection rate remains below 2% when the device activation rate exceeds 80% (Fig. 10). Compared with compressed sensing methods, the proposed algorithm provides a more favorable balance between detection accuracy and computational complexity. Its polynomial-level complexity improves practicality for real LEO-IoT systems (Fig. 13, Fig. 14). Conclusions The proposed superimposed preamble structure and dynamic detection algorithm effectively mitigate preamble collision, significantly reduce detection complexity, and achieve a clear SINR gain with a low miss detection rate. The scheme shows strong performance and robustness under high-load and asynchronous LEO-IoT access conditions, supporting its suitability for practical deployment.

IRS Deployment for Highly Time Sensitive Short Packet Communications: Distributed or Centralized Deployment?

ZHANG Yangyi, GUAN Xinrong, YANG Weiwei, CAO Kuo, WANG Meng, CAI Yueming

2026, 48(1): 107-115. doi: 10.11999/JEIT250720

[Abstract](228) [FullText HTML] (79) [PDF 2610KB](21)

Abstract:
Objective The rapid advancement of the Industrial Internet of Things (IIoT) creates latency-sensitive applications such as environmental monitoring and precision control, which depend on short-packet communications and require strict timeliness of information delivery. An Intelligent Reflecting Surface (IRS) is regarded as a feasible method to enhance the reliability and timeliness of these communications because its reflection coefficients can be dynamically adjusted. Previous work has mainly focused on optimizing the phase shifts of IRS elements, whereas the potential gains associated with flexible IRS deployment have not been fully examined. Adjusting the physical placement of IRS units provides additional degrees of freedom that can improve timeliness performance. Two representative deployment strategies, distributed IRS and centralized IRS, form different effective channels and result in different capacity characteristics. This study investigates and compares these deployment modes in IRS-assisted short-packet communication systems. By assessing their Age of Information (AoI) performance under practical channel estimation overheads, the analysis offers guidance on selecting deployment strategies that achieve superior timeliness under diverse system conditions. Methods The paper investigates an IRS-assisted short-packet communication system in which multiple terminal devices transmit short packets to an Access Point (AP) through IRS reflection. Two deployment strategies are considered: distributed and centralized IRS. In the distributed scheme, each device is supported by a dedicated IRS with M reflecting elements positioned nearby. In the centralized scheme, all IRS elements are placed near the AP. The average AoI is used as the performance metric to compare the timeliness of these strategies. The complex distribution of the composite channel gain makes closed-form average AoI analysis difficult. To address this issue, the Moment Matching (MM) approximation is employed to estimate the distribution of the composite channel gain. By incorporating pilot overhead into the analytical model, closed-form expressions for the average AoI of both deployment schemes are obtained, enabling a thorough performance comparison. Results and Discussions Simulation results show that the AoI performance of distributed and centralized IRS deployments differs under varying system conditions. When the IRS carries a large number of reflecting elements, the distributed configuration yields better AoI performance (Fig. 4). Under high transmission power, the centralized configuration presents improved AoI performance (Fig. 5). For scenarios with long AP-device distances, the distributed deployment produces more favorable AoI results (Fig. 6). As the system bandwidth increases, the centralized architecture shows a rapid decrease in AoI and eventually performs better than the distributed configuration (Fig. 7). Conclusions This study provides a comparative analysis of timeliness performance in IRS-assisted short-packet communication systems under distributed and centralized deployment strategies. The MM method is employed to approximate the composite channel gain with a gamma distribution, which supports the derivation of an approximate expression for the average packet error rate. A closed-form expression for the average AoI is then developed by accounting for channel estimation overhead. Simulation results show that the two deployment strategies exhibit different AoI advantages under varying operating conditions. The distributed configuration achieves better AoI performance when a large number of reflecting elements is used or when the AP-device distance is long. The centralized configuration provides improved AoI performance under high transmission power or wide system bandwidth.

Group-based Sparse Vector Codes for Short-Packet Communications

ZHANG Xuewan, ZHANG Di, GU Bo

2026, 48(1): 116-125. doi: 10.11999/JEIT251143

[Abstract](172) [FullText HTML] (68) [PDF 4040KB](13)

Abstract:
Objective Sparse Vector Codes (SVC) aim to construct sparse underdetermined linear systems and have attracted wide interest for short-packet Ultra-Reliable and Low-Latency Communications (URLLC) because of their simple implementation and reliable transmission. To guarantee system performance, short sparse vectors that can be transmitted using small-size random spreading codebooks are required. However, most existing sparse transformation schemes based on index modulation adopt a global selection strategy, where nonzero positions, to which transmission bits are mapped, are selected directly from the entire set of available positional resources in the sparse vector. Under high coding efficiency requirements, this strategy often leads to excessively long sparse vectors and a sharp degradation in transmission performance. To address this issue, a Group-based Sparse Vector Code (GSVC) scheme is proposed. Unlike the conventional global sparse mapping approach, GSVC divides index bits into groups and sequentially determines the nonzero positions for each group within a predefined sparse vector. This design enables positional resource sharing among all groups and generates compressed sparse vectors with higher positional resource utilization, thereby achieving Better Block Error Rate (BLER) performance than conventional SVC schemes. Methods The proposed GSVC scheme partitions the total number of nonzero positions N into V groups. Within a single predefined sparse vector, each group sequentially selects its N/V nonzero positions through index modulation. To prevent position selection conflicts among groups, a resource supplementation and elimination mechanism is applied. This mechanism ensures that the selected positions are mutually exclusive and that each group maintains the same number of available positional resources throughout the selection process. Given the sparsity of the constructed vector, a low-complexity sparse recovery algorithm is employed at the receiver. Accordingly, a GSVC decoder based on the Multipath Matching Pursuit (MMP) algorithm is designed. To enable accurate identification of the group affiliation associated with each nonzero position, GSVC adopts a multi-constellation mapping strategy for the nonzero elements. The receiver performs constellation matching by exploiting the unique characteristics of each constellation, thereby determining group affiliation and ensuring a high probability of successful decoding. Results and Discussions By enabling different groups to share positional resources through group-based nonzero position selection, GSVC effectively compresses the sparse vector and improves transmission reliability. Simulation results show that the GSVC decoder based on MMP significantly outperforms the decoder based on the Orthogonal Approximate Message Passing (OAMP) algorithm (Fig. 3). At lower modulation orders, GSVC achieves better BLER performance than existing schemes, including enhanced SVC, multi-rotation constellation-based SVC, and index-redefined SVC (Fig. 4 and Fig. 5). When the number of Orthogonal Frequency Division Multiplexing (OFDM) subcarriers is large, GSVC provides the best BLER performance among all compared schemes (Fig. 6). In addition, for a fixed number of nonzero entries per group, the BLER performance advantage of GSVC increases as the number of groups increases. A performance gain exceeding 1 dB over the second-best SVC scheme is observed at a BLER of 10^–5 (Fig. 7). Compared with polar codes (Fig. 8), GSVC achieves better BLER performance without Cyclic Redundancy Check (CRC) assistance and even outperforms CRC-aided polar codes. Conclusions This paper proposes a GSVC scheme to address the excessive sparse vector length encountered in conventional index modulation-based SVC systems. The central feature of GSVC is a grouped nonzero position selection mechanism that enables multiple groups to share positional resources within a predefined sparse vector, thereby reducing the overall vector length. A dedicated multi-constellation mapping design, together with well-defined resource allocation rules, ensures conflict-free and efficient utilization of positional resources. Simulation results demonstrate that (1) the GSVC decoder implemented using MMP significantly outperforms decoders based on the OAMP algorithm; (2) GSVC achieves superior BLER performance compared with enhanced SVC, multi-rotation constellation-based SVC, and index-redefined SVC schemes, particularly at lower modulation orders and with a large number of OFDM subcarriers; and (3) GSVC surpasses the BLER performance of CRC-aided polar codes without requiring CRC. Future work will focus on optimizing the grouping strategy and examining the transmission performance of SVC under imperfect channel estimation to improve robustness in practical communication systems.

Performance Analysis of Double RIS-Assisted Multi-Antenna Cooperative NOMA with Short-Packet Communication

SONG Wenbin, CHEN Dechuan, ZHANG Xingang, WANG Zhipeng, SUN Xiaolin, WANG Baoping

2026, 48(1): 126-134. doi: 10.11999/JEIT250761

[Abstract](236) [FullText HTML] (103) [PDF 1278KB](30)

Abstract:
Objective Existing studies on short-packet communication systems usually assume ideal transceiver hardware, although actual radio-frequency devices experience hardware impairments such as phase noise and amplifier nonlinearity. These impairments are more evident in short-packet communication because low-cost components are commonly used. The reliable performance of Reconfigurable Intelligent Surface (RIS)-assisted Multi-Antenna Cooperative Non-Orthogonal Multiple Access (NOMA) short-packet communication systems under hardware impairments has not been investigated. Furthermore, the impact of the number of Base Station (BS) antennas and RIS reflecting elements on reliable performance remain unclear. Therefore, this study examines reliable performance for a double RIS-assisted Multi-Antenna Cooperative NOMA short-packet communication system in which one RIS supports communication between a Multi-Antenna BS and a near user, and the other RIS strengthens communication between the near user and a far user. Methods Based on finite-blocklength information theory, closed-form expressions for the average Block Error Rate (BLER) of the near user and far user are derived under the optimal antenna-selection strategy. These expressions provide an efficient and convenient way to assess system reliability. The effective throughput is then formulated, and the optimal blocklength that maximizes this throughput under reliability and latency constraints is obtained. Results and Discussions The theoretical average BLER matches the Monte Carlo simulation results, confirming the validity of the derivations. The average BLER of the near user and far user decreases as the transmit Signal-to-Noise Ratio (SNR) increases. For a given transmit SNR, increasing the blocklength markedly reduces the average BLER for both users (Fig. 2) because longer blocklengths lower the transmission rate, which enhances system reliability. The double RIS-assisted transmission scheme achieves superior performance compared with the single RIS-assisted and non-RIS-assisted schemes (Fig. 3). As the number of RIS reflecting elements increases, the performance advantage of the proposed scheme becomes more evident. The average BLER of the far user saturates as the number of BS antennas increases (Fig. 4) because the relaying link becomes the dominant reliability bottleneck once the BS antenna count exceeds a certain value. As the blocklength increases, the effective throughput first reaches a maximum and then decreases (Fig. 5). When the blocklength is too small, higher BLER results in poor effective throughput. When the blocklength is too large, the reduced transmission rate also leads to poor effective throughput. As hardware quality improves, the optimal blocklength decreases because lower hardware impairments reduce decoding errors, allowing shorter blocklengths to be used to reduce latency while maintaining required reliability. Conclusions This paper investigates the performance of a double RIS-assisted Multi-Antenna Cooperative NOMA short-packet communication system under hardware impairments. Closed-form expressions for the average BLER of the near user and far user are derived under the optimal antenna-selection strategy. The effective throughput is analyzed, and the optimal blocklength that maximizes this throughput under reliability and latency constraints is determined. Simulation results show that the double RIS-assisted transmission scheme achieves superior performance compared with the single RIS-assisted and non-RIS-assisted schemes. Increasing the number of BS antennas does not always improve the average BLER of the far user because the relaying link becomes the limiting factor. Improved hardware quality enhances short-packet communication efficiency by reducing the optimal blocklength. Future work will explore RIS-configuration strategies that maximize energy efficiency and ensure user fairness in NOMA to support energy-constrained IoT devices.

Short Packet Secure Covert Communication Design and Optimization

TIAN Bo, YANG Weiwei, SHA Li, SHANG Zhihui, CAO Kuo, LIU Changming

2026, 48(1): 135-144. doi: 10.11999/JEIT250800

[Abstract](271) [FullText HTML] (152) [PDF 2245KB](40)

Abstract:
Objective The study addresses the dual security threats of eavesdropping and detection in Multiple-Input Single-Output (MISO) communication systems under short packet transmission conditions. An integrated secure and covert transmission scheme is proposed, combining physical layer security with covert communication techniques. The approach aims to overcome the limitations of conventional encryption in short packet scenarios, enhance communication concealment, and ensure information confidentiality. The optimization objective is to maximize the Average Effective Secrecy and Covert Rate (AESCR) through the joint optimization of packet length and transmit power, thereby providing robust security for low-latency Internet of Things (IoT) applications. Methods An MISO system model employing MRT beamforming is adopted to exploit spatial degrees of freedom for improved security. Through theoretical analysis, closed-form expressions are derived for the warden’s (Willie’s) optimal detection threshold and minimum detection error probability. A statistical covertness constraint based on Kullback-Leibler (KL) divergence is formulated to convert intractable instantaneous requirements into a tractable average constraint. A new performance metric, the AESCR, is proposed to comprehensively assess system performance in terms of covertness, secrecy, and reliability. The optimization strategy centers on the joint design of packet length and transmit power. By utilizing the inherent coupling between these variables, the original dual-variable maximization problem is reformulated into a tractable form solvable through an efficient one-dimensional search. Results and Discussions Simulation results confirm the theoretical analysis, showing close consistency between the derived expressions and Monte Carlo simulations for Willie’s detection error probability. The findings indicate that multi-antenna configurations markedly enhance the AESCR by directing signal energy toward the legitimate receiver and reducing eavesdropping risk. The proposed joint optimization of transmit power and packet length achieves a substantially higher AESCR than power-only optimization, particularly under stringent covertness constraints. The study further reveals key trade-offs: an optimal packet length exists that balances coding gain and exposure risk, while relaxed covertness constraints yield continuous improvements in AESCR. Moreover, multi-antenna technology is shown to be crucial for mitigating the inherent low-power limitations of covert communication. Conclusions This study presents an integrated framework for secure and covert communication in short packet MISO systems, achieving notable performance gains through the joint optimization of transmit power and packet length. The main contributions include: (1) a transmission architecture that combines security and covertness, supported by closed-form solutions for the warden’s detection threshold and error probability under a KL divergence-based constraint; (2) the introduction of the AESCR metric, which unifies the assessment of secrecy, covertness, and reliability; and (3) the formulation and efficient resolution of the AESCR maximization problem. Simulation results verify that the proposed joint optimization strategy exceeds power-only optimization, particularly under stringent covertness conditions. The AESCR increases monotonically with the number of transmit antennas, and an optimal packet length is identified that balances transmission efficiency and covertness.

Age of Information for Energy Harvesting-Driven LoRa Short-Packet Communication Networks

XIAO Shuyu, SUN Xinghua, YUAN Anshan, ZHAN Wen, CHEN Xiang

2026, 48(1): 145-156. doi: 10.11999/JEIT250814

[Abstract](239) [FullText HTML] (115) [PDF 5386KB](10)

Abstract:
Objective In short-packet communication scenarios for the Industrial Internet of Things (IIoT), devices operate under stringent energy constraints, whereas certain applications require timely data delivery, which makes real-time performance difficult to guarantee. To address this issue, this study analyzes information freshness in Energy Harvesting (EH) networks and examines the effects of energy storage capacity, random access strategies, and packet block length on the Age of Information (AoI). The objective is to provide effective optimization guidelines for the design of practical IIoT communication systems. Methods An accurate system model is established based on short-packet communication theory, random access mechanisms, and EH models. The charging and discharging processes of the energy queue are characterized as a Markov chain, from which the steady-state distribution of energy states is derived, followed by a general expression for the average AoI. A mathematical optimization problem is then formulated to minimize the average AoI. To improve practical applicability, two extreme battery-capacity scenarios are considered. For the minimum battery capacity case, a closed-form analytical solution for the optimal packet generation probability is obtained. For the ideal infinite battery capacity case, the packet generation probability and packet block length are jointly optimized, yielding closed-form optimal solutions for both parameters. Extensive simulations are conducted to evaluate the average AoI under different network parameter settings and to verify the effectiveness of the proposed optimization strategies. Results and Discussions An analytical expression for the average AoI is derived, and its optimization is investigated under two extreme battery-capacity conditions. For the minimum battery capacity case, the optimal packet generation probability balances update frequency and channel collision (Fig. 5). As the network size increases, the optimal packet generation probability decreases, which significantly improves the average AoI (Theorem 1; Fig. 6). For the ideal infinite battery capacity case, both packet block length and packet generation probability affect the average AoI (Fig. 7). With a fixed packet generation probability, optimizing the packet block length reduces the AoI, which indicates the existence of an optimal block length that balances transmission reliability and energy consumption. When the packet block length is fixed, a low packet generation probability leads to infrequent updates and increased delay, whereas a high probability increases collision in the Energy-Sufficient Regime (ESR) but enables more efficient utilization of energy and channel resources in the Energy-Limited Regime (ELR). Joint optimization of the packet block length and packet generation probability is consistent with the solution obtained via exhaustive search (Theorem 2; Fig. 8). The optimal packet block length increases with network size. In the ELR, the optimal packet generation probability remains equal to one, whereas it decreases with network size to balance update frequency and collision risk (Fig. 9, Fig. 10). In addition, the average AoI varies with the energy arrival rate, which reveals the effects of battery capacity and packet generation probability on overall system performance (Fig. 11). Conclusions For the minimum battery capacity case, the average AoI is minimized when the packet generation probability is set to its theoretical optimal value. Under ideal infinite battery capacity, both the packet generation probability and the packet block length must be jointly configured to their respective theoretical optimal values to achieve the minimum average AoI. Theoretical analysis shows that the selection of the optimal packet block length requires a trade-off between decoding error probability and energy consumption. In the ELR, when the packet block length is preconfigured to its optimal value, an energy buffer supporting a single transmission is sufficient, which allows network nodes to adapt effectively to external energy supply limitations. Network nodes should actively access the channel to fully utilize harvested energy and maintain timely information updates, thereby achieving the optimal average AoI. In contrast, under abundant energy conditions or in large-scale networks, network nodes should adjust the packet generation probability to balance channel collision and update frequency. Simulation results further confirm the proposed optimization strategy and demonstrate that the optimized LoRa network significantly improves information timeliness, which provides theoretical guidance for the design of low-power short-packet communication systems.

Coalition Formation Game based User and Networking Method for Status Update Satellite Internet of Things

GAO Zhixiang, LIU Aijun, HAN Chen, ZHANG Senbai, LIN Xin

2026, 48(1): 157-167. doi: 10.11999/JEIT250838

[Abstract](175) [FullText HTML] (151) [PDF 3002KB](15)

Abstract:
Objective Satellite communication has become a major focus in the development of next-generation wireless networks due to its advantages of wide coverage, long communication distance, and high flexibility in networking. Short-packet communication represents a critical scenario in the Satellite Internet of Things (S-IoT). However, research on the status update problem for massive users remains limited. It is necessary to design reasonable user-networking schemes to address the contradiction between massive user access demands and limited communication resources. In addition, under the condition of large-scale user access, the design of user-networking schemes with low complexity remains a key research challenge. This study presents a solution for status updates in S-IoT based on dynamic orthogonal access for massive users. Methods In the S-IoT, a state update model for user orthogonal dual-layer access is established. A dual-layer networking scheme is proposed in which users dynamically allocate bandwidth to access the base station, and the base station adopts time-slot polling to access the satellite. The closed-form expression of the average Age of Information (aAoI) for users is derived based on short-packet communication theory, and a simplified approximate expression is further obtained under high signal-to-noise ratio conditions. Subsequently, a distributed Dual-layer Coalition Formation Game User-base Station-Satellite Networking (DCFGUSSN) algorithm is proposed based on the coalition formation game framework. Results and Discussions The approximate aAoI expression effectively reduces computational complexity. The exact potential game is used to demonstrate that the proposed DCFGUSSN algorithm achieves stable networking formation. Simulation results verify the correctness of the theoretical analysis of user aAoI in the proposed state update model (Fig. 5). The results further indicate that with an increasing number of iterations, the user’s aAoI gradually decreases and eventually converges (Fig. 6). Compared with other access schemes, the proposed dual-layer access scheme achieves a lower aAoI (Figs. 7

\begin{document}$ \sim $\end{document}

9). Conclusions This study investigates the networking problem of massive users assisted by base stations in the status update S-IoT. A dynamic dual-layer user access framework and the corresponding status update model are first established. Based on this framework, the DCFGUSSN algorithm is proposed to reduce user’s aAoI. Theoretical and simulation results show strong consistency, and the proposed algorithm demonstrates significant performance improvement compared with traditional algorithms.

A Review of Compressed Sensing Technology for Efficient Receiving and Processing of Communication Signal

CHENG Yiting, DONG Tao, SU Yuwei, WEN Xiaojie, YANG Taojun, LI Yibo

2026, 48(1): 168-182. doi: 10.11999/JEIT250855

[Abstract](309) [FullText HTML] (164) [PDF 8960KB](56)

Abstract:
Significance (1)Lower data acquisition and storage costs: By exploiting signal sparsity and designing effective dictionary and measurement matrices, compressed sensing enables reconstruction below the Nyquist sampling rate, making it suitable for resource-constrained environments; (2)Smaller pilot overhead: With sparse prior information and optimized observation design, compressed sensing reduces pilot overhead compared with traditional schemes. This saving releases spectrum resources and improves transmission efficiency; (3)Higher signal processing efficiency: Compressed sensing enhances channel estimation performance by approximately 3

\begin{document}$ \sim $\end{document}

5 dB under the same data volume and achieves linear computational complexity, which is markedly lower than that of conventional super-linear approaches. Progress Between 2006 and 2009, compressed sensing progressed rapidly. Candès established the theoretical basis by converting zero-norm sparsity into a convex one-norm formulation under the Restricted Isometry Property (RIP). Aharon et al. then introduced dictionary matrices to strengthen sparse representation, and Needell et al. applied greedy algorithms to speed up reconstruction. From 2010 to 2020, research shifted toward engineering application and algorithm refinement. Wu et al. proposed more robust recovery strategies to improve adaptability, and Zayyani et al. later advanced AI-based dictionary learning. Since 2020, compressed sensing has integrated with deep learning for data-driven sparse modelling and reconstruction. Liu’s work in Integrated Sensing-And-Communication (ISAC) systems demonstrates this trend and supports deployment in next-generation communication networks. Conclusion This paper reviews compressed sensing for efficient receiving and processing of communication signal across three dimensions: current progress, key technical challenges, and future directions. It highlights three main research pathways, including dictionary matrix design, measurement matrix development, and reconstruction strategies. The review also shows that compressed sensing is moving toward greater adaptiveness, lightweight design, and intelligence. Current challenges are also summarized, including high computational cost, limited adaptability, and reduced performance under non-ideal conditions. These observations provide guidance for further study. Prospects (1)Research on relaxed sparse condition: Existing sparsity assumptions remain strict and constrain the use of compressed sensing in high-dimensional or non-stationary scenarios where ideal sparse representations are difficult to obtain. Loosening sparse requirements is therefore essential. Present work explores adaptive dictionary learning, structured sparse priors, and neural-network-driven relaxation, yet issues persist, such as dependence on prior assumptions, insufficient interpretability, and lack of theoretical convergence. Future work may refine optimization objectives, develop neural models with clear mathematical interpretation, and establish sparse representation methods that do not rely on rigid sparsity priors. (2)Research on algorithm complexity: Further complexity reduction is required in non-stationary time-varying channels, high-dimensional processing, and long-sequence reconstruction. Promising directions include pre-trained dictionary models, deep-learning-based structured measurement matrices, and robust deep reconstruction networks. (3)Research on algorithm adaptability: Practical systems face noise, spectrum fragmentation, fading, and multipath propagation, with stronger effects in cognitive radio and integrated sensing applications. Adaptive strategies should therefore be prioritized. Possible solutions include dynamic sliding-window modelling or optimized regularization for adaptive dictionaries, structured measurement matrices with tunable parameters, and semi-supervised reconstruction algorithms. (4)Research on non-cooperative user detection: Spectrum scarcity heightens the need for efficient sensing to manage uncoordinated users and prevent high-frequency occupancy. Future research may integrate deep learning with statistical models or embed time-frequency information in online dictionary learning to enhance generalization. Multi-objective design of adaptive measurement matrices may further support reliable detection of non-cooperative users.

Optimization of Energy Consumption in Semantic Communication Networks for Image Recovery Tasks

CHEN Yang, MA Huan, JI Zhi, LI Yingqi, LIANG Jiayu, GUO Lan

2026, 48(1): 183-190. doi: 10.11999/JEIT250915

[Abstract](186) [FullText HTML] (96) [PDF 4909KB](13)

Abstract:
Objective With the rapid development of semantic communication and the increasing demand for high-fidelity image recovery, high computational and transmission energy consumption remains a key factor limiting network deployment. Existing resource management strategies are largely static and show limited adaptability to dynamic wireless environments and user mobility. To address these issues, a robust energy optimization strategy driven by a modified Multi-Agent Proximal Policy Optimization (MAPPO) algorithm is proposed. By jointly optimizing communication and computing resources, the total network energy consumption is minimized while strictly satisfying multi-dimensional constraints, including latency and image recovery quality. Methods First, a theoretical model of the semantic communication network is constructed, and a closed-form expression for the user Symbol Error Rate (SER) is derived through asymptotic analysis of the uplink Signal-to-Interference-plus-Noise Ratio (SINR). Subsequently, the coupling relationships among semantic extraction rate, transmit power, computing resources, and network energy consumption are quantified. On this basis, a joint optimization model is formulated to minimize total energy consumption under constraints of delay, accuracy, and reliability. To solve this mixed-integer nonlinear programming problem, a modified MAPPO algorithm is designed. The algorithm integrates Long Short-Term Memory (LSTM) networks to capture temporal dynamics of user positions and channel states, and introduces a noise mechanism into the global state and advantage function to improve policy exploration and robustness. Results and Discussions Simulation results show that the proposed algorithm consistently outperforms baseline methods, including standard MAPPO, NOISE-MAPPO, LSTM-MAPPO, MADDPG, and greedy algorithms. The proposed strategy accelerates training convergence by 66.7%～80% relative to the benchmarks. In dynamic environments, network energy consumption stability is improved by approximately 50%, and user latency stability is enhanced by more than 96%. Additionally, the average SER is reduced by 4%～16.33% without degrading final image recovery performance, demonstrating an effective balance between energy efficiency and task reliability. Conclusions This study addresses energy optimization in semantic communication networks by combining theoretical modeling with a modified deep reinforcement learning framework. The proposed decision-making approach enhances the standard MAPPO algorithm through LSTM-based temporal feature extraction and noise-assisted robust exploration. Simulation results in dynamic single-cell and multi-cell scenarios show that the method improves convergence efficiency and system stability, and achieves a favorable trade-off between energy consumption and service quality. These results provide a theoretical basis and an efficient resource management framework for future energy-constrained semantic communication systems.

A Sparse-Reconstruction-Based Fast Localization Algorithm for Mixed Far-Field and Near-Field Sources

FU Shijian, QIU Longhao, LIANG Guolong

2026, 48(1): 191-201. doi: 10.11999/JEIT250165

[Abstract](368) [FullText HTML] (150) [PDF 3920KB](27)

Abstract:
Objective Source localization is a key research topic in array signal processing, with applications in radar, sonar, and wireless communications. Conventional localization methods based solely on far-field or near-field models face clear limitations when separating and localizing mixed far-field and near-field sources. Existing approaches, such as subspace-based methods, often show high computational complexity, limited localization accuracy, and degraded performance under low Signal-to-Noise Ratio (SNR) conditions. In addition, many methods assume that near-field sources lie strictly within the Fresnel region, which leads to localization errors and a reduced effective array aperture. Improved algorithms, such as Multiple Sparse Bayesian Learning for Far- and Near-Field Sources (FN-MSBL), overcome part of these limitations and achieve higher localization accuracy. However, their reliance on iterative matrix inversion leads to high computational cost and restricts real-time applicability. Therefore, this study aims to address these issues by proposing a novel algorithm that develops a sparse representation model for mixed far-field and near-field sources in the covariance domain and integrates sparse reconstruction with the Generalized Approximate Message Passing (GAMP) and Variational Bayesian Inference (VBI) frameworks. The objective is to achieve high-precision localization of mixed sources while substantially reducing computational cost. Methods Two algorithms, termed Covariance-Based VBI for Far- and Near-Field Sources (FN-CVBI) and Covariance-Based GAMP-VBI for Far- and Near-Field Sources (FN-GAMP-CVBI), are developed. First, a unified sparse representation model for mixed far-field and near-field sources is constructed based on the covariance vector. This representation benefits from the improved SNR of the covariance vector relative to the original array output, which improves far-field Direction of Arrival (DOA) estimation. Second, to reduce estimation errors in the sample covariance matrix, a pre-whitening operation is applied to the covariance vector to minimize inter-element correlation and improve robustness. Third, a hierarchical Bayesian model is established to impose sparsity, and VBI is employed to estimate model parameters through iterative posterior updates. Fourth, to reduce the computational burden associated with conventional VBI, GAMP is embedded into the VBI framework to replace matrix inversion operations. The detailed implementation of GAMP is given in Algirithm1. By combining sparse reconstruction, VBI, and GAMP, the proposed approach improves localization accuracy while markedly reducing computational complexity. Results and Discussions The proposed FN-GAMP-CVBI algorithm shows clear improvements in both localization accuracy and computational efficiency. Complexity analysis indicates a substantial reduction in computational cost (Table 1). In terms of localization performance, FN-CVBI and FN-GAMP-CVBI outperform comparative methods, including LOFNS and FN-MSBL (Fig. 3, Fig. 4), particularly under low SNR conditions and with sufficient snapshots (Fig. 5, Fig. 6). The proposed methods also show strong capability in resolving closely spaced far-field sources (Fig. 7). Experimental validation using lake trial data confirms these findings, as reflected by sharper spectral peaks and fewer false peaks in the background noise of the Bearing Time Recording (BTR) results (Fig. 9). FN-CVBI achieves the highest accuracy in far-field DOA estimation and near-field localization. The computational time of FN-GAMP-CVBI is reduced by up to 95% compared with FN-MSBL (Table 3), demonstrating its suitability for real-time applications. Conclusions A sparse-reconstruction-based approach for mixed far-field and near-field source localization is presented by integrating sparse reconstruction with the GAMP-VBI framework. The proposed FN-GAMP-CVBI algorithm addresses the limitations of existing methods and achieves a balanced trade-off between localization accuracy and computational efficiency. Simulation results confirm superior performance, especially under low SNR conditions with sufficient snapshots, and experimental results further support the effectiveness of the approach. The low computational complexity and ability to handle mixed-source scenarios indicate that the proposed algorithm is well suited for real-time localization in complex environments.

Robust Adaptive Beamforming for Sparse Arrays

FAN Xuhui, WANG Yuyi, WANG Anyi, XU Yanhong, CUI Can

2026, 48(1): 202-211. doi: 10.11999/JEIT250952

[Abstract](389) [FullText HTML] (252) [PDF 3643KB](25)

Abstract:
Objective The rapid development of modern communication technologies, such as 5G networks and Internet of Things (IoT) applications, increases the complexity of signal processing in wireless communication and radar systems. Adaptive beamforming is widely used because it extracts the signal of interest in the presence of interference and noise. Traditional robust adaptive beamforming methods address steering vector mismatch, which may result from environmental nonstationarity, Direction-Of-Arrival (DOA) estimation errors, imperfect array calibration, antenna deformation, and local scattering. However, they do not leverage the advantages of the Sparse Array (SA), which reduces hardware complexity and system cost. They also often fail to suppress SideLobe Levels (SLLs) under interference conditions, limiting their effectiveness in complex electromagnetic environments. To address these issues, a robust adaptive beamforming algorithm is proposed that incorporates SA and low-SLL constraints. Methods Unlike conventional sparse approaches that place the l₀ norm penalty in the objective function, the proposed method introduces the l₀ norm into the constraint. This formulation ensures that the optimized array configuration meets the pre-specified number of active sensors and avoids the uncertainty associated with sparse-weight tuning in multi-objective optimization models. In addition to the sparsity constraint, an SLL suppression constraint is incorporated to impose an upper bound on array response in interference and clutter directions. By integrating these constraints into a unified optimization framework, the method achieves a robust Minimum Variance Distortionless Response (MVDR) beamforming scheme that exhibits sparsity, adaptivity, and robustness. To address the nonconvexity of the formulated optimization problem, a convex relaxation strategy is used to convert the nonconvex constraint into a convex one. Based on this formulation, robust adaptive beamforming methods are developed to generate a sparse weight solution from a Uniform Linear Array (ULA). Although the method is derived from a ULA, the sparse weight solution provides practical advantages. By assigning zero weights to selected sensors, the number of active elements is reduced, lowering hardware cost and computational burden while preserving desirable beamforming performance. The main contribution of this work lies in establishing a unified framework that enables collaborative optimization of robustness, beam performance, SLL, and array sparsity. Results and Discussions A series of simulation experiments were conducted to evaluate the performance of the proposed sparse robust beamforming algorithm under multiple scenarios, including multi-interference environments, steering vector mismatch, Angle-Of-Arrival (AOA) mismatch, low Signal-to-Noise Ratio (SNR) conditions, and complex electromagnetic environments based on practical antenna arrays. The results show that the algorithm maintains stable mainlobe gain in the desired signal direction while forming deep nulls in interference directions. First, in the presence of steering vector mismatch, conventional MVDR beamformers often exhibit reduced mainlobe gain or beam pointing deviation, which compromises desired-signal reception. By contrast, the proposed method maintains a stable, distortionless mainlobe direction under mismatch conditions, ensuring high gain in the desired signal direction (Fig. 2(a), Fig. 3(a)). Second, with the introduction of an SLL constraint, clutter is suppressed effectively and peak SLLs are reduced markedly (Fig. 2(b)). Third, under low-SNR conditions, the method shows strong noise resistance. Even in heavily noise-contaminated scenarios, it maintains effective interference suppression and achieves high output Signal-to-Interference-plus-Noise Ratio (SINR), demonstrating adaptability to weak-target detection and cluttered environments. Moreover, the optimized SA configuration achieves beamforming performance close to that of a ULA while activating only part of the sensors (Fig. 2). Finally, experimental validation using real antenna arrays further confirms the method’s effectiveness (Fig. 3). Stable performance is maintained, and high gain is achieved in the desired direction even under AOA estimation mismatch (Fig. 4). Overall, the results indicate that the proposed method enhances robustness and hardware efficiency and provides reliable performance in complex electromagnetic environments. Conclusions A robust adaptive beamforming algorithm for sparse arrays is proposed. The central innovation is the construction of a joint optimization model that integrates array sparsity, robustness to steering vector mismatch, and low SLL constraints within a unified framework. Compared with approaches such as MVDR, which emphasizes interference suppression, Covariance Matrix Reconstruction (CMR), which enhances robustness, and Non-Adjacent Constrained Sparsity (NACS), which achieves array sparsity, the proposed method attains a balanced improvement across these dimensions. Simulation results show that in scenarios featuring steering vector errors, AOA estimation mismatches, and low-SNR conditions, the method maintains satisfactory beamforming performance with reduced hardware cost, demonstrating strong practical engineering utility and application potential.

Band-Limited Signal Compression Enabled Computationally Efficient Software-Defined Radio for Two-Way Satellite Time and Frequency Transfer

CHENG Long, DONG Shaowu, WU Wenjun, GONG Jianjun, WANG Weixiong, GAO Zhe

2026, 48(1): 212-221. doi: 10.11999/JEIT250705

[Abstract](171) [FullText HTML] (126) [PDF 3586KB](51)

Abstract:
Objective This study addresses key challenges in Two-Way Satellite Time and Frequency Transfer (TWSTFT) systems, with emphasis on the computational inefficiency and high resource consumption of Software-Defined Radio (SDR) receivers. Although TWSTFT provides excellent long-term stability and time-transfer precision, conventional hardware implementations exhibit significant diurnal effects. Existing mitigation approaches, such as fusion with GPS Precise Point Positioning, depend on auxiliary link quality and lack unified algorithms across international networks. SDR receivers reduce diurnal effects and improve accuracy; however, high sampling rates and multi-correlator processing impose excessive computational burdens that limit real-time multi-station operation. The objective is to develop a band-limited signal compression approach that preserves measurement resolution while substantially improving computational efficiency, thereby enabling scalable and high-performance time transfer across international timing laboratories. Methods A band-limited signal compression method tailored to TWSTFT is proposed by accounting for the distortion of Pseudo-Random Noise (PRN) code square-wave characteristics under bandwidth constraints. Bandwidth-matched filtering is first applied to the local PRN code replica to align its spectrum with the effective bandwidth of the received signal and suppress out-of-band noise. For received signals with different bandwidths, n groups (e.g., n = 1, 2, or 20) of phase-diversified, equally spaced PRN code subsequences are generated. The number of subsequence groups n satisfies n × R_chip≥ 2 × Band_signal, where R_chip denotes the sampling rate of the subsequences and Band_signal represents the signal bandwidth. After bandpass filtering, the received signal undergoes parallel correlation with the phase-diversified PRN subsequences. The full correlation function is reconstructed by a linear combination of the n independent correlation outputs, each scaled by N_chip/n, where N_chip is the number of samples per PRN chip. Adaptive sampling-rate adjustment and resource-allocation strategies are applied to achieve efficient processing with preserved accuracy. Results and Discussions Experimental validation is performed on a TWSTFT platform at the National Time Service Center using TWSTFT links (NTSC-NIM, NTSC-SU, NTSC-PTB) and SATRE local-loop tests. Data from MJD 60 742 to MJD 60 749 are collected in accordance with ITU-R TF.1153.4. In local-loop tests, the proposed method provides the most stable Time of Arrival measurements while maintaining a high signal-to-noise ratio (Table 2). Time deviation outperforms traditional multi-correlator and conventional compression methods over all averaging times (Fig. 9). For operational links, superior short-term stability is observed across different baseline lengths (Fig. 10 and Fig. 11). With n = 1 and n = 2, processing speed increases by 795% and 707%, respectively, while GPU memory usage decreases by 89.77% and 84.65% (Table 4). The method supports up to 102 concurrent channels (n = 1), exceeding the 11-channel capacity of conventional approaches (Table 5). Increasing n beyond these values yields no further precision improvement but increases resource consumption, confirming an optimal trade-off between accuracy and efficiency. Conclusions A band-limited signal compression method is presented to address the computational constraints of TWSTFT SDR receivers. Parallel short-correlation processing combined with bandwidth-aware sampling achieves substantial gains in precision and efficiency. Experimental results confirm improved short-term stability across signal bandwidths and baseline lengths relative to conventional multi-correlator methods. The approach delivers large efficiency gains, with processing speed increases of 795% (n = 1) and 707% (n = 2) and GPU memory reductions of 89.77% and 84.65%, respectively. System scalability is markedly enhanced, supporting up to 102 concurrent channels. These results demonstrate an effective balance between performance and resource utilization for TWSTFT applications.

A Family of Linear Codes and Their Subfield Codes

CHAI Ye, ZHU Shixin, KAI Xiaoshan

2026, 48(1): 222-229. doi: 10.11999/JEIT250775

[Abstract](267) [FullText HTML] (170) [PDF 529KB](33)

Abstract:
Objective The study of weight distributions of linear codes is fundamental in both theory and applications. Weight distributions indicate the error-correcting capability of a code and allow the calculation of error probabilities for detection and correction. Linear codes with few weights also find applications in secret sharing, strongly regular graphs, association schemes, and authentication codes. Therefore, the construction of linear codes with few weights has attracted sustained attention. Subfield codes of linear codes over finite fields have recently received considerable interest because they can yield optimal codes with potential applications in data storage systems and communication systems. In recent years, subfield codes of linear codes over finite fields with good parameters have been widely studied. Motivated by these constructions, a different defining set is selected to extend existing results. The objectives of this paper are to study the weight distributions and dual codes of this class of linear codes and their punctured codes, and to investigate their subfield codes to obtain linear codes with few weights. Methods The selection of the defining set is a key step in the analysis. The calculation of weight distributions relies on decomposing elements of finite fields into their subfields and applying the first four Pless power moments. Using known results on Kloosterman sums over finite fields, the lengths and weight distributions of this class of linear codes admit closed-form expressions and are completely determined in the binary case. The parameters of their dual codes are also determined and are optimal or almost optimal in the binary case. Trace representations of the subfield codes of this class of codes and their punctured codes are derived. Properties of characters over finite fields are then used to determine the parameters, weight distributions, and dualities of these subfield codes. Results and Discussions By selecting an appropriate defining set and using Kloosterman sums over finite fields, the parameters and weight distributions of a family of q-ary linear codes with few weights and their punctured codes are completely determined. Their dual codes and subfield codes are also examined and are shown to be length-optimal and dimension-optimal with respect to the Sphere-packing bound. A class of eight-weight linear codes and their punctured codes is constructed. The corresponding dual codes are all AMDS linear codes, and they are length-optimal and dimension-optimal linear codes with respect to the Sphere-packing bound (see Theorems 1 and 2, and Tables 1 and 2). The parameters and weight distributions of their subfield codes and the corresponding dual codes are provided (see Theorem 3 and Table 3). In addition, the subfield codes of the punctured codes are studied, and the weight distributions and duality of these codes are determined (see Theorem 4 and Table 4). All results are verified using Magma through two examples. Conclusions A family of q-ary linear codes with few weights and their punctured codes is studied. Based on Kloosterman sums over finite fields, the weight distributions and parameters of the codes and their dual codes are determined, yielding optimal linear codes with respect to the Sphere-packing bound. The weight distributions of their subfield codes and the parameters of the corresponding dual codes are also determined, resulting in few-weight binary linear codes.

Edge-Cloud Collaborative Searchable Attribute-Based Signcryption Approach for Internet of Vehicles

YU Huifang, WANG Qinggui, WANG Zihao

2026, 48(1): 230-238. doi: 10.11999/JEIT250750

[Abstract](167) [FullText HTML] (126) [PDF 1854KB](27)

Abstract:
Objective The dynamic and open environment of the Internet of Vehicles (IoV) poses substantial challenges to data security and real-time performance. Large-scale data interactions are vulnerable to eavesdropping, tampering, forgery, and replay attacks. Conventional cloud computing architectures exhibit inherent latency and cannot satisfy millisecond-level real-time requirements in IoV applications, which results in inefficient data transmission and an increased risk of traffic accidents. Therefore, balancing data security and real-time performance represents a critical bottleneck for large-scale IoV deployment. Methods An edge-cloud collaborative searchable attribute-based signcryption method is proposed for IoV applications. A multi-layer architecture is constructed, consisting of cloud servers, edge servers, and in-vehicle terminal devices. Access control is enforced through a hybrid key-policy and ciphertext-policy mechanism derived from attribute-based signcryption and a Linear Secret Sharing Scheme (LSSS). To reduce local decryption overhead, bilinear pairing operations are outsourced to edge nodes. SM9 is adopted for trapdoor generation and signature authentication. The proposed method provides data confidentiality, signature unforgeability, and trapdoor unforgeability. Results and Discussions The proposed method demonstrates superior performance in an IoV edge-cloud collaborative architecture for searchable attribute-based signcryption (Tables 1～5). Functional characteristics are summarized in (Table 1). (Fig. 2) illustrates the variation in total computation time as the number of attributes increases. Although the total time increases slightly, the growth rate remains low. By offloading computation-intensive tasks to edge nodes, the local computational burden on user terminals is substantially reduced. This optimization is quantified by an outsourcing efficiency exceeding 96% (Table 4, Fig. 5). Instantaneous retrieval is achieved by reducing the search complexity to O(1) through a hash-based index (Fig. 4). End-to-end search latency is maintained within an acceptable range for IoV applications (Table 5), which confirms suitability for real-time data access. As shown in (Fig. 3), with an increasing number of attributes, the ciphertext size variation of the proposed method remains the smallest among the compared schemes. Conclusions The proposed method achieves fine-grained access control, data confidentiality, data integrity, and unforgeability, while maintaining advantages in computational and communication efficiency. Through a computation offloading mechanism, the method effectively addresses resource constraints of on-board devices in dynamic, resource-sensitive, and real-time IoV environments.

Continuous Federation of Noise-resistant Heterogeneous Medical Dialogue Using the Trustworthiness-based Evaluation

LIU Yupeng, ZHANG Jiang, TANG Shichen, MENG Xin, MENG Qingfeng

2026, 48(1): 239-252. doi: 10.11999/JEIT250057

[Abstract](401) [FullText HTML] (264) [PDF 4494KB](35)

Abstract:
Objective To address the key challenges of client model heterogeneity, data distribution heterogeneity, and text noise in medical dialogue federated learning, this paper proposes a trustworthiness-based, noise-resistant heterogeneous medical dialogue federated learning method, termed FedRH. FedRH enhances robustness by improving the objective function, aggregation strategy, and local update process, among other components, based on credibility evaluation. Methods Model training is divided into a local training stage and a heterogeneous federated learning stage. During local training, text noise is mitigated using a symmetric cross-entropy loss function, which reduces the risk of overfitting to noisy text. In the heterogeneous federated learning stage, an adaptive aggregation mechanism incorporates clean, noisy, and heterogeneous client texts by evaluating their quality. Local parameter updates consider both local and global parameters simultaneously, enabling continuous adaptive updates that improve resistance to both random and structured (syntax/semantic) noise and model heterogeneity. The main contributions are threefold: (1) A local noise-resistant training strategy that uses symmetric cross-entropy loss to prevent overfitting to noisy text during local training; (2) A heterogeneous federated learning approach based on client trustworthiness, which evaluates each client’s text quality and learning effectiveness to compute trust scores. These scores are used to adaptively weight clients during model aggregation, thereby reducing the influence of low-quality data while accounting for text heterogeneity; (3) A local continuous adaptive aggregation mechanism, which allows the local model to integrate fine-grained global model information. This approach reduces the adverse effects of global model bias caused by heterogeneous and noisy text on local updates. Results and Discussions The effectiveness of the proposed model is systematically validated through extensive, multi-dimensional experiments. The results indicate that FedRH achieves substantial improvements over existing methods in noisy and heterogeneous federated learning scenarios (Table 2, Table 3). The study also presents training process curves for both heterogeneous models (Fig. 3) and isomorphic models (Fig. 6), supplemented by parameter sensitivity analysis, ablation experiments, and a case study. Conclusions The proposed FedRH framework significantly enhances the robustness of federated learning for medical dialogue tasks in the presence of heterogeneous and noisy text. The main conclusions are as follows: (1) Compared to baseline methods, FedRH achieves superior performance in client-side models under heterogeneous and noisy text conditions. It demonstrates improvements across multiple metrics, including precision, recall, and factual consistency, and converges more rapidly during training. (2) Ablation experiments confirm that both the symmetric cross-entropy-based local training strategy and the credibility-weighted heterogeneous aggregation approach contribute to performance gains.

AoI-prioritized Multi-UAV Deployment and Resource Allocation Method in Scenarios with Differentiated User Requirements

JIN Feihong, ZHANG Jing, XIE Yaqin

2026, 48(1): 253-263. doi: 10.11999/JEIT251062

[Abstract](291) [FullText HTML] (146) [PDF 6575KB](46)

Abstract:
Objective In emergency scenarios such as natural disasters, ground-based fixed base stations are often damaged and may not be restored promptly. Because Unmanned Aerial Vehicles (UAVs) provide flexibility and low cost, UAV-assisted emergency communication has gained growing attention from academia and industry. However, existing studies on bandwidth and power allocation often overlook the heterogeneity of traffic demands among different Ground Users (GUs). They also do not fully address the effect of Age of Information (AoI) on the timeliness of emergency decision-making. Given differentiated traffic requirements and the direct effect of AoI on emergency response, this study proposes an AoI-based joint UAV deployment and resource allocation method for emergency communication. The objectives are: (1) to determine the minimum number of UAVs required while meeting the total GU traffic demand, and (2) to jointly optimize bandwidth, power, and Three-Dimensional (3D) UAV positions to minimize the system’s average AoI. Methods A two-stage approach that combines the Multiple UAV Deployment (MUD) algorithm and the Bandwidth, Power, and 3D Location (BPL) algorithm is proposed. For UAV quantity determination, the Particle Swarm Optimization (PSO) algorithm calculates the traffic density of each uncovered GU. The GU with the highest traffic density is selected as the core, and its adjacent GUs form a cluster. PSO optimizes the cluster position to maximize covered traffic volume while meeting UAV service constraints and determines the minimum number of UAVs required. For joint resource and position optimization, the BPL algorithm allocates bandwidth, power, and 3D locations. Bandwidth allocation uses an improved relaxation adjustment method in which weights are assigned based on GU data transmission time, and subchannels are allocated dynamically to balance transmission time. Power allocation follows the same structure. For 3D position optimization, the Whale Optimization Algorithm (WOA) is applied. After fixing the UAV’s horizontal position, the minimum height needed for coverage is derived using ellipse characteristics to reduce energy consumption. This converts the 3D search into a 2D search for the optimal position. Results and Discussions Simulation results confirm the effectiveness of the method. In a scenario with 100 GUs distributed randomly in a 1 km × 1 km area, 7 UAVs are required to achieve a 90% coverage rate (Fig. 2). The system’s average AoI under this deployment meets basic real-time communication requirements. Compared with benchmark algorithms such as Weighted K-Means (WKM) and Minimum Degree Prior (MDP), the MUD algorithm consistently uses fewer UAVs under different conditions of area size, GU quantity, and UAV service capability (Fig. 3). As the maximum GU traffic demand increases, data transmission time increases, which raises the required UAV count, whereas UAV climbing time decreases because cluster radii are smaller. Therefore, the average AoI shows a slight decrease (Fig. 4). The improved allocation method yields better performance than average allocation. It reduces the maximum GU data transmission time by 26.35% (Fig. 5a) and assigns 16.7% more bandwidth and power to high-traffic GUs (Fig. 5b). This leads to more balanced transmission times and higher resource use efficiency. When compared with NBPL (no Bandwidth-Power and Location optimization), OL (Only Location optimization), and OBP (Only Bandwidth-Power optimization), the full BPL (Bandwidth-Power and Location optimization) algorithm achieves the lowest average AoI under different GU quantities. When the GU count is large, the BPL algorithm reduces the average AoI by about 21.1% compared with NBPL (Fig. 6a). The method also reaches the lowest total energy consumption per UAV among all compared schemes (Fig. 6b). Its computational complexity remains suitable for practical emergency deployment. Conclusions This study proposes an AoI-prioritized multi-UAV deployment and resource allocation method for emergency communication scenarios characterized by differentiated user traffic demands. The method integrates a PSO-enhanced MUD algorithm to determine the minimum UAV quantity and a BPL algorithm that jointly optimizes bandwidth, power, and 3D UAV positions using WOA and an improved allocation method. It meets three objectives: reducing UAV use, minimizing average AoI to maintain information freshness, and lowering energy consumption. Simulation results confirm advantages in deployment efficiency, AoI performance, and energy efficiency. Future work includes extending the method to non-LoS channel conditions, designing lower-complexity heuristic methods for larger-scale tasks, developing distributed optimization frameworks, and studying online joint trajectory and resource optimization methods for dynamic environments.

A Fault Diagnosis Method for Flight Control Systems Combining Pose-Invariant Features and a Semi-Supervised RDC-GAN Model

ZHANG Jingsen, HOU Biao, LI Zhijie, BI Wenping, WU Zitong

2026, 48(1): 264-276. doi: 10.11999/JEIT250964

[Abstract](220) [FullText HTML] (120) [PDF 8312KB](15)

Abstract:
Objective In recent years, China has actively promoted the development of the low-altitude economy, leading to the increasingly widespread application of drones across multiple industries. As highly complex aerial systems, Unmanned Aerial Vehicles (UAVs) are susceptible to various failures during operation. The flight control system, which serves as the core of UAV flight operations, may develop faults that are less evident than physical damage to components such as motors or propellers. However, such faults can directly cause flight instability or complete loss of control. Fault diagnosis of UAV flight control systems faces two major challenges. First, as an emerging aerial platform, UAVs have far fewer effectively accumulated training samples than traditional diagnostic targets such as bearings, resulting in data scarcity. Second, owing to strong maneuverability, UAVs exhibit substantial variations in data distribution under different flight attitudes, which limits the diagnostic accuracy of most existing models under rapidly changing operating conditions. Therefore, the development of an effective fault diagnosis method for UAV flight control systems is of both academic interest and practical engineering value. Methods A fault diagnosis method for flight control systems based on pose-invariant features and a semi-supervised Reloaded Dense Generative Adversarial Classification Network (RDC-GAN) is proposed. The overall framework is illustrated in Fig. 1. Flight logs collected from the UAV are used as raw diagnostic data. After data cleaning, a differential flatness-based data selection method is applied to separate the flight data into pose-dependent data and pose-independent data. For pose-dependent data, Empirical Mode Decomposition-Squeeze Excitation Network (EMD-SENet) is adopted to extract pose-invariant features, as shown in Fig. 3. An adaptive feature fusion module is then used to perform weighted fusion of pose-independent data, pose-invariant features, and pose-dependent data, as illustrated in Fig. 4. The fused features are subsequently input into a semi-supervised RDC-GAN diagnostic model, whose architecture is presented in Fig. 4. Model training is conducted in two stages. In the first stage, unsupervised training is performed to initialize the network parameters using a large set of unlabeled samples. In the second stage, supervised training is carried out with a small number of labeled samples, enabling accurate fault diagnosis under limited labeling conditions. Results and Discussions The proposed method is first validated on the publicly available RflyMad dataset, which contains magnetometer fault, accelerometer fault, gyroscope fault, Global Navigation Satellite System (GNSS) fault, and no-fault data under five flight attitude modes. Fig. 5 and Fig. 6 illustrate the pose-invariant features extracted by EMD-SENet and the synthetic samples generated by the RDC-GAN generator, respectively. Diagnostic performance is evaluated using Overall Accuracy (OA), Average Accuracy (AA), and the Kappa coefficient, in addition to class-wise accuracy for each fault category. The results on the RflyMad dataset are summarized in Table 3. The proposed method achieves 95.71% OA, 95.32% AA, and a Kappa coefficient of 95.41%, exceeding the second-best comparative method by 2.17%, 2.42%, and 2.40%, respectively. For real-flight experiments, a fault injection approach based on a redundant positioning system is designed. A motion capture system and an Ultra-WideBand (UWB) four-base-station positioning system are employed to ensure experimental reliability and operational safety. The experimental setup is shown in Fig. 11. Online real-flight diagnostic results are presented in Fig. 13, with an OA of 92.78%. Fault diagnosis time is reported in Table 5, and false alarm statistics are provided in Table 6. Conclusions A fault diagnosis method for flight control systems that integrates pose-invariant features with a semi-supervised RDC-GAN model is presented to address data scarcity and flight attitude-induced distribution variation in UAV diagnostics. Differential flatness-based data selection is used to distinguish pose-dependent data from pose-independent data, and pose-invariant features are extracted using EMD-SENet. An adaptive feature fusion strategy is applied to balance heterogeneous features, and phased semi-supervised training of the RDC-GAN model enables high diagnostic accuracy with a limited number of labeled samples. Experimental validation on the RflyMad dataset and real UAV flight scenarios confirms the effectiveness of the proposed method.

Bimodal Emotion Recognition Method Based on Dual-stream Attention and Adversarial Mutual Reconstruction

LIU Jia, ZHANG Yangrui, CHEN Dapeng, MAO Die, LU Guorui

2026, 48(1): 277-286. doi: 10.11999/JEIT250424

[Abstract](327) [FullText HTML] (157) [PDF 2800KB](34)

Abstract:
Objective This paper proposes a bimodal emotion recognition method that integrates ElectroEncephaloGraphy (EEG) and speech signals to address noise sensitivity and inter-subject variability that limit single-modality emotion recognition systems. Although substantial progress has been achieved in emotion recognition research, cross-subject recognition accuracy remains limited, and performance is strongly affected by noise. For EEG signals, physiological differences among subjects lead to large variations in emotion classification performance. Speech signals are likewise sensitive to environmental noise and data loss. This study aims to develop a dual-modality recognition framework that combines EEG and speech signals to improve robustness, stability, and generalization performance. Methods The proposed method utilizes two independent feature extractors for EEG and speech signals. For EEG, a dual feature extractor integrating time-frame-channel joint attention and state-space modeling is designed to capture salient temporal and spectral features. For speech, a Bidirectional Long Short-Term Memory (Bi-LSTM) network with a frame-level random masking strategy is adopted to improve robustness to missing or noisy speech segments. A modality refinement fusion module is constructed using gradient reversal and orthogonal projection to enhance feature alignment and discriminability. In addition, an adversarial mutual reconstruction mechanism is applied to enforce consistent emotion feature reconstruction across subjects within a shared latent space. Results and Discussions The proposed method is evaluated on multiple benchmark datasets, including MAHNOB-HCI, EAV, and SEED. Under cross-subject validation on the MAHNOB-HCI dataset, the model achieves accuracies of 81.09% for valence and 80.11% for arousal, outperforming several existing approaches. In five-fold cross-validation, accuracies increase to 98.14% for valence and 98.37% for arousal, demonstrating strong generalization and stability. On the EAV dataset, the proposed model attains an accuracy of 73.29%, which exceeds the 60.85% achieved by conventional Convolutional Neural Network (CNN)-based methods. In single-modality experiments on the SEED dataset, an accuracy of 89.33% is obtained, confirming the effectiveness of the dual-stream attention mechanism and adversarial mutual reconstruction for improving cross-subject generalization. Conclusions The proposed dual-stream attention and adversarial mutual reconstruction framework effectively addresses challenges in cross-subject emotion recognition and multimodal fusion for affective computing. The method demonstrates strong robustness to individual differences and noise, supporting its applicability in real-world human–computer interaction systems.

Multi-code Deep Fusion Attention Generative Adversarial Networks for Text-to-Image Synthesis

GU Guanghua, SUN Wenxing, YI Boyu

2026, 48(1): 287-296. doi: 10.11999/JEIT250516

[Abstract](281) [FullText HTML] (142) [PDF 8657KB](31)

Abstract:
Objective Text-to-image synthesis is a core task in multimodal artificial intelligence and aims to generate photorealistic images that accurately correspond to natural language descriptions. This capability supports a wide range of applications, including creative design, education, data augmentation, and human-computer interaction. However, simultaneously achieving high visual fidelity and precise semantic alignment remains challenging. Most existing Generative Adversarial Network (GAN) based methods condition image generation on a single latent noise vector, which limits the representation of diverse visual attributes described in text. Therefore, generated images often lack fine textures, subtle color variations, or detailed structural characteristics. In addition, although attention mechanisms enhance semantic correspondence, many approaches rely on single-focus attention, which is insufficient to capture the complex many-to-many relationships between linguistic expressions and visual regions. These limitations result in an observable discrepancy between textual descriptions and synthesized images. To address these issues, a novel GAN architecture, termed Multi-code Deep Feature Fusion Attention Generative Adversarial Network (mDFA-GAN), is proposed. The objective is to enhance text-to-image synthesis by enriching latent visual representations through multiple noise codes and strengthening semantic reasoning through a multi-head attention mechanism, thereby improving detail accuracy and textual faithfulness. Methods An mDFA-GAN is proposed. The generator incorporates three main components. First, a multi-noise input strategy is adopted, in which multiple independent noise vectors are used instead of a single latent noise vector, allowing different noise codes to capture different visual attributes such as structure, texture, and color. Second, a Multi-code Prior Fusion Module is designed to integrate these latent representations. This module operates on intermediate feature maps and applies learnable channel-wise weights to perform adaptive weighted summation, producing a unified and detail-rich feature representation. Third, a Multi-head Attention Module is embedded in the later stages of the generator. This module computes attention between visual features and word embeddings across multiple attention heads, enabling each image region to attend to multiple semantically relevant words and improving fine-grained cross-modal alignment. Training is conducted using a unidirectional discriminator with a conditional hinge loss combined with a Matching-Aware zero-centered Gradient Penalty (MA-GP) to enhance training stability and enforce text-image consistency. In addition, a multi-code fusion loss is introduced to reduce variance among features derived from different noise codes, thereby promoting spatial and semantic coherence. Results and Discussions The proposed mDFA-GAN is evaluated on the CUB-200-2011 and MS COCO datasets. Qualitative results, as illustrated in (Fig. 6) and (Fig. 7), indicate that the proposed method generates images with accurate colors, fine-grained details, and coherent complex scenes. Subtle textual attributes, such as specific plumage patterns and object shapes, are effectively captured. Quantitative evaluation demonstrates state-of-the-art performance. An Inception Score (IS) of 4.82 is achieved on the CUB-200-2011 dataset (Table 1), reflecting improved perceptual quality and semantic consistency. Moreover, the lowest Fréchet Inception Distance (FID) values of 13.45 on CUB-200-2011 and 16.50 on MS COCO are obtained (Table 2), indicating that the generated images are statistically closer to real samples. Ablation experiments confirm the contribution of each component. Performance degrades when either the Multi-code Prior Fusion Module or the Multi-head Attention Module is removed (Table 3). Further analysis identifies that setting the number of noises to 3 is the optimal configuration (Table 4). In terms of efficiency, the model achieves an inference time of 0.8 seconds per image (Table 5), maintaining the efficiency advantage of GAN-based methods. Conclusions A novel text-to-image synthesis framework, mDFA-GAN, is proposed to address limited fine-grained detail representation and insufficient semantic alignment in existing GAN-based methods. By decomposing the latent space into multiple noise codes and adaptively fusing them, the model enhances its capacity to generate detailed visual content. The integration of multi-head cross-modal attention enables more accurate and context-aware semantic grounding. Experimental results on benchmark datasets demonstrate that mDFA-GAN achieves state-of-the-art performance, as evidenced by improved IS and FID scores and high-quality visual results. Ablation studies further validate the necessity and complementary effects of the proposed components. The framework provides both an effective solution for text-to-image synthesis and useful architectural insights for future research in multimodal representation learning.

Bionic Behavior Modeling Method for Unmanned Aerial Vehicle Swarms Empowered by Deep Reinforcement Learning

HE Ming, WU Jingjing, HAN Wei, LIU Sicong, PAN Fan, XIA Hengyu

2026, 48(1): 297-310. doi: 10.11999/JEIT251103

[Abstract](211) [FullText HTML] (190) [PDF 4950KB](27)

Abstract:
Significance Unmanned Aerial Vehicle (UAV) swarm technology is a core driver of low-altitude economic development and intelligent unmanned system evolution, yielding cooperative effects greater than the sum of individual UAVs in disaster response, environmental monitoring, and logistics distribution. As mission scenarios shift toward dynamic heterogeneity, strong interference, and large-scale deployment, traditional centralized control architectures, although theoretically feasible, do not achieve practical implementation and remain a major constraint on engineering application. Bionic Swarm Intelligence (BSI), a distributed intelligent paradigm that simulates the self-organization, elastic reconfiguration, and cooperative behavior of biological swarms, offers a path to overcoming these limitations. The integration of Deep Reinforcement Learning (DRL) enables a transition from static behavior simulation to adaptive autonomous learning and decision-making. The combined BSI-DRL framework allows UAV swarms to optimize cooperative strategies through data-driven interaction, addressing the limited adaptability of manually designed bionic rules. Clarifying the progress and challenges of UAV swarm modeling based on BSI-DRL is essential for supporting engineering transformation and improving practical system performance. Progress The progress of BSI-DRL-driven UAV swarm behavior modeling is summarized from four aspects.(1) BSI’s concept and core characteristics: BSI, a biology-oriented subset of Swarm Intelligence (SI), is defined by four characteristics: distributed control without dependence on a central command, self-organization through spontaneous disorder-to-order transition, robustness through functional maintenance under disturbances, and adaptability through dynamic strategy optimization in complex environments. (2) Three-stage paradigm transition of BSI: (a)Before 2010 (rule transplantation stage): work centered on applying fixed bionic algorithms such as particle swarm optimization and biological models (e.g., Boids, Vicsek) to UAV path planning, with SI dependent on preset rules (Fig. 2). (b)From 2010 to 2020 (systematic decentralized control stage): studies shifted toward systematic design and decentralized control theory, enabling a transition from simulation to physical verification but showing limited adaptability under dynamic conditions (Fig. 2). (c)Since 2020 (AI-enhanced autonomous learning stage): integration of DRL enabled a transition to autonomous learning and decision-making, allowing UAV swarms to develop advanced cooperative strategies when facing unknown environments (Fig. 2).(3) Typical biological swarm mechanisms and bionic mapping: Four representative biological mechanisms provide bionic prototypes. (a)Pigeon flock hierarchy, characterized by a three-tier coupled structure, supports formation control and cooperation under interference. (b)Wolf pack hunting, structured as four-stage dynamic collaboration, enables efficient task division. (c)Fish school self-repair through decentralized topology adjustment enhances swarm robustness. (d)Honeybee colony division of labor, based on decentralized decision-making and dynamic role assignment, improves task efficiency. Bionic mapping proceeds through three steps: decomposition of the biological prototype and extraction of behavioral features using dynamic mode decomposition, social interaction filtering, and group state classification (Fig. 5); abstraction of behavior rules and mathematical modeling using approaches such as differential equations and graph theory; and algorithmic adaptation and intelligent enhancement by converting mathematical models into executable rules and integrating DRL.(4) Core BSI-DRL modeling directions: Three main technical paths are summarized with horizontal comparison (Table 1). (a)Bionic-rule parameterization with DRL optimization (shallow fusion): DRL is used to optimize key parameters of bionic models, such as attraction-repulsion weights in Boids, preserving biological robustness but exhibiting instability during large-swarm training. (b)Generative bionic-rule multi-agent reinforcement learning (middle fusion): bio-inspired reward functions guide the autonomous emergence of cooperative rules, improving adaptability but reducing interpretability due to “black-box” characteristics. (c)Dynamic role assignment with hierarchical DRL (deep fusion): a three-tier architecture comprising global planning, group role assignment, and individual execution reduces decision-making complexity in heterogeneous swarms and strengthens multi-task adaptability, although multi-level coordination remains challenging. A scenario-adaptation logic based on swarm scale, environmental dynamics, and task heterogeneity, together with a multi-method fusion strategy, is also proposed. Conclusions This study clarifies the theoretical framework and research progress of BSI-DRL-based UAV swarm behavior modeling. BSI addresses limitations of traditional centralized control, including scale expandability, dynamic adaptability, and system credibility, by simulating biological swarm mechanisms. DRL further enables a shift toward autonomous learning. Horizontal comparison indicates complementary strengths across the three core directions: parameterization optimization maintains basic robustness, generative methods enhance dynamic adaptability, and hierarchical collaboration improves performance in heterogeneous multi-task settings. The proposed scenario-adaptation logic, which applies parameterization to small-to-medium and static scenarios, generative methods to medium-to-large and dynamic scenarios, and hierarchical collaboration to heterogeneous multi-task missions, together with the multi-method fusion strategy, offers feasible engineering pathways. Key engineering bottlenecks are also identified, including inconsistent environmental perception, unbalanced multi-objective decision-making, and limited system interpretability, providing a basis for targeted technical advancement. Prospects Future work focuses on five directions to enhance the capacity of BSI-DRL for complex UAV swarm tasks. (1)Cross-species biological mechanism integration: combining advantages of different biological prototypes to construct adaptive hybrid systems. (2) BSI-DRL closed-loop collaborative evolution: establishing a bidirectional interaction framework in which BSI provides initial strategies and safety boundaries, while DRL refines bionic rules online. (3)Bird-swarm-like phase-transition control and DRL fusion: using phase-transition order parameters as DRL observation indicators to improve parameter interpretability. (4)Digital-twin and hardware-in-the-loop training and verification: building high-fidelity digital-twin environments to narrow simulation–reality gaps. (5)Real-scenario performance evaluation and field deployment: conducting field tests to assess algorithm effectiveness and guide theoretical refinement.

An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing

GAO Wenchao, SUO Jianhua, ZHANG Ao

2026, 48(1): 311-320. doi: 10.11999/JEIT250363

[Abstract](274) [FullText HTML] (123) [PDF 2013KB](39)

Abstract:
Objective Deep learning technology is widely applied to source code vulnerability detection. Existing approaches are mainly sequence-based or graph-based. Sequence-based models convert structured code into linear sequences, which leads to the loss of syntactic and structural information and often results in a high false positive rate. Graph-based models capture structural features but cannot represent execution order, and their detection granularity is usually limited to the function level. Both types of methods lack interpretability, which restricts the ability of developers to locate vulnerability sources. Large Language Models (LLMs) show progress in code understanding; however, they still exhibit high computational cost, hallucination risk in security analysis, and insufficient modeling of complex program logic. To address these issues, an interpretable vulnerability detection method based on Graph and Slicing Vulnerability Detection (GSVD) is proposed. Structural semantics and sequential information are integrated, and fine-grained, line-level explanations are provided for prediction results. Methods The proposed method consists of four modules: code graph feature extraction, code sequence feature extraction, feature fusion, and an interpreter module (Fig. 1). First, the source code is normalized, and the Joern static analysis tool is applied to generate multiple code graphs, including the Abstract Syntax Tree (AST), Data Dependency Graph (DDG), and Control Dependency Graph (CDG). These graphs represent program structure, data flow, and control flow, respectively. Node features are initialized using CodeBERT embeddings combined with one-hot encodings of node types. Based on the adjacency matrix of each graph, a Gated Graph Convolutional Network (GGCN) with a self-attention pooling layer is employed to extract deep structural semantic features. A code slicing algorithm based on taint analysis (Algorithm 1) is then designed. In this algorithm, taint sources are identified, and taints are propagated according to data and control dependencies to generate concise slices associated with potential vulnerabilities. Unrelated code is removed, and the resulting slices are processed using a Bidirectional Long Short-Term Memory (BiLSTM) network to capture long-range sequential dependencies. After graph and sequence features are extracted, a gating mechanism is applied for feature fusion. The fused feature vectors are further processed using a Gated Recurrent Unit (GRU), which learns dependencies between structural and sequential representations through dynamic state updates. To support vulnerability detection and localization, a Vulnerability Detection Explainer (VDExplainer) is designed. Inspired by the Hyperlink-Induced Topic Search (HITS) algorithm, node “authority” and “hub” scores are iteratively computed under an edge-mask constraint to estimate node importance and provide node-level interpretability. Results and Discussions The effectiveness of the GSVD model is evaluated through comparative experiments on the Devign dataset (FFmpeg + Qemu), as shown in (Table 2). GSVD is compared with several baseline models and achieves the highest accuracy and F1-score, reaching 64.57% and 61.89%, respectively. Recall increases to 62.63%, indicating improved vulnerability detection capability and reduced missed reports. To evaluate the GRU-based fusion module, three fusion strategies are compared: feature concatenation, weighted sum, and attention mechanism (Table 3). GSVD achieves the best overall performance. Although its precision (61.17%) is slightly lower than that of the weighted sum method (63.33%), accuracy, recall, and F1-score exhibit more balanced performance. Ablation studies (Tables 4～5) further demonstrate the contribution of the slicing algorithm. The taint propagation-based slicing method reduces the average number of code lines from 51.98 to 17.30, corresponding to a reduction of 66.72%, and lowers the data redundancy rate to 6.42%. In comparison, VulDeePecker and SySeVR report redundancy rates of 19.58% and 22.10%, respectively. This reduction in noise yields a 1.53% improvement in F1-score, confirming that the slicing module enhances focus on critical code segments. The interpretability of GSVD is validated on the Big-Vul dataset using the VDExplainer module (Table 6). Compared with the standard Graph Neural Network Explainer (GNNExplainer), higher localization accuracy is achieved at all evaluation thresholds. When 50% of the nodes are selected, localization accuracy increases by 7.65%, demonstrating the advantage of VDExplainer in node-level vulnerability explanation. Conclusions The GSVD model overcomes the limitations of single-modal methods by integrating graph structures with taint-based code slicing. Both detection accuracy and interpretability are improved. The VDExplainer enables node-level and line-level localization, enhancing practical applicability. Experimental results confirm the advantages of the proposed method in vulnerability detection and explanation.

Depression Screening Method Driven by Global-Local Feature Fusion

ZHANG Siyong, QIU Jiefan, ZHAO Xiangyun, XIAO Kejiang, CHEN Xiaofu, MAO Keji

2026, 48(1): 321-334. doi: 10.11999/JEIT250035

[Abstract](567) [FullText HTML] (336) [PDF 3274KB](31)

Abstract:
Objective Depression is a globally prevalent mental disorder that poses a serious threat to the physical and mental health of millions of individuals. Early screening and diagnosis are essential to reducing severe consequences such as self-harm and suicide. However, conventional questionnaire-based screening methods are limited by their dependence on the reliability of respondents’ answers, their difficulty in balancing efficiency with accuracy, and the uneven distribution of medical resources. New auxiliary screening approaches are therefore needed. Existing Artificial Intelligence (AI) methods for depression detection based on facial features primarily emphasize global expressions and often overlook subtle local cues such as eye features. Their performance also declines in scenarios where partial facial information is obscured, for instance by masks, and they raise privacy concerns. This study proposes a Global-Local Fusion Axial Network (GLFAN) for depression screening. By jointly extracting global facial and local eye features, this approach enhances screening accuracy and robustness under complex conditions. A corresponding dataset is constructed, and experimental evaluations are conducted to validate the method’s effectiveness. The model is deployed on edge devices to improve privacy protection while maintaining screening efficiency, offering a more objective, accurate, efficient, and secure depression screening solution that contributes to mitigating global mental health challenges. Methods To address the challenges of accuracy and efficiency in depression screening, this study proposes GLFAN. For long-duration consultation videos with partial occlusions such as masks, data preprocessing is performed using OpenFace 2.0 and facial keypoint algorithms, combined with peak detection, clustering, and centroid search strategies to segment the videos into short sequences capturing dynamic facial changes, thereby enhancing data validity. At the model level, GLFAN adopts a dual-branch parallel architecture to extract global facial and local eye features simultaneously. The global branch uses MTCNN for facial keypoint detection and enhances feature extraction under occlusion using an inverted bottleneck structure. The local branch detects eye regions via YOLO v7 and extracts eye movement features using a ResNet-18 network integrated with a convolutional attention module. Following dual-branch feature fusion, an integrated convolutional module optimizes the representation, and classification is performed using an axial attention network. Results and Discussions The performance of GLFAN is evaluated through comprehensive, multi-dimensional experiments. On the self-constructed depression dataset, high accuracy is achieved in binary classification tasks, and non-depression and severe depression categories are accurately distinguished in four-class classification. Under mask-occluded conditions, a precision of 0.72 and a precision of 0.690 are obtained for depression detection. Although these values are lower than the precision of 0.87 and precision of 0.840 observed under non-occluded conditions, reliable screening performance is maintained. Compared with other advanced methods, GLFAN achieves higher recall and F1 scores. On the public AVEC2013 and AVEC2014 datasets, the model achieves lower Mean Absolute Error (MAE) values and shows advantages in both short- and long-sequence video processing. Heatmap visualizations indicate that GLFAN dynamically adjusts its attention according to the degree of facial occlusion, demonstrating stronger adaptability than ResNet-50. Edge device tests further confirm that the average processing delay remains below 56.14 milliseconds per frame, and stable performance is maintained under low-bandwidth conditions. Conclusions This study proposes a depression screening approach based on edge vision technology. A lightweight, end-to-end GLFAN is developed to address the limitations of existing screening methods. The model integrates global facial features extracted via MTCNN with local eye-region features captured by YOLO v7, followed by effective feature fusion and classification using an Axial Transformer module. By emphasizing local eye-region information, GLFAN enhances performance in occluded scenarios such as mask-wearing. Experimental validation using both self-constructed and public datasets demonstrates that GLFAN reduces missed detections and improves adaptability to short-duration video inputs compared with existing models. Grad-CAM visualizations further reveal that GLFAN prioritizes eye-region features under occluded conditions and shifts focus to global facial features when full facial information is available, confirming its context-specific adaptability. The model has been successfully deployed on edge devices, offering a lightweight, efficient, and privacy-conscious solution for real-time depression screening.

Tensor-Train Decomposition for Lightweight Liver Tumor Segmentation

MA Jinlin, YANG Jipeng

2026, 48(1): 335-345. doi: 10.11999/JEIT250293

[Abstract](216) [FullText HTML] (84) [PDF 3347KB](27)

Abstract:
Objective Convolutional Neural Networks (CNNs) have recently achieved notable progress in medical image segmentation. Their conventional convolution operations, however, remain constrained by locality, which reduces their ability to capture global contextual information. Researchers have pursued two main strategies to address this limitation. Hybrid CNN-Transformer architectures use self-attention to model long-range dependencies, and this markedly improves segmentation accuracy. State-space models such as the Mamba series reduce computational cost and retain global modeling capacity, and they also show favorable scalability. Although CNN-Transformer models remain computationally demanding for real-time use, Mamba-based approaches still face challenges such as boundary blur and parameter redundancy when segmenting small targets and low-contrast regions. Lightweight network design has therefore become a research focus. Existing lightweight methods, however, still show limited segmentation accuracy for liver tumor targets with very small sizes and highly complex boundaries. This paper proposes an efficient lightweight method for liver tumor segmentation that aims to meet the combined requirements of high accuracy and real-time performance for small targets with complex boundaries. Methods The proposed method integrates three strategies. A Tensor-Train Multi-Scale Convolutional Attention (TT-MSCA) module is designed to improve segmentation accuracy for small targets and complex boundaries. This module optimizes multi-scale feature fusion through a TT_Layer and employs tensor decomposition to integrate feature information across scales, which supports more accurate identification and segmentation of tumor regions in challenging images. A feature extraction module with a multi-branch residual structure, termed the IncepRes Block, strengthens the model’s capacity to capture global contextual information. Its parallel multi-branch design processes features at several scales and enriches feature representation at a relatively low computational cost. All standard 3*3 convolutions are then decoupled into two consecutive strip convolutions. This reduces the number of parameters and computational cost although the feature extraction capacity is preserved. The combination of these modules allows the method to improve segmentation accuracy and maintain high efficiency, and it demonstrates strong performance for small targets and blurry boundary regions. Results and Discussions Experiments on the LiTS2017 and 3Dircadb datasets show that the proposed method reaches Dice coefficients of 98.54% and 97.95% for liver segmentation, and 94.11% and 94.35% for tumor segmentation. Ablation studies show that the TT-MSCA module and the IncepRes Block improve segmentation performance with only a modest computational cost, and the SC Block reduces computational cost while accuracy is preserved (Table 2). When the TT-MSCA module is inserted into the reduced U-Net on the LiTS2017 dataset, the tumor Dice and IoU reach 93.73% and 83.60%. These values are second only to the final model. On the 3Dircadb dataset, adding the SC Block after TT-MSCA produces a slight accuracy decrease but reduces GFLOPs by a factor of 4.15. Compared with the original U-Net, the present method improves liver IoU by 3.35% and tumor IoU by 5.89%. The TT-MSCA module also consistently exceeds the baseline MSCA module. It increases liver and tumor IoU by 2.59% and 1.95% on LiTS2017, and by 2.03% and 3.13% on 3Dircadb (Table 5). These results show that the TT_Layer strengthens global context perception and fine-detail representation through multi-scale feature fusion. The proposed network contains 0.79 M parameters and 1.43 GFLOPs, which represents a 74.9% reduction in parameters compared with CMUNeXt (3.15 M). Real-time performance evaluation records 156.62 fps, more than three times the 50.23 fps of the vanilla U-Net (Table 6). Although accuracy decreases slightly in a few isolated metrics, the overall accuracy-compression balance is improved, and the method demonstrates strong practical value for lightweight liver tumor segmentation. Conclusions This paper proposes an efficient liver tumor segmentation method that improves segmentation accuracy and meets real-time requirements. The TT-MSCA module enhances recognition of small targets and complex boundaries through the integration of spatial and channel attention. The IncepRes Block strengthens the network’s perception of liver tumors of different sizes. The decoupling of standard 3*3 convolutions into two consecutive strip convolutions reduces the parameter count and computational cost while preserving feature extraction capacity. Experimental evidence shows that the method reduces errors caused by complex boundaries and small tumor sizes and can satisfy real-time deployment needs. It offers a practical technical option for liver tumor segmentation. The method requires many training iterations to reach optimal data fitting, and future work will address improvements in convergence speed.

A Fake Attention Map-Driven Multi-Task Deepfake Video Detection Model

LIU Pengyu, ZHENG Tianyang, DONG Min

2026, 48(1): 346-358. doi: 10.11999/JEIT250926

[Abstract](258) [FullText HTML] (145) [PDF 6034KB](25)

Abstract:
Objective Deepfake detection is a major challenge in multimedia forensics and information security as synthetic media generation advances. Most high-quality detection methods rely on supervised binary classification models with implicit attention mechanisms. Although these models learn discriminative features and reveal manipulation traces, their performance decreases when confronted with unseen forgery techniques. The absence of explicit guidance during feature fusion reduces sensitivity to subtle artifacts and weakens cross-domain generalization. To address these issues, a detection framework named F-BiFPN-MTLNet is proposed. The framework is designed to achieve high detection accuracy and strong generalization by introducing an explicit forgery-attention-guided multi-scale feature fusion mechanism and a multi-task learning strategy. This research strengthens the interpretability and robustness of deepfake detection models, particularly in real-world settings where forgery methods are diverse and continuously changing. Methods The proposed F-BiFPN-MTLNet contains two components: a Forgery-attention-guided Bidirectional Feature Pyramid Network (F-BiFPN) and a Multi-Task Learning Network (MTLNet). The F-BiFPN (Fig. 1) is designed to provide explicit guidance for fusing multi-scale feature representations from different backbone layers. Instead of using simple top-down and bottom-up fusion, a forgery-attention map is applied to supervise the fusion process. This map highlights potential manipulation regions and assigns adaptive weights to each feature level, ensuring that both semantic and spatial details are retained and redundant information is reduced. This attention-guided fusion strengthens the sensitivity of the network to fine-grained forged traces and improves the quality of the resulting representations. Results and Discussions Experiments are conducted on multiple benchmark datasets, including FaceForensics++, DFDC, and Celeb-DF (Table 1). The proposed F-BiFPN-MTLNet shows consistent gains over state-of-the-art methods in both Area Under the Curve (AUC) and Average Precision (AP) metrics (Table 1). The findings show that attention-guided fusion strengthens the detection of subtle manipulations, and the multi-task learning structure stabilizes performance across different forgery types. Ablation analyses (Table 2) confirm the complementary effects of the two modules. Removing F-BiFPN reduces sensitivity to local artifacts, whereas omitting the self-consistency branch reduces robustness under cross-dataset evaluation. Visualization results (Fig. 8) show that F-BiFPN-MTLNet consistently focuses on forged regions and produces interpretable attention maps that align with actual manipulation areas. The framework achieves a balanced improvement in accuracy, generalization, and transparency, while maintaining computational efficiency suitable for practical forensic applications. Conclusions In this study, a forgery-attention-guided weighted bidirectional feature pyramid network combined with a multi-task learning framework is proposed for robust and interpretable deepfake detection. The F-BiFPN provides explicit supervision for multi-scale feature fusion through forgery-attention maps, reducing redundancy and emphasizing informative regions. The MTLNet introduces a learnable mask branch and a self-consistency branch, jointly strengthening localization accuracy and cross-domain robustness. Experimental results show that the proposed model exceeds existing baselines in AUC and AP metrics while retaining strong interpretability through visualized attention maps. Overall, F-BiFPN-MTLNet achieves a balanced improvement in fine-grained localization, detection reliability, and generalization ability. Its explicit attention and multi-task strategies offer a new direction for developing interpretable and resilient deepfake detection systems. Future work will examine the extension of the framework to weakly supervised and unsupervised settings, reduce dependence on pixel-level annotations, and explore adversarial training strategies to strengthen adaptability against evolving forgery methods.

Precise Hand Joint Motion Analysis Driven by Complex Physiological Information

YAN Jiaqing, LIU Gengchen, ZHOU Qingqi, XUE Weiqi, ZHOU Weiao, TIAN Yunzhi, WANG Jiaju, DONG Zhekang, LI Xiaoli

2026, 48(1): 359-369. doi: 10.11999/JEIT250033

[Abstract](374) [FullText HTML] (338) [PDF 4290KB](37)

Abstract:
Objective The human hand is a highly dexterous organ essential for performing complex tasks. However, dysfunction due to trauma, congenital anomalies, or disease substantially impairs daily activities. Restoring hand function remains a major challenge in rehabilitation medicine. Virtual Reality (VR) technology presents a promising approach for functional recovery by enabling hand pose reconstruction from surface ElectroMyoGraphy (sEMG) signals, thereby facilitating neural plasticity and motor relearning. Current sEMG-based hand pose estimation methods are limited by low accuracy and coarse joint resolution. This study proposes a new method to estimate the motion of 15 hand joints using eight-channel sEMG signals, offering a potential improvement in rehabilitation outcomes and quality of life for individuals with hand impairment. Methods The proposed method, termed All Hand joints Posture Estimation (AHPE), incorporates a continuous denoising network that combines sparse attention and multi-channel attention mechanisms to extract spatiotemporal features from sEMG signals. A dual-decoder architecture estimates both noisy hand poses and the corresponding correction ranges. These outputs are subsequently refined using a Bidirectional Long Short-Term Memory (BiLSTM) network to improve pose accuracy. Model training employs a composite loss function that integrates Mean Squared Error (MSE) and Kullback-Leibler (KL) divergence to enhance joint angle estimation and capture inter-joint dependencies. Performance is evaluated using the NinaproDB8 and NinaproDB5 datasets, which provide sEMG and hand pose data for single-finger and multi-finger movements, respectively. Results and Discussions The AHPE model outperforms existing methods—including CNN-Transformer, DKFN, CNN-LSTM, TEMPOnet, and RPC-Net—in estimating hand poses from multi-channel sEMG signals. In within-subject validation (Table 1), AHPE achieves a Root Mean Squared Error (RMSE) of 2.86, a coefficient of determination (R²) of 0.92, and a Mean Absolute Deviation (MAD) of 1.79° for MetaCarPophalangeal (MCP) joint rotation angle estimation. In between-subject validation (Table 2), the model maintains high accuracy with an RMSE of 3.72, an R² of 0.88, and an MAD of 2.36°, demonstrating strong generalization. The model’s capacity to estimate complex hand gestures is further confirmed using the NinaproDB5 dataset. Estimated hand poses are visualized with the Mano Torch hand model (Fig. 4, Fig. 5). The average R² values for finger joint extension estimation are 0.72 (thumb), 0.692 (index), 0.696 (middle), 0.689 (ring), and 0.696 (little finger). Corresponding RMSE values are 10.217°, 10.257°, 10.290°, 10.293°, and 10.303°, respectively. A grid error map (Fig. 6) highlights prediction accuracy, with red regions indicating higher errors. Conclusions The AHPE model offers an effective approach for estimating hand poses from sEMG signals, addressing key challenges such as signal noise, high dimensionality, and inter-individual variability. By integrating mixed attention mechanisms with a dual-decoder architecture, the model enhances both accuracy and robustness in multi-joint hand pose estimation. Results confirm the model’s capacity to reconstruct detailed hand kinematics, supporting its potential for applications in hand function rehabilitation and human-machine interaction. Future work will aim to improve robustness under real-world conditions, including sensor noise and environmental variation.

T³FRNet: A Cloth-Changing Person Re-identification via Texture-aware Transformer Tuning Fine-grained Reconstruction Method

ZHUANG Jianjun, WANG Nan

2026, 48(1): 370-381. doi: 10.11999/JEIT250476

[Abstract](323) [FullText HTML] (107) [PDF 6132KB](18)

Abstract:
Objective Compared with conventional person re-identification, Cloth-Changing Person Re-Identification (CC Re-ID) requires moving beyond reliance on the temporal stability of appearance features and instead demands models with stronger robustness and generalization to meet real-world application requirements. Existing deep feature representation methods leverage salient regions or attribute information to obtain discriminative features and mitigate the effect of clothing variations; however, their performance often degrades under changing environments. To address the challenges of effective feature extraction and limited training samples in CC Re-ID tasks, a Texture-Aware Transformer Tuning Fine-Grained Reconstruction Network (T³FRNet) is proposed. The method aims to exploit fine-grained information in person images, enhance the robustness of feature learning, and reduce the adverse effect of clothing changes on recognition performance, thereby alleviating performance bottlenecks under scene variations. Methods To compensate for the limitations of local receptive fields, a Transformer-based attention mechanism is integrated into a ResNet50 backbone, forming a hybrid architecture referred to as ResFormer50. This design enables spatial relationship modeling on top of local features and improves perceptual capacity for feature extraction while maintaining a balance between efficiency and performance. A fine-grained Texture-Aware (TA) module concatenates processed texture features with deep semantic features, improving recognition capability under clothing variations. An Adaptive Hybrid Pooling (AHP) module performs channel-wise autonomous aggregation, allowing deeper mining of feature representations and balancing global representation consistency with robustness to clothing changes. An Adaptive Fine-Grained Reconstruction (AFR) strategy introduces adversarial perturbations and selective reconstruction at the fine-grained level. Without explicit supervision, this strategy enhances robustness and generalization against clothing changes and local detail perturbations. In addition, a Joint Perception Loss (JP-Loss) is constructed by integrating fine-grained identity robustness loss, texture feature loss, identity classification loss, and triplet loss. This composite loss jointly supervises the learning of robust fine-grained identity features under cloth-changing conditions. Results and Discussions Extensive evaluations are conducted on LTCC, PRCC, Celeb-reID, and the large-scale DeepChange dataset (Table 1). Under cloth-changing scenarios, the proposed method achieves Rank-1/mAP scores of 45.6%/19.8% on LTCC, 70.6%/69.1% on PRCC (Table 2), 64.6%/18.4% on Celeb-reID (Table 3), and 58.0%/20.8% on DeepChange (Table 4), outperforming existing state-of-the-art approaches. The TA module effectively captures latent local texture details and, when combined with the AFR strategy, enables fine-grained adversarial perturbation and selective reconstruction. This improves fine-grained feature representation and allows the method to achieve 96.2% Rank-1 and 89.3% mAP on the clothing-consistent Market-1501 dataset (Table 5). The JP-Loss further supports the TA module and AFR strategy by enabling fine-grained adaptive regulation and clustering of texture-sensitive identity features (Table 6). When the Transformer-based attention mechanism is inserted after stage 2 of ResNet50, improved local structural perception and global context modeling are obtained with only a slight increase in computational overhead (Table 7). Setting the

\begin{document}$ \beta $\end{document}

parameter to 0.5 (Fig. 5) enables effective balancing of global texture consistency and local fine-grained discriminability. Visualization results on PRCC (Fig. 6a) and top-10 retrieval comparisons (Fig. 6b) provide intuitive evidence of improved stability and accuracy in cloth-changing scenarios. Conclusions A CC Re-ID method based on T³FRNet is proposed, consisting of the ResFormer50 backbone, TA module, AHP module, AFR strategy, and JP-Loss. Experimental results on four cloth-changing benchmarks and one clothing-consistent dataset confirm the effectiveness of the proposed approach. Under long-term scenarios, Rank-1/mAP improvements of 16.8%/8.3% on LTCC and 30.4%/32.9% on PRCC are achieved. The ResFormer50 backbone supports spatial relationship modeling over local fine-grained features, while the TA module and AFR strategy enhance feature expressiveness. The AHP module balances sensitivity to local textures and stability of global features, and JP-Loss strengthens adaptive regulation of fine-grained representations. Future work will focus on simplifying the architecture to reduce computational complexity and latency while maintaining high recognition accuracy.

Research on Segmentation Algorithm of Oral and Maxillofacial Panoramic X-ray Images under Dual-domain Multiscale State Space Network

LI Bing, HU Weijie, LIU Xia

2026, 48(1): 382-393. doi: 10.11999/JEIT250639

[Abstract](192) [FullText HTML] (117) [PDF 5765KB](33)

Abstract:
Objective To address significant morphological variability, blurred boundaries between teeth and gingival tissues, and overlapping grayscale distributions in periodontal regions of oral and maxillofacial panoramic X-ray images, a state space model based on Mamba, a recently proposed neural network architecture, is adopted. The model preserves the advantage of Convolutional Neural Networks (CNNs) in local feature extraction while avoiding the high computational cost associated with Transformer-based methods. On this basis, a Dual-Domain Multiscale State Space Network (DMSS-Net)-based segmentation algorithm for oral and maxillofacial panoramic X-ray images is proposed, resulting in notable improvements in segmentation accuracy and computational efficiency. Methods An encoder-decoder architecture is adopted. The encoder consists of dual branches to capture global contextual information and local structural features, whereas the decoder progressively restores spatial resolution. Skip connections are used to transmit fused feature maps from the encoding path to the decoding path. During decoding, fused features gradually recover spatial resolution and reduce channel dimensionality through deconvolution combined with upsampling modules, finally producing a two-channel segmentation map. Results and Discussions Ablation experiments are conducted to validate the contribution of each module to overall performance, as shown in Table 1. The proposed model demonstrates clear performance gains. The Dice score increases by 5.69 percentage points to 93.86%, and the 95th percentile Hausdorff distance (HD95) decreases by 2.97 mm to 18.73 mm, with an overall accuracy of 94.57%. In terms of efficiency, the model size is 81.23 MB with 90.1 million parameters, which is substantially smaller than that of the baseline model, enabling simultaneous improvement in segmentation accuracy and reduction in parameter count. Comparative experiments with seven representative medical image segmentation models under identical conditions, as reported in Table 2, show that the DMSS-Net achieves superior segmentation accuracy while maintaining a model size comparable to, or smaller than, Transformer-based models of similar scale. Conclusions A DMSS-Net-based segmentation algorithm for oral and maxillofacial panoramic X-ray images is proposed. The algorithm is built on a dual-domain fusion framework that strengthens long-range dependency modeling in dental images and improves segmentation performance in regions with indistinct boundaries. The spatial-domain design effectively supports long-range contextual representation under dynamically varying dental arch morphology. Moreover, enhancement in the feature domain improves sensitivity to low-contrast structures and increases robustness against image interference.

A Multi-step Channel Prediction Method Based on Pseudo-3D Convolutional Neural Network with Attention Mechanism

TAO Jing, HOU Meng, PENG Wei, ZHANG Guoyan, DAI Jiaming, LIU Weiming, WANG Haidong, WANG Zhen

2026, 48(1): 394-403. doi: 10.11999/JEIT251090

[Abstract](123) [FullText HTML] (94) [PDF 5763KB](10)

Abstract:
Objective With the rapid growth in connections and data traffic in Fifth Generation (5G) mobile networks, massive Multiple-Input Multiple-Output (MIMO) has become a key technology for improving network performance. The spectral efficiency and energy efficiency of massive MIMO transmission depend on accurate Channel State Information (CSI). However, the non-stationary characteristics of wireless channels, terminal processing delay, and the use of ultra-high-frequency bands intensify CSI aging, which necessitates channel prediction. Most mainstream prediction schemes are designed for generalized stationary channels and rely on single-step prediction. In non-stationary environments, CSI obtained through single-step prediction is likely to become outdated, and frequent single-step prediction greatly increases pilot overhead. To address these challenges, a multi-step channel prediction method based on a Pseudo-Three-Dimensional Convolutional Neural Network (P3D-CNN) and an attention mechanism is proposed. The method learns the joint time-frequency characteristics of CSI, leverages high frequency-domain correlation to mitigate the effect of lower time-domain correlation in multi-step prediction, and improves prediction performance. Methods In this study, the uplink model of a massive MIMO system is constructed (Fig. 1). CSI is obtained through channel estimation, using an Inverse Fast Fourier Transform (IFFT) at the transmitter and a Fast Fourier Transform (FFT) at the receiver. Actual channel measurements provide a CSI dataset with time-frequency dimensions, and autocorrelation analyses are performed in both domains. A multi-step channel prediction network, termed P3D-CNN with the Convolutional Block Attention Module (CBAM) (Fig. 10), is designed. The P3D-CNN structure replaces the traditional Three-Dimensional Convolutional Neural Network (3D-CNN) by decomposing the three-dimensional convolution into a two-dimensional convolution in the frequency domain and a one-dimensional convolution in the time domain, which greatly reduces computational complexity. The CBAM-based hybrid attention mechanism is incorporated to extract global information in the frequency and channel domains, further improving channel prediction accuracy. Results and Discussions Based on the measured CSI dataset, the prediction method using an AutoRegressive (AR) model, the prediction method using Fully Connected Long Short-Term Memory (FC-LSTM), and the prediction method using P3D-CNN-CBAM are compared under different prediction steps. Simulation results show that the average Normalized Mean Square Error (NMSE) of the proposed P3D-CNN-CBAM method is lower than that of the other two methods (Fig. 15). As the prediction step increases from 1 to 10, prediction error rises sharply because the AR model and FC-LSTM rely solely on time-domain correlation. When the prediction step is 10, the average NMSE of these two methods reaches 0.5868 and 0.7648, respectively. The P3D-CNN-CBAM method yields an average NMSE of only 0.3078, maintaining strong prediction performance. The improvement brought by integrating CBAM into the P3D-CNN network is also verified (Fig. 16). Finally, through transfer learning, the proposed method is extended from single-day datasets to multi-day scenarios. Conclusions Based on the measured CSI dataset, a multi-step prediction method addressing CSI aging in massive MIMO systems is proposed. The method applies P3D-CNN with CBAM to improve multi-step prediction accuracy. By replacing full three-dimensional convolution with pseudo-three-dimensional convolution, time-frequency CSI information is effectively extracted, and the CBAM mechanism enhances the learning of global features. Experimental results show that: (1) the proposed method achieves clear performance advantages over AR- and FC-LSTM-based approaches; and (2) through transfer learning, multi-step prediction is extended from single-antenna to multi-antenna scenarios.

Photosensing Model and Circuit Design of Rod Cells Based on Memristors

SUN Jingru, MA Wenjing, WANG Chunhua, XUE Xiaoyong

2026, 48(1): 404-416. doi: 10.11999/JEIT250901

[Abstract](246) [FullText HTML] (116) [PDF 5290KB](40)

Abstract:
Objective Visual perception plays a critical role in artificial intelligence, robotics, and the Internet of Things. Although existing visual perception devices have achieved substantial progress, the widespread use of conventional CMOS circuit architectures still results in limitations such as slow sensing speed, complex structures, and high power consumption. In contrast, biological visual perception systems exhibit high response speed, low power consumption, and strong stability. Therefore, designing optical perception circuits inspired by biological visual systems has become an active research direction. Existing biologically inspired optical perception circuits are mainly based on the Leaky Integrate-and-Fire (LIF) model, which enables rapid and low-cost conversion of light intensity signals into spike signals. However, the LIF model only supports basic signal conversion and cannot adequately reproduce the working mechanisms and computational characteristics of biological visual neurons. Therefore, practical applications suffer from limited imaging quality, slow response, and weak adaptability. To address these issues, the structure and operating mechanism of human visual perception cells are investigated, a corresponding photosensing circuit is designed, and spiking camera schemes are proposed to achieve high-speed, low-power, and stable imaging. Methods The biological visual system provides valuable inspiration for bionic photosensing circuits due to its fast response, low power consumption, high stability, and strong adaptability. The biological mechanism of photoreceptor cells in the human visual system is analyzed from the perspective of ionic flow, and a mathematical photosensitivity model of rod cells is derived following the construction approach of the Hodgkin-Huxley (HH) model. Based on the closed states of ionic channels in rod cells, a memristor model is designed. Using the proposed memristor model and the mathematical model of photoreceptor cells, a rod-cell photosensing circuit is developed. Its adaptability, conversion speed, stability, and dynamic range are evaluated through simulation to verify effectiveness and bionic characteristics, and the results are compared with those of a photosensing circuit based on the LIF model. To further demonstrate practicality, the proposed rod-cell photosensing circuit is applied to a spiking camera, and its adaptability, speed, power consumption, error, and dynamic range are analyzed and compared with a spiking camera based on a simplified neuron photosensing circuit. Results and Discussions Based on the operating principles of photoreceptor cells in the human visual system, a photoreceptor cell model is proposed. Sodium-ion memristors and calcium-ion memristors are introduced to simulate sodium and calcium ion channels in photoreceptor cells, respectively, where the sodium-ion memristor is implemented as a tri-valued memristor. Using the proposed memristor model, a rod-cell photosensing circuit is designed. Under strong illumination, the circuit adapts to light intensity through resistance transitions of the sodium-ion memristor, reducing sensitivity and suppressing the influence of extreme illumination on normal lighting conditions, while maintaining fast conversion speed and a wide dynamic range. The rod-cell photosensing circuit is further combined with the signal conversion circuit to implement a spiking camera. Simulation results show that, compared with spiking cameras based on simplified neuron photosensing circuits and CMOS circuits, the imaging speed increases by 20% and 150%, respectively, while automatic adaptation to extreme illumination, low power consumption, high accuracy, and strong stability are achieved. Conclusions Inspired by the operating mechanisms of photoreceptor cells in the visual system, a mathematical model of rod cells and a corresponding memristor model are proposed, and a rod-cell photosensing circuit based on memristors is designed. The circuit reproduces the hyperpolarization and adaptive processes observed in rod-cell photosensing. Through capacitor charge-discharge behavior and memristor resistance transitions, optical signals are converted into voltage signals whose amplitudes vary with light intensity, with higher illumination producing higher voltage amplitudes. Automatic amplitude regulation under strong illumination is achieved, thereby suppressing the influence of extreme light conditions. Compared with simplified neuron photosensing circuits, the proposed rod-cell photosensing circuit provides faster conversion speed, a wide dynamic range from 50 to 5 000 lx, self-adaptation, and improved stability. An intelligent optical sensor array is further constructed, and a spiking camera is implemented by combining the photosensing circuit with a signal conversion circuit and a time-window function. Simulation results confirm clearer imaging under strong background illumination and effective high-speed imaging for both stationary and rapidly moving objects. Compared with spiking cameras based on simplified neuron photosensing circuits and CMOS circuits, imaging speed is improved by 20% and 150%, respectively, while low power consumption, small error, and strong anti-interference capability are maintained.

Modeling and Dynamic Analysis of Controllable Multi-double Scroll Memristor Hopfield Neural Network

LIU Song, LI Zihan, QIU Da, LUO Min, LAI Qiang

2026, 48(1): 417-428. doi: 10.11999/JEIT250972

[Abstract](268) [FullText HTML] (111) [PDF 7355KB](41)

Abstract:
Objective The human brain is a complex neural system capable of integrated information storage, computation, and parallel processing. The collective activity of neuronal populations processes and coordinates sensory inputs, producing highly nonlinear dynamics. Developing artificial neural network models and analyzing them with nonlinear dynamics theory is therefore of considerable scientific and practical interest. As a brain-inspired model, the Hopfield Neural Network (HNN) exhibits more diverse dynamics when a Memristor Hopfield Neural Network (MHNN) is formed by introducing a memristor into its structure. Among such systems, networks that generate Multi-Double Scroll (MDS) attractors are advantageous because their richer dynamical behavior and more complex topological structure offer strong potential for applications such as image encryption. Methods A memristor model based on an arctangent-function series is proposed and introduced into a fully connected HNN. This forms an MHNN that incorporates electromagnetic radiation effects and memristive synaptic weights. The mechanism responsible for generating MDS chaotic attractors is examined through equilibrium-point analysis. Dynamical characteristics, including the effects of memristive synaptic coupling strength and initial offset boosting, are evaluated using bifurcation diagrams, Lyapunov-exponent spectra, and attraction basins. The system is then implemented on an FPGA platform. Results and Discussions The MHNN generates an arbitrary number of multi-directional MDS chaotic attractors (Figs. 4, 5, 6). Adjusting the memristive synaptic coupling strength yields distinct coexisting attractor types (Figs. 7, 8). Multiple coexisting MDS chaotic attractors also emerge from modifications of the initial values (Figs. 9, 10, 11, 12). Hardware implementation on an FPGA (Figs. 13, 14) confirms the correctness and feasibility of the system. Conclusions The proposed MHNN generates unidirectional, bidirectional, and tridirectional MDS chaotic attractors in phase space. The number of scrolls is tuned by the memristor control parameter. The system also shows initial offset boosting, and the number of coexisting attractors is regulated by this parameter. Higher-dimensional networks can be constructed by increasing the number of memristive synapses, demonstrating the broad generality of the model. Owing to its complex topology and rich dynamics, the network offers promising potential for engineering applications.

Research on Load Modulation Enhancement of Quasi-Ideal Doherty Power Amplifier with Equivalent Transconductance Compensation

HUA Jun, XU Gaoming, CHEN Jinghao, LU Siyang, YOU Leiyuan, LÜ Yan, LI Gang, SHI Weimin, LIU Taijun

2026, 48(1): 429-435. doi: 10.11999/JEIT250789

[Abstract](173) [FullText HTML] (105) [PDF 5350KB](8)

Abstract:
Objective Modern wireless communication systems require efficient dynamic-range performance in RF power amplifiers. The Doherty Power Amplifier (DPA), which uses dynamic load modulation between the main and auxiliary paths, achieves high efficiency at power backoff. It is widely applied in multi-carrier 4G and 5G macro base stations. Research on DPAs generally focuses on improving backoff efficiency, backoff range, and bandwidth. However, the architecture has a structural limitation because the auxiliary amplifier, biased in Class C, exhibits weak current output compared with the main amplifier biased in Class AB. The low conduction level and short turn-on period of the auxiliary path create nonlinear imbalance and reduce overall performance. Methods The study addresses insufficient load modulation caused by the weak current output capability of the auxiliary amplifier. An equivalent transconductance compensation theory is proposed. It compensates the current of the auxiliary amplifier under Class C bias by injecting a compensatory current into the branch. A load-modulation-enhanced quasi-ideal high-performance DPA is developed to resolve the inherent current deficiency in the auxiliary path of traditional configurations. Results and Discussions A load-modulation-enhanced DPA was designed and fabricated using the GaN HEMT device CG2H40010F for the 1.3

\begin{document}$ \sim $\end{document}

1.8 GHz band. Measurements show that the saturated output power ranges from 43.7 to 44.5 dBm and that the Drain Efficiency (DE) exceeds 69.1%. At a 6 dB backoff, the DE remains between 62.9% and 69.4% and the gain ranges from 9.7 to 10.5 dB. At a 9 dB backoff, the DE ranges from 49.5% to 57% and the gain ranges from 10.3 to 11.5 dB. The equivalent transconductance compensation theory resolves the load modulation bottleneck of traditional DPA structures through the current-injection mechanism. It provides meaningful guidance for broadband RF power-amplifier design with high backoff efficiency. Conclusions The study proposes an equivalent transconductance compensation method by adding a third compensation branch to the traditional DPA structure. This mechanism corrects the weak auxiliary-amplifier current caused by Class C bias and its short turn-on period, thereby achieving a quasi-ideal load-modulation-enhanced DPA. A device operating from 1.3 to 1.8 GHz was designed to validate the method. The measured saturated DE exceeds 69.1%. The DE ranges from 62.9% to 69.4% at a 6 dB backoff and from 49.5% to 57% at a 9 dB backoff. The linearized Adjacent Channel Leakage Ratio (ACLR) is lower than –49 dBc. These results verify the feasibility of the method and show strong application potential.

Research on Snow Depth Measurement Technology Based on Dual-Band Microwave Open Resonant Cavity

LI Mengyao, ZHANG Pengfei, FENG Hao, MA Zhongfa

2026, 48(1): 436-446. doi: 10.11999/JEIT250724

[Abstract](137) [FullText HTML] (113) [PDF 6259KB](12)

Abstract:
Objective Large-scale winter snowfall poses a significant threat to the safety of outdoor infrastructure, including power transmission and communication systems. Real-time monitoring of snow depth within the range of 1～30 mm is required for accurate early warning and effective snow removal scheduling. Satellite- and radar-based techniques are mainly applied to snow depths exceeding 10 cm, but their large size and limited spatial resolution restrict their applicability to near-surface measurements. Although recently developed planar resonant sensors based on the resonance principle improve measurement accuracy, their effective measurement range remains limited. To resolve the trade-off between measurement range and accuracy, a rectangular microwave open resonant cavity featuring a dual-cavity, dual-feed, and dual-frequency-band configuration is proposed in combination with a data inversion algorithm. This scheme achieves a wide dynamic range of 1～30 mm while maintaining a measurement accuracy of 1 mm. The proposed device meets the monitoring requirements for snow depth corresponding to six snowfall intensity grades, ranging from light snow to heavy snowstorms. Methods The research methodology consists of four main stages. First, the phase-matching condition of the resonator formed by the open-ended waveguide and the snow layer is used to derive an analytical relationship between resonant frequency and snow depth, thereby verifying the feasibility of the measurement principle. Subsequently, a single-cavity model with coaxial feed is designed and simulated to evaluate its sensitivity to snow depths from 1 to 25 mm and to determine the corresponding operating frequency band. To further extend the measurement range, a dual-cavity, dual-feed model is constructed using either a metal plate or a Frequency Selective Surface (FSS) as a separator. A segmented measurement strategy is adopted, in which the large cavity and small cavity are responsible for different snow thickness intervals, enabling stable measurements with a precision of 1 mm over the full 1～30 mm range under different snow conditions. Finally, an optimal data inversion scheme is selected and implemented to further improve measurement accuracy. Results and Discussions A snow depth measurement technique based on a dual-band open-ended microwave resonant cavity is demonstrated. The dynamic measurement range is extended from 1～25 mm (Fig. 4) for the single-cavity configuration to 1～30 mm (Fig. 9) for the dual-cavity configuration. Simulation results show that the dual-cavity model maintains stable performance under variations in snow physical properties (Fig. 10～13). As snow depth increases, the resonant frequency exhibits a regular shift toward lower frequencies (Fig. 9(a)), whereas the attenuation remains below –10 dB (Fig. 9(b)), achieving a measurement precision of 1 mm. Experimental results show trends consistent with the simulations (Fig. 15). When combined with the data inversion scheme, the inversion error is less than 0.16 mm (Table 5), satisfying the requirements for both wide dynamic range and high measurement accuracy. Conclusions A dual-cavity, dual-feed, and dual-frequency snow depth measurement method employing either a metal plate or an FSS plate as a cavity separator is proposed. The limited dynamic range of conventional single-cavity designs is addressed through the constructed dual-cavity architecture. Measurement resolution is improved by assigning different snow thickness ranges to the two frequency bands and applying a data inversion algorithm. Experimental results demonstrate that the proposed method enables segmented measurement of snow depth from 1 to 30 mm, with an inversion accuracy of 0.16 mm and a measured precision better than 1 mm. The effects of variations in snow density and snow moisture content on resonant frequency and attenuation are analyzed. For future research, machine learning methods are suggested to associate measurement parameters with meteorological parameters, thereby improving measurement accuracy and extending the early-warning capability of the system.

Research on an EEG-based Neurofeedback System for the Auxiliary Intervention of Post-Traumatic Stress Disorder

TAN Lize, DING Peng, WANG Fan, LI Na, GONG Anmin, NAN Wenya, LI Tianwen, ZHAO Lei, FU Yunfa

2026, 48(1): 447-458. doi: 10.11999/JEIT250093

[Abstract](703) [FullText HTML] (578) [PDF 5794KB](43)

Abstract:
Objective The ElectroEncephaloGram (EEG)-based Neurofeedback Regulation (ENR) system is designed for real-time modulation of dysregulated stress responses to reduce symptoms of Post-Traumatic Stress Disorder (PTSD) and anxiety. This study evaluates the system’s effectiveness and applicability using a series of neurofeedback paradigms tailored for both PTSD patients and healthy participants. Methods Employing real-time EEG monitoring and feedback, the ENR system targets the regulation of alpha wave activity, to alleviate mental health symptoms associated with dysregulated stress responses. The system integrates MATLAB and Unity3D to support a complete workflow for EEG data acquisition, processing, storage, and visual feedback. Experimental validation includes both PTSD patients and healthy participants to assess the system’s effects on neuroplasticity and emotional regulation. Primary assessment indices include changes in alpha wave dynamics and self-reported reductions in stress and anxiety. Results and Discussions Compared with conventional therapeutic methods, the ENR system shows significant potential in reducing symptoms of PTSD and anxiety. During functionality tests, the system effectively captures and regulates alpha wave activity, enabling real-time and efficient neurofeedback. Dynamic adjustment of feedback thresholds and task paradigms allows participants to improve stress responses and emotional states following training. Quantitative data indicate clear enhancements in EEG pattern modulation, while qualitative assessments reflect improvements in participants’ self-reported stress and anxiety levels. Conclusion This study presents an effective and practical EEG-based neurofeedback regulation system that proves applicable and beneficial for both individuals with PTSD and healthy participants. The successful implementation of the system provides a new technological approach for mental health interventions and supports ongoing personalized neuroregulation strategies. Future research should explore broader applications of the system across neurological conditions to fully assess its efficacy and scalability.

Current Issue 2026 Vol. 48, No. 1

Current Issue
2026 Vol. 48, No. 1