Email alert
Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Available online ,
doi: 10.11999/JEIT240087
Abstract:
In order to improve the accuracy of emotion recognition models and solve the problem of insufficient emotional feature extraction, this paper conducts research on bimodal emotion recognition involving audio and facial imagery. In the audio modality, a feature extraction model of a Multi-branch Convolutional Neural Network (MCNN) incorporating a channel-space attention mechanism is proposed, which extracts emotional features from speech spectrograms across time, space, and local feature dimensions. For the facial image modality, a feature extraction model using a Residual Hybrid Convolutional Neural Network (RHCNN) is introduced, which further establishes a parallel attention mechanism that concentrates on global emotional features to enhance recognition accuracy. The emotional features extracted from audio and facial imagery are then classified through separate classification layers, and a decision fusion technique is utilized to amalgamate the classification results. The experimental results indicate that the proposed bimodal fusion model has achieved recognition accuracies of 97.22%, 94.78%, and 96.96% on the RAVDESS, eNTERFACE’05, and RML datasets, respectively. These accuracies signify improvements over single-modality audio recognition by 11.02%, 4.24%, and 8.83%, and single-modality facial image recognition by 4.60%, 6.74%, and 4.10%, respectively. Moreover, the proposed model outperforms related methodologies applied to these datasets in recent years. This illustrates that the advanced bimodal fusion model can effectively focus on emotional information, thereby enhancing the overall accuracy of emotion recognition.
In order to improve the accuracy of emotion recognition models and solve the problem of insufficient emotional feature extraction, this paper conducts research on bimodal emotion recognition involving audio and facial imagery. In the audio modality, a feature extraction model of a Multi-branch Convolutional Neural Network (MCNN) incorporating a channel-space attention mechanism is proposed, which extracts emotional features from speech spectrograms across time, space, and local feature dimensions. For the facial image modality, a feature extraction model using a Residual Hybrid Convolutional Neural Network (RHCNN) is introduced, which further establishes a parallel attention mechanism that concentrates on global emotional features to enhance recognition accuracy. The emotional features extracted from audio and facial imagery are then classified through separate classification layers, and a decision fusion technique is utilized to amalgamate the classification results. The experimental results indicate that the proposed bimodal fusion model has achieved recognition accuracies of 97.22%, 94.78%, and 96.96% on the RAVDESS, eNTERFACE’05, and RML datasets, respectively. These accuracies signify improvements over single-modality audio recognition by 11.02%, 4.24%, and 8.83%, and single-modality facial image recognition by 4.60%, 6.74%, and 4.10%, respectively. Moreover, the proposed model outperforms related methodologies applied to these datasets in recent years. This illustrates that the advanced bimodal fusion model can effectively focus on emotional information, thereby enhancing the overall accuracy of emotion recognition.
Available online ,
doi: 10.11999/JEIT240113
Abstract:
Multi-exposure image fusion is used to enhance the dynamic range of images, resulting in higher-quality outputs. However, for blurred long-exposure images captured in fast-motion scenes, such as autonomous driving, the image quality achieved by directly fusing them with low-exposure images using generalized fusion methods is often suboptimal. Currently, end-to-end fusion methods for combining long and short exposure images with motion blur are lacking. To address this issue, a Deblur Fusion Network (DF-Net) is proposed to solve the problem of fusing long and short exposure images with motion blur in an end-to-end manner. A residual module combined with wavelet transform is proposed for constructing the encoder and decoder, where a single encoder is designed for the feature extraction of short exposure images, a multilevel structure based on encoder and decoder is built for feature extraction of long exposure images with blurring, a residual mean excitation fusion module is designed for the fusion of the long and short exposure features, and finally the image is reconstructed by the decoder. Due to the lack of a benchmark dataset, a multi-exposure fusion dataset with motion blur based on the dataset SICE is created for model training and testing. Finally, the designed model and method are experimentally compared with other state-of-the-art step-by-step optimization methods for image deblurring and multi-exposure fusion from both qualitative and quantitative perspectives to verify the superiority of the model and method in this paper for multi-exposure image fusion with motion blur. The validation is also conducted on a multi-exposure dataset acquired from a moving vehicle, and the effectiveness of the proposed method in solving practical problems is demonstrated by the results.
Multi-exposure image fusion is used to enhance the dynamic range of images, resulting in higher-quality outputs. However, for blurred long-exposure images captured in fast-motion scenes, such as autonomous driving, the image quality achieved by directly fusing them with low-exposure images using generalized fusion methods is often suboptimal. Currently, end-to-end fusion methods for combining long and short exposure images with motion blur are lacking. To address this issue, a Deblur Fusion Network (DF-Net) is proposed to solve the problem of fusing long and short exposure images with motion blur in an end-to-end manner. A residual module combined with wavelet transform is proposed for constructing the encoder and decoder, where a single encoder is designed for the feature extraction of short exposure images, a multilevel structure based on encoder and decoder is built for feature extraction of long exposure images with blurring, a residual mean excitation fusion module is designed for the fusion of the long and short exposure features, and finally the image is reconstructed by the decoder. Due to the lack of a benchmark dataset, a multi-exposure fusion dataset with motion blur based on the dataset SICE is created for model training and testing. Finally, the designed model and method are experimentally compared with other state-of-the-art step-by-step optimization methods for image deblurring and multi-exposure fusion from both qualitative and quantitative perspectives to verify the superiority of the model and method in this paper for multi-exposure image fusion with motion blur. The validation is also conducted on a multi-exposure dataset acquired from a moving vehicle, and the effectiveness of the proposed method in solving practical problems is demonstrated by the results.
Available online ,
doi: 10.11999/JEIT240090
Abstract:
The modular high-voltage power supply, characterized by high efficiency, reliability, and reconfigurability, has found widespread application in high-power high-voltage devices. Among them, the input series output series topology based on the series-parallel resonant converter is suitable for high-frequency high-voltage operating environments, offering advantages such as reduced power losses, winding dielectric losses, and utilizing parasitic parameters of multi-stage transformer. It has broad prospects for application. Current research on this topology primarily focuses on theoretical analysis and efficiency optimization. In practical high-voltage environments, the high-voltage isolation issues between windings of multi-stage transformers have not been effectively addressed. In this paper, a design of shared primary windings for multi-stage transformers is proposed to simplify the high-voltage isolation issues inherent in traditional transformer single-stage winding methods. However, this winding scheme can lead to non-uniform voltage distribution and voltage divergence in multi-stage transformers. Therefore, based on utilizing the parasitic parameters of diodes in transformers and voltage doubling rectifier circuits, an improved topology design is proposed to effectively address the uneven voltage distribution issue. Simulation and experimental validations were conducted, and the results from both simulations and experiments confirm the effectiveness of the proposed high-voltage isolation structure with shared primary windings and the improved topology.
The modular high-voltage power supply, characterized by high efficiency, reliability, and reconfigurability, has found widespread application in high-power high-voltage devices. Among them, the input series output series topology based on the series-parallel resonant converter is suitable for high-frequency high-voltage operating environments, offering advantages such as reduced power losses, winding dielectric losses, and utilizing parasitic parameters of multi-stage transformer. It has broad prospects for application. Current research on this topology primarily focuses on theoretical analysis and efficiency optimization. In practical high-voltage environments, the high-voltage isolation issues between windings of multi-stage transformers have not been effectively addressed. In this paper, a design of shared primary windings for multi-stage transformers is proposed to simplify the high-voltage isolation issues inherent in traditional transformer single-stage winding methods. However, this winding scheme can lead to non-uniform voltage distribution and voltage divergence in multi-stage transformers. Therefore, based on utilizing the parasitic parameters of diodes in transformers and voltage doubling rectifier circuits, an improved topology design is proposed to effectively address the uneven voltage distribution issue. Simulation and experimental validations were conducted, and the results from both simulations and experiments confirm the effectiveness of the proposed high-voltage isolation structure with shared primary windings and the improved topology.
Available online ,
doi: 10.11999/JEIT240253
Abstract:
Sea surface temperature is one of the key elements of the marine environment, which is of great significance to the marine dynamic process and air-sea interaction. Buoy is a commonly used method of sea surface temperature observation. However, due to the irregular distribution of buoys in space, the sea surface temperature data collected by buoys also show irregularity. In addition, it is inevitable that sometimes the buoy is out of order, so that the sea surface temperature data collected is incomplete. Therefore, it is of great significance to reconstruct the incomplete irregular sea surface temperature data. In this paper, the sea surface temperature data is established as a time-varying graph signal, and the graph signal processing method is used to solve the problem of missing data reconstruction of sea surface temperature. Firstly, the sea surface temperature reconstruction model is constructed by using the low rank data and the joint variation characteristics of time-domain and graph-domain. Secondly, a time-varying graph signal reconstruction method based on Low Rank and Joint Smoothness (LRJS) constraints is proposed to solve the optimization problem by using the framework of alternating direction multiplier method, and the computational complexity and the theoretical limit of the estimation error of the method are analyzed. Finally, the sea surface temperature data of the South China Sea and the Pacific Ocean are used to evaluate the effectiveness of the method. The results show that the LRJS method proposed in this paper can improve the reconstruction accuracy compared with the existing missing data reconstruction methods.
Sea surface temperature is one of the key elements of the marine environment, which is of great significance to the marine dynamic process and air-sea interaction. Buoy is a commonly used method of sea surface temperature observation. However, due to the irregular distribution of buoys in space, the sea surface temperature data collected by buoys also show irregularity. In addition, it is inevitable that sometimes the buoy is out of order, so that the sea surface temperature data collected is incomplete. Therefore, it is of great significance to reconstruct the incomplete irregular sea surface temperature data. In this paper, the sea surface temperature data is established as a time-varying graph signal, and the graph signal processing method is used to solve the problem of missing data reconstruction of sea surface temperature. Firstly, the sea surface temperature reconstruction model is constructed by using the low rank data and the joint variation characteristics of time-domain and graph-domain. Secondly, a time-varying graph signal reconstruction method based on Low Rank and Joint Smoothness (LRJS) constraints is proposed to solve the optimization problem by using the framework of alternating direction multiplier method, and the computational complexity and the theoretical limit of the estimation error of the method are analyzed. Finally, the sea surface temperature data of the South China Sea and the Pacific Ocean are used to evaluate the effectiveness of the method. The results show that the LRJS method proposed in this paper can improve the reconstruction accuracy compared with the existing missing data reconstruction methods.
Available online ,
doi: 10.11999/JEIT231394
Abstract:
To address the issues of insufficient multi-scale feature expression ability and insufficient utilization of shallow features in memory network algorithms, a Video Object Segmentation (VOS) algorithm based on multi-scale feature enhancement and global local feature aggregation is proposed in this paper. Firstly, the multi-scale feature enhancement module fuses different scale feature information from reference mask branches and reference RGB branches to enhance the expression ability of multi-scale features; At the same time, a global local feature aggregation module is established, which utilizes convolution operations of different sizes of receptive fields to extract features, through the feature aggregation module, the features of the global and local regions are adaptively fused. This fusion method can better capture the global features and detailed information of the target, improving the accuracy of segmentation; Finally, a cross layer fusion module is designed to improve the accuracy of masks segmentation by utilizing the spatial details of shallow features. By fusing shallow features with deep features, it can better capture the details and edge information of the target. The experimental results show that on the public datasets DAVIS2016, DAVIS2017, and YouTube 2018, the comprehensive performance of our algorithm reaches 91.8%, 84.5%, and 83.0%, respectively, and can run in real-time on both single and multi-objective segmentation tasks.
To address the issues of insufficient multi-scale feature expression ability and insufficient utilization of shallow features in memory network algorithms, a Video Object Segmentation (VOS) algorithm based on multi-scale feature enhancement and global local feature aggregation is proposed in this paper. Firstly, the multi-scale feature enhancement module fuses different scale feature information from reference mask branches and reference RGB branches to enhance the expression ability of multi-scale features; At the same time, a global local feature aggregation module is established, which utilizes convolution operations of different sizes of receptive fields to extract features, through the feature aggregation module, the features of the global and local regions are adaptively fused. This fusion method can better capture the global features and detailed information of the target, improving the accuracy of segmentation; Finally, a cross layer fusion module is designed to improve the accuracy of masks segmentation by utilizing the spatial details of shallow features. By fusing shallow features with deep features, it can better capture the details and edge information of the target. The experimental results show that on the public datasets DAVIS2016, DAVIS2017, and YouTube 2018, the comprehensive performance of our algorithm reaches 91.8%, 84.5%, and 83.0%, respectively, and can run in real-time on both single and multi-objective segmentation tasks.
Available online ,
doi: 10.11999/JEIT240188
Abstract:
The central symmetry based on the virtual array is a necessary fundamental assumption for the structure transformation of Uniform Circular Arrays (UCAs). In this paper, the virtual signal model for circular arrays is used to make an eigen analysis, and an efficient two-dimensional direction finding algorithm is proposed for arbitrary UCAs and Non Uniform Circular Arrays (NUCAs), where the structure transformation of linear arrays is avoided. As such, the Forward/Backward average of the Array Covariance Matrix (FBACM) and the sum-difference transformation method after separating the real and imaginary parts are both utilized to obtain the manifold and real-valued subspace with matching dimensions. Moreover, the linear relationship between the obtained real-valued subspace and the original complex-valued subspace is revealed, where the spatial spectrum is reconstructed without fake targets. The proposed method can be generalized to NUCAs, enhancing the adaptability of real-valued algorithms to circular array structures. Numerical simulations are applied to demonstrate that with significantly reduced complexity, the proposed method in this paper can provide similar performances and better angle resolution as compared to the traditional UCAs based on the mode-step. Meanwhile, the proposed method demonstrates high robustness with amplitude and phase errors in practical scenarios.
The central symmetry based on the virtual array is a necessary fundamental assumption for the structure transformation of Uniform Circular Arrays (UCAs). In this paper, the virtual signal model for circular arrays is used to make an eigen analysis, and an efficient two-dimensional direction finding algorithm is proposed for arbitrary UCAs and Non Uniform Circular Arrays (NUCAs), where the structure transformation of linear arrays is avoided. As such, the Forward/Backward average of the Array Covariance Matrix (FBACM) and the sum-difference transformation method after separating the real and imaginary parts are both utilized to obtain the manifold and real-valued subspace with matching dimensions. Moreover, the linear relationship between the obtained real-valued subspace and the original complex-valued subspace is revealed, where the spatial spectrum is reconstructed without fake targets. The proposed method can be generalized to NUCAs, enhancing the adaptability of real-valued algorithms to circular array structures. Numerical simulations are applied to demonstrate that with significantly reduced complexity, the proposed method in this paper can provide similar performances and better angle resolution as compared to the traditional UCAs based on the mode-step. Meanwhile, the proposed method demonstrates high robustness with amplitude and phase errors in practical scenarios.
Available online ,
doi: 10.11999/JEIT240049
Abstract:
As a new generation of flow-based microfluidics, Fully Programmable Valve Array (FPVA) biochips have become a popular biochemical experimental platform that provide higher flexibility and programmability. Due to environmental and human factors, however, there are usually some physical faults in the manufacturing process such as channel blockage and leakage, which, undoubtedly, can affect the results of bioassays. In addition, as the primary stage of architecture synthesis, high-level synthesis directly affects the quality of sub-sequent design. The fault tolerance problem in the high-level synthesis stage of FPVA biochips is focused on for the first time in this paper, and dynamic fault-tolerant techniques, including a cell function conversion method, a bidirectional redundancy scheme, and a fault mapping method, are presented, providing technical guarantee for realizing efficient fault-tolerant design. By integrating these techniques into the high-level synthesis stage, a high-quality fault-tolerance-oriented high-level synthesis algorithm for FPVA biochips is further realized in this paper, including a fault-aware real-time binding strategy and a fault-aware priority scheduling strategy, which lays a good foundation for the robustness of chip architecture and the correctness of assay outcomes. Experimental results confirm that a high-quality and fault-tolerant high-level synthesis scheme of FPVA biochips can be obtained by the proposed algorithm, providing a strong guarantee for the subsequent realization of a fault-tolerant physical design scheme.
As a new generation of flow-based microfluidics, Fully Programmable Valve Array (FPVA) biochips have become a popular biochemical experimental platform that provide higher flexibility and programmability. Due to environmental and human factors, however, there are usually some physical faults in the manufacturing process such as channel blockage and leakage, which, undoubtedly, can affect the results of bioassays. In addition, as the primary stage of architecture synthesis, high-level synthesis directly affects the quality of sub-sequent design. The fault tolerance problem in the high-level synthesis stage of FPVA biochips is focused on for the first time in this paper, and dynamic fault-tolerant techniques, including a cell function conversion method, a bidirectional redundancy scheme, and a fault mapping method, are presented, providing technical guarantee for realizing efficient fault-tolerant design. By integrating these techniques into the high-level synthesis stage, a high-quality fault-tolerance-oriented high-level synthesis algorithm for FPVA biochips is further realized in this paper, including a fault-aware real-time binding strategy and a fault-aware priority scheduling strategy, which lays a good foundation for the robustness of chip architecture and the correctness of assay outcomes. Experimental results confirm that a high-quality and fault-tolerant high-level synthesis scheme of FPVA biochips can be obtained by the proposed algorithm, providing a strong guarantee for the subsequent realization of a fault-tolerant physical design scheme.
Available online ,
doi: 10.11999/JEIT240257
Abstract:
Considering the issues of limited receptive field and insufficient feature interaction in vision-language tracking framework combineing bi-levelrouting Perception and Scattering Visual Trans-formation (BPSVTrack) is proposed in this paper. Initially, a Bi-level Routing Perception Module (BRPM) is designed which combines Efficient Additive Attention(EAA) and Dual Dynamic Adaptive Module(DDAM) in parallel to enable bidirectional interaction for expanding the receptive field. Consequently, enhancing the model’s ability to integrate features between different windows and sizes efficiently, thereby improving the model’s ability to perceive objects in complex scenes. Secondly, the Scattering Vision Transform Module(SVTM) based on Dual-Tree Complex Wavelet Transform(DTCWT) is introduced to decompose the image into low frequency and high frequency information, aiming to capture the target structure and fine-grained details in the image, thus improving the robustness and accuracy of the model in complex environments. The proposed framework achieves accuracies of 86.1%, 64.4%, and 63.2% on TNL2K, LaSOT, and OTB99 tracking datasets respectively. Moreover, it attains an accuracy of 70.21% on the RefCOCOg dataset, the performance in tracking and locating surpasses that of the baseline model.
Considering the issues of limited receptive field and insufficient feature interaction in vision-language tracking framework combineing bi-levelrouting Perception and Scattering Visual Trans-formation (BPSVTrack) is proposed in this paper. Initially, a Bi-level Routing Perception Module (BRPM) is designed which combines Efficient Additive Attention(EAA) and Dual Dynamic Adaptive Module(DDAM) in parallel to enable bidirectional interaction for expanding the receptive field. Consequently, enhancing the model’s ability to integrate features between different windows and sizes efficiently, thereby improving the model’s ability to perceive objects in complex scenes. Secondly, the Scattering Vision Transform Module(SVTM) based on Dual-Tree Complex Wavelet Transform(DTCWT) is introduced to decompose the image into low frequency and high frequency information, aiming to capture the target structure and fine-grained details in the image, thus improving the robustness and accuracy of the model in complex environments. The proposed framework achieves accuracies of 86.1%, 64.4%, and 63.2% on TNL2K, LaSOT, and OTB99 tracking datasets respectively. Moreover, it attains an accuracy of 70.21% on the RefCOCOg dataset, the performance in tracking and locating surpasses that of the baseline model.
Available online ,
doi: 10.11999/JEIT240300
Abstract:
Physical Unclonable Functions (PUFs), as well as Exclusive OR (XOR) operations, play an important role in the field of information security. In order to break through the functional barrier between PUF and logic operation, an integrated design scheme of PUF and multi-bit parallel XOR operation circuit based on the random process deviation of Differential Cascode Voltage Switch Logic (DCVSL) XOR gate cascade unit is proposed by studying the working mechanism of PUF and DCVSL. By adding a pre-charge tube at the differential output of the DCVSL XOR gate and setting a control gate at the ground end, three operating modes of the PUF feature information extraction, XOR/ Negated Exclusive OR (XNOR) operation and power control can be switched freely. Meanwhile, for the PUF response stability problem, the unstable bit hybrid screening technique with extreme and golden operating point participation labeling was proposed. Based on TSMC process of 65 nm, a fully customized layout design for a 10-bit input bit-wide circuit with an area of 38.76 μm2 was carried out. The experimental results show that the1024 -bit output response can be generated in PUF mode, and a stable key of more than 512 bit can be obtained after hybrid screening, which has good randomness and uniqueness; In the operation mode, 10-bit parallel XOR and XNOR operations can be achieved simultaneously, with power consumption and delay of 2.67 μW and 593.52 ps, respectively. In power control mode, the standby power consumption is only 70.5 nW. The proposed method provides a novel way to break the function-wall of PUF.
Physical Unclonable Functions (PUFs), as well as Exclusive OR (XOR) operations, play an important role in the field of information security. In order to break through the functional barrier between PUF and logic operation, an integrated design scheme of PUF and multi-bit parallel XOR operation circuit based on the random process deviation of Differential Cascode Voltage Switch Logic (DCVSL) XOR gate cascade unit is proposed by studying the working mechanism of PUF and DCVSL. By adding a pre-charge tube at the differential output of the DCVSL XOR gate and setting a control gate at the ground end, three operating modes of the PUF feature information extraction, XOR/ Negated Exclusive OR (XNOR) operation and power control can be switched freely. Meanwhile, for the PUF response stability problem, the unstable bit hybrid screening technique with extreme and golden operating point participation labeling was proposed. Based on TSMC process of 65 nm, a fully customized layout design for a 10-bit input bit-wide circuit with an area of 38.76 μm2 was carried out. The experimental results show that the
Available online ,
doi: 10.11999/JEIT240316
Abstract:
Currently, traditional explicit scene representation Simultaneous Localization And Mapping (SLAM) systems discretize the scene and are not suitable for continuous scene reconstruction. A RGB-D SLAM system based on hybrid scene representation of Neural Radiation Fields (NeRF) is proposed in this paper. The extended explicit octree Signed Distance Functions (SDF) prior is used to roughly represent the scene, and multi-resolution hash coding is used to represent the scene with different details levels, enabling fast initialization of scene geometry and making scene geometry easier to learn. In addition, the appearance color decomposition method is used to decompose the color into diffuse reflection color and specular reflection color based on the view direction to achieve reconstruction of lighting consistency, making the reconstruction result more realistic. Through experiments on the Replica and TUM RGB-D dataset, the scene reconstruction completion rate of the Replica dataset reaches 93.65%. Compared with the Vox-Fusion positioning accuracy, it leads on average by 87.50% on the Replica dataset and by 81.99% on the TUM RGB-D dataset.
Currently, traditional explicit scene representation Simultaneous Localization And Mapping (SLAM) systems discretize the scene and are not suitable for continuous scene reconstruction. A RGB-D SLAM system based on hybrid scene representation of Neural Radiation Fields (NeRF) is proposed in this paper. The extended explicit octree Signed Distance Functions (SDF) prior is used to roughly represent the scene, and multi-resolution hash coding is used to represent the scene with different details levels, enabling fast initialization of scene geometry and making scene geometry easier to learn. In addition, the appearance color decomposition method is used to decompose the color into diffuse reflection color and specular reflection color based on the view direction to achieve reconstruction of lighting consistency, making the reconstruction result more realistic. Through experiments on the Replica and TUM RGB-D dataset, the scene reconstruction completion rate of the Replica dataset reaches 93.65%. Compared with the Vox-Fusion positioning accuracy, it leads on average by 87.50% on the Replica dataset and by 81.99% on the TUM RGB-D dataset.
Available online ,
doi: 10.11999/JEIT240342
Abstract:
In modern electronic countermeasures, grouping of multiple joint radar and communication systems can improve the detection efficiency and collaborative detection capability of the single joint radar and communication system. Due to the high peak to average power ratio of the joint radar and communication signal itself, the signal is easy to be intercepted, and the system’s survivability is seriously threatened. In order to improve the Low Probability of Intercept (LPI) performance of the joint radar and communication signal, a time-frequency structure of grouping LPI joint radar and communication signal with communication subcarrier grouping power optimization and radar subcarrier interleaving equal power optimization under the framework of filter bank multicarrier is proposed in this paper. Then, from the perspective of the information theory, the paper unifies the performance assessment metrics of the system; On this basis, minimizing the intercepted information divergence of the interceptor is taken as the optimization objective, and an LPI optimization model of the group network joint radar and communication signal is established. The paper converts this optimization model into a convex optimization problem and solves it using the Karush-Kuhn-Tucker condition. The simulation results show that the radar interference of the network LPI joint radar and communication signal designed in this paper has inter-node radar interference as low as nearly –60 dB when detecting moving targets, and the communication bit error rate satisfies 10–6 order of magnitude, while the signal-to-noise ratio of the intercepted signal is effectively reduced.
In modern electronic countermeasures, grouping of multiple joint radar and communication systems can improve the detection efficiency and collaborative detection capability of the single joint radar and communication system. Due to the high peak to average power ratio of the joint radar and communication signal itself, the signal is easy to be intercepted, and the system’s survivability is seriously threatened. In order to improve the Low Probability of Intercept (LPI) performance of the joint radar and communication signal, a time-frequency structure of grouping LPI joint radar and communication signal with communication subcarrier grouping power optimization and radar subcarrier interleaving equal power optimization under the framework of filter bank multicarrier is proposed in this paper. Then, from the perspective of the information theory, the paper unifies the performance assessment metrics of the system; On this basis, minimizing the intercepted information divergence of the interceptor is taken as the optimization objective, and an LPI optimization model of the group network joint radar and communication signal is established. The paper converts this optimization model into a convex optimization problem and solves it using the Karush-Kuhn-Tucker condition. The simulation results show that the radar interference of the network LPI joint radar and communication signal designed in this paper has inter-node radar interference as low as nearly –60 dB when detecting moving targets, and the communication bit error rate satisfies 10–6 order of magnitude, while the signal-to-noise ratio of the intercepted signal is effectively reduced.
Available online ,
doi: 10.11999/JEIT240161
Abstract:
Most of the existing lipreading models use a combination of single-layer 3D convolution and 2D convolutional neural networks to extract spatio-temporal joint features from lip video sequences. However, due to the limitations of single-layer 3D convolutions in capturing temporal information and the restricted capability of 2D convolutional neural networks in exploring fine-grained lipreading features, a Multi-Scale Lipreading Network (MS-LipNet) is proposed to improve lip reading tasks. In this paper, 3D spatio-temporal convolution is used to replace traditional two-dimensional convolution in Res2Net network to better extract spatio-temporal joint features, and a spatio-temporal coordinate attention module is proposed to make the network focus on task-related important regional features. The effectiveness of the proposed method was verified through experiments conducted on the LRW and LRW-1000 datasets.
Most of the existing lipreading models use a combination of single-layer 3D convolution and 2D convolutional neural networks to extract spatio-temporal joint features from lip video sequences. However, due to the limitations of single-layer 3D convolutions in capturing temporal information and the restricted capability of 2D convolutional neural networks in exploring fine-grained lipreading features, a Multi-Scale Lipreading Network (MS-LipNet) is proposed to improve lip reading tasks. In this paper, 3D spatio-temporal convolution is used to replace traditional two-dimensional convolution in Res2Net network to better extract spatio-temporal joint features, and a spatio-temporal coordinate attention module is proposed to make the network focus on task-related important regional features. The effectiveness of the proposed method was verified through experiments conducted on the LRW and LRW-1000 datasets.
Available online ,
doi: 10.11999/JEIT240242
Abstract:
Ground Penetrating Radar (GPR) is identified as a non-destructive method usable for the identification of underground targets. Existing methods often struggle with variable target sizes, complex image recognition, and precise target localization. To address these challenges, an innovative method is introduced that leverages a dual YOLOv8-pose model for the detection and precise localization of hyperbolic keypoint. This method, termed Dual YOLOv8-pose Keypoint Localization (DYKL), offers a sophisticated solution to the challenges inherent in GPR-based target identification and positioning. The proposed model architecture includes two-stages: firstly, the YOLOv8-pose model is employed for the preliminary detection of GPR targets, adeptly identifying regions that are likely to contain these targets. Secondly, building upon the training weights established in the first phase, the model further hones the YOLOv8-pose network. This refinement is geared towards the precise detection of keypoints within the candidate target features, thereby facilitating the automated identification and exact localization of underground targets with enhanced accuracy. Through comparison with four advanced deep-learning models— Cascade Region-based Convolutional Neural Networks (Cascade R-CNN), Faster Region-based Convolutional Neural Networks (Faster R-CNN), Real-Time Models for object Detection (RTMDet), and You Only Look Once v7(YOLOv7-face)—the proposed DYKL model exhibits an average recognition accuracy of 98.8%, surpassing these models. The results demonstrate the DYKL model’s high recognition accuracy and robustness, serving as a benchmark for the precise localization of subterranean targets.
Ground Penetrating Radar (GPR) is identified as a non-destructive method usable for the identification of underground targets. Existing methods often struggle with variable target sizes, complex image recognition, and precise target localization. To address these challenges, an innovative method is introduced that leverages a dual YOLOv8-pose model for the detection and precise localization of hyperbolic keypoint. This method, termed Dual YOLOv8-pose Keypoint Localization (DYKL), offers a sophisticated solution to the challenges inherent in GPR-based target identification and positioning. The proposed model architecture includes two-stages: firstly, the YOLOv8-pose model is employed for the preliminary detection of GPR targets, adeptly identifying regions that are likely to contain these targets. Secondly, building upon the training weights established in the first phase, the model further hones the YOLOv8-pose network. This refinement is geared towards the precise detection of keypoints within the candidate target features, thereby facilitating the automated identification and exact localization of underground targets with enhanced accuracy. Through comparison with four advanced deep-learning models— Cascade Region-based Convolutional Neural Networks (Cascade R-CNN), Faster Region-based Convolutional Neural Networks (Faster R-CNN), Real-Time Models for object Detection (RTMDet), and You Only Look Once v7(YOLOv7-face)—the proposed DYKL model exhibits an average recognition accuracy of 98.8%, surpassing these models. The results demonstrate the DYKL model’s high recognition accuracy and robustness, serving as a benchmark for the precise localization of subterranean targets.
Available online ,
doi: 10.11999/JEIT240210
Abstract:
With the advancement of robot automatic navigation technology, software-based path planning algorithms can no longer satisfy the needs in scenarios of many real-time applications. Fast and efficient hardware customization of the algorithm is required to achieve low-latency performance acceleration. In this work, High Level Synthesis (HLS) of classic A* algorithm is studied. Hardware-oriented data structure and function optimization, varying design constraints are explored to pick the right architecture, which is then followed by FPGA synthesis. Experimental results show that, compared to the conventional Register Transfer Level (RTL) method, the HLS-based FPGA implementation of the A* algorithm can achieve better productivity, improved hardware performance and resource utilization efficiency, which demonstrates the advantages of high level synthesis in hardware customization in algorithm-centric applications.
With the advancement of robot automatic navigation technology, software-based path planning algorithms can no longer satisfy the needs in scenarios of many real-time applications. Fast and efficient hardware customization of the algorithm is required to achieve low-latency performance acceleration. In this work, High Level Synthesis (HLS) of classic A* algorithm is studied. Hardware-oriented data structure and function optimization, varying design constraints are explored to pick the right architecture, which is then followed by FPGA synthesis. Experimental results show that, compared to the conventional Register Transfer Level (RTL) method, the HLS-based FPGA implementation of the A* algorithm can achieve better productivity, improved hardware performance and resource utilization efficiency, which demonstrates the advantages of high level synthesis in hardware customization in algorithm-centric applications.
Available online ,
doi: 10.11999/JEIT240201
Abstract:
The Multi-Model Gaussian Mixture-Probability Hypothesis Density (MM-GM-PHD) filter is widely used in uncertain maneuvering target tracking, but it does not maintain parallel estimates under different models, leading to the model-related likelihood lagging behind unknown target maneuvers. To solve this issue, a Joint Multi-Gaussian Mixture PHD (JMGM-PHD) filter is proposed and applied to bearings-only multi-target tracking in this paper. Firstly, a JMGM model is derived, where each single-target state estimate is described by a set of parallel Gaussian functions with model probabilities, and the probability of this state estimate is characterized by a nonegative weight. The weights, model-related probabilities, means and covariances are collectively called JMGM components. According to the Bayesian rule, the updating method of the JMGM components is derived. Then, the multi-target PHD is approximated using the JMGM model. According to the Interactive Multi-Model (IMM) rule, the interacting, prediction and estimation methods of the JMGM components are derived. When addressing Bearings-Only Tracking (BOT), a method based on the derivative rule for composite functions is derived to compute the linearized observation matrix of observers that simultaneously perform translations and rotations. The proposed JMGM-PHD filter preserves the form of regular single-model PHD filter but can adaptively track uncertain maneuvering targets. Simulations show that our algorithm overcomes the likelihood lag issue and outperforms the MM-GM-PHD filter in terms of tracking accuracy and computation cost.
The Multi-Model Gaussian Mixture-Probability Hypothesis Density (MM-GM-PHD) filter is widely used in uncertain maneuvering target tracking, but it does not maintain parallel estimates under different models, leading to the model-related likelihood lagging behind unknown target maneuvers. To solve this issue, a Joint Multi-Gaussian Mixture PHD (JMGM-PHD) filter is proposed and applied to bearings-only multi-target tracking in this paper. Firstly, a JMGM model is derived, where each single-target state estimate is described by a set of parallel Gaussian functions with model probabilities, and the probability of this state estimate is characterized by a nonegative weight. The weights, model-related probabilities, means and covariances are collectively called JMGM components. According to the Bayesian rule, the updating method of the JMGM components is derived. Then, the multi-target PHD is approximated using the JMGM model. According to the Interactive Multi-Model (IMM) rule, the interacting, prediction and estimation methods of the JMGM components are derived. When addressing Bearings-Only Tracking (BOT), a method based on the derivative rule for composite functions is derived to compute the linearized observation matrix of observers that simultaneously perform translations and rotations. The proposed JMGM-PHD filter preserves the form of regular single-model PHD filter but can adaptively track uncertain maneuvering targets. Simulations show that our algorithm overcomes the likelihood lag issue and outperforms the MM-GM-PHD filter in terms of tracking accuracy and computation cost.
Display Method:
2024, 46(10): 3827-3848.
doi: 10.11999/JEIT240155
Abstract:
With the development of artificial intelligence technology, Synthetic Aperture Radar (SAR) target recognition based on deep neural networks has received widespread attention. However, the imaging mechanism of SAR system leads to a strong correlation between image characteristics and imaging parameters, so the algorithm accuracy under deep learning is easily disturbed by the sensitivity of imaging parameters, which becomes a major obstacle restricting the deployment of advanced intelligent algorithms to practical engineering applications. Firstly, in this paper, the developments of SAR image target recognition technology and related data sets are reviewed, and the influence of imaging parameters on image characteristics is analyzed deeply from three aspects, i.e., imaging geometry, radar parameter and noise interference. Then, the existing literature on the robustness and generalization of deep learning technology to imaging parameter sensitivity is summarized from the three dimensions of model, data and features. Thereafter, the experimental results of typical methods are summarized and analyzed. Finally, the research direction of deep learning technology which is expected to break through the sensitivity of imaging parameters in the future is discussed.
With the development of artificial intelligence technology, Synthetic Aperture Radar (SAR) target recognition based on deep neural networks has received widespread attention. However, the imaging mechanism of SAR system leads to a strong correlation between image characteristics and imaging parameters, so the algorithm accuracy under deep learning is easily disturbed by the sensitivity of imaging parameters, which becomes a major obstacle restricting the deployment of advanced intelligent algorithms to practical engineering applications. Firstly, in this paper, the developments of SAR image target recognition technology and related data sets are reviewed, and the influence of imaging parameters on image characteristics is analyzed deeply from three aspects, i.e., imaging geometry, radar parameter and noise interference. Then, the existing literature on the robustness and generalization of deep learning technology to imaging parameter sensitivity is summarized from the three dimensions of model, data and features. Thereafter, the experimental results of typical methods are summarized and analyzed. Finally, the research direction of deep learning technology which is expected to break through the sensitivity of imaging parameters in the future is discussed.
2024, 46(10): 3849-3878.
doi: 10.11999/JEIT240095
Abstract:
Biological organisms in nature are required to continuously learn from and adapt to the environment throughout their lifetime. This ongoing learning capacity serves as the fundamental basis for the biological learning systems. Despite the significant advancements in deep learning methods for computer vision and natural language processing, these models often encounter a serious issue, known as catastrophic forgetting, when learning tasks sequentially. This refers to the model’s tendency to discard previously acquired knowledge when acquiring new information, which greatly hampers the practical application of deep learning models. Thus, the exploration of continual learning is paramount for enhancing and implementing artificial intelligence systems. This paper provides a comprehensive survey of continual learning with deep models. Firstly, the definition and typical settings of continual learning are introduced, followed by the key aspects of the problem. Secondly, existing methods are categorized into four main groups: regularization-based, replay-based, gradient-based and structure-based approaches, with an outline of the strengths and weaknesses of each group. Meanwhile, the paper highlights and summarizes the theoretical progress in continual learning, establishing a crucial nexus between theory and methodology. Additionally, commonly used datasets and evaluation metrics are provided to facilitate fair comparisons among these methods. Finally, the paper addresses current issues, challenges and outlines future research directions in deep continual learning, taking into account its potential applications across diverse fields.
Biological organisms in nature are required to continuously learn from and adapt to the environment throughout their lifetime. This ongoing learning capacity serves as the fundamental basis for the biological learning systems. Despite the significant advancements in deep learning methods for computer vision and natural language processing, these models often encounter a serious issue, known as catastrophic forgetting, when learning tasks sequentially. This refers to the model’s tendency to discard previously acquired knowledge when acquiring new information, which greatly hampers the practical application of deep learning models. Thus, the exploration of continual learning is paramount for enhancing and implementing artificial intelligence systems. This paper provides a comprehensive survey of continual learning with deep models. Firstly, the definition and typical settings of continual learning are introduced, followed by the key aspects of the problem. Secondly, existing methods are categorized into four main groups: regularization-based, replay-based, gradient-based and structure-based approaches, with an outline of the strengths and weaknesses of each group. Meanwhile, the paper highlights and summarizes the theoretical progress in continual learning, establishing a crucial nexus between theory and methodology. Additionally, commonly used datasets and evaluation metrics are provided to facilitate fair comparisons among these methods. Finally, the paper addresses current issues, challenges and outlines future research directions in deep continual learning, taking into account its potential applications across diverse fields.
2024, 46(10): 3879-3889.
doi: 10.11999/JEIT231064
Abstract:
To address the catastrophic forgetting problem in Class Incremental Learning (CIL), a class incremental learning algorithm with dual separation of data flow and feature space for various classes is proposed in this paper. The Dual Separation (S2) algorithm is composed of two stages in an incremental task. In the first stage, the network training is achieved through the comprehensive constraint of classification loss, distillation loss, and contrastive loss. The data flows from different classes are separated depending on module functions, in order to enhance the network’s ability to recognize new classes. By utilizing contrastive loss, the distance between different classes in the feature space is increased to prevent the feature space of old class from being eroded by the new class due to the incompleteness of the old class samples. In the second stage, the imbalanced dataset is subjected to dynamic balancing sampling to provide a balanced dataset for the new network’s dynamic fine-tuning. A high-resolution range profile incremental learning dataset of aircraft targets was created using observed and simulated data. The experimental results demonstrate that the algorithm proposed in this paper outperforms other algorithms in terms of overall performance and higher stability, while maintaining high plasticity.
To address the catastrophic forgetting problem in Class Incremental Learning (CIL), a class incremental learning algorithm with dual separation of data flow and feature space for various classes is proposed in this paper. The Dual Separation (S2) algorithm is composed of two stages in an incremental task. In the first stage, the network training is achieved through the comprehensive constraint of classification loss, distillation loss, and contrastive loss. The data flows from different classes are separated depending on module functions, in order to enhance the network’s ability to recognize new classes. By utilizing contrastive loss, the distance between different classes in the feature space is increased to prevent the feature space of old class from being eroded by the new class due to the incompleteness of the old class samples. In the second stage, the imbalanced dataset is subjected to dynamic balancing sampling to provide a balanced dataset for the new network’s dynamic fine-tuning. A high-resolution range profile incremental learning dataset of aircraft targets was created using observed and simulated data. The experimental results demonstrate that the algorithm proposed in this paper outperforms other algorithms in terms of overall performance and higher stability, while maintaining high plasticity.
2024, 46(10): 3890-3907.
doi: 10.11999/JEIT240138
Abstract:
The existing Synthetic Aperture Radar (SAR) target recognition methods are mostly limited to the closed-set assumption, which considers that the training target categories in training template library cover all the categories to be tested and is not suitable for the open environment with the presence of both known and unknown classes. To solve the problem of SAR target recognition in the case of incomplete target categories in the training template library, an openset SAR target recognition method that combines unknown feature generation with classification score modification is proposed in this paper. Firstly, a prototype network is exploited to get high recognition accuracy of known classes, and then potential unknown features are generated based on prior knowledge to enhance the discrimination of known and unknown classes. After the prototype network being updated, the boundary features of each known class are selected and the distance of each boundary feature to the corresponding class prototype, i.e., maximum distance, is calculated, respectively. Subsequently the maximum distribution area for each known class is determined by the probability fitting of maximum distances for each known class by using extreme value theory. In the testing phase, on the basis of predicting closed-set classification scores by measuring the distance between the testing sample features and each known class prototype, the probability of each distance in the distribution of the corresponding known class’s maximum distance is calculated, and the closed-set classification scores are corrected to automatically determine the rejection probability. Experiments on measured MSTAR dataset show that the proposed method can effectively represent the distribution of unknown class features and enhance the discriminability of known and unknown class features in the feature space, thus achieving accurate recognition for both known class targets and unknown class targets.
The existing Synthetic Aperture Radar (SAR) target recognition methods are mostly limited to the closed-set assumption, which considers that the training target categories in training template library cover all the categories to be tested and is not suitable for the open environment with the presence of both known and unknown classes. To solve the problem of SAR target recognition in the case of incomplete target categories in the training template library, an openset SAR target recognition method that combines unknown feature generation with classification score modification is proposed in this paper. Firstly, a prototype network is exploited to get high recognition accuracy of known classes, and then potential unknown features are generated based on prior knowledge to enhance the discrimination of known and unknown classes. After the prototype network being updated, the boundary features of each known class are selected and the distance of each boundary feature to the corresponding class prototype, i.e., maximum distance, is calculated, respectively. Subsequently the maximum distribution area for each known class is determined by the probability fitting of maximum distances for each known class by using extreme value theory. In the testing phase, on the basis of predicting closed-set classification scores by measuring the distance between the testing sample features and each known class prototype, the probability of each distance in the distribution of the corresponding known class’s maximum distance is calculated, and the closed-set classification scores are corrected to automatically determine the rejection probability. Experiments on measured MSTAR dataset show that the proposed method can effectively represent the distribution of unknown class features and enhance the discriminability of known and unknown class features in the feature space, thus achieving accurate recognition for both known class targets and unknown class targets.
2024, 46(10): 3908-3917.
doi: 10.11999/JEIT231426
Abstract:
In open, dynamic environments where the range of object categories continually expands, the challenge of remote sensing object detection is to detect a known set of object categories while simultaneously identifying unknown objects. To this end, a remote sensing open-set object detection network based on adaptive pre-screening is proposed. Firstly, an adaptive pre-screening module is proposed for object region proposals. Based on the coordinates of the selected region proposals, queries with rich semantic information and spatial features are generated and passed to the decoder. Subsequently, a pseudo-label selection method is devised based on object edge information, and loss functions are constructed with the aim of open set classification to enhance the network’s ability to learn knowledge of unknown classes. Finally, the Military Aircraft Recognition (MAR20) dataset is used to simulate various dynamic environments. Extensive comparative experiments and ablation experiments show that the proposed method can achieve reliable detection of known and unknown objects.
In open, dynamic environments where the range of object categories continually expands, the challenge of remote sensing object detection is to detect a known set of object categories while simultaneously identifying unknown objects. To this end, a remote sensing open-set object detection network based on adaptive pre-screening is proposed. Firstly, an adaptive pre-screening module is proposed for object region proposals. Based on the coordinates of the selected region proposals, queries with rich semantic information and spatial features are generated and passed to the decoder. Subsequently, a pseudo-label selection method is devised based on object edge information, and loss functions are constructed with the aim of open set classification to enhance the network’s ability to learn knowledge of unknown classes. Finally, the Military Aircraft Recognition (MAR20) dataset is used to simulate various dynamic environments. Extensive comparative experiments and ablation experiments show that the proposed method can achieve reliable detection of known and unknown objects.
2024, 46(10): 3918-3927.
doi: 10.11999/JEIT240217
Abstract:
To ensure the Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) system can quickly adapt to new application environments, it must possess the ability to rapidly learn new classes. Currently, SAR ATR systems require repetitive training of all old class samples when learning new classes, leading to significant waste of storage resources and preventing the recognition model from updating quickly. Preserving a small number of old class examples for subsequent incremental training is crucial for model incremental recognition. To address this issue, Exemplar Selection based on Maximizing Non-overlapping Volume (ESMNV) is proposed in this paper, an exemplar selection algorithm that emphasizes the non-overlapping volume of the distribution. ESMNV transforms the exemplar selection problem for each known class into an asymptotic growth problem of the Non-overlapping volume of the distribution, aiming to maximize the Non-overlapping volume of the distribution of the selected exemplars. ESMNV utilizes the similarity between distributions to represent differences in volume. Firstly, ESMNV uses a kernel function to map the distribution of the target class into a Reconstructed Kernel Hilbert Space (RKHS) and employs higher-order moments to represent the distribution. Then, it uses the Maximum Mean Discrepancy (MMD) to compute the difference between the distribution of the target class and the selected exemplars. Combined with a greedy algorithm, ESMNV progressively selects exemplars that minimize the difference in distribution between the selected exemplars and the target class, ensuring the maximum Non-overlapping volume of the selected exemplars with a limited number.
To ensure the Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) system can quickly adapt to new application environments, it must possess the ability to rapidly learn new classes. Currently, SAR ATR systems require repetitive training of all old class samples when learning new classes, leading to significant waste of storage resources and preventing the recognition model from updating quickly. Preserving a small number of old class examples for subsequent incremental training is crucial for model incremental recognition. To address this issue, Exemplar Selection based on Maximizing Non-overlapping Volume (ESMNV) is proposed in this paper, an exemplar selection algorithm that emphasizes the non-overlapping volume of the distribution. ESMNV transforms the exemplar selection problem for each known class into an asymptotic growth problem of the Non-overlapping volume of the distribution, aiming to maximize the Non-overlapping volume of the distribution of the selected exemplars. ESMNV utilizes the similarity between distributions to represent differences in volume. Firstly, ESMNV uses a kernel function to map the distribution of the target class into a Reconstructed Kernel Hilbert Space (RKHS) and employs higher-order moments to represent the distribution. Then, it uses the Maximum Mean Discrepancy (MMD) to compute the difference between the distribution of the target class and the selected exemplars. Combined with a greedy algorithm, ESMNV progressively selects exemplars that minimize the difference in distribution between the selected exemplars and the target class, ensuring the maximum Non-overlapping volume of the selected exemplars with a limited number.
2024, 46(10): 3928-3935.
doi: 10.11999/JEIT240140
Abstract:
Improving the generalization performance of methods under limited sample conditions is an important research direction in Synthetic Aperture Radar Automatic Target Recognition (SAR ATR). Addressing the fundamental problem in this field, a causal model is established in this paper for SAR ATR, demonstrating that interferences in SAR images, such as background and speckle, can be neglected under sufficient sample conditions. However, under limited sample conditions, these factors become confounding variables, introducing spurious correlations into the extracted SAR image features and affecting the generalization of SAR ATR. To accurately identify and eliminate these spurious effects in the features, this paper proposes a limited-sample SAR ATR method via dual consistency, which includes an intra-class feature consistency mask and effect-consistency loss. Firstly, based on the principle that discriminative features should have intra-class consistency and inter-class differences, the intra-class feature consistency mask is used to capture the consistent discriminative features of the target, subtracting the confounded part in the target features, and identifying the spurious effects introduced by interferences. Secondly, based on the idea of invariant risk minimization, the effect-consistency loss transforms the data requirement of empirical risk minimization into a need for labeling the similarity among effects of different samples, reducing the data demand for eliminating spurious effects and removing the spurious effects in the features. Thus, the limited-sample SAR ATR method proposed in this paper achieves true causal feature extraction and accurate recognition performance. Experiments on two benchmark datasets validate the effectiveness of this method which can achieve superior performance of SAR target recognition with limited sample.
Improving the generalization performance of methods under limited sample conditions is an important research direction in Synthetic Aperture Radar Automatic Target Recognition (SAR ATR). Addressing the fundamental problem in this field, a causal model is established in this paper for SAR ATR, demonstrating that interferences in SAR images, such as background and speckle, can be neglected under sufficient sample conditions. However, under limited sample conditions, these factors become confounding variables, introducing spurious correlations into the extracted SAR image features and affecting the generalization of SAR ATR. To accurately identify and eliminate these spurious effects in the features, this paper proposes a limited-sample SAR ATR method via dual consistency, which includes an intra-class feature consistency mask and effect-consistency loss. Firstly, based on the principle that discriminative features should have intra-class consistency and inter-class differences, the intra-class feature consistency mask is used to capture the consistent discriminative features of the target, subtracting the confounded part in the target features, and identifying the spurious effects introduced by interferences. Secondly, based on the idea of invariant risk minimization, the effect-consistency loss transforms the data requirement of empirical risk minimization into a need for labeling the similarity among effects of different samples, reducing the data demand for eliminating spurious effects and removing the spurious effects in the features. Thus, the limited-sample SAR ATR method proposed in this paper achieves true causal feature extraction and accurate recognition performance. Experiments on two benchmark datasets validate the effectiveness of this method which can achieve superior performance of SAR target recognition with limited sample.
2024, 46(10): 3936-3948.
doi: 10.11999/JEIT231470
Abstract:
To power Deep-Learning (DL) based Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) systems with the capability of learning new-class targets incrementally and rapidly in openly dynamic non-cooperative situations, the problem of Few-Shot Class-Incremental Learning (FSCIL) of SAR ATR is researched and a Self-supervised Decoupled Dynamic Classifier (SDDC) is proposed. Considering solving both the intrinsic Catastrophic forgetting and Overfitting dilemma of the FSCIL and domain challenges of SAR ATR, a self-supervised learning task powered by Scattering Component Mixup and Rotation (SCMR) is designed to improve the model’s generalizability and stability for target representation, leveraged by the partiality and azimuth dependence of target information in SAR imagery. Meanwhile, a Class-Imprinting Cross-Entropy (CI-CE) and a Parameter Decoupled Learning (PDL) strategy are designed to fine-tune networks dynamically to identify old and new targets evenly. Experiments on various FSCIL scenarios constructed by the MSTAR and the SAR-AIRcraft-1.0 datasets covering diverse target categories, observing environments, and imaging payloads, verify the method’s adaptability to openly dynamic world.
To power Deep-Learning (DL) based Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) systems with the capability of learning new-class targets incrementally and rapidly in openly dynamic non-cooperative situations, the problem of Few-Shot Class-Incremental Learning (FSCIL) of SAR ATR is researched and a Self-supervised Decoupled Dynamic Classifier (SDDC) is proposed. Considering solving both the intrinsic Catastrophic forgetting and Overfitting dilemma of the FSCIL and domain challenges of SAR ATR, a self-supervised learning task powered by Scattering Component Mixup and Rotation (SCMR) is designed to improve the model’s generalizability and stability for target representation, leveraged by the partiality and azimuth dependence of target information in SAR imagery. Meanwhile, a Class-Imprinting Cross-Entropy (CI-CE) and a Parameter Decoupled Learning (PDL) strategy are designed to fine-tune networks dynamically to identify old and new targets evenly. Experiments on various FSCIL scenarios constructed by the MSTAR and the SAR-AIRcraft-1.0 datasets covering diverse target categories, observing environments, and imaging payloads, verify the method’s adaptability to openly dynamic world.
2024, 46(10): 3949-3956.
doi: 10.11999/JEIT240050
Abstract:
The Radio Environment Map (REM) is one of the effective ways to represent the electromagnetic situation. Considering the issue that the actual observed incomplete spectrum map is corrupted by the impulses and the noises, the incomplete radio environment map is reconstructed and the specific emitter identification is performed based on the reconstructed maps. First, the spectrum map in the complex electromagnetic environment is modeled as the high-dimensional spectrum tensor, and the incomplete spectrum tensor is initially completed by the linear interpolation in preprocessing. Then, the vision transformer model is employed to solve the semantic segmentation problem in order to identify the spectrum semantic regions, in which the power of only one emitter dominates and the low-rank property of each semantic tensor is further preserved. To reconstruct the REM, the compressed tensor decomposition algorithm is proposed, and the expected signal spectrum and impulses are recovered utilizing the Alternating Direction Method of Multipliers (ADMM) in the semantic regions. Finally, the locations of the unknown emitters are detected on the reconstructed spectrum map. The proposed approach leverages the low-rank property of spectrum data and works well in wide-area electromagnetic scenarios involving multiple emitters. The simulation results demonstrate that the proposed approach outperforms the comparative approach in terms of reconstruction performance. It requires fewer observation samples to achieve the same spectrum map recovery accuracy and can accurately detect emitters.
The Radio Environment Map (REM) is one of the effective ways to represent the electromagnetic situation. Considering the issue that the actual observed incomplete spectrum map is corrupted by the impulses and the noises, the incomplete radio environment map is reconstructed and the specific emitter identification is performed based on the reconstructed maps. First, the spectrum map in the complex electromagnetic environment is modeled as the high-dimensional spectrum tensor, and the incomplete spectrum tensor is initially completed by the linear interpolation in preprocessing. Then, the vision transformer model is employed to solve the semantic segmentation problem in order to identify the spectrum semantic regions, in which the power of only one emitter dominates and the low-rank property of each semantic tensor is further preserved. To reconstruct the REM, the compressed tensor decomposition algorithm is proposed, and the expected signal spectrum and impulses are recovered utilizing the Alternating Direction Method of Multipliers (ADMM) in the semantic regions. Finally, the locations of the unknown emitters are detected on the reconstructed spectrum map. The proposed approach leverages the low-rank property of spectrum data and works well in wide-area electromagnetic scenarios involving multiple emitters. The simulation results demonstrate that the proposed approach outperforms the comparative approach in terms of reconstruction performance. It requires fewer observation samples to achieve the same spectrum map recovery accuracy and can accurately detect emitters.
2024, 46(10): 3957-3965.
doi: 10.11999/JEIT240100
Abstract:
Intelligent jamming is a technique that utilizes environmental feedback information and autonomous learning of jamming strategies to effectively disrupt the communication links of the enemy. However, most existing research on intelligent jamming assumes that jammers can directly access the feedback of communication quality indicators, such as bit error rate or packet loss rate. This assumption is difficult to achieve in practical adversarial environments, thus limiting the applicability of intelligent jamming. To address this issue, the communication jamming problem is modeled as a Markov Decision Process (MDP), and by considering both the fundamental principles of jamming and the dynamic behavior of communication objectives, an Improved Policy Hill-Climbing (IPHC) algorithm is proposed. This algorithm follows an OODA loop of “Observe-Orient-Decide-Act”, continuously observes the changes of communication objectives in real time, flexibly adjusts jamming strategies, and applies a mixed strategy decision-making to execute communication jamming. Simulation results demonstrate that when the communication objectives adopt deterministic evasion strategies, the proposed algorithm can quickly converge to the optimal jamming strategy, and the convergence time is at least two-thirds shorter than that of the Q-learning algorithm. When the communication objectives switch evasion strategies, the algorithm can adaptively learn and readjust to the optimal jamming strategy. In the case of communication objectives using mixed evasion strategies, the proposed algorithm also achieves fast convergence and obtains superior jamming effects.
Intelligent jamming is a technique that utilizes environmental feedback information and autonomous learning of jamming strategies to effectively disrupt the communication links of the enemy. However, most existing research on intelligent jamming assumes that jammers can directly access the feedback of communication quality indicators, such as bit error rate or packet loss rate. This assumption is difficult to achieve in practical adversarial environments, thus limiting the applicability of intelligent jamming. To address this issue, the communication jamming problem is modeled as a Markov Decision Process (MDP), and by considering both the fundamental principles of jamming and the dynamic behavior of communication objectives, an Improved Policy Hill-Climbing (IPHC) algorithm is proposed. This algorithm follows an OODA loop of “Observe-Orient-Decide-Act”, continuously observes the changes of communication objectives in real time, flexibly adjusts jamming strategies, and applies a mixed strategy decision-making to execute communication jamming. Simulation results demonstrate that when the communication objectives adopt deterministic evasion strategies, the proposed algorithm can quickly converge to the optimal jamming strategy, and the convergence time is at least two-thirds shorter than that of the Q-learning algorithm. When the communication objectives switch evasion strategies, the algorithm can adaptively learn and readjust to the optimal jamming strategy. In the case of communication objectives using mixed evasion strategies, the proposed algorithm also achieves fast convergence and obtains superior jamming effects.
2024, 46(10): 3966-3978.
doi: 10.11999/JEIT240171
Abstract:
Due to the coupling effect of emitter distortion and receiver distortion, the actual received signal contains the information of the current emitter system and the receiving system, which makes the Radio Frequency Fingerprinting (RFF) technology unable to be generalized in cross-receiving system scenarios. In order to eliminate the effect of receiver, in this paper, a universal RFF method across receiving systems based on receiving domain separation is proposed which considers the influence of the receiver as a separate scope. Through the dual-label multi-channel fusion feature and domain separation adversarial reconstruction method, after trained with multi-receiver data in the source domain, the proposed method can separate domains of transmitting and receiving, extract emitter fingerprint information, which improves the generalization of RFF in scenarios such as cross-receiving system and cross-platform. Compared with the existing cross-receiver RFF methods, the proposed method can truly adapt to the actual unsupervised scenario. And the more the number of source domain receivers participating in the training, the better the domain adaptation effect. It can be directly applied to the new receiving system without repeated training, which has high practical application value.
Due to the coupling effect of emitter distortion and receiver distortion, the actual received signal contains the information of the current emitter system and the receiving system, which makes the Radio Frequency Fingerprinting (RFF) technology unable to be generalized in cross-receiving system scenarios. In order to eliminate the effect of receiver, in this paper, a universal RFF method across receiving systems based on receiving domain separation is proposed which considers the influence of the receiver as a separate scope. Through the dual-label multi-channel fusion feature and domain separation adversarial reconstruction method, after trained with multi-receiver data in the source domain, the proposed method can separate domains of transmitting and receiving, extract emitter fingerprint information, which improves the generalization of RFF in scenarios such as cross-receiving system and cross-platform. Compared with the existing cross-receiver RFF methods, the proposed method can truly adapt to the actual unsupervised scenario. And the more the number of source domain receivers participating in the training, the better the domain adaptation effect. It can be directly applied to the new receiving system without repeated training, which has high practical application value.
2024, 46(10): 3979-4001.
doi: 10.11999/JEIT240172
Abstract:
The significant advancement of deep learning has facilitated the emergence of high-precision interpretation models for remote-sensing images. However, a notable drawback is that the majority of interpretation models are trained independently on static datasets, rendering them incapable of adapting to open environments and dynamic demands. This limitation poses a substantial obstacle to the widespread and long-term application of remote-sensing interpretation models. Incremental learning, empowering models to continuously learn new knowledge while retaining previous knowledge, has been recently utilized to drive the evolution of interpretation models and improve their performance. A comprehensive investigation of incremental learning methods for multi-modal remote sensing data and diverse interpretation tasks is provided in this paper. Existing research efforts are organized and reviewed in terms of mitigating catastrophic forgetting and facilitating interpretation model evolution. Drawing from this research progress, this study deliberates on the future research directions for incremental learning in remote sensing, with the aim of advancing research in model evolution for remote sensing image interpretation.
The significant advancement of deep learning has facilitated the emergence of high-precision interpretation models for remote-sensing images. However, a notable drawback is that the majority of interpretation models are trained independently on static datasets, rendering them incapable of adapting to open environments and dynamic demands. This limitation poses a substantial obstacle to the widespread and long-term application of remote-sensing interpretation models. Incremental learning, empowering models to continuously learn new knowledge while retaining previous knowledge, has been recently utilized to drive the evolution of interpretation models and improve their performance. A comprehensive investigation of incremental learning methods for multi-modal remote sensing data and diverse interpretation tasks is provided in this paper. Existing research efforts are organized and reviewed in terms of mitigating catastrophic forgetting and facilitating interpretation model evolution. Drawing from this research progress, this study deliberates on the future research directions for incremental learning in remote sensing, with the aim of advancing research in model evolution for remote sensing image interpretation.
2024, 46(10): 4002-4008.
doi: 10.11999/JEIT240584
Abstract:
Massive Machine-Type Communication (mMTC) is one of the typical scenarios of the fifth-generation mobile communications systems, and nearly one million devices per square kilometer can be connected under this circumstance. The Reconfigurable Intelligent Surface (RIS) is applied for the grant-free uplink transmission due to the complexity of the propagation environment in the scenario of massive connectivity. Then, the cascaded channel, i.e., the channel link between devices and the RIS, as well as the channel link between the RIS and the Base Station (BS), is formed. Consequently, the quality of the wireless signal transmission can be controlled effectively. On this basis, a denoising learning system is designed using the principle of turbo decoding message passing. The RIS-aided cascaded CSI is learned and estimated through a large number of training data. In addition, the statistical analysis of the RIS-assisted mMTC channel estimation is performed to verify the accuracy of the proposed scheme. Numerical simulation results and theoretical analyses show that the proposed technique is superior to other compressed-sensing-type methods.
Massive Machine-Type Communication (mMTC) is one of the typical scenarios of the fifth-generation mobile communications systems, and nearly one million devices per square kilometer can be connected under this circumstance. The Reconfigurable Intelligent Surface (RIS) is applied for the grant-free uplink transmission due to the complexity of the propagation environment in the scenario of massive connectivity. Then, the cascaded channel, i.e., the channel link between devices and the RIS, as well as the channel link between the RIS and the Base Station (BS), is formed. Consequently, the quality of the wireless signal transmission can be controlled effectively. On this basis, a denoising learning system is designed using the principle of turbo decoding message passing. The RIS-aided cascaded CSI is learned and estimated through a large number of training data. In addition, the statistical analysis of the RIS-assisted mMTC channel estimation is performed to verify the accuracy of the proposed scheme. Numerical simulation results and theoretical analyses show that the proposed technique is superior to other compressed-sensing-type methods.
2024, 46(10): 4009-4016.
doi: 10.11999/JEIT240048
Abstract:
Data collection problem in an Unmanned Aerial Vehicle (UAV)-assisted wireless sensor network is addressed. Firstly, an initial Sensor Node (SN) clustering strategy is proposed based on the mean drift algorithm, then an SN switching algorithm is designed to achieve load balancing between clusters. Based on the obtained clustering strategy, the UAV data collection and trajectory planning problem is formulated as a system energy consumption minimization problem. Since the formulated problem is a non-convex problem and is difficult to solve directly, it is decoupled into two subproblems, namely data scheduling subproblem and UAV trajectory planning subproblem. To tackle the data scheduling subproblem, a multi-slot Kuhn-Munkres algorithm-based time-frequency resource scheduling strategy is proposed. To solve the UAV trajectory planning subproblem, the problem is modeled as a Markov decision-making process, and a deep Q-network-based algorithm is proposed. Simulation results verify the effectiveness of the proposed algorithm.
Data collection problem in an Unmanned Aerial Vehicle (UAV)-assisted wireless sensor network is addressed. Firstly, an initial Sensor Node (SN) clustering strategy is proposed based on the mean drift algorithm, then an SN switching algorithm is designed to achieve load balancing between clusters. Based on the obtained clustering strategy, the UAV data collection and trajectory planning problem is formulated as a system energy consumption minimization problem. Since the formulated problem is a non-convex problem and is difficult to solve directly, it is decoupled into two subproblems, namely data scheduling subproblem and UAV trajectory planning subproblem. To tackle the data scheduling subproblem, a multi-slot Kuhn-Munkres algorithm-based time-frequency resource scheduling strategy is proposed. To solve the UAV trajectory planning subproblem, the problem is modeled as a Markov decision-making process, and a deep Q-network-based algorithm is proposed. Simulation results verify the effectiveness of the proposed algorithm.
2024, 46(10): 4017-4023.
doi: 10.11999/JEIT230879
Abstract:
A rateless coding scheme based on Bernoulli random construction is proposed for strong interference communication environments, which differs from the traditional Luby Transform (LT) rateless codes. The scheme utilizes the Locally Constrained Ordered Statistic Decoding (LC-OSD) algorithm at the receiver to effectively combat strong interference noise and achieve adaptive and ultra-reliable transmission. To reduce the communication resource consumption at both the transmitter and receiver, three effective decoding criteria are proposed: (1) a startup criterion based on the Random Code Union (RCU) bound, which initiates decoding only when the number of received symbols exceeds a threshold derived from RCU; (2) an early stopping criterion based on soft weights, which stops decoding early when the soft weights exceed a preset threshold; and (3) a skipping criterion based on the comparison between the codeword and the hard decision sequence, which skips the current decoding process when the hard decision of the newly received sequence satisfies the recoding check. Simulation results show that the performance of the rateless random codes is significantly better than that of LT codes in a channel with block erasures and additive noise. Moreover, due to the adaptive to channel quality capability of rateless codes, their performance is also significantly better than fixed-rate codes. The simulation results also show that the proposed startup, early stopping, and skipping criteria effectively reduce transmission resources and computational complexity for both the sender and receiver.
A rateless coding scheme based on Bernoulli random construction is proposed for strong interference communication environments, which differs from the traditional Luby Transform (LT) rateless codes. The scheme utilizes the Locally Constrained Ordered Statistic Decoding (LC-OSD) algorithm at the receiver to effectively combat strong interference noise and achieve adaptive and ultra-reliable transmission. To reduce the communication resource consumption at both the transmitter and receiver, three effective decoding criteria are proposed: (1) a startup criterion based on the Random Code Union (RCU) bound, which initiates decoding only when the number of received symbols exceeds a threshold derived from RCU; (2) an early stopping criterion based on soft weights, which stops decoding early when the soft weights exceed a preset threshold; and (3) a skipping criterion based on the comparison between the codeword and the hard decision sequence, which skips the current decoding process when the hard decision of the newly received sequence satisfies the recoding check. Simulation results show that the performance of the rateless random codes is significantly better than that of LT codes in a channel with block erasures and additive noise. Moreover, due to the adaptive to channel quality capability of rateless codes, their performance is also significantly better than fixed-rate codes. The simulation results also show that the proposed startup, early stopping, and skipping criteria effectively reduce transmission resources and computational complexity for both the sender and receiver.
2024, 46(10): 4024-4034.
doi: 10.11999/JEIT240075
Abstract:
The information freshness is measured by Age of Information (AoI) of each sensor in Wireless Sensor Networks (WSN). The UAV optimizes flight trajectories and accelerates speed to assist WSN data collection, which guarantees that the data offloaded to the base station meets the AoI limitation of each sensor. However, the UAV’s inappropriate flight strategies cause non-essential energy consumption due to excessive flight distance and speed, which may result in the failure of data collection mission. In this paper, firstly a mathematical model is investigated and developed for the UAV energy consumption optimization trajectory planning problem on the basis of AoI-constrained data collection. Then, a novel deep reinforcement learning algorithm, named Cooperation Hybrid Proximal Policy Optimization (CH-PPO) algorithm, is proposed to simultaneously schedule the UAV’s access sequence, hovering position, the flight speed to the sensor nodes or the base station, to minimize the UAV's energy consumption under the constraint of data timeliness for each sensor node. Meanwhile, a loss function that integrates the discrete policy and continuous policy is designed to increase the rationality of hybrid actions and improve the training effectiveness of the proposed algorithm. Numerical results demonstrate that the CH-PPO algorithm outperforms the other three reinforcement learning algorithms in the comparison group in energy consumption of UAV and its influencing factors. Furthermore, the convergence, stability, and robustness of the proposed algorithm is well verified.
The information freshness is measured by Age of Information (AoI) of each sensor in Wireless Sensor Networks (WSN). The UAV optimizes flight trajectories and accelerates speed to assist WSN data collection, which guarantees that the data offloaded to the base station meets the AoI limitation of each sensor. However, the UAV’s inappropriate flight strategies cause non-essential energy consumption due to excessive flight distance and speed, which may result in the failure of data collection mission. In this paper, firstly a mathematical model is investigated and developed for the UAV energy consumption optimization trajectory planning problem on the basis of AoI-constrained data collection. Then, a novel deep reinforcement learning algorithm, named Cooperation Hybrid Proximal Policy Optimization (CH-PPO) algorithm, is proposed to simultaneously schedule the UAV’s access sequence, hovering position, the flight speed to the sensor nodes or the base station, to minimize the UAV's energy consumption under the constraint of data timeliness for each sensor node. Meanwhile, a loss function that integrates the discrete policy and continuous policy is designed to increase the rationality of hybrid actions and improve the training effectiveness of the proposed algorithm. Numerical results demonstrate that the CH-PPO algorithm outperforms the other three reinforcement learning algorithms in the comparison group in energy consumption of UAV and its influencing factors. Furthermore, the convergence, stability, and robustness of the proposed algorithm is well verified.
2024, 46(10): 4035-4043.
doi: 10.11999/JEIT240013
Abstract:
Traditional methods for multi-target bias registration in networked radar system typically assume that the data association relationship is known. However, in the case of platform maneuvering, there are simultaneously radar measurement biases and platform attitude angle biases, and the radar observation process is prone to clutter interference, resulting in difficulties in data association. To address this issue, a multi-target mobile radar bias registration method based on Bernoulli filter is proposed. Firstly, the measurement and state equations for the system biases are established, and then the system biases are modeled as a Bernoulli random finite set. The recursive estimation of the system biases under the Bernoulli filtering framework is achieved using the original measurements in a common coordinate system, effectively avoiding the data association. Additionally, to fully utilize multi-target measurement information, a modified greedy measurement partitioning method is proposed to select the optimal measurement subset corresponding to the system biases at each filtering time step, and the centralized fusion estimation of the system biases is performed using the multi-measurement information in the measurement subset, improving the estimation accuracy and convergence speed of the system biases. Simulation experiments show that the proposed method can effectively estimate radar measurement biases and platform attitude angle biases in multi-target and cluttered scenarios with unknown data association. Moreover, this method demonstrates strong adaptability when the platform attitude angle variation rate is low.
Traditional methods for multi-target bias registration in networked radar system typically assume that the data association relationship is known. However, in the case of platform maneuvering, there are simultaneously radar measurement biases and platform attitude angle biases, and the radar observation process is prone to clutter interference, resulting in difficulties in data association. To address this issue, a multi-target mobile radar bias registration method based on Bernoulli filter is proposed. Firstly, the measurement and state equations for the system biases are established, and then the system biases are modeled as a Bernoulli random finite set. The recursive estimation of the system biases under the Bernoulli filtering framework is achieved using the original measurements in a common coordinate system, effectively avoiding the data association. Additionally, to fully utilize multi-target measurement information, a modified greedy measurement partitioning method is proposed to select the optimal measurement subset corresponding to the system biases at each filtering time step, and the centralized fusion estimation of the system biases is performed using the multi-measurement information in the measurement subset, improving the estimation accuracy and convergence speed of the system biases. Simulation experiments show that the proposed method can effectively estimate radar measurement biases and platform attitude angle biases in multi-target and cluttered scenarios with unknown data association. Moreover, this method demonstrates strong adaptability when the platform attitude angle variation rate is low.
2024, 46(10): 4044-4052.
doi: 10.11999/JEIT231348
Abstract:
Aiming at the difficulties in extracting fingerprint features from communication emitters and the low recognition rate of single features, considering the nonlinear and non-stationary characteristics of subtle features of communication emitters, this paper proposes an individual identification method for communication emitters based on improved variational mode decomposition and multiple features. Firstly, in order to obtain the optimal combination of decomposition levels and penalty factors for variational mode decomposition, the variational modal decomposition of communication emitter symbol waveform signals is improved with whale optimization algorithm, in which the sequence complexity is used as the stopping criterion in this method to enable each symbol waveform signal to adaptively decompose several high-frequency signal components containing nonlinear fingerprint features and low-frequency components of data information; Then, according to the relevant threshold, the number of high-frequency signal component layers is selected that can best represent the nonlinear characteristics of the radiation source and the fuzzy entropy, permutation entropy, Higuchi dimension, and Katz dimension are extracted to form a multi-domain joint feature vector; Finally, the recognition and classification of communication emitters are achieved through convolutional neural networks, and recognition and classification experiments are conducted using the Oracle public dataset. The experimental results show that this method has high recognition accuracy and good noise immunity.
Aiming at the difficulties in extracting fingerprint features from communication emitters and the low recognition rate of single features, considering the nonlinear and non-stationary characteristics of subtle features of communication emitters, this paper proposes an individual identification method for communication emitters based on improved variational mode decomposition and multiple features. Firstly, in order to obtain the optimal combination of decomposition levels and penalty factors for variational mode decomposition, the variational modal decomposition of communication emitter symbol waveform signals is improved with whale optimization algorithm, in which the sequence complexity is used as the stopping criterion in this method to enable each symbol waveform signal to adaptively decompose several high-frequency signal components containing nonlinear fingerprint features and low-frequency components of data information; Then, according to the relevant threshold, the number of high-frequency signal component layers is selected that can best represent the nonlinear characteristics of the radiation source and the fuzzy entropy, permutation entropy, Higuchi dimension, and Katz dimension are extracted to form a multi-domain joint feature vector; Finally, the recognition and classification of communication emitters are achieved through convolutional neural networks, and recognition and classification experiments are conducted using the Oracle public dataset. The experimental results show that this method has high recognition accuracy and good noise immunity.
2024, 46(10): 4053-4061.
doi: 10.11999/JEIT240067
Abstract:
The navigation signal authentication service is in the initial stage. The coverage multiple numbers of the authentication signal to ground can not meet the requirement of independent positioning and timing. The existing research has paid little attention to the deception detection method based on partially trusted signals at this stage. Aiming at the status quo, according to the principle of spoofing attack, a spoofing detection method is proposed based on the pseudo-distance residual of the authentication signal, and the spoofing detection model is established in this scenario, and the factors that affect the detection performance of the proposed method are analyzed. After simulation, the average deception detection probability of the algorithm can reach 0.96 when the positioning deviation is 100 m, the positioning accuracy is about 10 m, and the number of trusted satellites is 3. In addition, the blind area of the algorithm is analyzed, and it is proved that the algorithm is effective for most of the deception positions.
The navigation signal authentication service is in the initial stage. The coverage multiple numbers of the authentication signal to ground can not meet the requirement of independent positioning and timing. The existing research has paid little attention to the deception detection method based on partially trusted signals at this stage. Aiming at the status quo, according to the principle of spoofing attack, a spoofing detection method is proposed based on the pseudo-distance residual of the authentication signal, and the spoofing detection model is established in this scenario, and the factors that affect the detection performance of the proposed method are analyzed. After simulation, the average deception detection probability of the algorithm can reach 0.96 when the positioning deviation is 100 m, the positioning accuracy is about 10 m, and the number of trusted satellites is 3. In addition, the blind area of the algorithm is analyzed, and it is proved that the algorithm is effective for most of the deception positions.
2024, 46(10): 4062-4071.
doi: 10.11999/JEIT231025
Abstract:
Gait recognition is susceptible to external factors such as camera viewpoints, clothing, and carrying conditions, which could lead to performance degradation. To address these issues, the technique of non-rigid point set registration is introduced into gait recognition, which is used to improve the dynamic perception ability of human morphological changes by utilizing the deformation field between adjacent gait frames to represent the displacement of human contours during walking. Accordingly, a dual-flow convolutional neural network-GaitDef exploiting human contour deformation field is proposed in this paper, which consists of deformation field and gait silhouette extraction branches. Besides, a multi-scale feature extraction module is designed for the sparsity of deformation field data to obtain multi-level spatial structure information of the deformation field. A dynamic difference capture module and a context information augmentation module are proposed to capture the changing characteristics of dynamic regions in gait silhouettes and consequently enhance gait representation ability by utilizing context information. The output features of the dual-branch network structure are fused to obtain the final gait representation. Extensive experimental results verify the effectiveness of GaitDef. The average Rank-1 accuracy of GaitDef can achieve 93.5%和68.3% on CASIA-B and CCPG datasets, respectively.
Gait recognition is susceptible to external factors such as camera viewpoints, clothing, and carrying conditions, which could lead to performance degradation. To address these issues, the technique of non-rigid point set registration is introduced into gait recognition, which is used to improve the dynamic perception ability of human morphological changes by utilizing the deformation field between adjacent gait frames to represent the displacement of human contours during walking. Accordingly, a dual-flow convolutional neural network-GaitDef exploiting human contour deformation field is proposed in this paper, which consists of deformation field and gait silhouette extraction branches. Besides, a multi-scale feature extraction module is designed for the sparsity of deformation field data to obtain multi-level spatial structure information of the deformation field. A dynamic difference capture module and a context information augmentation module are proposed to capture the changing characteristics of dynamic regions in gait silhouettes and consequently enhance gait representation ability by utilizing context information. The output features of the dual-branch network structure are fused to obtain the final gait representation. Extensive experimental results verify the effectiveness of GaitDef. The average Rank-1 accuracy of GaitDef can achieve 93.5%和68.3% on CASIA-B and CCPG datasets, respectively.
2024, 46(10): 4072-4080.
doi: 10.11999/JEIT240082
Abstract:
Aggressive scaling of CMOS technologies can cause the reliability issues of circuits. Two highly reliable Radiation Hardened By Design (RHBD) 10T and 12T Static Random-Access Memory (SRAM) cells are presented in this paper, which can protect against Single Node Upsets (SNUs) and Double Node Upsets (DNUs). The 10T cell mainly consists of two cross-coupled input-split inverters and the cell can robustly keep stored values through a feedback mechanism among its internal nodes. It also has a low cost in terms of area and power consumption, since it uses only a few transistors. Based on the 10T cell, a 12T cell is proposed that uses four parallel access transistors. The 12T cell has a reduced read/write access time with the same soft error tolerance when compared to the 10T cell. Simulation results demonstrate that the proposed cells can recover from SNUs and a part of DNUs. Moreover, compared with the state-of-the-art hardened SRAM cells, the proposed RHBD 12T cell can save 16.8% write access time, 56.4% read access time, and 10.2% power dissipation at the cost of 5.32% silicon area on average.
Aggressive scaling of CMOS technologies can cause the reliability issues of circuits. Two highly reliable Radiation Hardened By Design (RHBD) 10T and 12T Static Random-Access Memory (SRAM) cells are presented in this paper, which can protect against Single Node Upsets (SNUs) and Double Node Upsets (DNUs). The 10T cell mainly consists of two cross-coupled input-split inverters and the cell can robustly keep stored values through a feedback mechanism among its internal nodes. It also has a low cost in terms of area and power consumption, since it uses only a few transistors. Based on the 10T cell, a 12T cell is proposed that uses four parallel access transistors. The 12T cell has a reduced read/write access time with the same soft error tolerance when compared to the 10T cell. Simulation results demonstrate that the proposed cells can recover from SNUs and a part of DNUs. Moreover, compared with the state-of-the-art hardened SRAM cells, the proposed RHBD 12T cell can save 16.8% write access time, 56.4% read access time, and 10.2% power dissipation at the cost of 5.32% silicon area on average.
News
more >
- JEIT has been included by DOAJ
- Call for papers of Radar Signal Processing Issue for JEIT
- WeChat public platform has been open for JEIT
- Call for papaers of Online Social Network for JEIT
- JEIT attended the 973 Project " social network "
- 2016 Annual NNFS Application Code and Research Direction
- JEIT Held the First Meeting of the Seventh Editorial Board
- JEIT won the title of "2014 Chinese TOP 100 Outstanding Academic Journals"
Conference
more >
- Call for papers of the 9th CWSN2015 Conference
- Notice to the 26th Annual Conference of Circle and System Branch, Chinese Institute of Electronics
- Notice of MAN2015 & ICMAN2015 Conference
- Call for papers 27th Annual Conference of Circle and System Branch, Chinese Institute of Electronics
- Notice on hosting the annual meeting of the 2016 Software Radar Technology Application Conference
- Notice on hosting the annual meeting of the 2016 Software Radar Technology Application Conference
- Call for paper of the 27th academic annual conference of circuits and systems branch
- Call for Papers of the Maritime Informantion Processing and Fusion
Author Center
Wechat Community