Email alert
2023 Vol. 45, No. 6
Display Method:
2023, 45(6): 1911-1920.
doi: 10.11999/JEIT220561
Abstract:
DNA storage technology provides a new way to tackle the problems of massive data storage and application, due to its high density, long durability, and low maintenance cost. To face massive data storage demand, DNA storage has to overcome the problem on how to organize, access and manipulate data files, that is, the design of file system. In this paper, future DNA storage file system model and its characteristics are studied according to computer file system model. Then, the research progress of file system of DNA storage is systematically reviewed. Finally, the perspectives on research direction of future DNA storage file system are discussed.
DNA storage technology provides a new way to tackle the problems of massive data storage and application, due to its high density, long durability, and low maintenance cost. To face massive data storage demand, DNA storage has to overcome the problem on how to organize, access and manipulate data files, that is, the design of file system. In this paper, future DNA storage file system model and its characteristics are studied according to computer file system model. Then, the research progress of file system of DNA storage is systematically reviewed. Finally, the perspectives on research direction of future DNA storage file system are discussed.
2023, 45(6): 1921-1932.
doi: 10.11999/JEIT220500
Abstract:
Hardware Trojan attack has become a serious threat to Integrated Circuit(IC). Hardware Trojans are hidden, rare triggered and the data-sets of Trojan benchmarks are unbalanced, a hardware Trojan detection method that performs a static analysis in gate-level netlist is presented. The path-feature based on the principle of design-for-test is proposed to simplify the analysis of feature. Based on the path-feature extracted in a circuit, the nets are classified into two groups with the Support Vector Machine (SVM) machine learning. It uses the double-weighting method of training-set to improve the performance of the classifier. Experimental results demonstrate that this method can be used to detect the suspicious nets in circuits and the ACCuracy (ACC) can achieve up to 99.85%.The static weighting method improves the performance of the classifier and the improvement of accuracy can achieve up 5.58%. Compared with the existing reference, the size of feature is only 36%, True Positive Rate (TPR) is decreased by 1.07%, True Negative Rate (TNR) is increased by 2.74% and ACC is increased by 2.92% respectively. This work verifies the efficiency of path-feature and SVM machine learning for Hardware Trojan detection and clarifies the relationship between the balance of data-sets and the detection performance.
Hardware Trojan attack has become a serious threat to Integrated Circuit(IC). Hardware Trojans are hidden, rare triggered and the data-sets of Trojan benchmarks are unbalanced, a hardware Trojan detection method that performs a static analysis in gate-level netlist is presented. The path-feature based on the principle of design-for-test is proposed to simplify the analysis of feature. Based on the path-feature extracted in a circuit, the nets are classified into two groups with the Support Vector Machine (SVM) machine learning. It uses the double-weighting method of training-set to improve the performance of the classifier. Experimental results demonstrate that this method can be used to detect the suspicious nets in circuits and the ACCuracy (ACC) can achieve up to 99.85%.The static weighting method improves the performance of the classifier and the improvement of accuracy can achieve up 5.58%. Compared with the existing reference, the size of feature is only 36%, True Positive Rate (TPR) is decreased by 1.07%, True Negative Rate (TNR) is increased by 2.74% and ACC is increased by 2.92% respectively. This work verifies the efficiency of path-feature and SVM machine learning for Hardware Trojan detection and clarifies the relationship between the balance of data-sets and the detection performance.
2023, 45(6): 1933-1943.
doi: 10.11999/JEIT220130
Abstract:
In order to meet the requirements of performance and power in Ultimate Edge Computing (UEC) scenario, a Convolutional Neural Network (CNN) accelerator architecture is proposed with 16 Bit quantization model that does not rely on external memory. The basic structure of proposed architecture is Field Programmable Gate Array (FPGA) with multi-core CNN full pipeline accelerator. On this basis, the optimization of intra-layer mapping and inter-layer fusion of accelerator is realized. Then, the evaluation of computing resource and memory resource are theoretically completed by building the corresponding model. Under the guidance of this model, the resource utilization and computing efficiency are maximized through design space exploration, and the peak computing power of accelerator is fully exploited with limited resource constraint. Finally, taking fast human detection of nano Unmanned Aerial Vehicle (UAV) as an example, the verification and analysis of architecture are completed through experiments. Experimental results show that in the inference of human body detection neural network based on Single Shot multibox Detector (SSD), the performance is achieved with the speed of frame rate 137 and 34 at 100 MHz and 25 MHz, and the corresponding power is 0.514 W and 0.263 W, respectively, which meets the performance and power requirements of real-time image processing in typical UEC scenarios such as autonomous computing of nano-UAV.
In order to meet the requirements of performance and power in Ultimate Edge Computing (UEC) scenario, a Convolutional Neural Network (CNN) accelerator architecture is proposed with 16 Bit quantization model that does not rely on external memory. The basic structure of proposed architecture is Field Programmable Gate Array (FPGA) with multi-core CNN full pipeline accelerator. On this basis, the optimization of intra-layer mapping and inter-layer fusion of accelerator is realized. Then, the evaluation of computing resource and memory resource are theoretically completed by building the corresponding model. Under the guidance of this model, the resource utilization and computing efficiency are maximized through design space exploration, and the peak computing power of accelerator is fully exploited with limited resource constraint. Finally, taking fast human detection of nano Unmanned Aerial Vehicle (UAV) as an example, the verification and analysis of architecture are completed through experiments. Experimental results show that in the inference of human body detection neural network based on Single Shot multibox Detector (SSD), the performance is achieved with the speed of frame rate 137 and 34 at 100 MHz and 25 MHz, and the corresponding power is 0.514 W and 0.263 W, respectively, which meets the performance and power requirements of real-time image processing in typical UEC scenarios such as autonomous computing of nano-UAV.
2023, 45(6): 1944-1951.
doi: 10.11999/JEIT220610
Abstract:
As the core positioning component of the motor, the angle sensor has an important impact on the positioning accuracy of the motor. In this paper, a bipolar inductive absolute angle sensor is designed. The sensor measures the angle by periodically changing the induced voltage in the coil. The sensitive structure includes mainly the rotor and stator, the integration with the motor spindle can be realized. The rotor is composed of inner and outer single cycle and multi cycle fan-shaped copper foil in a bipolar layout, and the stator is composed of excitation coil, receiving coil and subsequent processing circuit. There are two groups of receiving coils in the stator. One group of coils is composed of 8 loops, corresponding to multi cycle sector copper foil on the outer edge, and the other group is composed of 2 loops, corresponding to 180° sector (semicircular) copper foil on the center. The two groups of coils are independent of each other and do not affect each other. When the rotor rotates above the receiving coil, the eddy current generated in the rotor will make the induced voltage of two adjacent receiving coils change in the form of periodic sine and cosine. The measurement accuracy of 8-loop coil is high, but multiple periodic signals will appear within 360°, so the absolute position measurement can not be realized. By measuring the number of cycles of the coil, the problem is solved by identifying the number of cycles of the coil 1 and the number of cycles of the coil 2. The sine and cosine signals are identified and solved by the algorithm, and the prototype is tested based on the high-precision turntable. The results show that the measurement error of the sensor can reach 0.04°, which meets the requirements of motor position control accuracy, and verifies the feasibility of the scheme.
As the core positioning component of the motor, the angle sensor has an important impact on the positioning accuracy of the motor. In this paper, a bipolar inductive absolute angle sensor is designed. The sensor measures the angle by periodically changing the induced voltage in the coil. The sensitive structure includes mainly the rotor and stator, the integration with the motor spindle can be realized. The rotor is composed of inner and outer single cycle and multi cycle fan-shaped copper foil in a bipolar layout, and the stator is composed of excitation coil, receiving coil and subsequent processing circuit. There are two groups of receiving coils in the stator. One group of coils is composed of 8 loops, corresponding to multi cycle sector copper foil on the outer edge, and the other group is composed of 2 loops, corresponding to 180° sector (semicircular) copper foil on the center. The two groups of coils are independent of each other and do not affect each other. When the rotor rotates above the receiving coil, the eddy current generated in the rotor will make the induced voltage of two adjacent receiving coils change in the form of periodic sine and cosine. The measurement accuracy of 8-loop coil is high, but multiple periodic signals will appear within 360°, so the absolute position measurement can not be realized. By measuring the number of cycles of the coil, the problem is solved by identifying the number of cycles of the coil 1 and the number of cycles of the coil 2. The sine and cosine signals are identified and solved by the algorithm, and the prototype is tested based on the high-precision turntable. The results show that the measurement error of the sensor can reach 0.04°, which meets the requirements of motor position control accuracy, and verifies the feasibility of the scheme.
2023, 45(6): 1952-1958.
doi: 10.11999/JEIT220591
Abstract:
Perfect Gaussian Integer Sequence (PGIS) has been widely used in Code Division Multiplexing (CDM) systems and Orthogonal Frequency Division Multiplexing (OFDM) systems because of its good anti-interference, high transmission rate and high frequency spectrum utilization. In this paper, Gaussian Integer Sequence (GIS) is decomposed into real part sequence and imaginary part sequence, and then second-order and third-order PGIS are constructed by second-order cyclotomy of real part sequence and imaginary part sequence. A new method of extending odd length PGIS to even length PGIS is proposed. The energy efficiency of most PGIS constructed in this paper is higher than 95%, and expands the address selection space of spread spectrum communication system, which is of great significance to engineering practice.
Perfect Gaussian Integer Sequence (PGIS) has been widely used in Code Division Multiplexing (CDM) systems and Orthogonal Frequency Division Multiplexing (OFDM) systems because of its good anti-interference, high transmission rate and high frequency spectrum utilization. In this paper, Gaussian Integer Sequence (GIS) is decomposed into real part sequence and imaginary part sequence, and then second-order and third-order PGIS are constructed by second-order cyclotomy of real part sequence and imaginary part sequence. A new method of extending odd length PGIS to even length PGIS is proposed. The energy efficiency of most PGIS constructed in this paper is higher than 95%, and expands the address selection space of spread spectrum communication system, which is of great significance to engineering practice.
2023, 45(6): 1959-1969.
doi: 10.11999/JEIT211285
Abstract:
Traditional Generative Adversarial Network (GAN) ignores the representation and structural information of the original feature when the feature map is large, and there is no remote correlation between the pixels of the generated images, resulting image quality is low. To improve the quality of the generated images further, a method of data generation based on Generative Adversarial Network with Spatial Features (SF-GAN) is proposed. Firstly, the spatial pyramid network is added into the generator and discriminator to capture the important description information better such as the edge of the images. Then the features of the generator and discriminator are strengthened to model the remote correlation between pixels. Experiments are performed with small-scale benchmarks (CelebA, SVHN, and CIFAR-10). Compared with improved training of Wasserstein GANs (WGAN-GP) and Self-Attention Generative Adversarial Networks (SAGAN) by qualitative and quantitative evaluation of Inception Score (IS) and Frechet Inception Distance (FID), the proposed method can generate higher quality images. The experiment proves that the generated images can improve the training effect of the classified model further.
Traditional Generative Adversarial Network (GAN) ignores the representation and structural information of the original feature when the feature map is large, and there is no remote correlation between the pixels of the generated images, resulting image quality is low. To improve the quality of the generated images further, a method of data generation based on Generative Adversarial Network with Spatial Features (SF-GAN) is proposed. Firstly, the spatial pyramid network is added into the generator and discriminator to capture the important description information better such as the edge of the images. Then the features of the generator and discriminator are strengthened to model the remote correlation between pixels. Experiments are performed with small-scale benchmarks (CelebA, SVHN, and CIFAR-10). Compared with improved training of Wasserstein GANs (WGAN-GP) and Self-Attention Generative Adversarial Networks (SAGAN) by qualitative and quantitative evaluation of Inception Score (IS) and Frechet Inception Distance (FID), the proposed method can generate higher quality images. The experiment proves that the generated images can improve the training effect of the classified model further.
2023, 45(6): 1970-1980.
doi: 10.11999/JEIT220585
Abstract:
For the problem that the existing network defense decision-making method is challenging by error interference and real-time response, a novel network defense decision-making method based on an Improved Evolutionary Game Model (IEGM) is proposed. Firstly, using the classical servo system model for reference, the short-term prediction effect of the defense side on the attack strategy is quantified by differential hypothesis to accelerate the convergence of the model and improve the efficiency of defense decisions. Secondly, the mechanism of error generation in attack-defense game is analyzed, then the observational error in network defense is defined quantitatively, and the improved replication dynamics equation is proposed to strengthen the tolerance of the model to information deviation. On this basis, an improved evolutionary game model is established, and the corresponding stability analysis and mathematical proof are given to prove that the model can converge to the\begin{document}$ \varepsilon $\end{document} ![]()
![]()
-neighborhood of the Nash equilibrium solution. Theoretical analysis and simulation results show that the proposed model can overcome the influence of observation error, and the optimal pure defense strategy with deviation order of 0.01% is given. Besides, under the jamming environment, the response speed of defense decision-making can be improved by 64.06% compared with the other three decision models. The improved model and decision-making method can effectively improve the response timeliness of defense decisions and the adaptability to observation error.
For the problem that the existing network defense decision-making method is challenging by error interference and real-time response, a novel network defense decision-making method based on an Improved Evolutionary Game Model (IEGM) is proposed. Firstly, using the classical servo system model for reference, the short-term prediction effect of the defense side on the attack strategy is quantified by differential hypothesis to accelerate the convergence of the model and improve the efficiency of defense decisions. Secondly, the mechanism of error generation in attack-defense game is analyzed, then the observational error in network defense is defined quantitatively, and the improved replication dynamics equation is proposed to strengthen the tolerance of the model to information deviation. On this basis, an improved evolutionary game model is established, and the corresponding stability analysis and mathematical proof are given to prove that the model can converge to the
2023, 45(6): 1981-1989.
doi: 10.11999/JEIT220529
Abstract:
To relieve the negative impacts caused by non-Independent and Identically Distributed (non-IID) data across different clients in federated learning, a spectral clustering-based Fourier personalized federated learning mechanism is proposed to overcome the performance drops from data heterogeneity. Specifically, a cloud-edge-end collaborative personalized federated learning model for image recognition is constructed, and in order to make full use of the knowledge learned by similar clients, the clients are divided into multiple clusters by spectral clustering under cloud-edge collaboration. Next, a local federated learning method based on edge-end collaboration is proposed, in which an agent model is used to perform the process of restoring and re-updating the personalized local model at the clients to restore the local knowledge loss during aggregation. Furthermore, a cloud-edge collaborative Fourier personalized federated learning method is proposed to adapt the global model to each distributed client. In this method, the cloud server converts the local model parameters to the frequency domain space for aggregation through Fourier transform, and customizes high-quality personalized local model for each edge node. Finally, the experimental results demonstrate that the proposed algorithm obtains competitive convergence speed compared with existing representative works and the accuracy is 3%~13% higher.
To relieve the negative impacts caused by non-Independent and Identically Distributed (non-IID) data across different clients in federated learning, a spectral clustering-based Fourier personalized federated learning mechanism is proposed to overcome the performance drops from data heterogeneity. Specifically, a cloud-edge-end collaborative personalized federated learning model for image recognition is constructed, and in order to make full use of the knowledge learned by similar clients, the clients are divided into multiple clusters by spectral clustering under cloud-edge collaboration. Next, a local federated learning method based on edge-end collaboration is proposed, in which an agent model is used to perform the process of restoring and re-updating the personalized local model at the clients to restore the local knowledge loss during aggregation. Furthermore, a cloud-edge collaborative Fourier personalized federated learning method is proposed to adapt the global model to each distributed client. In this method, the cloud server converts the local model parameters to the frequency domain space for aggregation through Fourier transform, and customizes high-quality personalized local model for each edge node. Finally, the experimental results demonstrate that the proposed algorithm obtains competitive convergence speed compared with existing representative works and the accuracy is 3%~13% higher.
2023, 45(6): 1990-1998.
doi: 10.11999/JEIT220547
Abstract:
In-band full-duplex is an efficient technology to alleviate the shortage of spectrum resources in wireless communication system. To ensure the information security of the full-duplex communication system, where a Full Duplex Access Point (FD-AP) receives information from the uplink users and transmits information to the downlink users simultaneously, an Intelligent Reflecting Surface (IRS) assisted physical layer security scheme is proposed in this paper. A multi-variable coupling non-convex optimization problem is formulated to maximize the secrecy rate of downlink users, subject to constraints of the maximum transmit power, the minimum Signal-to-Interference and Noise Ratio(SINR) required at the AP and unit modulus of IRS phase shift. To solve the multi-variable coupling optimization problem, the Alternating Optimization (AO) algorithm is adopted to optimize the AP transmit beamforming and IRS reflection phase shift iteratively, and a Riemannian manifold optimization based on exact penalty method is proposed to deal with the unit modulus constraint and transform the phase shift optimization sup-problem into an unconstrained minimization problem on Riemannian manifold. Simulation results show that the proposed scheme can significantly improve the security performance of full-duplex communication system. In addition, compared with the positive SemiDefinite Relaxation (SDR) algorithm, the proposed algorithm has much lower computational complexity.
In-band full-duplex is an efficient technology to alleviate the shortage of spectrum resources in wireless communication system. To ensure the information security of the full-duplex communication system, where a Full Duplex Access Point (FD-AP) receives information from the uplink users and transmits information to the downlink users simultaneously, an Intelligent Reflecting Surface (IRS) assisted physical layer security scheme is proposed in this paper. A multi-variable coupling non-convex optimization problem is formulated to maximize the secrecy rate of downlink users, subject to constraints of the maximum transmit power, the minimum Signal-to-Interference and Noise Ratio(SINR) required at the AP and unit modulus of IRS phase shift. To solve the multi-variable coupling optimization problem, the Alternating Optimization (AO) algorithm is adopted to optimize the AP transmit beamforming and IRS reflection phase shift iteratively, and a Riemannian manifold optimization based on exact penalty method is proposed to deal with the unit modulus constraint and transform the phase shift optimization sup-problem into an unconstrained minimization problem on Riemannian manifold. Simulation results show that the proposed scheme can significantly improve the security performance of full-duplex communication system. In addition, compared with the positive SemiDefinite Relaxation (SDR) algorithm, the proposed algorithm has much lower computational complexity.
2023, 45(6): 1999-2006.
doi: 10.11999/JEIT220530
Abstract:
Time sensitive network is one of the core technologies of future smart factories. There are multiple business flows with different requirements in smart factories. To ensure the performance of critical traffic flows and improve network bandwidth utilization, a Time Slot-Aware Cyclic Queuing and Forwarding (TSA-CQF) mechanism is proposed. TSA-CQF improves bandwidth utilization by inserting low-priority traffic into the remaining available time slots of the CQF queues. TSA-CQF mechanism includes slot-aware insertion of low-priority traffic and global traffic planning for low-priority traffic. The first part of TSA-CQF is to insert low-priority traffic into the remaining time slots of the CQF queues. The global traffic planning is modeled as a multi-conditional objective optimization problem, and it is solved by the simulated annealing algorithm to accept as many flows as possible to increase the bandwidth utilization. Simulation results show that TSA-CQF improves the bandwidth utilization by 11.29% on average compared with the traditional CQF mechanism under mixed traffic conditions.
Time sensitive network is one of the core technologies of future smart factories. There are multiple business flows with different requirements in smart factories. To ensure the performance of critical traffic flows and improve network bandwidth utilization, a Time Slot-Aware Cyclic Queuing and Forwarding (TSA-CQF) mechanism is proposed. TSA-CQF improves bandwidth utilization by inserting low-priority traffic into the remaining available time slots of the CQF queues. TSA-CQF mechanism includes slot-aware insertion of low-priority traffic and global traffic planning for low-priority traffic. The first part of TSA-CQF is to insert low-priority traffic into the remaining time slots of the CQF queues. The global traffic planning is modeled as a multi-conditional objective optimization problem, and it is solved by the simulated annealing algorithm to accept as many flows as possible to increase the bandwidth utilization. Simulation results show that TSA-CQF improves the bandwidth utilization by 11.29% on average compared with the traditional CQF mechanism under mixed traffic conditions.
2023, 45(6): 2007-2015.
doi: 10.11999/JEIT220513
Abstract:
It’s challenging to use traditional optimization algorithms to solve the long-term dynamic deployment problem of Unmanned Aerial Vehicles (UAVs) due to their high complexity and difficulty in matching dynamic environment. Aiming at solving these shortcomings, a dynamic pre-deployment strategy of UAV based on Multi-Agent Deep Reinforcement Learning (MADRL) is proposed. Firstly, a deep spatio-temporal network model is used to predict the expected rate demand of users in the coverage area to capture the dynamic environment information. The concept of users’ satisfaction is defined to describe the fairness of users. An optimization problem is modeled with the goal of maximizing the long-term overall users’ satisfaction, minimizing the mobile and radio energy consumption of the UAVs. Secondly, the problem above is transformed into a Partially Observable Markov Game (POMG) process. An H-MADDPG algorithm based on MADRL is proposed to solve the optimal decision of trajectory design, user association and power allocation. The H-MADDPG algorithm uses a hybrid network structure to extract the features of multi-modal inputs, and adopts a centralized training-distributed execution mechanism to realize efficient training and decision execution. Finally, the effectiveness of the algorithm is verified by simulation experiments.
It’s challenging to use traditional optimization algorithms to solve the long-term dynamic deployment problem of Unmanned Aerial Vehicles (UAVs) due to their high complexity and difficulty in matching dynamic environment. Aiming at solving these shortcomings, a dynamic pre-deployment strategy of UAV based on Multi-Agent Deep Reinforcement Learning (MADRL) is proposed. Firstly, a deep spatio-temporal network model is used to predict the expected rate demand of users in the coverage area to capture the dynamic environment information. The concept of users’ satisfaction is defined to describe the fairness of users. An optimization problem is modeled with the goal of maximizing the long-term overall users’ satisfaction, minimizing the mobile and radio energy consumption of the UAVs. Secondly, the problem above is transformed into a Partially Observable Markov Game (POMG) process. An H-MADDPG algorithm based on MADRL is proposed to solve the optimal decision of trajectory design, user association and power allocation. The H-MADDPG algorithm uses a hybrid network structure to extract the features of multi-modal inputs, and adopts a centralized training-distributed execution mechanism to realize efficient training and decision execution. Finally, the effectiveness of the algorithm is verified by simulation experiments.
2023, 45(6): 2016-2023.
doi: 10.11999/JEIT220573
Abstract:
Access Point(AP) selection in cell-free massive Multiple-Input Multiple-Output Non-Orthogonal Multiple Access(MIMO-NOMA) system has a great impact on effectively reducing the backhaul overhead and improving the user’s downlink achievable rate. In this paper, the expression of the downlink average rate of the user is derived for the cell-free massive MIMO-NOMA system using AP selection. Then, a novel AP selection strategy based on Quantum Bacterial Foraging Optimization(QBFO) is proposed, which encodes the connection relationship between APs and users in the form of qubits. The adaptive quantum rotation gate is used to simulate the chemotaxis of bacteria. By measuring the quantum bacterial population, the selection solution set of APs and the users is obtained, and the dispersal operation is introduced to avoid the algorithm from falling into local optimum. Numerical results show that the proposed scheme can significantly improve the downlink average rate of users while relieving the backhaul burden. Compared with the schemes based on received power and channel estimation mean square error, the proposed scheme has better performance in reducing inter-user interference and improving the total throughput of the system.
Access Point(AP) selection in cell-free massive Multiple-Input Multiple-Output Non-Orthogonal Multiple Access(MIMO-NOMA) system has a great impact on effectively reducing the backhaul overhead and improving the user’s downlink achievable rate. In this paper, the expression of the downlink average rate of the user is derived for the cell-free massive MIMO-NOMA system using AP selection. Then, a novel AP selection strategy based on Quantum Bacterial Foraging Optimization(QBFO) is proposed, which encodes the connection relationship between APs and users in the form of qubits. The adaptive quantum rotation gate is used to simulate the chemotaxis of bacteria. By measuring the quantum bacterial population, the selection solution set of APs and the users is obtained, and the dispersal operation is introduced to avoid the algorithm from falling into local optimum. Numerical results show that the proposed scheme can significantly improve the downlink average rate of users while relieving the backhaul burden. Compared with the schemes based on received power and channel estimation mean square error, the proposed scheme has better performance in reducing inter-user interference and improving the total throughput of the system.
2023, 45(6): 2024-2033.
doi: 10.11999/JEIT220673
Abstract:
In this paper, the joint optimization of transmission power, modulation mode and the rate of channel codes is studied in wireless communication systems with Energy Harvesting(EH) when the prior information of energy harvesting and channel state is unknown. The target of the optimization is to maximize the actual achievable transmission rate. Based on the Lyapunov optimization framework, the long-term constraint of energy is transformed into the stability requirement of energy virtual queue, and the maximization of the long-term average achievable transmission rate is transformed to the minimization of the upper bound on the “drift-plus-penalty” at each time slot that only depends on the current system state such as channel fading and battery power level. The optimization is solved by using an efficient numerical algorithm. In addition, an adaptive adjustment method for the two parameters, that is, weight and virtual queue offset in “drift-plus-penalty” based on sliding window K-means clustering is given. The performance of the proposed algorithm is compared with that of the comparison algorithms under different energy arrival stochastic models by computer simulation. The results show that the proposed algorithm can achieve a higher long-term average rate under various energy arrival models. The correctness and effectiveness of the adaptive adjustment of the two parameters are verified by the performance comparing between the algorithm with the optimal parameters and with the adaptive adjusted parameters.
In this paper, the joint optimization of transmission power, modulation mode and the rate of channel codes is studied in wireless communication systems with Energy Harvesting(EH) when the prior information of energy harvesting and channel state is unknown. The target of the optimization is to maximize the actual achievable transmission rate. Based on the Lyapunov optimization framework, the long-term constraint of energy is transformed into the stability requirement of energy virtual queue, and the maximization of the long-term average achievable transmission rate is transformed to the minimization of the upper bound on the “drift-plus-penalty” at each time slot that only depends on the current system state such as channel fading and battery power level. The optimization is solved by using an efficient numerical algorithm. In addition, an adaptive adjustment method for the two parameters, that is, weight and virtual queue offset in “drift-plus-penalty” based on sliding window K-means clustering is given. The performance of the proposed algorithm is compared with that of the comparison algorithms under different energy arrival stochastic models by computer simulation. The results show that the proposed algorithm can achieve a higher long-term average rate under various energy arrival models. The correctness and effectiveness of the adaptive adjustment of the two parameters are verified by the performance comparing between the algorithm with the optimal parameters and with the adaptive adjusted parameters.
2023, 45(6): 2034-2044.
doi: 10.11999/JEIT220521
Abstract:
So far, the information theory of Orthogonal Frequency Division Multiplexing (OFDM) Visible Light Communication (VLC) and Radio Frequency (RF) aggregation systems based on finite alphabet inputs is still unknown. Based on this situation, the achievable rate of unclosed expression and the lower bound with closed expression of OFDM VLC-RF aggregation system are derived, and the maximization of Spectral Efficiency (SE) based on achievable rate and its lower bound satisfying the constraints of average optical power and total electric power is studied. In this paper, the relationship between mutual information and Minimum Mean Square Error (MMSE) is used to deal with the rate partial derivative, and the double Water-filling algorithm is proposed to solve the maximization problem of spectral efficiency. Because the non-closed form of spectral efficiency leads to high computational complexity of the double Water-filling algorithm, this paper further studies the problem of spectral efficiency maximization with closed form and uses the interior point method to solve it. The simulation results show that the aggregation system has the advantages in communication performance compared with a single link, and the spectral efficiency based on the lower bound of achievable rate can be used as a good low complexity approximation of the spectral efficiency based on achievable rate.
So far, the information theory of Orthogonal Frequency Division Multiplexing (OFDM) Visible Light Communication (VLC) and Radio Frequency (RF) aggregation systems based on finite alphabet inputs is still unknown. Based on this situation, the achievable rate of unclosed expression and the lower bound with closed expression of OFDM VLC-RF aggregation system are derived, and the maximization of Spectral Efficiency (SE) based on achievable rate and its lower bound satisfying the constraints of average optical power and total electric power is studied. In this paper, the relationship between mutual information and Minimum Mean Square Error (MMSE) is used to deal with the rate partial derivative, and the double Water-filling algorithm is proposed to solve the maximization problem of spectral efficiency. Because the non-closed form of spectral efficiency leads to high computational complexity of the double Water-filling algorithm, this paper further studies the problem of spectral efficiency maximization with closed form and uses the interior point method to solve it. The simulation results show that the aggregation system has the advantages in communication performance compared with a single link, and the spectral efficiency based on the lower bound of achievable rate can be used as a good low complexity approximation of the spectral efficiency based on achievable rate.
2023, 45(6): 2045-2053.
doi: 10.11999/JEIT220531
Abstract:
To deal with effects of limited spectrum, fading and multipath during wireless communication, a new joint source-channel code modulation scheme is proposed. This scheme consists of a Variable-Length Error-Correct (VLEC) code and doping modulation. With the aid of EXtrinsic Information Transfer (EXIT) chart analysis for the iterative decoding characteristics, the parameters of VLEC and doping modulation are designed. The design includes: A variable-length code with large free distance is constructed to provide error correction capability; The doping code and mapping of modulation are optimized to make the EXIT curve of doping modulation match with the EXIT curve of VLEC, and hence the Signal-to-Noise Ratio (SNR) required for iterative decoding convergence is reduced. Simulation results show that over AWGN channel and Rayleigh fading channel, the proposed scheme has more than 1 dB SNR gains compared with the separated source-channel code modulation and has the best performance compared with other joint source-channel code modulation schemes. Also, at the symbol error rate of 10-4, the performance of the proposed system is 0.7 dB and 1.0 dB away from the Shannon limit, respectively.
To deal with effects of limited spectrum, fading and multipath during wireless communication, a new joint source-channel code modulation scheme is proposed. This scheme consists of a Variable-Length Error-Correct (VLEC) code and doping modulation. With the aid of EXtrinsic Information Transfer (EXIT) chart analysis for the iterative decoding characteristics, the parameters of VLEC and doping modulation are designed. The design includes: A variable-length code with large free distance is constructed to provide error correction capability; The doping code and mapping of modulation are optimized to make the EXIT curve of doping modulation match with the EXIT curve of VLEC, and hence the Signal-to-Noise Ratio (SNR) required for iterative decoding convergence is reduced. Simulation results show that over AWGN channel and Rayleigh fading channel, the proposed scheme has more than 1 dB SNR gains compared with the separated source-channel code modulation and has the best performance compared with other joint source-channel code modulation schemes. Also, at the symbol error rate of 10-4, the performance of the proposed system is 0.7 dB and 1.0 dB away from the Shannon limit, respectively.
2023, 45(6): 2054-2062.
doi: 10.11999/JEIT220554
Abstract:
To ensure security and reduce energy consumption for data acquisition in Wireless Sensor Network (WSN), Unmanned Aerial Vehicle (UAV) swarms-aided energy consumption optimization for data acquisition algorithm is proposed. The total energy consumption of the system is reduced by optimizing the number of UAVs, the height and the number of data transmissions in the WSN according to this algorithm. Firstly, for data acquisition in WSN, a Reputation baseD Data Dual Compression (RDDC) algorithm is proposed, which divides sensors into clusters according to geographic locations. In a cluster, there are one cluster head which is responsible for model selection, aggregation, and reputation update, and several cluster members which are responsible for training the prediction model and send it to the cluster head. Secondly, a UAV deployment optimization algorithm is proposed to minimize energy consumption of UAV swarms, which is transformed into a circular packing problem and solved by dynamically adjusting the number of UAVs. Moreover, a private blockchain is enabled in the UAV swarm to improve the security of the data acquisition process. Finally, the proposed method is verified by Berkeley Research Laboratory dataset and simulation results show that this method could optimize the deployment of UAVs, achieve small error, low energy consumption and high security.
To ensure security and reduce energy consumption for data acquisition in Wireless Sensor Network (WSN), Unmanned Aerial Vehicle (UAV) swarms-aided energy consumption optimization for data acquisition algorithm is proposed. The total energy consumption of the system is reduced by optimizing the number of UAVs, the height and the number of data transmissions in the WSN according to this algorithm. Firstly, for data acquisition in WSN, a Reputation baseD Data Dual Compression (RDDC) algorithm is proposed, which divides sensors into clusters according to geographic locations. In a cluster, there are one cluster head which is responsible for model selection, aggregation, and reputation update, and several cluster members which are responsible for training the prediction model and send it to the cluster head. Secondly, a UAV deployment optimization algorithm is proposed to minimize energy consumption of UAV swarms, which is transformed into a circular packing problem and solved by dynamically adjusting the number of UAVs. Moreover, a private blockchain is enabled in the UAV swarm to improve the security of the data acquisition process. Finally, the proposed method is verified by Berkeley Research Laboratory dataset and simulation results show that this method could optimize the deployment of UAVs, achieve small error, low energy consumption and high security.
2023, 45(6): 2063-2070.
doi: 10.11999/JEIT220457
Abstract:
To solve the potential security issue caused by the fact that the transmitted beam in the phased array-assisted wireless communication systems only depend on angle characteristics and high computational complexity caused by the traditional iteration algorithms. A secure transmission scheme with 3D secure region assisted by Random Frequency Diverse Array (RFDA) and Deep Learning (DL) is proposed in this paper. Firstly, the requirements for the secure communication with the desired user within 3D secure zone are derived. Based on it, an optimization problem is formulated to maximize the lower bound of the secure rate of the considered system. Then, an optimization scheme based on deep learning is proposed to design the beamforming vector and Artificial Noise (AN) vector, so as to reduce the computational complexity. Simulation results show that even when the eavesdropper is located at the edge of the desired user’s secure region, the proposed scheme can achieve the 3D secure transmission, and ensure the received confidential information in secure region.
To solve the potential security issue caused by the fact that the transmitted beam in the phased array-assisted wireless communication systems only depend on angle characteristics and high computational complexity caused by the traditional iteration algorithms. A secure transmission scheme with 3D secure region assisted by Random Frequency Diverse Array (RFDA) and Deep Learning (DL) is proposed in this paper. Firstly, the requirements for the secure communication with the desired user within 3D secure zone are derived. Based on it, an optimization problem is formulated to maximize the lower bound of the secure rate of the considered system. Then, an optimization scheme based on deep learning is proposed to design the beamforming vector and Artificial Noise (AN) vector, so as to reduce the computational complexity. Simulation results show that even when the eavesdropper is located at the edge of the desired user’s secure region, the proposed scheme can achieve the 3D secure transmission, and ensure the received confidential information in secure region.
2023, 45(6): 2071-2080.
doi: 10.11999/JEIT220624
Abstract:
Focusing on solving the problem of small sample signal modulation recognition, the theoretical feasibility of using Support Vector Machine (SVM) for modulation recognition is investigated firstly; Secondly, based on statistical learning theory, a theoretical analysis of using Generative Adversarial Networks (GAN) generated data to enhance the classification ability of SVM is conducted; And finally, a Deep Convolutional Generative Adversarial Network based on Layer normalization (LDCGAN) is constructed , whose generated data has more obvious features than Deep Convolutional Generative Adversarial Networks (DCGAN) after mapping to a high-dimensional space, so the generated data is more conducive to the classification of SVM. The experiments verify that LDGAN generated data can achieve an effective enhancement of the classification ability of SVM under the condition of small samples.
Focusing on solving the problem of small sample signal modulation recognition, the theoretical feasibility of using Support Vector Machine (SVM) for modulation recognition is investigated firstly; Secondly, based on statistical learning theory, a theoretical analysis of using Generative Adversarial Networks (GAN) generated data to enhance the classification ability of SVM is conducted; And finally, a Deep Convolutional Generative Adversarial Network based on Layer normalization (LDCGAN) is constructed , whose generated data has more obvious features than Deep Convolutional Generative Adversarial Networks (DCGAN) after mapping to a high-dimensional space, so the generated data is more conducive to the classification of SVM. The experiments verify that LDGAN generated data can achieve an effective enhancement of the classification ability of SVM under the condition of small samples.
2023, 45(6): 2081-2088.
doi: 10.11999/JEIT220659
Abstract:
Intelligent Reflecting Surface (IRS) has been widely used in the after 5G and 6G to increase the communication efficiency by adjusting the wireless transmission channels in real-time. In this paper, the use of distributed IRSs to maximize the secrecy rate is investigated. Considering the constraints for the power, constant-mode and the correlation between the IRS links, the optimization problem of the secrecy rate maximization based on the active beamforming of the base station and IRSs’ phase shifts is formulated. An efficient algorithm is proposed to solve the non-convex optimization problem by using the fractional programming and manifold optimization methods. Finally, simulation results confirm, compared with the conservative methods, the proposed algorithm can significantly improve the secrecy communication rate. Furthermore, the results unveil that the distributed IRS scheme achieves a significant secure performance improvement over the centralized IRS.
Intelligent Reflecting Surface (IRS) has been widely used in the after 5G and 6G to increase the communication efficiency by adjusting the wireless transmission channels in real-time. In this paper, the use of distributed IRSs to maximize the secrecy rate is investigated. Considering the constraints for the power, constant-mode and the correlation between the IRS links, the optimization problem of the secrecy rate maximization based on the active beamforming of the base station and IRSs’ phase shifts is formulated. An efficient algorithm is proposed to solve the non-convex optimization problem by using the fractional programming and manifold optimization methods. Finally, simulation results confirm, compared with the conservative methods, the proposed algorithm can significantly improve the secrecy communication rate. Furthermore, the results unveil that the distributed IRS scheme achieves a significant secure performance improvement over the centralized IRS.
2023, 45(6): 2089-2097.
doi: 10.11999/JEIT220627
Abstract:
Quantum Imaging(QI) is an important research direction in the field of quantum optics due to its anti-reconnaissance, anti-interference and high resolution. In order to solve the problem of image quality degradation caused by the abnormal coincidence count value caused by ambient light in the actual quantum imaging process, a photon quantum imaging method based on coincidence count filter optimization is proposed in this paper. Firstly, three-layer Discrete Wavelet Transform(DWT) on the original coincident count values is performed to obtain the corresponding wavelet coefficients. Secondly, Gaussian filtering is performed to denoise the high-frequency components in the wavelet coefficients, and the denoised coincident count values through inverse wavelet transform is obtained in this paper. Finally, according to these coincidence count values, the linear mapping method is used to achieve quantum imaging of the target. In this paper, the influence of image pixel number, single pixel exposure time and coincidence gate width on imaging results by simulation are analyzed, and the actual quantum imaging optical path is built to verify the validity of the simulation analysis.
Quantum Imaging(QI) is an important research direction in the field of quantum optics due to its anti-reconnaissance, anti-interference and high resolution. In order to solve the problem of image quality degradation caused by the abnormal coincidence count value caused by ambient light in the actual quantum imaging process, a photon quantum imaging method based on coincidence count filter optimization is proposed in this paper. Firstly, three-layer Discrete Wavelet Transform(DWT) on the original coincident count values is performed to obtain the corresponding wavelet coefficients. Secondly, Gaussian filtering is performed to denoise the high-frequency components in the wavelet coefficients, and the denoised coincident count values through inverse wavelet transform is obtained in this paper. Finally, according to these coincidence count values, the linear mapping method is used to achieve quantum imaging of the target. In this paper, the influence of image pixel number, single pixel exposure time and coincidence gate width on imaging results by simulation are analyzed, and the actual quantum imaging optical path is built to verify the validity of the simulation analysis.
2023, 45(6): 2098-2104.
doi: 10.11999/JEIT220541
Abstract:
Orthogonal Time Sequency Multiplexing (OTSM) multiplexes information symbols in the delay-sequence domain through concatenated time division and Walsh-Hadamard multiplexing. Due to the Walsh-Hadamard Transform (WHT) does not require complex multiplication operations in the modulation and demodulation process, it has lower modulation complexity than Orthogonal Time-Frequency Space (OTFS). In this paper, a two-stage equalizer is proposed for OTSM systems in high-speed mobile environments. First, low-complexity MMSE detection is performed block-by-block in the time domain by utilizing the sparsity and band structure of the channel matrix; Then Gauss-Seid (GS) iterative detection further removes residual symbol interference. The simulation results show that, compared with the GS iterative detection algorithm based on single-tap frequency domain equalization, the proposed algorithm has a performance gain of 1.8 dB when 16QAM modulation is used and the bit error rate is 10–4.
Orthogonal Time Sequency Multiplexing (OTSM) multiplexes information symbols in the delay-sequence domain through concatenated time division and Walsh-Hadamard multiplexing. Due to the Walsh-Hadamard Transform (WHT) does not require complex multiplication operations in the modulation and demodulation process, it has lower modulation complexity than Orthogonal Time-Frequency Space (OTFS). In this paper, a two-stage equalizer is proposed for OTSM systems in high-speed mobile environments. First, low-complexity MMSE detection is performed block-by-block in the time domain by utilizing the sparsity and band structure of the channel matrix; Then Gauss-Seid (GS) iterative detection further removes residual symbol interference. The simulation results show that, compared with the GS iterative detection algorithm based on single-tap frequency domain equalization, the proposed algorithm has a performance gain of 1.8 dB when 16QAM modulation is used and the bit error rate is 10–4.
2023, 45(6): 2105-2114.
doi: 10.11999/JEIT220493
Abstract:
Polarimetric Inverse Synthetic Aperture Radar (ISAR), which has the ability of full polarization measurement and high-resolution imaging, has become an important sensor for space awareness. As a typical man-made target, space target has various scattering characteristic, which is sensitive to the relative angle between the target orientation and the radar’s line of sight. This scattering diversity makes it difficult for polarimetric ISAR data interpretation. Besides, enrich polarimetric scattering information is hidden within it. In order to promote the interpretation performance of space target, a scattering structure recognition method in polarimetric rotation domain is proposed by mining the polarimetric rotation domain information along the radar’s line of sight, which mainly contains three steps. Firstly, polarimetric rotation domain analysis along the radar’s line of sight is conducted on the polarimetric ISAR data and a set of polarimetric correlation pattern features are derived. Secondly, the polarimetric correlation pattern characteristics of canonical structures are analyzed and the polarimetric feature coding vectors are given. Finally, the target scattering structure can be recognized by the distance of polarimetric feature coding vectors. Simulation experimental studies are carried out with the typical space target components of solar panel and reflector antenna. Compared with the traditional Cameron decomposition, the proposed method has superior and robust performance.
Polarimetric Inverse Synthetic Aperture Radar (ISAR), which has the ability of full polarization measurement and high-resolution imaging, has become an important sensor for space awareness. As a typical man-made target, space target has various scattering characteristic, which is sensitive to the relative angle between the target orientation and the radar’s line of sight. This scattering diversity makes it difficult for polarimetric ISAR data interpretation. Besides, enrich polarimetric scattering information is hidden within it. In order to promote the interpretation performance of space target, a scattering structure recognition method in polarimetric rotation domain is proposed by mining the polarimetric rotation domain information along the radar’s line of sight, which mainly contains three steps. Firstly, polarimetric rotation domain analysis along the radar’s line of sight is conducted on the polarimetric ISAR data and a set of polarimetric correlation pattern features are derived. Secondly, the polarimetric correlation pattern characteristics of canonical structures are analyzed and the polarimetric feature coding vectors are given. Finally, the target scattering structure can be recognized by the distance of polarimetric feature coding vectors. Simulation experimental studies are carried out with the typical space target components of solar panel and reflector antenna. Compared with the traditional Cameron decomposition, the proposed method has superior and robust performance.
2023, 45(6): 2115-2122.
doi: 10.11999/JEIT220520
Abstract:
To achieve high performance beam scanning of tri-polarized planar arrays over a wide angle range, this study presents a tri-polarized element-based method that forms a half-space scanning beam of planar arrays . Moreover, the beam scanning characteristics of the phased array are also analyzed and presented. Considering that the characteristics of the three polarizations of the element are consistent, starting from the required direction of the synthetic beam of the element, the unit vector of the synthesis electric field orthogonal to the beam direction is decomposed along the three orthogonal polarization directions to obtain the phase and amplitude of the three polarization electric field components. Subsequently, according to the proportional relationship between the power and square of the polarization electric field amplitude, the excitation expressions required for polarization synthesis in four polarization forms are derived, and their accuracies are verified through simulation. The classical method that controls the array factor is used. Furthermore, by changing the phase difference between the array elements, the scanning angle of the array factor is adjusted to be consistent with the beam direction of the antenna element, forming the half-space array scanning beams in any direction. Finally, the scanning characteristics of the array under four polarization forms are simulated and without considering the mutual coupling between the array elements. The results reveal that the array has excellent half-space scanning characteristics, verifying the effectiveness of the beamforming method.
To achieve high performance beam scanning of tri-polarized planar arrays over a wide angle range, this study presents a tri-polarized element-based method that forms a half-space scanning beam of planar arrays . Moreover, the beam scanning characteristics of the phased array are also analyzed and presented. Considering that the characteristics of the three polarizations of the element are consistent, starting from the required direction of the synthetic beam of the element, the unit vector of the synthesis electric field orthogonal to the beam direction is decomposed along the three orthogonal polarization directions to obtain the phase and amplitude of the three polarization electric field components. Subsequently, according to the proportional relationship between the power and square of the polarization electric field amplitude, the excitation expressions required for polarization synthesis in four polarization forms are derived, and their accuracies are verified through simulation. The classical method that controls the array factor is used. Furthermore, by changing the phase difference between the array elements, the scanning angle of the array factor is adjusted to be consistent with the beam direction of the antenna element, forming the half-space array scanning beams in any direction. Finally, the scanning characteristics of the array under four polarization forms are simulated and without considering the mutual coupling between the array elements. The results reveal that the array has excellent half-space scanning characteristics, verifying the effectiveness of the beamforming method.
2023, 45(6): 2123-2133.
doi: 10.11999/JEIT220503
Abstract:
Microwave Frequency Shift (MFS) technology is widely used in electronic countermeasures, satellite communications and frequency diverse array radar. The MFS method based on photonics has the advantages of large bandwidth and pure spectrum. In order to explore the performance, three MFS methods based on Acousto-Optic Frequency Shifter (AOFS), Sawtooth Phase Modulation (SPM) and I/Q modulation are compared in this paper. The principles of the three methods are illustrated and the corresponding verification systems are built for experiments and analysis. The results show that the three methods can achieve accurate MFS whose spurious suppression ratios are greater than 30 dB. However, the three methods simultaneously have different limitations: the operating frequency, bandwidth, and frequency shift direction of AOFS are relatively fixed which means the tunability is low; methods based on SPM and I/Q modulation have strict requirements on the input driving signal which leads to poor stability.
Microwave Frequency Shift (MFS) technology is widely used in electronic countermeasures, satellite communications and frequency diverse array radar. The MFS method based on photonics has the advantages of large bandwidth and pure spectrum. In order to explore the performance, three MFS methods based on Acousto-Optic Frequency Shifter (AOFS), Sawtooth Phase Modulation (SPM) and I/Q modulation are compared in this paper. The principles of the three methods are illustrated and the corresponding verification systems are built for experiments and analysis. The results show that the three methods can achieve accurate MFS whose spurious suppression ratios are greater than 30 dB. However, the three methods simultaneously have different limitations: the operating frequency, bandwidth, and frequency shift direction of AOFS are relatively fixed which means the tunability is low; methods based on SPM and I/Q modulation have strict requirements on the input driving signal which leads to poor stability.
2023, 45(6): 2134-2143.
doi: 10.11999/JEIT220687
Abstract:
Against the epidemic background, the contactless human-computer interaction has great application prospects in the medical and health field. Among them, using gesture recognition method to realize non-contact instrument control is becoming the hotspot. To improve the robustness and accuracy, a method is proposed to realize the digital gesture recognition based on dual-view sequential feature fusion of millimeter-wave radars in this paper. Firstly, time series echo data of gesture numbers 0~9 from positive and side perspectives are collected synchronously. Secondly, datasets from different perspectives are preprocessed by implementing clutter suppression and data compression. Furthermore, the Attention embedded Dual View Fusion Network (ADVFNet) is constructed based on the intrinsic correlation of temporal features. Finally, using the collected dataset, the task of training network, fusing sequential feature, and recognizing digital gesture could be completed. Experimental results show that the recognition accuracy of proposed method is about 95%, which has faster network convergence and better model generalization ability compared with several existing methods. Moreover, the method could provide a new idea for future human-computer interaction of millimeter-wave radars.
Against the epidemic background, the contactless human-computer interaction has great application prospects in the medical and health field. Among them, using gesture recognition method to realize non-contact instrument control is becoming the hotspot. To improve the robustness and accuracy, a method is proposed to realize the digital gesture recognition based on dual-view sequential feature fusion of millimeter-wave radars in this paper. Firstly, time series echo data of gesture numbers 0~9 from positive and side perspectives are collected synchronously. Secondly, datasets from different perspectives are preprocessed by implementing clutter suppression and data compression. Furthermore, the Attention embedded Dual View Fusion Network (ADVFNet) is constructed based on the intrinsic correlation of temporal features. Finally, using the collected dataset, the task of training network, fusing sequential feature, and recognizing digital gesture could be completed. Experimental results show that the recognition accuracy of proposed method is about 95%, which has faster network convergence and better model generalization ability compared with several existing methods. Moreover, the method could provide a new idea for future human-computer interaction of millimeter-wave radars.
2023, 45(6): 2144-2152.
doi: 10.11999/JEIT220579
Abstract:
The Rapidly-exploring Random Tree (RRT) algorithm has some shortcomings, including low computation efficiency and non-asymptotic optimality. An Improved RRT (IRRT) algorithm based on search rules and cross entropy optimization is presented in this paper. In the path search process, according to the current node position and search rules, the search step size and search direction are adjusted to achieve efficient and rapid initial path planning. Then, the cross entropy theory is applied to optimize the initial path, so that the path has the characteristic of asymptotic optimality. The simulation results of experiment 1 show the effectiveness and convergence of the proposed method, in the second simulation experiment, the proposed algorithm is compared with several variant RRT algorithms, and the results show that the proposed algorithm can ensure the computational efficiency and make the path has the characteristic of asymptotic optimality.
The Rapidly-exploring Random Tree (RRT) algorithm has some shortcomings, including low computation efficiency and non-asymptotic optimality. An Improved RRT (IRRT) algorithm based on search rules and cross entropy optimization is presented in this paper. In the path search process, according to the current node position and search rules, the search step size and search direction are adjusted to achieve efficient and rapid initial path planning. Then, the cross entropy theory is applied to optimize the initial path, so that the path has the characteristic of asymptotic optimality. The simulation results of experiment 1 show the effectiveness and convergence of the proposed method, in the second simulation experiment, the proposed algorithm is compared with several variant RRT algorithms, and the results show that the proposed algorithm can ensure the computational efficiency and make the path has the characteristic of asymptotic optimality.
2023, 45(6): 2153-2161.
doi: 10.11999/JEIT220558
Abstract:
The Finite Rate of Innovation (FRI) theory can realize the sub-Nyquist sampling of pulse streams signal by a sampling rate much lower than its Nyquist frequency. Most classical FRI reconstruction algorithms operate on the basis of Fourier coefficients, and there is a lot of singular value decomposition of complex matrices, which reduces the efficiency of the algorithm. To solve this problem, an FRI sampling and reconstruction method based on the real part of Fourier coefficients is proposed in this paper. Firstly, the discrete cosine transform is used to obtain the real part of Fourier coefficients information from the low-speed sampling value of the pulse flow signal, and the Toeplitz matrix of the real part is used in the reconstruction algorithm to improve the efficiency of the Singular Value Decomposition (SVD). Secondly, in order to improve the robustness of the classical annihilating filter algorithm, a covariance matrix decomposition algorithm and a null space searching algorithm are proposed from the rotation invariant feature and the null space property of the real covariance matrix. The two methods are based on the discrete cosine transform to estimate characteristic parameters of the pulse stream signal. For the conjugate root problem, a new method of deconjugation based on the alternating direction multiplier is proposed in this paper. The simulation results show that using the real part information of Fourier coefficients can greatly improve the efficiency of the algorithm and ensure the accuracy of parameter estimation when the rate of innovation of the signal is high.
The Finite Rate of Innovation (FRI) theory can realize the sub-Nyquist sampling of pulse streams signal by a sampling rate much lower than its Nyquist frequency. Most classical FRI reconstruction algorithms operate on the basis of Fourier coefficients, and there is a lot of singular value decomposition of complex matrices, which reduces the efficiency of the algorithm. To solve this problem, an FRI sampling and reconstruction method based on the real part of Fourier coefficients is proposed in this paper. Firstly, the discrete cosine transform is used to obtain the real part of Fourier coefficients information from the low-speed sampling value of the pulse flow signal, and the Toeplitz matrix of the real part is used in the reconstruction algorithm to improve the efficiency of the Singular Value Decomposition (SVD). Secondly, in order to improve the robustness of the classical annihilating filter algorithm, a covariance matrix decomposition algorithm and a null space searching algorithm are proposed from the rotation invariant feature and the null space property of the real covariance matrix. The two methods are based on the discrete cosine transform to estimate characteristic parameters of the pulse stream signal. For the conjugate root problem, a new method of deconjugation based on the alternating direction multiplier is proposed in this paper. The simulation results show that using the real part information of Fourier coefficients can greatly improve the efficiency of the algorithm and ensure the accuracy of parameter estimation when the rate of innovation of the signal is high.
2023, 45(6): 2162-2170.
doi: 10.11999/JEIT220667
Abstract:
In order to optimize the coherent accumulation, repeated searches are required. However, due to the randomness and time variability, it is difficult to search for the optimal transformation order. In order to solve this problem, singular value decomposition in matrix theory is used to realize the feature extraction of FRFT spectrum under the condition of each transformation order, designs feature detection, and proposes sea clutter suppression and target detection based on singular value in the FRFT domain. The method avoids the search for the optimal transformation order while increasing the use of the shape information of the maneuvering target in the FRFT domain. Under the condition of Gaussian white noise simulation data evaluation, the proposed method can achieve a detection probability of 60% when the SNR is –2.5 dB; Verified by the measured data, the method can be stably completed under the condition that the SNR is 4.7 dB Target detection has good detection performance and is easy to implement in engineering.
In order to optimize the coherent accumulation, repeated searches are required. However, due to the randomness and time variability, it is difficult to search for the optimal transformation order. In order to solve this problem, singular value decomposition in matrix theory is used to realize the feature extraction of FRFT spectrum under the condition of each transformation order, designs feature detection, and proposes sea clutter suppression and target detection based on singular value in the FRFT domain. The method avoids the search for the optimal transformation order while increasing the use of the shape information of the maneuvering target in the FRFT domain. Under the condition of Gaussian white noise simulation data evaluation, the proposed method can achieve a detection probability of 60% when the SNR is –2.5 dB; Verified by the measured data, the method can be stably completed under the condition that the SNR is 4.7 dB Target detection has good detection performance and is easy to implement in engineering.
2023, 45(6): 2171-2179.
doi: 10.11999/JEIT220688
Abstract:
In distributed sensor networks, the inconsistent estimation results of state parameters such as azimuth and axis lengths of the same extended target under different sensors lead to the difficulty of extended target estimation association, which gives rise to challenges to the subsequent density information fusion. Compared with the point target posterior density information, the extended target posterior density contains both centroid state and shape information. Moreover, the Ellipse Distance (ED) is proposed based on the Euclidean distance of centroid and non-Euclidean size-shape metric of shape matrix. The ellipse distance considers both the centroid state and shape information of the extended target, and better realizes the posterior density correlation of the same extended target under different sensors. In addition, in this paper, the approximate Gamma Gaussian Inverse Wishart (GGIW) distribution of fusion space density is derived under the Arithmetic Average (AA) fusion rule, and the AA fusion of posterior information of the same extended target under different sensors is realized. Simulation results show that the proposed algorithm can effectively track multiple extended targets in distributed sensor networks.
In distributed sensor networks, the inconsistent estimation results of state parameters such as azimuth and axis lengths of the same extended target under different sensors lead to the difficulty of extended target estimation association, which gives rise to challenges to the subsequent density information fusion. Compared with the point target posterior density information, the extended target posterior density contains both centroid state and shape information. Moreover, the Ellipse Distance (ED) is proposed based on the Euclidean distance of centroid and non-Euclidean size-shape metric of shape matrix. The ellipse distance considers both the centroid state and shape information of the extended target, and better realizes the posterior density correlation of the same extended target under different sensors. In addition, in this paper, the approximate Gamma Gaussian Inverse Wishart (GGIW) distribution of fusion space density is derived under the Arithmetic Average (AA) fusion rule, and the AA fusion of posterior information of the same extended target under different sensors is realized. Simulation results show that the proposed algorithm can effectively track multiple extended targets in distributed sensor networks.
2023, 45(6): 2180-2187.
doi: 10.11999/JEIT220693
Abstract:
The three-dimensional parameter estimation algorithm of the helicopter with constant speed flight from the underwater acoustic data with single hydrophone, which extended the traditional two-dimension flight parameters estimation is proposed. Firstly, the helicopter line spectrum is used as the exciting sound source, and its three-dimensional Doppler propagation model in two-layer air-water medium, including altitude, speed and deviation distance of the helicopter, is established. The asymmetry of the Doppler frequency curve and its first- and second-order derivatives is related with the three-dimensional motion parameter of the helicopter, which can be estimated from the received data. Finally, with the measured data, the rationality of the three-dimensional Doppler shift flight model is verified and the result is compared with short-time Fourier instantaneous frequency estimation algorithm, APP-LMS algorithm can more accurately retrieve the flight parameters such as natural frequency, velocity, altitude and yaw distance of the helicopter.
The three-dimensional parameter estimation algorithm of the helicopter with constant speed flight from the underwater acoustic data with single hydrophone, which extended the traditional two-dimension flight parameters estimation is proposed. Firstly, the helicopter line spectrum is used as the exciting sound source, and its three-dimensional Doppler propagation model in two-layer air-water medium, including altitude, speed and deviation distance of the helicopter, is established. The asymmetry of the Doppler frequency curve and its first- and second-order derivatives is related with the three-dimensional motion parameter of the helicopter, which can be estimated from the received data. Finally, with the measured data, the rationality of the three-dimensional Doppler shift flight model is verified and the result is compared with short-time Fourier instantaneous frequency estimation algorithm, APP-LMS algorithm can more accurately retrieve the flight parameters such as natural frequency, velocity, altitude and yaw distance of the helicopter.
2023, 45(6): 2188-2196.
doi: 10.11999/JEIT220551
Abstract:
Low Dose CT (LDCT) images can significantly reduce the X-ray radiation dose, but there is a lot of noise that affects doctors' diagnosis. Deep Image Prior (DIP) is an unsupervised deep learning algorithm that uses random tensor as the input of neural network and iterates with a single LDCT image as the target. However, DIP needs thousands of iterations to get the best denoised results, resulting in the slow running speed of this method. Therefore, a DIP acceleration method for target offset in low-dose CT images is proposed, which aims to improve the running speed while maintaining the quality of denoised image. According to the similarity of LDCT slice images of an organ (such as lungs), the algorithm associates independent networks whose target images are different slices by inheriting parameters, updates the network parameters corresponding to the current slice based on the network parameters corresponding to the previous slice, and takes the network parameters corresponding to the current slice as the basis of next network corresponding to next slice to update parameters; Since the input of DIP network is a fixed random tensor, which is different from the target image greatly, this paper uses the LDCT image preprocessed by the traditional models as the network input to improve further the network iteration speed. Experiments show that the proposed acceleration algorithm can improve the iteration speed by 10.45% compared with the original DIP network without traditional model preprocessing. When LDCT preprocessed by Relative Total Variation (RTV) model is used as the network input, the image peak signal-to-noise ratio can not only reach 29.13, but also the total iterative speed can be increased by 94.31%. Therefore, this algorithm can greatly improve the running speed while maintaining the denoised quality of DIP, especially when the CT image preprocessed by RTV model is used as the network input, the effect of improving the running speed is more obvious.
Low Dose CT (LDCT) images can significantly reduce the X-ray radiation dose, but there is a lot of noise that affects doctors' diagnosis. Deep Image Prior (DIP) is an unsupervised deep learning algorithm that uses random tensor as the input of neural network and iterates with a single LDCT image as the target. However, DIP needs thousands of iterations to get the best denoised results, resulting in the slow running speed of this method. Therefore, a DIP acceleration method for target offset in low-dose CT images is proposed, which aims to improve the running speed while maintaining the quality of denoised image. According to the similarity of LDCT slice images of an organ (such as lungs), the algorithm associates independent networks whose target images are different slices by inheriting parameters, updates the network parameters corresponding to the current slice based on the network parameters corresponding to the previous slice, and takes the network parameters corresponding to the current slice as the basis of next network corresponding to next slice to update parameters; Since the input of DIP network is a fixed random tensor, which is different from the target image greatly, this paper uses the LDCT image preprocessed by the traditional models as the network input to improve further the network iteration speed. Experiments show that the proposed acceleration algorithm can improve the iteration speed by 10.45% compared with the original DIP network without traditional model preprocessing. When LDCT preprocessed by Relative Total Variation (RTV) model is used as the network input, the image peak signal-to-noise ratio can not only reach 29.13, but also the total iterative speed can be increased by 94.31%. Therefore, this algorithm can greatly improve the running speed while maintaining the denoised quality of DIP, especially when the CT image preprocessed by RTV model is used as the network input, the effect of improving the running speed is more obvious.
2023, 45(6): 2197-2204.
doi: 10.11999/JEIT220603
Abstract:
As the core algorithm of deep learning technology, deep neural network is easy to make wrong judgment on the adversarial examples with imperceptive perturbation. This situation brings new challenges to the security of deep learning model. The resistance of deep learning model to adversarial examples is called robustness. In order to improve the robustness of the model trained by adversarial training algorithm, an adversarial training algorithm of deep learning model based on information bottleneck is proposed. Among this, information bottleneck describes the process of deep learning based on information theory, so that the deep learning model can converge faster. The proposed algorithm uses the conclusions derived from the optimization objective proposed based on the information bottleneck theory, adds the tensor input to the linear classification layer in the model to the loss function, and aligns the clean samples with the high-level features obtained when the adversarial samples are input to the model by means of sample cross-training, so that the model can better learn the relationship between the input samples and their true labels during the training process and has finally good robustness to the adversarial samples. Experimental results show that the proposed algorithm has good robustness to a variety of adversarial attacks, and has generalization ability in different data sets and models.
As the core algorithm of deep learning technology, deep neural network is easy to make wrong judgment on the adversarial examples with imperceptive perturbation. This situation brings new challenges to the security of deep learning model. The resistance of deep learning model to adversarial examples is called robustness. In order to improve the robustness of the model trained by adversarial training algorithm, an adversarial training algorithm of deep learning model based on information bottleneck is proposed. Among this, information bottleneck describes the process of deep learning based on information theory, so that the deep learning model can converge faster. The proposed algorithm uses the conclusions derived from the optimization objective proposed based on the information bottleneck theory, adds the tensor input to the linear classification layer in the model to the loss function, and aligns the clean samples with the high-level features obtained when the adversarial samples are input to the model by means of sample cross-training, so that the model can better learn the relationship between the input samples and their true labels during the training process and has finally good robustness to the adversarial samples. Experimental results show that the proposed algorithm has good robustness to a variety of adversarial attacks, and has generalization ability in different data sets and models.
2023, 45(6): 2205-2215.
doi: 10.11999/JEIT220413
Abstract:
Considering the problems of imbalanced comprehensive performance of the current deep learning single-stage detection algorithms and difficult deployment in embedded devices, one High-Performance object detection algorithm for embedded platforms is proposed in this paper. Based on the You Only Look Once v5 (YOLOv5) network, in the backbone network part of the improved algorithm firstly, the original focus module and original Cross Stage Partial Darknet are replaced by a designed space stem block and an improved ShuffleNetv2, respectively. The kernel size of Space Pyramid Pooling (SPP) is reduced to lighten the backbone network. Secondly, in the neck, an Enhanced Path Aggregation Network (EPAN) based on Path Aggregation Network (PAN) design is adopted, a P6 large target output layer is added, and the feature extraction ability of the network is improved. And then, in the head, an Adaptive-Atrous Spatial Feature Fusion (A-ASFF) based on Adaptive Spatial Feature Fusion (ASFF) is used to replace the original detection head, the object scale change problem is solved, and the detection accuracy is greatly improved with a small amount of additional overhead. Finally, in the function section, a Complete Intersection over Union (CIoU) loss function is replaced by the Efficient Intersection over Union (EIoU), a HardSwish activation function is replaced by a Sigmoid weighted Linear Unit (SiLU), and model synthesis ability has been improved. The experimental results show that compared to YOLOv5-S, the mAP@.5 and mAP@.5:95 of the same version of the algorithm proposed in this paper are increased by 4.6% and 6.3% while the number of parameters and the computational complexity are reduced by 43.5% and 12.0%, respectively. Using the original model and the TensorRT accelerated model for speed evaluation on the Jetson Nano platform, the inference latency is reduced by 8.1% and 9.8%, respectively. The comprehensive indicators of many excellent object detection networks and their friendliness to embedded platforms are surpassed by the algorithm proposed in this paper and the practical meaning is generated.
Considering the problems of imbalanced comprehensive performance of the current deep learning single-stage detection algorithms and difficult deployment in embedded devices, one High-Performance object detection algorithm for embedded platforms is proposed in this paper. Based on the You Only Look Once v5 (YOLOv5) network, in the backbone network part of the improved algorithm firstly, the original focus module and original Cross Stage Partial Darknet are replaced by a designed space stem block and an improved ShuffleNetv2, respectively. The kernel size of Space Pyramid Pooling (SPP) is reduced to lighten the backbone network. Secondly, in the neck, an Enhanced Path Aggregation Network (EPAN) based on Path Aggregation Network (PAN) design is adopted, a P6 large target output layer is added, and the feature extraction ability of the network is improved. And then, in the head, an Adaptive-Atrous Spatial Feature Fusion (A-ASFF) based on Adaptive Spatial Feature Fusion (ASFF) is used to replace the original detection head, the object scale change problem is solved, and the detection accuracy is greatly improved with a small amount of additional overhead. Finally, in the function section, a Complete Intersection over Union (CIoU) loss function is replaced by the Efficient Intersection over Union (EIoU), a HardSwish activation function is replaced by a Sigmoid weighted Linear Unit (SiLU), and model synthesis ability has been improved. The experimental results show that compared to YOLOv5-S, the mAP@.5 and mAP@.5:95 of the same version of the algorithm proposed in this paper are increased by 4.6% and 6.3% while the number of parameters and the computational complexity are reduced by 43.5% and 12.0%, respectively. Using the original model and the TensorRT accelerated model for speed evaluation on the Jetson Nano platform, the inference latency is reduced by 8.1% and 9.8%, respectively. The comprehensive indicators of many excellent object detection networks and their friendliness to embedded platforms are surpassed by the algorithm proposed in this paper and the practical meaning is generated.
2023, 45(6): 2216-2225.
doi: 10.11999/JEIT220645
Abstract:
In this paper, a Transfer Fuzzy C-Means clustering algorithm based on Maximum Mean Discrepancy (TFCM-MMD) is proposed. TFCM-MMD solves the problem that the transfer learning effect of the transfer fuzzy C-means clustering algorithm is weakened when the data distribution between source domain and target domain is very different. The algorithm measures inter-domain differences based on the maximum mean discrepancy criterion, and reduces the differences of data distribution between source domain and target domain in the common subspace by learning the projection matrix of source domain and target domain, so as to improve the effect of transfer learning. Finally, experiments based on synthetic datasets and medical image segmentation datasets verify further the effectiveness of TFCM-MMD algorithm in solving transfer clustering problems with large inter-domain differences.
In this paper, a Transfer Fuzzy C-Means clustering algorithm based on Maximum Mean Discrepancy (TFCM-MMD) is proposed. TFCM-MMD solves the problem that the transfer learning effect of the transfer fuzzy C-means clustering algorithm is weakened when the data distribution between source domain and target domain is very different. The algorithm measures inter-domain differences based on the maximum mean discrepancy criterion, and reduces the differences of data distribution between source domain and target domain in the common subspace by learning the projection matrix of source domain and target domain, so as to improve the effect of transfer learning. Finally, experiments based on synthetic datasets and medical image segmentation datasets verify further the effectiveness of TFCM-MMD algorithm in solving transfer clustering problems with large inter-domain differences.
2023, 45(6): 2226-2235.
doi: 10.11999/JEIT220563
Abstract:
Adversarial sample generation is a technique that makes the neural network produce misjudgments by adding small disturbance information. Which can be used to detect the robustness of text classification models. At present, the methods of sample generation in the Chinese domain include mainly traditional characters and homophones substitution, which have the problems of large disturbance amplitude of sample generation and low quality of sample generation. Polyphonic characters Generation Adversarial Sample (PGAS), a character-level countermeasure samples generation approach, is proposed in this paper. Which can generate high-quality adversarial samples with minor disturbance by replacing polyphonic characters. First, a polyphonic word dictionary to label polyphonic words is constructed. Then, the input text with polyphonic words is replaced. Finally, an adversarial sample attack experiment in the black-box model is conducted. Experiments on multiple sentiment classification datasets verify the effectiveness of the proposed method for a variety of the latest classification models.
Adversarial sample generation is a technique that makes the neural network produce misjudgments by adding small disturbance information. Which can be used to detect the robustness of text classification models. At present, the methods of sample generation in the Chinese domain include mainly traditional characters and homophones substitution, which have the problems of large disturbance amplitude of sample generation and low quality of sample generation. Polyphonic characters Generation Adversarial Sample (PGAS), a character-level countermeasure samples generation approach, is proposed in this paper. Which can generate high-quality adversarial samples with minor disturbance by replacing polyphonic characters. First, a polyphonic word dictionary to label polyphonic words is constructed. Then, the input text with polyphonic words is replaced. Finally, an adversarial sample attack experiment in the black-box model is conducted. Experiments on multiple sentiment classification datasets verify the effectiveness of the proposed method for a variety of the latest classification models.
2023, 45(6): 2236-2245.
doi: 10.11999/JEIT220644
Abstract:
Factors such as scale variation, occlusion and complex backgrounds make crowd number estimation in crowded scenes a challenging task. To cope with the scale variation in crowd images and the scope limitation and the feature similarity problem in existing multi-column networks, a Multi-Scale Interactive Attention crowd counting Network (MSIANet) is proposed in this paper. Firstly, a multi-scale attention module is designed, which uses four branches with different perceptual fields to extract features at different scales and interacts the scale features extracted from each branch. At the same time, an attention mechanism is used to limit the feature similarity problem of the multi-column network. Secondly, a semantic information fusion module is designed based on the multi-scale attention module, which interacts different levels of semantic information of the backbone network and stacks the multi-scale attention module in layers to make full use of the multi-layer semantic information. Finally, a multi-scale interactive attention crowd counting network is constructed based on the multi-scale attention module and the semantic information fusion module, which makes full use of multi-level semantic information and multi-scale information to generate high-quality crowd density maps. The experimental results show that compared with the existing representative crowd counting methods, the proposed MSIANet can effectively improve the accuracy and robustness of the crowd counting task.
Factors such as scale variation, occlusion and complex backgrounds make crowd number estimation in crowded scenes a challenging task. To cope with the scale variation in crowd images and the scope limitation and the feature similarity problem in existing multi-column networks, a Multi-Scale Interactive Attention crowd counting Network (MSIANet) is proposed in this paper. Firstly, a multi-scale attention module is designed, which uses four branches with different perceptual fields to extract features at different scales and interacts the scale features extracted from each branch. At the same time, an attention mechanism is used to limit the feature similarity problem of the multi-column network. Secondly, a semantic information fusion module is designed based on the multi-scale attention module, which interacts different levels of semantic information of the backbone network and stacks the multi-scale attention module in layers to make full use of the multi-layer semantic information. Finally, a multi-scale interactive attention crowd counting network is constructed based on the multi-scale attention module and the semantic information fusion module, which makes full use of multi-level semantic information and multi-scale information to generate high-quality crowd density maps. The experimental results show that compared with the existing representative crowd counting methods, the proposed MSIANet can effectively improve the accuracy and robustness of the crowd counting task.
2023, 45(6): 2246-2255.
doi: 10.11999/JEIT220684
Abstract:
Considering the problems of low detection accuracy, slow model convergence speed and large amount of computation in current panorama image saliency detection methods, a U-Net with Robust vision transformer and Multiple attention at tention modules (URMNet) is proposed. Sphere convolution is used to extract multi-scale features of panoramic images of the model,while reducing the distortion of panoramic images after equirectangular projection.The robust visual transformer module is used to extract the salient information contained in the feature maps of four scales, and the convolutional embedding is used to reduce the resolution of the feature maps and enhance the robustness of the model. The multiple attention module is used to integrate selectively multi-dimensional attention according to the relationship between spatial attention and channel attention. Finally, the multi-layer features are gradually fused to form a panoramic image saliency map. The latitude weighted loss function is used to make the model in this paper have a faster convergence rate. Experiments on two public datasets show that the model proposed in this paper outperforms other 6 advanced methods due to the use of a robust visual transformer module and a multiple attention module, and can further improve the saliency detection accuracy of panoramic images.
Considering the problems of low detection accuracy, slow model convergence speed and large amount of computation in current panorama image saliency detection methods, a U-Net with Robust vision transformer and Multiple attention at tention modules (URMNet) is proposed. Sphere convolution is used to extract multi-scale features of panoramic images of the model,while reducing the distortion of panoramic images after equirectangular projection.The robust visual transformer module is used to extract the salient information contained in the feature maps of four scales, and the convolutional embedding is used to reduce the resolution of the feature maps and enhance the robustness of the model. The multiple attention module is used to integrate selectively multi-dimensional attention according to the relationship between spatial attention and channel attention. Finally, the multi-layer features are gradually fused to form a panoramic image saliency map. The latitude weighted loss function is used to make the model in this paper have a faster convergence rate. Experiments on two public datasets show that the model proposed in this paper outperforms other 6 advanced methods due to the use of a robust visual transformer module and a multiple attention module, and can further improve the saliency detection accuracy of panoramic images.
2023, 45(6): 2256-2263.
doi: 10.11999/JEIT220601
Abstract:
Person Re-IDentification (ReID) aims to retrieve specific pedestrian targets across surveillance cameras. For the purpose of aggregating the multi-granularity features of pedestrian images and further solving the problem of deep feature mapping correlation, Person Re-Identification based on CNN and TransFormer Multi-scale learning (CTM) is proposed. The CTM network is composed of a global branch, a deep aggregation branch and a feature pyramid branch. Global branch extracts global features of pedestrian images, and extracts hierarchical features with different scales. The deep aggregation branch aggregates recursively the hierarchical features of CNN and extracts multi-scale features. The feature pyramid branch is a two-way pyramid structure, under the attention module and orthogonal regularization operation, it can significantly improve the performance of the network. Experiments on three large scale datasets show the effectiveness of CTM. On the Market1501, DukeMTMC-reID and MSMT17 datasets, mAP/Rank-1 reached 90.2%/96.0%, 82.3%/91.6% and 63.2%/83.7%, which is superior to other existing methods.
Person Re-IDentification (ReID) aims to retrieve specific pedestrian targets across surveillance cameras. For the purpose of aggregating the multi-granularity features of pedestrian images and further solving the problem of deep feature mapping correlation, Person Re-Identification based on CNN and TransFormer Multi-scale learning (CTM) is proposed. The CTM network is composed of a global branch, a deep aggregation branch and a feature pyramid branch. Global branch extracts global features of pedestrian images, and extracts hierarchical features with different scales. The deep aggregation branch aggregates recursively the hierarchical features of CNN and extracts multi-scale features. The feature pyramid branch is a two-way pyramid structure, under the attention module and orthogonal regularization operation, it can significantly improve the performance of the network. Experiments on three large scale datasets show the effectiveness of CTM. On the Market1501, DukeMTMC-reID and MSMT17 datasets, mAP/Rank-1 reached 90.2%/96.0%, 82.3%/91.6% and 63.2%/83.7%, which is superior to other existing methods.
2023, 45(6): 2264-2272.
doi: 10.11999/JEIT221252
Abstract:
Automatic and accurate segmentation of 3D kidney CT image is of great significance to reduce the workload of doctors and improve the efficiency of computer-aided diagnosis. However, due to the structural complexity of kidney organs and the gray similarity of adjacent parts, accurate segmentation of 3D kidney is still challenging. Based on the characteristics of simple structure and few parameters of Simplified Pulse Coupled Neural Network (SPCNN), combined with Fuzzy Connectedness (FC) algorithm, an automatic segmentation algorithm of three-dimensional kidney CT images is proposed in this paper. The main contributions of this paper are as follows: The 2D SPCNN is extended to 3D SPCNN, which can make full use of the inter-layer information of 3D CT images. A 3D seed point automatic generation strategy based on the centroid of region of interest is proposed, which can effectively improve the automatic segmentation efficiency of the algorithm. Effective coupling of 3D FC response map and 3D SPCNN is realized. The proposed algorithm is validated on self-made and public datasets, and the results show that the performance of the proposed algorithm is better than that of the existing mainstream algorithms. The average values of Dice coefficient, accuracy, sensitivity, volume error and average symmetric surface distance can achieve 0.9095, 0.9969, 0.8517, 0.1749 and 0.8536 respectively.
Automatic and accurate segmentation of 3D kidney CT image is of great significance to reduce the workload of doctors and improve the efficiency of computer-aided diagnosis. However, due to the structural complexity of kidney organs and the gray similarity of adjacent parts, accurate segmentation of 3D kidney is still challenging. Based on the characteristics of simple structure and few parameters of Simplified Pulse Coupled Neural Network (SPCNN), combined with Fuzzy Connectedness (FC) algorithm, an automatic segmentation algorithm of three-dimensional kidney CT images is proposed in this paper. The main contributions of this paper are as follows: The 2D SPCNN is extended to 3D SPCNN, which can make full use of the inter-layer information of 3D CT images. A 3D seed point automatic generation strategy based on the centroid of region of interest is proposed, which can effectively improve the automatic segmentation efficiency of the algorithm. Effective coupling of 3D FC response map and 3D SPCNN is realized. The proposed algorithm is validated on self-made and public datasets, and the results show that the performance of the proposed algorithm is better than that of the existing mainstream algorithms. The average values of Dice coefficient, accuracy, sensitivity, volume error and average symmetric surface distance can achieve 0.9095, 0.9969, 0.8517, 0.1749 and 0.8536 respectively.
2023, 45(6): 2273-2283.
doi: 10.11999/JEIT220596
Abstract:
3D Convolutional Neural Network (3D CNN) has been a hot topic in deep learning research over the last few years and has made great achievements in computer vision. Despite years of research and abundant results, a comprehensive and detailed review of this content is still lacking. In this paper, the 3D convolutional neural network is introduced in the following aspects. Firstly, the rationale and model structure of 3D convolutional neural network are put forward. Then the improvement of 3D convolutional neural network is summarized from the network structure, network interior and optimization methods. After that the application of 3D convolutional neural network to the field of video understanding is explained. Finally, the contents summary of the paper and future development. This paper provides a systematic review of the latest research progress of 3D convolutional neural networks and their applications in the field of video understanding, which is of positive significance to the research and development of 3D convolutional neural network.
3D Convolutional Neural Network (3D CNN) has been a hot topic in deep learning research over the last few years and has made great achievements in computer vision. Despite years of research and abundant results, a comprehensive and detailed review of this content is still lacking. In this paper, the 3D convolutional neural network is introduced in the following aspects. Firstly, the rationale and model structure of 3D convolutional neural network are put forward. Then the improvement of 3D convolutional neural network is summarized from the network structure, network interior and optimization methods. After that the application of 3D convolutional neural network to the field of video understanding is explained. Finally, the contents summary of the paper and future development. This paper provides a systematic review of the latest research progress of 3D convolutional neural networks and their applications in the field of video understanding, which is of positive significance to the research and development of 3D convolutional neural network.
2023, 45(6): 2284-2292.
doi: 10.11999/JEIT220532
Abstract:
To deal with the weight degradation and sample impoverishment problems of particle filter, a Particle Filter based on Harris Hawks Optimization improved by Encircling strategy (EHHOPF) is designed. Firstly, the global search strategy in Harris Hawks Optimization is replaced by an encircling prey strategy to fit the filtering environment. Additionally, Sigmoid function is introduced to construct the nonlinear prey escaping energy to achieve the balance between exploration and exploitation. Lastly, the selection scale factor is proposed to simplify the selection mechanism of searching strategies and nonlinear dynamic prey jump strength is constructed to guarantee the convergence efficiency as well. The simulation results exhibited that the proposed particle filter can effectively improve the state estimation accuracy, filtering stability and real-time performance than the standard particle filter and particle filters optimized by krill herd algorithm, bat algorithm, cuckoo search algorithm and grey wolf optimizer.
To deal with the weight degradation and sample impoverishment problems of particle filter, a Particle Filter based on Harris Hawks Optimization improved by Encircling strategy (EHHOPF) is designed. Firstly, the global search strategy in Harris Hawks Optimization is replaced by an encircling prey strategy to fit the filtering environment. Additionally, Sigmoid function is introduced to construct the nonlinear prey escaping energy to achieve the balance between exploration and exploitation. Lastly, the selection scale factor is proposed to simplify the selection mechanism of searching strategies and nonlinear dynamic prey jump strength is constructed to guarantee the convergence efficiency as well. The simulation results exhibited that the proposed particle filter can effectively improve the state estimation accuracy, filtering stability and real-time performance than the standard particle filter and particle filters optimized by krill herd algorithm, bat algorithm, cuckoo search algorithm and grey wolf optimizer.