Email alert
2021 Vol. 43, No. 6
Display Method:
2021, 43(6): 1485-1497.
doi: 10.11999/JEIT210076
Abstract:
The range resolution and maximum working distance of millimeter-wave radar are usually limited by the RF bandwidth and transmitted power. Millimeter-wave radar front-end chip with wide bandwidth, high transmitted power, high sensitivity and high-precision phase control is crucial to millimeter-wave radar system to achieve high performance. The difficulties of millimeter-wave radar chips mainly focus on impedance matching, noise reduction, transmitted power increase, phase control, etc. Therefore, this article discusses and summarizes the key technique to solution the difficulties of millimeter-wave radar front-end chips.
The range resolution and maximum working distance of millimeter-wave radar are usually limited by the RF bandwidth and transmitted power. Millimeter-wave radar front-end chip with wide bandwidth, high transmitted power, high sensitivity and high-precision phase control is crucial to millimeter-wave radar system to achieve high performance. The difficulties of millimeter-wave radar chips mainly focus on impedance matching, noise reduction, transmitted power increase, phase control, etc. Therefore, this article discusses and summarizes the key technique to solution the difficulties of millimeter-wave radar front-end chips.
2021, 43(6): 1498-1509.
doi: 10.11999/JEIT201102
Abstract:
Because of its low power consumption, high response, nanometer level, non-volatility and other characteristics, the memristor shows great development potential in the realization of non-von Neumann computing architecture. The high-density cross-array structure based on memristors can build logic circuits and brain-like computing circuits integrating data storage and parallel computing. In addition, the nanosensor and the memristor are further integrated, and the collected signals are calculated and stored in the memristor array. The chip technology integrating sensing, storage and computing become a new research focus. The research on the memristor-based storage-calculation integrated technology and sense-storage-calculated integrated technology are reviewed in this summary paper, and outding prospect of the research prospects are given.
Because of its low power consumption, high response, nanometer level, non-volatility and other characteristics, the memristor shows great development potential in the realization of non-von Neumann computing architecture. The high-density cross-array structure based on memristors can build logic circuits and brain-like computing circuits integrating data storage and parallel computing. In addition, the nanosensor and the memristor are further integrated, and the collected signals are calculated and stored in the memristor array. The chip technology integrating sensing, storage and computing become a new research focus. The research on the memristor-based storage-calculation integrated technology and sense-storage-calculated integrated technology are reviewed in this summary paper, and outding prospect of the research prospects are given.
2021, 43(6): 1510-1517.
doi: 10.11999/JEIT210002
Abstract:
Lightweight neural networks deployed on low-power platforms have proven to be effective solutions for Artificial Intelligence (AI) and Internet Of Things (IOT) domains such as Unmanned Aerial Vehicle (UAV) detection and unmanned driving. However, in the case of limited resources, it is very challenging to build Deep Neural Networks (DNN) accelerator with both high precision and low delay. In this paper, a series of efficient hardware optimization strategies are proposed, including stackable shared Processing Engine (PE) to balance the inconsistency of data reuse and memory access patterns in different convolutions; Regulable loop parallelism and channel augmentation are proposed to increase effectively the access bandwidth between accelerator and external memory. It also improve the efficiency of DNN shallow layers computing; Pre-Workflow is applied to improve the overall parallelism of heterogeneous systems. Verified by Xilinx Ultra96 V2 board, the hardware optimization strategies in this paper improve effectively the design of DNN acceleration chips like iSmart3-SkyNet and SkrSkr-SkyNet. The results show that the optimized accelerator processes 78.576 frames per second, and the power consumption of each picture is 0.068 Joules.
Lightweight neural networks deployed on low-power platforms have proven to be effective solutions for Artificial Intelligence (AI) and Internet Of Things (IOT) domains such as Unmanned Aerial Vehicle (UAV) detection and unmanned driving. However, in the case of limited resources, it is very challenging to build Deep Neural Networks (DNN) accelerator with both high precision and low delay. In this paper, a series of efficient hardware optimization strategies are proposed, including stackable shared Processing Engine (PE) to balance the inconsistency of data reuse and memory access patterns in different convolutions; Regulable loop parallelism and channel augmentation are proposed to increase effectively the access bandwidth between accelerator and external memory. It also improve the efficiency of DNN shallow layers computing; Pre-Workflow is applied to improve the overall parallelism of heterogeneous systems. Verified by Xilinx Ultra96 V2 board, the hardware optimization strategies in this paper improve effectively the design of DNN acceleration chips like iSmart3-SkyNet and SkrSkr-SkyNet. The results show that the optimized accelerator processes 78.576 frames per second, and the power consumption of each picture is 0.068 Joules.
2021, 43(6): 1518-1524.
doi: 10.11999/JEIT200979
Abstract:
With the growing abundance of data-intensive applications, memory wall has become a bottleneck to computing efficiency. A novel Floating Point (FP) computing infrastructure that embeds Ferroelectric Field Effect Transistor (FeFET) based Ternary Content Addressable Memories (TCAMs) for energy efficient computing is proposed. With an ultra-dense TCAM implementation following the designed guidelines, the infrastructure can replace unnecessary Float Point Unit (FPU) executions with more efficient TCAM searching, thereby saving the overall energy consumption. Thanks to the proposed execution flow, the infrastructure can achieve up to 33% energy saving compared to regular FPUs.
With the growing abundance of data-intensive applications, memory wall has become a bottleneck to computing efficiency. A novel Floating Point (FP) computing infrastructure that embeds Ferroelectric Field Effect Transistor (FeFET) based Ternary Content Addressable Memories (TCAMs) for energy efficient computing is proposed. With an ultra-dense TCAM implementation following the designed guidelines, the infrastructure can replace unnecessary Float Point Unit (FPU) executions with more efficient TCAM searching, thereby saving the overall energy consumption. Thanks to the proposed execution flow, the infrastructure can achieve up to 33% energy saving compared to regular FPUs.
2021, 43(6): 1525-1532.
doi: 10.11999/JEIT210012
Abstract:
The level set algorithm is widely used for image segmentation due to its high accuracy. In addition, compared to the deep learning-based image segmentation methods, the level set algorithm can be implemented without training data, which reduces significantly the labeling efforts. However, the normal level set algorithm is still developed using software, involving complex computation with a large number of pixels and iterations andcausing long processing time and large power consumption. In this work, an FPGA-based level set hardware accelerator is proposed for image segmentation. The proposed hardware accelerator contains four design components: task-level parallel processing, image splitting processing, fully-pipelined processing architecture, and time-multiplexed gradient and divergence processing engine. Based on the experimental results, the proposed hardware accelerator achieves up to 10.7 times acceleration compared to the level set algorithm executing on the CPU, with only 2.2 W power consumption.
The level set algorithm is widely used for image segmentation due to its high accuracy. In addition, compared to the deep learning-based image segmentation methods, the level set algorithm can be implemented without training data, which reduces significantly the labeling efforts. However, the normal level set algorithm is still developed using software, involving complex computation with a large number of pixels and iterations andcausing long processing time and large power consumption. In this work, an FPGA-based level set hardware accelerator is proposed for image segmentation. The proposed hardware accelerator contains four design components: task-level parallel processing, image splitting processing, fully-pipelined processing architecture, and time-multiplexed gradient and divergence processing engine. Based on the experimental results, the proposed hardware accelerator achieves up to 10.7 times acceleration compared to the level set algorithm executing on the CPU, with only 2.2 W power consumption.
2021, 43(6): 1533-1540.
doi: 10.11999/JEIT201108
Abstract:
As a new type of electronic component, memristor has the characteristics of small size, fast reading and writing speed, non-volatile and easy to be compatible with CMOS circuits. It is one of the most promising technologies to realize non-volatile memory. However, the existing multi-value storage cross array has problems such as complex circuit structure, sneak path problem and low storage density, which affect the practicability of the multi-value storage cross array. In this paper, a multi-value memory crossbar array based on heterogeneous memristors is proposed, in which the memory cell is composed of one Transistor and two heterogeneous Memristors (1T2M) with different threshold voltages and Ron resistance values. A single voltage signal completes the four-value read and write operation, which reduces the current path and simplifies the circuit structure. Simulation verification by PSpice shows that compared with existing work, the proposed 1T2M multi-value memory crossbar array has simpler circuit structure, higher storage density, faster reading and writing speed, and overcomes better the leakage current problem.
As a new type of electronic component, memristor has the characteristics of small size, fast reading and writing speed, non-volatile and easy to be compatible with CMOS circuits. It is one of the most promising technologies to realize non-volatile memory. However, the existing multi-value storage cross array has problems such as complex circuit structure, sneak path problem and low storage density, which affect the practicability of the multi-value storage cross array. In this paper, a multi-value memory crossbar array based on heterogeneous memristors is proposed, in which the memory cell is composed of one Transistor and two heterogeneous Memristors (1T2M) with different threshold voltages and Ron resistance values. A single voltage signal completes the four-value read and write operation, which reduces the current path and simplifies the circuit structure. Simulation verification by PSpice shows that compared with existing work, the proposed 1T2M multi-value memory crossbar array has simpler circuit structure, higher storage density, faster reading and writing speed, and overcomes better the leakage current problem.
2021, 43(6): 1541-1549.
doi: 10.11999/JEIT210004
Abstract:
This paper establishes an energy efficiency probability model for a dedicated cryptographic processor, and guides the design of the cryptographic processor. The design space exploration problem of a processor is designed as the positioning problem of "1" values in the configuration matrix. The probability matrix is introduced to transform the positioning problem into an optimal configuration probability problem. Based on the idea of machine learning, a probability model for the highest energy efficiency of a dedicated cryptographic processor is proposed. Experiments prove that the energy efficiency probability model in this paper outputs the final result after 2300 iterations on average, and the prediction accuracy rate reaches 92.7%. According to the highest energy efficiency probability model, a collection of computing units that meet high energy efficiency requirements can be obtained, and they are integrated into the open source general-purpose 64 bit RISCV processor core named Ariane. A dedicated processor for energy-efficient cryptography is built. The processor is synthesized under the CMOS 55 nm process, and the results show that compared with Ariane, the area of the proposed cryptographic processor increases by 426874 μm2, the key delay increases by 0.51 ns, and the sum of the increasing total time area of the cryptographic algorithm is 0.46, the energy efficiency ratio of common cryptographic algorithms is within the range of 1.6~35.16 Mbps/mW.
This paper establishes an energy efficiency probability model for a dedicated cryptographic processor, and guides the design of the cryptographic processor. The design space exploration problem of a processor is designed as the positioning problem of "1" values in the configuration matrix. The probability matrix is introduced to transform the positioning problem into an optimal configuration probability problem. Based on the idea of machine learning, a probability model for the highest energy efficiency of a dedicated cryptographic processor is proposed. Experiments prove that the energy efficiency probability model in this paper outputs the final result after 2300 iterations on average, and the prediction accuracy rate reaches 92.7%. According to the highest energy efficiency probability model, a collection of computing units that meet high energy efficiency requirements can be obtained, and they are integrated into the open source general-purpose 64 bit RISCV processor core named Ariane. A dedicated processor for energy-efficient cryptography is built. The processor is synthesized under the CMOS 55 nm process, and the results show that compared with Ariane, the area of the proposed cryptographic processor increases by 426874 μm2, the key delay increases by 0.51 ns, and the sum of the increasing total time area of the cryptographic algorithm is 0.46, the energy efficiency ratio of common cryptographic algorithms is within the range of 1.6~35.16 Mbps/mW.
2021, 43(6): 1550-1558.
doi: 10.11999/JEIT210001
Abstract:
Sub-threshold circuit is an important development direction of low power consumption. With the reduction of power supply voltage, the performance of standard cell circuits provided by foundries is susceptible to noise and process deviations, which has become a bottleneck restricting sub-threshold chips. The high-robust sub-threshold standard cells are proposed in this work. The Schmitt Trigger (ST) and Inverse Narrow Width Effect (INWE) are used to improve the performance, leakage, robust of the logic gates. Then, the INWE minimum width size and finger layout methods are used to increase the switching threshold of the circuit and the drive current of transistor. Finally, the standard cell library is designed and verified with TSMC 65 nm process. The experimental results show that the power of designed standard cells is reduced about 7.2%~15.6%, the noise margin is improved about 11.5%~15.3%, and the average power of ISCAS test circuit is reduced about 15.8%.
Sub-threshold circuit is an important development direction of low power consumption. With the reduction of power supply voltage, the performance of standard cell circuits provided by foundries is susceptible to noise and process deviations, which has become a bottleneck restricting sub-threshold chips. The high-robust sub-threshold standard cells are proposed in this work. The Schmitt Trigger (ST) and Inverse Narrow Width Effect (INWE) are used to improve the performance, leakage, robust of the logic gates. Then, the INWE minimum width size and finger layout methods are used to increase the switching threshold of the circuit and the drive current of transistor. Finally, the standard cell library is designed and verified with TSMC 65 nm process. The experimental results show that the power of designed standard cells is reduced about 7.2%~15.6%, the noise margin is improved about 11.5%~15.3%, and the average power of ISCAS test circuit is reduced about 15.8%.
2021, 43(6): 1559-1564.
doi: 10.11999/JEIT210071
Abstract:
A 130 GHz active Vector-Modulation (VM) phase shifter based on 55 nm CMOS process is presented for millimeter-wave phased array radar applications. A wideband quadrature generator, there stages variable gain amplifiers and a Gilbert-based summator are exploited in the proposed phase shifter. For improving the phase resolution and accuracy of the phase shifter, multi-stage wide gain-range variable gain amplifiers, which consists of stack common-gate amplifiers and the cascode amplifier based on capacitance neutralization technology, are employed. In addition, the Digital Controlled Artificial Dielectric (DiCAD) structure are also adopted in the proposed phase shifter to cover the phase gap result by VM structure. The full-wave electromagnetic simulation results show that the average gain of the proposed phase shifter is above 1 dB at 125 to 135 GHz. The phase shifting range can cover full 360°with a 5.6° phase step, and the RMS phase error is less than 4° at operating frequency range. The area of the phase shifter is 1100 μm×600 μm, and the power consumption is 33 mW.
A 130 GHz active Vector-Modulation (VM) phase shifter based on 55 nm CMOS process is presented for millimeter-wave phased array radar applications. A wideband quadrature generator, there stages variable gain amplifiers and a Gilbert-based summator are exploited in the proposed phase shifter. For improving the phase resolution and accuracy of the phase shifter, multi-stage wide gain-range variable gain amplifiers, which consists of stack common-gate amplifiers and the cascode amplifier based on capacitance neutralization technology, are employed. In addition, the Digital Controlled Artificial Dielectric (DiCAD) structure are also adopted in the proposed phase shifter to cover the phase gap result by VM structure. The full-wave electromagnetic simulation results show that the average gain of the proposed phase shifter is above 1 dB at 125 to 135 GHz. The phase shifting range can cover full 360°with a 5.6° phase step, and the RMS phase error is less than 4° at operating frequency range. The area of the phase shifter is 1100 μm×600 μm, and the power consumption is 33 mW.
2021, 43(6): 1565-1573.
doi: 10.11999/JEIT210060
Abstract:
A low-jitter multi-phase clock generation circuit is designed based on a global shared Delay Locked Loop (DLL) and a two-dimensional H-shaped clock tree network for Geiger-mode avalanche focal plane array applications. The DLL adopts an eight-phase voltage-controlled delay chain, a double-edge trigger phase detector and a start reset module. A differential charge pump structure is introduced to reduce the current mismatch between charging and discharging and lower the clock timing jitter. H clock tree structure is involved to diminish the phase variation induced by the asymmetry of the transmission route for large scale integrated circuit, ensuring an equal delay of the multi-channel split-phase clock signal to the pixel unit. The locking frequency range of 150~400 MHz, phase noises below -127 dBc/Hz at 1 MHz offset, RMS timing jitter of below 2.5 ps and static phase error below 65 ps are achieved based on a 0.18 µm digital-analog hybrid CMOS technology.
A low-jitter multi-phase clock generation circuit is designed based on a global shared Delay Locked Loop (DLL) and a two-dimensional H-shaped clock tree network for Geiger-mode avalanche focal plane array applications. The DLL adopts an eight-phase voltage-controlled delay chain, a double-edge trigger phase detector and a start reset module. A differential charge pump structure is introduced to reduce the current mismatch between charging and discharging and lower the clock timing jitter. H clock tree structure is involved to diminish the phase variation induced by the asymmetry of the transmission route for large scale integrated circuit, ensuring an equal delay of the multi-channel split-phase clock signal to the pixel unit. The locking frequency range of 150~400 MHz, phase noises below -127 dBc/Hz at 1 MHz offset, RMS timing jitter of below 2.5 ps and static phase error below 65 ps are achieved based on a 0.18 µm digital-analog hybrid CMOS technology.
2021, 43(6): 1574-1586.
doi: 10.11999/JEIT210010
Abstract:
In-Memory Computing (IMC) architectures have aroused much attention recently, and are regarded as promising candidates to break the von Neumann bottleneck. IMC architectures can bring significant performance and energy-efficiency improvement especially in data-intensive computation. Among those emerging IMC architectures, SRAM-based ones have also been extensively researched and applied to many scenarios. In this paper, IoT applications are explored based on a SRAM-based generic IMC architecture platform named DM-IMCA. To be specific, the algorithms of several lightweight data-intensive applications in IoT area including information security, Binary Neural Networks (BNN) and image processing are analyzed, decomposed and then mapped to SRAM macros of DM-IMCA, so as to accelerate the computation of these applications. Experimental results indicate that DM-IMCA can offer up to 24 times performance speed-up, compared to a baseline system with conventional von Neumann architecture, in terms of realizing lightweight data-intensive applications in IoT.
In-Memory Computing (IMC) architectures have aroused much attention recently, and are regarded as promising candidates to break the von Neumann bottleneck. IMC architectures can bring significant performance and energy-efficiency improvement especially in data-intensive computation. Among those emerging IMC architectures, SRAM-based ones have also been extensively researched and applied to many scenarios. In this paper, IoT applications are explored based on a SRAM-based generic IMC architecture platform named DM-IMCA. To be specific, the algorithms of several lightweight data-intensive applications in IoT area including information security, Binary Neural Networks (BNN) and image processing are analyzed, decomposed and then mapped to SRAM macros of DM-IMCA, so as to accelerate the computation of these applications. Experimental results indicate that DM-IMCA can offer up to 24 times performance speed-up, compared to a baseline system with conventional von Neumann architecture, in terms of realizing lightweight data-intensive applications in IoT.
2021, 43(6): 1587-1595.
doi: 10.11999/JEIT210008
Abstract:
In order to solve the problem of low mapping performance and long compilation time of cipher algorithms on Coarse-grained Reconfigurable Cipher Logic Arrays (CRCLA), a description of cryptographic algorithms and hardware resources is proposed, which can display the occupancy of each resource more intuitively during the mapping process. Then by analyzing the inherent relationship between the cryptographic algorithm operation characteristics and the coarse-grained reconfigurable cipher logic array hardware structure, with the goal of reducing the critical path delay, an Edge - centric Cipher Logical array Mapping (ECLMap) algorithm is proposed. Using edge mappings to guide the mapping of nodes, combined with relevant mapping strategy, the backtracking mechanism is introduced to improve the success rate of mapping. Compared with other common mapping algorithms, the results show that the algorithm proposed in this paper has the best mapping performance, with an average increase of about 20% in the algorithm energy efficiency and about 25% in the compilation time. The high-efficiency mapping of the algorithm is realized.
In order to solve the problem of low mapping performance and long compilation time of cipher algorithms on Coarse-grained Reconfigurable Cipher Logic Arrays (CRCLA), a description of cryptographic algorithms and hardware resources is proposed, which can display the occupancy of each resource more intuitively during the mapping process. Then by analyzing the inherent relationship between the cryptographic algorithm operation characteristics and the coarse-grained reconfigurable cipher logic array hardware structure, with the goal of reducing the critical path delay, an Edge - centric Cipher Logical array Mapping (ECLMap) algorithm is proposed. Using edge mappings to guide the mapping of nodes, combined with relevant mapping strategy, the backtracking mechanism is introduced to improve the success rate of mapping. Compared with other common mapping algorithms, the results show that the algorithm proposed in this paper has the best mapping performance, with an average increase of about 20% in the algorithm energy efficiency and about 25% in the compilation time. The high-efficiency mapping of the algorithm is realized.
2021, 43(6): 1596-1602.
doi: 10.11999/JEIT201104
Abstract:
As a strategic emerging industry, the Internet of Things (IoT) has become a national development focus, but it also faces various security threats in practical applications. Ensuring the security of data transmission, processing and storage of resource-constrained IoT systems has become a research hotspot. In this paper, a high steady-state Physical Unclonable Function(PUF) generator scheme is proposed by studying PUF and the deviation of the sensor preparation process. Firstly, the Electrostatic Spray Deposition (ESD) is used to generate nanofibers with high specific surface area characteristics and high-temperature calcination technology is combined to prepare Pd-SnO2 gas sensors. Secondly, the response data of gas sensors to formaldehyde gas is collected under different gas concentration, ambient temperature and heating voltage conditions. Then, a random resistance multi-bit balance algorithm is used to compare the response data of different clusters of gas sensors and then multiple high steady-state PUF data is generated. Finally, the safety and reliability of the designed PUF generator are evaluated. Experimental results show that the randomness of the PUF generator is 97.03%, the reliability is 97.85%, and the uniqueness is 49.04%, which can be widely used in IoT security field.
As a strategic emerging industry, the Internet of Things (IoT) has become a national development focus, but it also faces various security threats in practical applications. Ensuring the security of data transmission, processing and storage of resource-constrained IoT systems has become a research hotspot. In this paper, a high steady-state Physical Unclonable Function(PUF) generator scheme is proposed by studying PUF and the deviation of the sensor preparation process. Firstly, the Electrostatic Spray Deposition (ESD) is used to generate nanofibers with high specific surface area characteristics and high-temperature calcination technology is combined to prepare Pd-SnO2 gas sensors. Secondly, the response data of gas sensors to formaldehyde gas is collected under different gas concentration, ambient temperature and heating voltage conditions. Then, a random resistance multi-bit balance algorithm is used to compare the response data of different clusters of gas sensors and then multiple high steady-state PUF data is generated. Finally, the safety and reliability of the designed PUF generator are evaluated. Experimental results show that the randomness of the PUF generator is 97.03%, the reliability is 97.85%, and the uniqueness is 49.04%, which can be widely used in IoT security field.
2021, 43(6): 1603-1608.
doi: 10.11999/JEIT200957
Abstract:
A wide-Intermediate-Frequency (IF) down-conversion mixer operating in millimeter-wave band is proposed. The mixer is designed based on a passive double-balanced structure integrating Radio-Frequency (RF) and Local-Oscillator (LO) baluns. To optimize the performances in terms of the Conversion Gain (CG), bandwidth and isolations of the mixer, the gate-inductive technique is employed. The measured results show that the mixer features a wide IF bandwidth from 0.5 to 12 GHz. A measured CG of –8.5~–5.5 dB is achieved within such a wide IF band at a LO power (PLO) of 4 dBm and a LO frequency (fLO) of 30 GHz. The proposed mixer also achieves a CG with a ripple of 2 dB from –7.9 to –5.9 dB in a wide RF band (fRF) from 25 to 45 GHz at a PLO of 4 dBm and a fixed IF frequency (fIF) of 0.5 GHz. The measured LO-to-IF, LO-to-RF and RF-to-IF isolations are better than 42, 50 and 43 dB, respectively. The chip is fabricated in TSMC 90 nm CMOS process with an area of 0.4 mm2.
A wide-Intermediate-Frequency (IF) down-conversion mixer operating in millimeter-wave band is proposed. The mixer is designed based on a passive double-balanced structure integrating Radio-Frequency (RF) and Local-Oscillator (LO) baluns. To optimize the performances in terms of the Conversion Gain (CG), bandwidth and isolations of the mixer, the gate-inductive technique is employed. The measured results show that the mixer features a wide IF bandwidth from 0.5 to 12 GHz. A measured CG of –8.5~–5.5 dB is achieved within such a wide IF band at a LO power (PLO) of 4 dBm and a LO frequency (fLO) of 30 GHz. The proposed mixer also achieves a CG with a ripple of 2 dB from –7.9 to –5.9 dB in a wide RF band (fRF) from 25 to 45 GHz at a PLO of 4 dBm and a fixed IF frequency (fIF) of 0.5 GHz. The measured LO-to-IF, LO-to-RF and RF-to-IF isolations are better than 42, 50 and 43 dB, respectively. The chip is fabricated in TSMC 90 nm CMOS process with an area of 0.4 mm2.
2021, 43(6): 1609-1616.
doi: 10.11999/JEIT201033
Abstract:
Escape routing is an important part of the integrated circuit physical design. In order to solve the problem of slow parallel escape routing with unsatisfactory outcomes, an algorithm of ordered escape routing, combining the improved A* algorithm with the rip-up and reroute method is proposed. Firstly, the routing sequence of pins is determined by the cost estimation function and the improved A* algorithm is used to initialize the ordered escape routing. Secondly, the routing paths of the same length are optimized and the routing paths of the crowded areas are adjusted. Finally, A* algorithm and breadth-first search are employed to rip-up and reroute. The experimental results shows that this method achieved 100% escape routing for all given test cases, and that the feasible solution of the ordered escape paths is, to a great extent, close to the optimal solution. Compared to the Boolean Satisfiability Problem (SAT) algorithm and MMCF algorithm, this algorithm reduces CPU time by 95.6% and 97.8%, respectively, and makes the overall line length shorter. It is evident that the proposed method reduces the time required to find the feasible solution and optimize wire routing.
Escape routing is an important part of the integrated circuit physical design. In order to solve the problem of slow parallel escape routing with unsatisfactory outcomes, an algorithm of ordered escape routing, combining the improved A* algorithm with the rip-up and reroute method is proposed. Firstly, the routing sequence of pins is determined by the cost estimation function and the improved A* algorithm is used to initialize the ordered escape routing. Secondly, the routing paths of the same length are optimized and the routing paths of the crowded areas are adjusted. Finally, A* algorithm and breadth-first search are employed to rip-up and reroute. The experimental results shows that this method achieved 100% escape routing for all given test cases, and that the feasible solution of the ordered escape paths is, to a great extent, close to the optimal solution. Compared to the Boolean Satisfiability Problem (SAT) algorithm and MMCF algorithm, this algorithm reduces CPU time by 95.6% and 97.8%, respectively, and makes the overall line length shorter. It is evident that the proposed method reduces the time required to find the feasible solution and optimize wire routing.
2021, 43(6): 1617-1621.
doi: 10.11999/JEIT200564
Abstract:
A multi-octave power amplifier is proposed based on the improved simplified real frequency method. Combined with the load-pull technology, the optimal impedance of the transistor is analyzed. By improving the optimization objective and error function in the improved simplified real frequency method, the optimal impedance of multiple frequency points in the frequency band is analyzed. The output matching circuit of the power amplifier is designed. The broadband matching circuit is optimized, and the working bandwidth of the amplifier is improved. The test results show that the saturation output power reaches 42.5 dBm and the saturation drain efficiency is 64%~75% within the frequency band of 0.5~2.7 GHz.
A multi-octave power amplifier is proposed based on the improved simplified real frequency method. Combined with the load-pull technology, the optimal impedance of the transistor is analyzed. By improving the optimization objective and error function in the improved simplified real frequency method, the optimal impedance of multiple frequency points in the frequency band is analyzed. The output matching circuit of the power amplifier is designed. The broadband matching circuit is optimized, and the working bandwidth of the amplifier is improved. The test results show that the saturation output power reaches 42.5 dBm and the saturation drain efficiency is 64%~75% within the frequency band of 0.5~2.7 GHz.
2021, 43(6): 1622-1629.
doi: 10.11999/JEIT190337
Abstract:
The LCLC resonant converters are universally applied to the space Travelling-Wave Tube Amplifiers (TWTA), to boost the input voltage. In an LCLC resonant converter, several resonant components, including the leakage inductance, the serial resonant capacitor, the magnetizing inductance and the parallel resonant capacitor, are included in the resonant tank, which complicates the optimal design of total power loss in the LCLC resonant converter. In order to solve this problem, in this paper, a Particle-Swarm-Optimization (PSO)algorithm based, optimal design method of the LCLC resonant converter is proposed. At first, the total power loss of the LCLC resonant converter is derived based on the circuit analysis; After that, the total power loss of the LCLC resonant converter is optimized by the Particle-Swarm-Optimization algorithm and the optimal resonant parameters are calculated. Finally, based on the optimal resonant parameters, the optimal LCLC resonant converter is built. The proposed optimal design method is validated by experiments.
The LCLC resonant converters are universally applied to the space Travelling-Wave Tube Amplifiers (TWTA), to boost the input voltage. In an LCLC resonant converter, several resonant components, including the leakage inductance, the serial resonant capacitor, the magnetizing inductance and the parallel resonant capacitor, are included in the resonant tank, which complicates the optimal design of total power loss in the LCLC resonant converter. In order to solve this problem, in this paper, a Particle-Swarm-Optimization (PSO)algorithm based, optimal design method of the LCLC resonant converter is proposed. At first, the total power loss of the LCLC resonant converter is derived based on the circuit analysis; After that, the total power loss of the LCLC resonant converter is optimized by the Particle-Swarm-Optimization algorithm and the optimal resonant parameters are calculated. Finally, based on the optimal resonant parameters, the optimal LCLC resonant converter is built. The proposed optimal design method is validated by experiments.
2021, 43(6): 1630-1636.
doi: 10.11999/JEIT200392
Abstract:
A novel f-letter waveguide slot with small size and low-profile is presented in this paper. The height of the circular polarizer is only λ0/6, and the width is about 2λ0/5. Based on the new circular polarizer, a circularly polarized ridge-waveguide slot linear array antenna with lower-profile and high aperture efficiency is designed. Using the linear array as the antenna cell, a 16-cells Ka-band circularly polarized wide-scan ridge-waveguide slot phased-array antenna with high efficiency and lightweight is designed, simulated and fabricated to achieve ±60° wide-scan for the space application. The simulated and measured results show that the phased-array antenna can achieve one-dimensional wide-angle beam-scanning with ±60° scan range. Across the scan coverage, the axial ratio is less than 4.1 dB and the gain variation is less than 4.3 dB. The measured antenna gain is higher than 35.9 dBi in the 0° scan-angle, and the corresponding efficiency near 85% are obtained.
A novel f-letter waveguide slot with small size and low-profile is presented in this paper. The height of the circular polarizer is only λ0/6, and the width is about 2λ0/5. Based on the new circular polarizer, a circularly polarized ridge-waveguide slot linear array antenna with lower-profile and high aperture efficiency is designed. Using the linear array as the antenna cell, a 16-cells Ka-band circularly polarized wide-scan ridge-waveguide slot phased-array antenna with high efficiency and lightweight is designed, simulated and fabricated to achieve ±60° wide-scan for the space application. The simulated and measured results show that the phased-array antenna can achieve one-dimensional wide-angle beam-scanning with ±60° scan range. Across the scan coverage, the axial ratio is less than 4.1 dB and the gain variation is less than 4.3 dB. The measured antenna gain is higher than 35.9 dBi in the 0° scan-angle, and the corresponding efficiency near 85% are obtained.
2021, 43(6): 1637-1643.
doi: 10.11999/JEIT200286
Abstract:
A novel non-contact AC voltage detector based on the electric field probe of concentric double-layer spherical shell structure is presented. The concentric double-layer spherical shell structure is similar to the differential structure, which can eliminate the influence of common mode interference noise. The theoretical model of the electric field distribution of the double-layer spherical shell structure is established, and the induced charge density of the outer spherical shell surface is analyzed. Then the sensitivity expression of the electric field probe is obtained. Furthermore, the equivalent circuit model of the electric field probe is proposed and the interface circuit is designed. Finally, a prototype of the not-contact AC voltage detector is successfully developed. The test results show that there is a good linear relationship between the output of the prototype and the applied electric field, with a linearity of 0.66%, and the test results are in good agreement with the calculated results. Additionally, when the prototype rotates within the range of 0~45°, the output voltage is only reduced by a maximum of 4.0%, which indicates that the small angle rotation of the AC voltage detector does not affect the accuracy of electricity testing. Besides, the closer to the transmission line, the faster the output voltage of the prototype increases, and the threshold is easy to identify, suggesting that it is easier to verify the electricity.
A novel non-contact AC voltage detector based on the electric field probe of concentric double-layer spherical shell structure is presented. The concentric double-layer spherical shell structure is similar to the differential structure, which can eliminate the influence of common mode interference noise. The theoretical model of the electric field distribution of the double-layer spherical shell structure is established, and the induced charge density of the outer spherical shell surface is analyzed. Then the sensitivity expression of the electric field probe is obtained. Furthermore, the equivalent circuit model of the electric field probe is proposed and the interface circuit is designed. Finally, a prototype of the not-contact AC voltage detector is successfully developed. The test results show that there is a good linear relationship between the output of the prototype and the applied electric field, with a linearity of 0.66%, and the test results are in good agreement with the calculated results. Additionally, when the prototype rotates within the range of 0~45°, the output voltage is only reduced by a maximum of 4.0%, which indicates that the small angle rotation of the AC voltage detector does not affect the accuracy of electricity testing. Besides, the closer to the transmission line, the faster the output voltage of the prototype increases, and the threshold is easy to identify, suggesting that it is easier to verify the electricity.
2021, 43(6): 1644-1652.
doi: 10.11999/JEIT200219
Abstract:
In view of the problem of data damage in faint radar signals, a radar signal reconstruction method is proposed based on Variational Mode Decomposition and Compressed Sensing (VMD-CS). Firstly, Variational Mode Decomposition is used to degrade and denoise the collected data. Secondly, the observation matrix and sparse representation matrix are constructed by compressed sensing method. Then the sparse representation vector is reconstructed based on the Orthogonal Matching Pursuit (OMP) algorithm. On this basis, the discrete cosine transform is utilized to reconstruct the damaged radar signal. The simulation experiments are carried out on the actual collect Linear Frequency Modulation (LFM) radar signal in two cases of continuous data loss and random data loss. The experimental results show that, the proposed method can well reconstruct the radar signal and approach the original signal accurately in time domain, frequency domain and instantaneous frequency when the continuous data loss rate does not exceed 30% or the random data loss date does not exceed 60%.
In view of the problem of data damage in faint radar signals, a radar signal reconstruction method is proposed based on Variational Mode Decomposition and Compressed Sensing (VMD-CS). Firstly, Variational Mode Decomposition is used to degrade and denoise the collected data. Secondly, the observation matrix and sparse representation matrix are constructed by compressed sensing method. Then the sparse representation vector is reconstructed based on the Orthogonal Matching Pursuit (OMP) algorithm. On this basis, the discrete cosine transform is utilized to reconstruct the damaged radar signal. The simulation experiments are carried out on the actual collect Linear Frequency Modulation (LFM) radar signal in two cases of continuous data loss and random data loss. The experimental results show that, the proposed method can well reconstruct the radar signal and approach the original signal accurately in time domain, frequency domain and instantaneous frequency when the continuous data loss rate does not exceed 30% or the random data loss date does not exceed 60%.
2021, 43(6): 1653-1658.
doi: 10.11999/JEIT200296
Abstract:
There are more and more problems for the two-Dimensional (2D) direction finding with traditional parallel arrays, such as low degree of freedom, low resolution and large estimation errors with small snapshots, etc. In view of these problems and based on the parallel coprime virtual array, a low-complexity 2D Direction Of Arrival (DOA) algorithm is proposed in this paper. In the proposed algorithm, a virtual array is generated by the the expansion of two mutually parallel linear arrays. Then an extended matrix with high degrees of freedom of the 2D angular is constructed by the autocovariance matrix and cross-covariance matrix. Finally, the automatically matched 2D-DOA estimation is obtained by the Singular Value Decomposition (SVD) and Estimating Signal Parameters via Rotational Invariance Techniques (ESPRIT). Compared with the traditional two-dimensional DOA estimation methods, the proposed algorithm utilizes more information from the received data of the array, can distinguish more incident signals with high resolution. Meanwhile, the proposed algorithm needs no 2D linear searching or angular parameter matching and is with good performance under low Signal-to-Noise Ratio (SNR) and small snapshot. Experimental simulation results demonstrate the correctness and validity of the algorithm.
There are more and more problems for the two-Dimensional (2D) direction finding with traditional parallel arrays, such as low degree of freedom, low resolution and large estimation errors with small snapshots, etc. In view of these problems and based on the parallel coprime virtual array, a low-complexity 2D Direction Of Arrival (DOA) algorithm is proposed in this paper. In the proposed algorithm, a virtual array is generated by the the expansion of two mutually parallel linear arrays. Then an extended matrix with high degrees of freedom of the 2D angular is constructed by the autocovariance matrix and cross-covariance matrix. Finally, the automatically matched 2D-DOA estimation is obtained by the Singular Value Decomposition (SVD) and Estimating Signal Parameters via Rotational Invariance Techniques (ESPRIT). Compared with the traditional two-dimensional DOA estimation methods, the proposed algorithm utilizes more information from the received data of the array, can distinguish more incident signals with high resolution. Meanwhile, the proposed algorithm needs no 2D linear searching or angular parameter matching and is with good performance under low Signal-to-Noise Ratio (SNR) and small snapshot. Experimental simulation results demonstrate the correctness and validity of the algorithm.
2021, 43(6): 1659-1666.
doi: 10.11999/JEIT200259
Abstract:
In the complex electromagnetic environment, the radar returns may contain some interference components, which leads to the detection performance degradation. In this paper, considering the adaptive detection problem, where the cell under test and a portion of the reference data are contaminated by the rank one interference of the firs order, constrained in a known subspace. Base on the 2-Step Generalized Likelihood Ratio Test (2SGLRT) criterion, a Subspace Constrained (SC) 2SGLRT(SC-2SGLRT) detector is proposed. Furthermore, using a Modified 2SGLRT (M2SGLRT), a SC-M2SGLRT detector is proposed, which has a better detection performance than 2SGLRT. Finally, using the so called 3SGLRT criterion, a SC-3SGLRT detector is proposed, whose detection performance is similar to the 2SGLRT, but with very small computation load. The computer simulation results show that, to make full use of all reference data and a prior information of the interference is helpful to improve the detection performance.
In the complex electromagnetic environment, the radar returns may contain some interference components, which leads to the detection performance degradation. In this paper, considering the adaptive detection problem, where the cell under test and a portion of the reference data are contaminated by the rank one interference of the firs order, constrained in a known subspace. Base on the 2-Step Generalized Likelihood Ratio Test (2SGLRT) criterion, a Subspace Constrained (SC) 2SGLRT(SC-2SGLRT) detector is proposed. Furthermore, using a Modified 2SGLRT (M2SGLRT), a SC-M2SGLRT detector is proposed, which has a better detection performance than 2SGLRT. Finally, using the so called 3SGLRT criterion, a SC-3SGLRT detector is proposed, whose detection performance is similar to the 2SGLRT, but with very small computation load. The computer simulation results show that, to make full use of all reference data and a prior information of the interference is helpful to improve the detection performance.
2021, 43(6): 1667-1675.
doi: 10.11999/JEIT200274
Abstract:
In order to solving the problems of the inner structure damage and the high computation load brought by the vectorizing or matrixing of 3-D sparse data, the 3-D signal model is established in tensor space for downward-looking sparse linear array three-dimensional SAR. Based on this signal model, a three-dimensional SAR sparse imaging algorithm is proposed in this paper. The missing data firstly can be recovered by tensor completion on the assumption that the echo tensor is essentially low rank. Then, the resulting 3-D images can be well focused by any Fourier transform-based 3-D imaging algorithms with the recovered full-sampled data tensor. The proposed algorithm achieves not only high resolution and low-level side-lobes but also the ideal computational cost and memory consumption, which verified by several numerical simulations and multiple comparative studies on real data.
In order to solving the problems of the inner structure damage and the high computation load brought by the vectorizing or matrixing of 3-D sparse data, the 3-D signal model is established in tensor space for downward-looking sparse linear array three-dimensional SAR. Based on this signal model, a three-dimensional SAR sparse imaging algorithm is proposed in this paper. The missing data firstly can be recovered by tensor completion on the assumption that the echo tensor is essentially low rank. Then, the resulting 3-D images can be well focused by any Fourier transform-based 3-D imaging algorithms with the recovered full-sampled data tensor. The proposed algorithm achieves not only high resolution and low-level side-lobes but also the ideal computational cost and memory consumption, which verified by several numerical simulations and multiple comparative studies on real data.
2021, 43(6): 1676-1682.
doi: 10.11999/JEIT190946
Abstract:
Two key factors limiting the performance of height finding of low-elevation targets for Very High Frequency (VHF) are the wider receive beamwidth and complex multipath signals. An T-shaped interferometric array is proposed to extend the receive aperture and increase the Degrees Of Freedom(DOF) for improving the angle resolution. A robust height finding Algorithm based on the Fractional Low Order Moments(FLOM) is proposed. Owing to the non-Gaussian diffuse component, the Covariation Matrix(CM) is demonstrated theoretically for the array manifold reservation by the fractional lower order moments. Then the decorrelation for the generalized signal covariation matrix is performed with spatial smoothing and real-valued transform. The robust low-elevation altitude estimation is achieved by the two-dimensional Unitary ESPRIT algorithm based on the dual-size spatial shift-invariance of the interferometric array. The proposed method increases the resolution between the direct signal and specular multipath. The three-region baseline method is also proposed theoretically. Simulation results demonstrate the validation of the interferometric structure and robust height finding method as well as the theoretical correctness of the three-region baseline method.
Two key factors limiting the performance of height finding of low-elevation targets for Very High Frequency (VHF) are the wider receive beamwidth and complex multipath signals. An T-shaped interferometric array is proposed to extend the receive aperture and increase the Degrees Of Freedom(DOF) for improving the angle resolution. A robust height finding Algorithm based on the Fractional Low Order Moments(FLOM) is proposed. Owing to the non-Gaussian diffuse component, the Covariation Matrix(CM) is demonstrated theoretically for the array manifold reservation by the fractional lower order moments. Then the decorrelation for the generalized signal covariation matrix is performed with spatial smoothing and real-valued transform. The robust low-elevation altitude estimation is achieved by the two-dimensional Unitary ESPRIT algorithm based on the dual-size spatial shift-invariance of the interferometric array. The proposed method increases the resolution between the direct signal and specular multipath. The three-region baseline method is also proposed theoretically. Simulation results demonstrate the validation of the interferometric structure and robust height finding method as well as the theoretical correctness of the three-region baseline method.
2021, 43(6): 1683-1690.
doi: 10.11999/JEIT200086
Abstract:
Three-dimensional imaging of sea surface altitude is a technology realized with the launch of Tiangong-2. Phase unwrapping is a key step in elevation inversion of three-dimensional radar imaging altimeter. In order to improve the branch cut algorithm proposed by Goldstein, shorten the total length of branch cuts in interferogram and advance the accuracy of phase unwrapping, a method that based on Jonker-Volgenant-Castanon (JVC) global optimal linear assignment algorithm to generate branch cuts is proposed in this paper. At first, all residual points in the interferogram are found out, and the distance between all opposite polarity pairs is calculated. Then, by comparing the distance between each pair of residual points and the distance sum between both of them and the nearest boundary, it is determined whether to place the branch cuts directly between the residual points and the boundaries or to use JVC algorithm. So, the shortest total length of the balanced branch cuts is obtained. The experiments of unwrapping are carried out by using the interferogram both of simulated three-dimensional imaging altimeter sea surface elevation and Etna volcano area. By comparison with the other three algorithms, it shows that the error between the unwrapping result of the proposed algorithm and the real phase value is relatively small. Also, it can void the “islanding phenomenon” effectively.
Three-dimensional imaging of sea surface altitude is a technology realized with the launch of Tiangong-2. Phase unwrapping is a key step in elevation inversion of three-dimensional radar imaging altimeter. In order to improve the branch cut algorithm proposed by Goldstein, shorten the total length of branch cuts in interferogram and advance the accuracy of phase unwrapping, a method that based on Jonker-Volgenant-Castanon (JVC) global optimal linear assignment algorithm to generate branch cuts is proposed in this paper. At first, all residual points in the interferogram are found out, and the distance between all opposite polarity pairs is calculated. Then, by comparing the distance between each pair of residual points and the distance sum between both of them and the nearest boundary, it is determined whether to place the branch cuts directly between the residual points and the boundaries or to use JVC algorithm. So, the shortest total length of the balanced branch cuts is obtained. The experiments of unwrapping are carried out by using the interferogram both of simulated three-dimensional imaging altimeter sea surface elevation and Etna volcano area. By comparison with the other three algorithms, it shows that the error between the unwrapping result of the proposed algorithm and the real phase value is relatively small. Also, it can void the “islanding phenomenon” effectively.
2021, 43(6): 1691-1697.
doi: 10.11999/JEIT200149
Abstract:
Hypersonic technology is the development trend of space vehicles in the future. It also poses new challenges for the fast acquire capability of communication platforms in ultra-high dynamic and low signal-to-noise ratio environments. To overcome the limitation of the classic acquisition algorithm affected by frequency offset, a signal acquisition algorithm based on Multi-sample Serial Fast Fourier Transform (MS-FFT) is proposed. The proposed algorithm serially executes the FFT of multiple samples and runs the peak searching after non-coherent combining to obtain the acquire result. Without increasing the complexity, the influence of frequency offset on the acquisition performance is avoided. By deriving the theoretical formula of the Peak Signal-to-Noise Ratio (PSNR), it is proved that the frequency offset adaptation range of MS-FFT depends on the sampling rate and can be larger than the classical algorithm with the continuous improvement of the sampling capability of digital-analog conversion devices. Finally, the correctness of the above theoretical derivation is verified by simulation, and it is proved that the proposed algorithm is more suitable for the application scenarios of ultra-high dynamic environment.
Hypersonic technology is the development trend of space vehicles in the future. It also poses new challenges for the fast acquire capability of communication platforms in ultra-high dynamic and low signal-to-noise ratio environments. To overcome the limitation of the classic acquisition algorithm affected by frequency offset, a signal acquisition algorithm based on Multi-sample Serial Fast Fourier Transform (MS-FFT) is proposed. The proposed algorithm serially executes the FFT of multiple samples and runs the peak searching after non-coherent combining to obtain the acquire result. Without increasing the complexity, the influence of frequency offset on the acquisition performance is avoided. By deriving the theoretical formula of the Peak Signal-to-Noise Ratio (PSNR), it is proved that the frequency offset adaptation range of MS-FFT depends on the sampling rate and can be larger than the classical algorithm with the continuous improvement of the sampling capability of digital-analog conversion devices. Finally, the correctness of the above theoretical derivation is verified by simulation, and it is proved that the proposed algorithm is more suitable for the application scenarios of ultra-high dynamic environment.
2021, 43(6): 1698-1705.
doi: 10.11999/JEIT200223
Abstract:
Object detection and tracking is essential in the navigation, obstacle avoidance and other tasks of Unmanned Surface Vehicles (USV). However, the environment on the water is complex, and there are many problems such as object scale variation, occlusion, illumination variation and camera shaking, etc. This paper proposes the visual object detection and tracking of USV based on spatial-temporal information fusion. Deep learning detection in space is used to extract single-frame depth semantic features and correlation filter tracking in time is used to calculate the correlation of oriented gradient feature between frames. Temporal and spatial information through feature comparison are combined to achieve continuous and stable object detection and tracking with strong robustness at real-time. The experiments results demonstrate that the average detection and tracking accuracy is 0.83 with the average running speed of 15 fps, which illustrates the accuracy is improved and the speed is high.
Object detection and tracking is essential in the navigation, obstacle avoidance and other tasks of Unmanned Surface Vehicles (USV). However, the environment on the water is complex, and there are many problems such as object scale variation, occlusion, illumination variation and camera shaking, etc. This paper proposes the visual object detection and tracking of USV based on spatial-temporal information fusion. Deep learning detection in space is used to extract single-frame depth semantic features and correlation filter tracking in time is used to calculate the correlation of oriented gradient feature between frames. Temporal and spatial information through feature comparison are combined to achieve continuous and stable object detection and tracking with strong robustness at real-time. The experiments results demonstrate that the average detection and tracking accuracy is 0.83 with the average running speed of 15 fps, which illustrates the accuracy is improved and the speed is high.
2021, 43(6): 1706-1714.
doi: 10.11999/JEIT200327
Abstract:
In view of the spectrum efficiency and energy efficiency of Heterogeneous Cloud Radio Access Networks (H-CRAN), an energy efficiency optimization algorithm based on Power Domain Non-Orthogonal Multiple Access (PD-NOMA) is proposed. First, the algorithm takes queue stability and forward link capacity as constraints, jointly optimizes user association, power allocation and resource block allocation, and it establishes a joint optimization model of network energy efficiency and user fairness. Secondly, because the state space and action space of the system are both high-dimensional and continuity, the research problem is the NP-hard problem of the continuous domain, and then Trust Region Policy Optimization (TRPO) algorithm is introduced to solve efficiently the continuous domain issue. Finally, the amount of calculations generated by the standard solution for the TRPO algorithm is too large, and Proximal Policy Optimization (PPO) algorithm is used to optimize the solution. The PPO algorithm not only ensures the reliability of the TRPO algorithm, but also reduces effectively the TRPO calculation complexity. Simulation results show that the algorithm proposed in this paper improves further the energy efficiency performance of the network under the constraint of ensuring user fairness.
In view of the spectrum efficiency and energy efficiency of Heterogeneous Cloud Radio Access Networks (H-CRAN), an energy efficiency optimization algorithm based on Power Domain Non-Orthogonal Multiple Access (PD-NOMA) is proposed. First, the algorithm takes queue stability and forward link capacity as constraints, jointly optimizes user association, power allocation and resource block allocation, and it establishes a joint optimization model of network energy efficiency and user fairness. Secondly, because the state space and action space of the system are both high-dimensional and continuity, the research problem is the NP-hard problem of the continuous domain, and then Trust Region Policy Optimization (TRPO) algorithm is introduced to solve efficiently the continuous domain issue. Finally, the amount of calculations generated by the standard solution for the TRPO algorithm is too large, and Proximal Policy Optimization (PPO) algorithm is used to optimize the solution. The PPO algorithm not only ensures the reliability of the TRPO algorithm, but also reduces effectively the TRPO calculation complexity. Simulation results show that the algorithm proposed in this paper improves further the energy efficiency performance of the network under the constraint of ensuring user fairness.
2021, 43(6): 1715-1723.
doi: 10.11999/JEIT200322
Abstract:
The rapid development of cross technology communication promotes the transformation from single network to heterogeneous wireless network, which greatly improves the efficient coexistence and collaboration of heterogeneous wireless devices, but also brings challenges to data distribution in heterogeneous wireless networks. Traditional data distribution schemes are limited by the communication range of a single node and conflict between different network devices, resulting in continuous decline in the efficiency of data distribution. At the same time, they are not suitable for the unique network model of heterogeneous networks. In order to solve these problems, a data distribution method based on multi protocol parallel data transmission in heterogeneous wireless networks is proposed. The key idea is to use the Parallel Multi-protocol Communication (PMC) node as the transmitting node of the ZigBee network, and define a new system COST function to measure the delay and energy penalty of the system. Through adaptive adjustment of the trade-off coefficient in the function, it can depict the data transmission of various requirements. Based on the system COST function, the paper propose a distribution strategy of delayed receiving packets using beacon control that allows ZigBee to choose the appropriate timing to receive data in a heterogeneous network. Furthermore, the paper proves the rationality of the COST function, and then derives the optimal values of the overall energy penalty and time delay of the system based on the idea of dynamic programming. Comprehensive evaluation shows that considering the two design requirements of time delay and energy penalty, the performance of this method is better than traditional data distribution methods.
The rapid development of cross technology communication promotes the transformation from single network to heterogeneous wireless network, which greatly improves the efficient coexistence and collaboration of heterogeneous wireless devices, but also brings challenges to data distribution in heterogeneous wireless networks. Traditional data distribution schemes are limited by the communication range of a single node and conflict between different network devices, resulting in continuous decline in the efficiency of data distribution. At the same time, they are not suitable for the unique network model of heterogeneous networks. In order to solve these problems, a data distribution method based on multi protocol parallel data transmission in heterogeneous wireless networks is proposed. The key idea is to use the Parallel Multi-protocol Communication (PMC) node as the transmitting node of the ZigBee network, and define a new system COST function to measure the delay and energy penalty of the system. Through adaptive adjustment of the trade-off coefficient in the function, it can depict the data transmission of various requirements. Based on the system COST function, the paper propose a distribution strategy of delayed receiving packets using beacon control that allows ZigBee to choose the appropriate timing to receive data in a heterogeneous network. Furthermore, the paper proves the rationality of the COST function, and then derives the optimal values of the overall energy penalty and time delay of the system based on the idea of dynamic programming. Comprehensive evaluation shows that considering the two design requirements of time delay and energy penalty, the performance of this method is better than traditional data distribution methods.
2021, 43(6): 1724-1732.
doi: 10.11999/JEIT200297
Abstract:
Considering the problem of Service Function Chain (SFC) placement optimization caused by the dynamic arrival of network service requests under the Network Function Virtualization/Software Defined Network (NFV/SDN) architecture, a Virtual Network Function (VNF) placement optimization algorithm based on improved deep reinforcement learning is proposed. Firstly, a stochastic optimization model of Markov Decision Process (MDP) is established to jointly optimizes SFC placement cost and delay cost, and is constrained by the delay of SFC, as well as the resources of common server Central Processing Unit (CPU) and physical link bandwidth. Secondly, in the process of VNF placement and resource allocation, there are problems such as too large state space, high dimension of action space, and unknown state transition probability. A VNF intelligent placement algorithm based on deep reinforcement learning is proposed to obtain an approximately optimal VNF placement strategy and resource allocation strategy. Finally, considering the problems of deep reinforcement learning agent's action exploration and utilization through ε greedy strategy, resulting in low learning efficiency and slow convergence speed, a method of action exploration and utilization based on the difference of value function is proposed, and further adopts dual experience playback pool to solve the problem of low utilization of empirical samples. Simulation results show that the algorithm can converge quickly, and it can optimize SFC placement cost and SFC end-to-end delay.
Considering the problem of Service Function Chain (SFC) placement optimization caused by the dynamic arrival of network service requests under the Network Function Virtualization/Software Defined Network (NFV/SDN) architecture, a Virtual Network Function (VNF) placement optimization algorithm based on improved deep reinforcement learning is proposed. Firstly, a stochastic optimization model of Markov Decision Process (MDP) is established to jointly optimizes SFC placement cost and delay cost, and is constrained by the delay of SFC, as well as the resources of common server Central Processing Unit (CPU) and physical link bandwidth. Secondly, in the process of VNF placement and resource allocation, there are problems such as too large state space, high dimension of action space, and unknown state transition probability. A VNF intelligent placement algorithm based on deep reinforcement learning is proposed to obtain an approximately optimal VNF placement strategy and resource allocation strategy. Finally, considering the problems of deep reinforcement learning agent's action exploration and utilization through ε greedy strategy, resulting in low learning efficiency and slow convergence speed, a method of action exploration and utilization based on the difference of value function is proposed, and further adopts dual experience playback pool to solve the problem of low utilization of empirical samples. Simulation results show that the algorithm can converge quickly, and it can optimize SFC placement cost and SFC end-to-end delay.
2021, 43(6): 1733-1741.
doi: 10.11999/JEIT200287
Abstract:
Considering the fact that global network information is hard to obtain, and the slice resource allocation optimization problem caused by mobility of User Equipment (UE) and dynamics of packet arrival in the radio access network slice, a Service Function Chain(SFC)resource allocation algorithm based on Asynchronous Advantage Actor-Critic (A3C) learning is proposed. Firstly, a resource management mechanism based on blockchain technology is established, which can credibly share and update the global network information, also supervise and record SFC resource allocation process. Then, a delay minimization model based on joint allocation of radio resources, computing resources and bandwidth resources is built under the circumstance of UE moving and time-varying packet arrival, and further transformed into an Markov Decision Process(MDP) problem. At last, A3C learning method is adopted to obtain the resource allocation optimization strategy in this MDP. Simulation results show that the proposed algorithm could utilize resources more efficiently to optimize the system delay while guarantee the requirement of each UE.
Considering the fact that global network information is hard to obtain, and the slice resource allocation optimization problem caused by mobility of User Equipment (UE) and dynamics of packet arrival in the radio access network slice, a Service Function Chain(SFC)resource allocation algorithm based on Asynchronous Advantage Actor-Critic (A3C) learning is proposed. Firstly, a resource management mechanism based on blockchain technology is established, which can credibly share and update the global network information, also supervise and record SFC resource allocation process. Then, a delay minimization model based on joint allocation of radio resources, computing resources and bandwidth resources is built under the circumstance of UE moving and time-varying packet arrival, and further transformed into an Markov Decision Process(MDP) problem. At last, A3C learning method is adopted to obtain the resource allocation optimization strategy in this MDP. Simulation results show that the proposed algorithm could utilize resources more efficiently to optimize the system delay while guarantee the requirement of each UE.
2021, 43(6): 1742-1749.
doi: 10.11999/JEIT200369
Abstract:
The Partial Transmission Sequences (PTS) algorithm is affected by symbol overlap in the Filter Bank MultiCarrier with Offset Quadrature Amplitude Modulation (FBMC-OQAM), resulting in peak value regeneration, which leads to high Peak-to-Average Power Ratio (PAPR) and computational complexity. In this paper, a PTS algorithm based on Double Optimization (DO-PTS) is proposed, which searches two-layer phase factor for signal data blocks to obtain better PAPR suppression performance. The first layer takes full account of the overlap characteristics and combines the previous overlapping data blocks for initial optimization. The second layer groups the signals, and in each group, the data blocks that have the greatest impact on the peak value are optimized to reduce the number of data blocks for phase factor search. The search range of phase factor is reduced in the first layer optimization to reduce the calculation complexity. Through the analysis of computational complexity and simulation results, it is shown that compared with other mainstream PTS optimization methods, this algorithm can not only offer good PAPR reduction performance, but also have low computational complexity, and also ensure the transmission data rate of the system.
The Partial Transmission Sequences (PTS) algorithm is affected by symbol overlap in the Filter Bank MultiCarrier with Offset Quadrature Amplitude Modulation (FBMC-OQAM), resulting in peak value regeneration, which leads to high Peak-to-Average Power Ratio (PAPR) and computational complexity. In this paper, a PTS algorithm based on Double Optimization (DO-PTS) is proposed, which searches two-layer phase factor for signal data blocks to obtain better PAPR suppression performance. The first layer takes full account of the overlap characteristics and combines the previous overlapping data blocks for initial optimization. The second layer groups the signals, and in each group, the data blocks that have the greatest impact on the peak value are optimized to reduce the number of data blocks for phase factor search. The search range of phase factor is reduced in the first layer optimization to reduce the calculation complexity. Through the analysis of computational complexity and simulation results, it is shown that compared with other mainstream PTS optimization methods, this algorithm can not only offer good PAPR reduction performance, but also have low computational complexity, and also ensure the transmission data rate of the system.
2021, 43(6): 1750-1755.
doi: 10.11999/JEIT200203
Abstract:
Based on the DNA origami technique, a method for the graph vertex coloring problem is proposed via the self-assembly of DNA origami structures. Utilizing the DNA origami technique different DNA origami structures with specific shapes are constructed. These structures are utilized to encode the information of a graph’s vertices and edges, and because these structures have sticky ends, so they can assemble to advanced structures which stands for different answers of the graph vertex coloring problem via specific molecular hybridization. Utilizing the property of DNA nanoparticle conjugation and electrophoresis as well as other experimental methods, the correct answer of the graph vertex coloring problem can be detected. This method is highly parallel, and can greatly reduce the complexity of the graph vertex coloring problem.
Based on the DNA origami technique, a method for the graph vertex coloring problem is proposed via the self-assembly of DNA origami structures. Utilizing the DNA origami technique different DNA origami structures with specific shapes are constructed. These structures are utilized to encode the information of a graph’s vertices and edges, and because these structures have sticky ends, so they can assemble to advanced structures which stands for different answers of the graph vertex coloring problem via specific molecular hybridization. Utilizing the property of DNA nanoparticle conjugation and electrophoresis as well as other experimental methods, the correct answer of the graph vertex coloring problem can be detected. This method is highly parallel, and can greatly reduce the complexity of the graph vertex coloring problem.
2021, 43(6): 1756-1763.
doi: 10.11999/JEIT191029
Abstract:
Most of the existing two-party password-based Authenticated Key Exchange (2PAKE) protocols from lattices are proven secure using the indistinguishable common reference string model or the Bellare-Pointcheval-Rogaway model. This paper proposes a two-party password-based authenticated key exchange protocol based on the Ring Learning With Errors (RLWE) problem and proves its security under the Universally Composable (UC) framework. Compared with similar protocols, the new protocol achieves a higher level of security and efficiency.
Most of the existing two-party password-based Authenticated Key Exchange (2PAKE) protocols from lattices are proven secure using the indistinguishable common reference string model or the Bellare-Pointcheval-Rogaway model. This paper proposes a two-party password-based authenticated key exchange protocol based on the Ring Learning With Errors (RLWE) problem and proves its security under the Universally Composable (UC) framework. Compared with similar protocols, the new protocol achieves a higher level of security and efficiency.
2021, 43(6): 1764-1771.
doi: 10.11999/JEIT200331
Abstract:
The performance of the crowd counting methods is degraded due to the commonly used Euclidean loss ignoring the local correlation of images and the limited ability of the model to cope with multi-scale information. A crowd counting method based on Multi-Scale Enhanced Network(MSEN) is proposed to address the above problems. Firstly, an embedded GAN module with a multi-branch generator and a regional discriminator is designed to initially generate crowd density maps and optimize their local correlation. Then, a well-designed scale enhancement module is connected after the embedded GAN module to extract further local features of different scales from different regions, which will strengthen the generalization ability of the model. Extensive experimental results on three challenging public datasets demonstrate that the performance of the proposed method can effectively improve the accuracy and robustness of the prediction.
The performance of the crowd counting methods is degraded due to the commonly used Euclidean loss ignoring the local correlation of images and the limited ability of the model to cope with multi-scale information. A crowd counting method based on Multi-Scale Enhanced Network(MSEN) is proposed to address the above problems. Firstly, an embedded GAN module with a multi-branch generator and a regional discriminator is designed to initially generate crowd density maps and optimize their local correlation. Then, a well-designed scale enhancement module is connected after the embedded GAN module to extract further local features of different scales from different regions, which will strengthen the generalization ability of the model. Extensive experimental results on three challenging public datasets demonstrate that the performance of the proposed method can effectively improve the accuracy and robustness of the prediction.
2021, 43(6): 1772-1780.
doi: 10.11999/JEIT200226
Abstract:
For the fluctuation of single sampling measurement value and the mutual interference between signals in indoor environment, this paper proposes an indoor positioning system based on the partition MultiVariate Gaussian Mixture Model(MVGMM). According to the Access Point (AP) position and indoor spatial structure, the system uses SVM classification in “one-against-all” form to partition the target area in order to predict the subarea with signal changes. A MVGMM based on the mutual interference between signals is established by using the coupling relationship between multiple communication devices in the partition. It is important to improve the positioning accuracy which is affected by signal fluctuation. When the indoor environment changes, the adaptive updating algorithm based on the partition MVGMM can test the reliability of fingerprint data in each segmentation. Moreover, it can update the model parameters in the partition with large signal fluctuation by the adaptive algorithm to strengthen the coupling relationship between the model and the existing environment. Experimental result demonstrates that the proposed algorithm can build a stable and maintainable indoor signal distribution model by using a relatively small number of sample data. Its positioning accuracy is also improved to a certain extent compared to other algorithms.
For the fluctuation of single sampling measurement value and the mutual interference between signals in indoor environment, this paper proposes an indoor positioning system based on the partition MultiVariate Gaussian Mixture Model(MVGMM). According to the Access Point (AP) position and indoor spatial structure, the system uses SVM classification in “one-against-all” form to partition the target area in order to predict the subarea with signal changes. A MVGMM based on the mutual interference between signals is established by using the coupling relationship between multiple communication devices in the partition. It is important to improve the positioning accuracy which is affected by signal fluctuation. When the indoor environment changes, the adaptive updating algorithm based on the partition MVGMM can test the reliability of fingerprint data in each segmentation. Moreover, it can update the model parameters in the partition with large signal fluctuation by the adaptive algorithm to strengthen the coupling relationship between the model and the existing environment. Experimental result demonstrates that the proposed algorithm can build a stable and maintainable indoor signal distribution model by using a relatively small number of sample data. Its positioning accuracy is also improved to a certain extent compared to other algorithms.
2021, 43(6): 1781-1788.
doi: 10.11999/JEIT191035
Abstract:
In order to enhance the cognitive emotional computing ability of robot, a cognitive emotional interaction model of robot based on reinforcement learning is proposed, which combines immediate feedback and long-term trend according to PAD(Pleasure-Arousal-Dominance) emotional space. Firstly, according to the psychology theory of interpersonal communication, the human emotion generation process is simulated to generate human-like emotions, and the three influencing factors of similarity, positivity and empathy are extracted. Secondly, the relationship between the response emotion+ state and the contexted long-term emotion state is established by using the global co-ordination feature of reinforcement learning, so as to model the robot emotion generation process. Then, three factors are incorporated into the model reward mechanism for the evaluate of the interactive emotion state, to update the model and get the optimal emotional strategy. Finally, the optimal emotional state corresponding to the obtained optimal emotional strategy is used to update the robot's emotional state transition probability, and based on the sentiment values of the six basic emotional states in space, them are mapped to continuous emotional space to get the optimal response emotional value of the robot. Subjective and objective comparison experiments show that the model in this paper can effectively increase the delicateness, continuity, positivity and empathy of the robot's emotional expression, and can effectively reduce the robot's dependence on external emotional stimuli, further improving the harmonious and friendly human-computer interaction.
In order to enhance the cognitive emotional computing ability of robot, a cognitive emotional interaction model of robot based on reinforcement learning is proposed, which combines immediate feedback and long-term trend according to PAD(Pleasure-Arousal-Dominance) emotional space. Firstly, according to the psychology theory of interpersonal communication, the human emotion generation process is simulated to generate human-like emotions, and the three influencing factors of similarity, positivity and empathy are extracted. Secondly, the relationship between the response emotion+ state and the contexted long-term emotion state is established by using the global co-ordination feature of reinforcement learning, so as to model the robot emotion generation process. Then, three factors are incorporated into the model reward mechanism for the evaluate of the interactive emotion state, to update the model and get the optimal emotional strategy. Finally, the optimal emotional state corresponding to the obtained optimal emotional strategy is used to update the robot's emotional state transition probability, and based on the sentiment values of the six basic emotional states in space, them are mapped to continuous emotional space to get the optimal response emotional value of the robot. Subjective and objective comparison experiments show that the model in this paper can effectively increase the delicateness, continuity, positivity and empathy of the robot's emotional expression, and can effectively reduce the robot's dependence on external emotional stimuli, further improving the harmonious and friendly human-computer interaction.
2021, 43(6): 1789-1802.
doi: 10.11999/JEIT200267
Abstract:
Action recognition using joints has attracted the attention of scholars at home and abroad because it is not easily affected by appearance and can better avoid the impact of noise. However, there are few systematic reviews in this field. In this paper, the methods of action recognition using joints based on deep learning in recent years are summarized. According to the different subjects of the network, it is divided into Convolutional Neural Network(CNN), Recurrent Neural Network(RNN), graph convolution network and hybrid network. The representation of joint point data that convolution neural network, recurrent neural network and graph convolution network are good at is pseudo image, vector sequence and topological graph. This paper summarizes the current data sets of action recognition using joints at home and abroad, and discusses the challenges and future research directions of behavior recognition using joints. Under the premise of high precision, rapid action recognition and practicality still need to be further promoted.
Action recognition using joints has attracted the attention of scholars at home and abroad because it is not easily affected by appearance and can better avoid the impact of noise. However, there are few systematic reviews in this field. In this paper, the methods of action recognition using joints based on deep learning in recent years are summarized. According to the different subjects of the network, it is divided into Convolutional Neural Network(CNN), Recurrent Neural Network(RNN), graph convolution network and hybrid network. The representation of joint point data that convolution neural network, recurrent neural network and graph convolution network are good at is pseudo image, vector sequence and topological graph. This paper summarizes the current data sets of action recognition using joints at home and abroad, and discusses the challenges and future research directions of behavior recognition using joints. Under the premise of high precision, rapid action recognition and practicality still need to be further promoted.