Resilient Semantic Communication for Space-Air-Ground Networks
-
摘要: 针对空天地网络中图像传输面临的带宽受限和信道损伤等挑战,该文提出一种弹性语义通信方案。该方案基于信息瓶颈(IB)理论构建了增强的率失真(RD)函数,采用Gumbel-Softmax方法和可变速率网络实现动态速率自适应,并设计了加权多重非对称高斯分布来表征不同语义特征的概率密度。在架构设计上,该方案采用注意力机制和残差学习,根据信噪比(SNR)要求自适应地选择网络模块,实现计算效率和传输可靠性之间的最佳权衡。实验表明,与传统方案相比,所提方案在信道带宽比(CBR)和重建质量方面均取得了显著提升,特别是在具有挑战性的信道条件下,表现出更强的鲁棒性和更高的图像保真度。Abstract:
Objective Space-air-ground networks represent a fundamental evolution in wireless communication infrastructure by extending conventional terrestrial networks into airspace and outer space through the integration of satellites, unmanned aerial vehicles, and ground-based nodes. These networks have become a cornerstone technology for future sixth-generation wireless communications, providing wide-area coverage and flexible networking capabilities that overcome the limitations of geographically constrained terrestrial systems. However, image transmission over space-air-ground networks faces considerable challenges due to inherent bandwidth constraints, severe channel impairments, and highly dynamic propagation environments. Traditional Separate Source-Channel Coding (SSCC) methods, although theoretically sound, exhibit poor performance under these adverse conditions, particularly near information-theoretic limits. The rapid expansion of multimedia applications and service demands requires new approaches to information extraction and transmission that meet ultra-reliable low-latency communication requirements. Conventional image coding algorithms, such as JPEG and JPEG2000, experience significant performance degradation in satellite communication systems due to channel impairments. These limitations highlight the need for advanced semantic communication frameworks that move beyond conventional bit-level transmission and prioritize the preservation of semantic content integrity under complex wireless conditions. Methods A resilient semantic communication framework is proposed for image transmission in space-air-ground networks, based on enhanced information bottleneck theory. The key innovation is the development of an augmented rate-distortion function that jointly optimizes system capacity, pixel-level reconstruction fidelity, semantic feature preservation, and perceptual quality. The framework combines deep learning with information-theoretic principles to ensure reliable performance under varying channel conditions. A Gumbel-Softmax approach is integrated with variable rate networks to enable dynamic rate adaptation in response to fluctuating channel states. The system employs a weighted multi-asymmetric Gaussian distribution model to characterize the probability density functions of heterogeneous semantic features, providing accurate entropy estimation in the latent space. The architecture incorporates attention mechanisms and residual learning structures to enhance feature extraction and semantic preservation. A reconfigurable neural network architecture is introduced, featuring adaptive module selection driven by real-time signal-to-noise ratio assessments and service quality requirements. The encoder subsystem consists of semantic extractor networks that identify and isolate critical image features, joint source-channel encoders that perform integrated compression and error protection, and adaptive networks that generate binary mask vectors for intelligent symbol selection. The semantic extractor combines convolutional neural networks with fully connected layers to capture hierarchical feature representations. The joint source-channel coding architecture integrates residual convolution blocks, attention feature blocks, and residual transpose convolution blocks to optimize rate-distortion performance. The adaptive network produces dynamic masks through Gumbel-Softmax sampling, controlling the transmission of semantic symbols based on their relevance and channel state. A two-stage training strategy is implemented. First, end-to-end optimization of the joint source-channel encoders and decoders is performed using mean squared error loss. This is followed by full system training based on a composite loss function that jointly considers transmission rate, pixel-level distortion, semantic distortion, and perceptual quality. Results and Discussions Comprehensive experimental validation is conducted using the CIFAR-10, CIFAR-100, and Kodak24 datasets to assess the effectiveness of the proposed framework. Performance is evaluated using multiple metrics, including Peak Signal-to-Noise Ratio (PSNR) for objective image quality assessment and USC metric for overall system efficiency. Comparative analysis with conventional JPEG combined with LDPC and QAM schemes shows substantial performance gains, particularly under low Signal-to-Noise Ratio (SNR) conditions ( Fig. 3 ,Fig. 4 ). The proposed resilient semantic communication framework consistently outperforms conventional methods across different compression ratios, demonstrating robust resistance to channel impairments and effectively mitigating the cliff effect observed in traditional SSCC systems. When compared with advanced deep learning-based approaches such as ADJSCC and DeepJSCC-V, the proposed method achieves significant improvements in both PSNR and transmission efficiency (Fig. 5 ,Fig. 6 ). Efficiency evaluation using USC metrics shows that the proposed framework achieves higher utility values across various compression settings, with performance advantages becoming more evident as SNR decreases (Fig. 7 ). Further analysis of different network configurations demonstrates the adaptability of the architecture in balancing computational complexity and transmission performance (Fig. 8 ,Fig. 9 ). Configurations with five network units consistently provide higher PSNR values compared to three- and four-unit designs. However, four-unit configurations achieve optimal efficiency under high SNR conditions, indicating effective resource allocation in response to varying channel states. Visual quality assessment using the Kodak dataset confirms improved reconstruction performance with reduced channel bandwidth requirements, supporting the practical feasibility of the proposed approach (Fig. 10 ). Computational complexity analysis shows that the five-unit configuration maintains complexity comparable to existing benchmarks, while three- and four-unit designs significantly reduce computational demands, supporting deployment on resource-constrained platforms (Table 4 ). These experimental results demonstrate that the proposed framework provides superior reconstruction quality, transmission efficiency, and adaptive capability across diverse and challenging wireless environments.Conclusions This study presents a resilient semantic communication framework that addresses key challenges in image transmission for space–air–ground networks. Experimental validation confirms that the proposed method achieves substantial improvements over conventional approaches in reconstruction quality, transmission efficiency, and adaptability. -
1 JSC编码器的训练过程
输入:训练数据集$\mathcal{X}$,批量大小$B$,学习率${\text{lr}}$,最大中间“单
元”数量L输出:JSC编码器和解码器参数集合$({\boldsymbol{\theta }}_{\text{C}}^*,{\boldsymbol{\varphi }}_{\text{C}}^{\text{*}})$,SNR-l 映射向量${\boldsymbol{\varLambda }}$ (1) 从$\mathcal{X}$中随机抽取一个数据批次${\boldsymbol{X}} = [{{\boldsymbol{x}}_1},{{\boldsymbol{x}}_2}, \cdots ,{{\boldsymbol{x}}_B}]$ (2) 对于每个$l \in \{ 3,4, \cdots ,L\} $: (3) 对于批量中的每个数据样本${{\boldsymbol{x}}_i} \in {\boldsymbol{X}}$: (4) 生成得到SNR (5) 随机生成压缩率${R_i} \in \mathcal{U}(0.05,0.4)$ (6) 提取语义特征${{\boldsymbol{z}}_i} = Q({E_{\text{C}}}({E_{\text{S}}}({{\boldsymbol{x}}_i}),{\gamma _i}))$ (7) 生成具有SNR ${\gamma _i}$的加性高斯白噪声${{\boldsymbol{n}}_i}{\text{~}}\mathcal{C}\mathcal{N}(0,\sigma _2^i{\boldsymbol{I}})$ (8) 计算接收的压缩信号${{{\tilde {\boldsymbol{z}}}}_i}$并重构${{\boldsymbol{\hat y}}_i}$ (9) 使用解码器重构图像${{{\hat {\boldsymbol{x}}}}_i} = {D_{\text{S}}}({D_{\text{C}}}({{{\hat {\boldsymbol{y}}}}_i},{\gamma _i}))$ (10) 结束循环 (11) 计算平均损失$\mathcal{L}$ (12) 更新映射向量${\boldsymbol{\varLambda }}$ (13) 使用梯度下降法更新模型参数$({\boldsymbol{\theta }}_{\text{C}}^*,{\boldsymbol{\varphi }}_{\text{C}}^*)$ 表 1 语义通信效用指标参数
指标 图像重建任务中的数值 ${\text{AC}}{{\text{C}}_{{\text{min}}}}$ 0 ${\text{AC}}{{\text{C}}_{{\text{th}}}}$ 0.99 ${\text{TI}}{{\text{M}}_{{\text{th}}}}$ 100 ms $\delta $ 6 表 2 训练中的关键参数设置
数据集 CIFAR-100 WHU 初始化学习率 1e–4 5e–5 批量大小 128 256 训练轮次 400 400 优化器 Adam 学习率调度器 poly 权重参数 $\varepsilon $=0.015, $\eta $=0.5, $\lambda $=1 表 3 仿真参数配置
参数名称 参数配置 卫星高度 35 786 km 等效卫星天线孔径 22 m 3 dB波束宽度 0.401 1° 卫星天线发射增益 51 dBi 中心频率 2 GHz 总带宽 30 MHz 地面设备接收增益 39.7 dBi 噪声温度 290 K 噪声系数 7 dB 噪声功率谱密度 1.38 × 10–23 ×290 W/Hz 表 4 复杂度比较
方案 FLOPs 参数量(M) 所提方案 5unit 5.797 11.700 所提方案 4unit 3.942 8.160 所提方案 3unit 2.088 4.620 DeepJSCC-V 5.797 11.700 ADJSCC 5.527 11.102 -
[1] LIU Yalin, DAI Hongning, WANG Qubeijian, et al. Space-air-ground integrated networks: Spherical stochastic geometry-based uplink connectivity analysis[J]. IEEE Journal on Selected Areas in Communications, 2024, 42(5): 1387–1402. doi: 10.1109/JSAC.2024.3365891. [2] CHEN Qian, WU Chenyu, HAN Shuai, et al. Intersatellite-link-enhanced transmission scheme toward aviation IoT in SAGIN[J]. IEEE Internet of Things Journal, 2025, 12(9): 11812–11826. doi: 10.1109/JIOT.2024.3519659. [3] MAO Bomin, ZHOU Xueming, LIU Jiajia, et al. Digital twin satellite networks toward 6G: Motivations, challenges, and future perspectives[J]. IEEE Network, 2024, 38(1): 54–60. doi: 10.1109/MNET.2023.3332895. [4] MAHBOOB S and LIU Lingjia. Revolutionizing future connectivity: A contemporary survey on AI-empowered satellite-based non-terrestrial networks in 6G[J]. IEEE Communications Surveys & Tutorials, 2024, 26(2): 1279–1321. doi: 10.1109/COMST.2023.3347145. [5] WALLACE G K. The JPEG still picture compression standard[J]. Communications of the ACM, 1991, 34(4): 30–44. doi: 10.1145/103085.103089. [6] MARCELLIN M W, GORMISH M J, BILGIN A, et al. An overview of JPEG-2000[C]. DCC 2000. Data Compression Conference, Snowbird, USA, 2000: 523–541. doi: 10.1109/DCC.2000.838192. [7] SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27(3): 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [8] RICHARDSON T J, SHOKROLLAHI M A, and URBANKE R L. Design of capacity-approaching irregular low-density parity-check codes[J]. IEEE Transactions on Information Theory, 2001, 47(2): 619–637. doi: 10.1109/18.910578. [9] VANGALA H, VITERBO E, and HONG Y. A comparative study of polar code constructions for the AWGN channel[J]. arXiv preprint arXiv: 1501.02473, 2015. doi: 10.48550/arXiv.1501.02473. [10] WEITHOFFER S, NOUR C A, WEHN N, et al. 25 years of turbo codes: From Mb/s to beyond 100 Gb/s[C]. 2018 IEEE 10th International Symposium on Turbo Codes & Iterative Information Processing (ISTC), Hong Kong, China, 2018: 1–6. doi: 10.1109/ISTC.2018.8625377. [11] QIN Zhijin, TAO Xiaoming, LU Jianhua, et al. Semantic communications: Principles and challenges[J]. arXiv preprint arXiv: 2201.01389, 2021. doi: 10.48550/arXiv.2201.01389. [12] WANG Chaowei, LI Yehao, GAO Feifei, et al. Adaptive semantic-bit communication for extended reality interactions[J]. IEEE Journal of Selected Topics in Signal Processing, 2023, 17(5): 1080–1092. doi: 10.1109/JSTSP.2023.3310654. [13] YU Lei, LI Houqiang, and LI Weiping. Wireless scalable video coding using a hybrid digital-analog scheme[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(2): 331–345. doi: 10.1109/TCSVT.2013.2273675. [14] MA Yuanliang, ZHANG Qunfei, and WANG Honglei. 6G: Ubiquitously extending to the vast underwater world of the oceans[J]. Engineering, 2022, 8: 12–17. doi: 10.1016/j.eng.2021.07.017. [15] BOURTSOULATZE E, KURKA D B, and GÜNDÜZ D. Deep joint source-channel coding for wireless image transmission[J]. IEEE Transactions on Cognitive Communications and Networking, 2019, 5(3): 567–579. doi: 10.1109/TCCN.2019.2919300. [16] SUN Lunan, YANG Yang, CHEN Mingzhe, et al. Adaptive information bottleneck guided joint source and channel coding for image transmission[J]. IEEE Journal on Selected Areas in Communications, 2023, 41(8): 2628–2644. doi: 10.1109/JSAC.2023.3288238. [17] LIU Fangfang, SUN Zhengfen, YANG Yang, et al. Rate-adaptable multitask-oriented semantic communication: An extended rate-distortion theory-based scheme[J]. IEEE Internet of Things Journal, 2024, 11(9): 15557–15570. doi: 10.1109/JIOT.2024.3350656. [18] DAI Jincheng, ZHANG Ping, NIU Kai, et al. Communication beyond transmitting bits: Semantics-guided source and channel coding[J]. IEEE Wireless Communications, 2023, 30(4): 170–177. doi: 10.1109/MWC.017.2100705. [19] DAI Jincheng, WANG Sixian, TAN Kailin, et al. Nonlinear transform source-channel coding for semantic communications[J]. IEEE Journal on Selected Areas in Communications, 2022, 40(8): 2300–2316. doi: 10.1109/JSAC.2022.3180802. [20] XIAO Zixuan, YAO Shengshi, DAI Jincheng, et al. Wireless deep speech semantic transmission[C]. ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023: 1–5. doi: 10.1109/ICASSP49357.2023.10094680. [21] LIU Jiakun, SHAO Shuo, ZHANG Wenyi, et al. An indirect rate-distortion characterization for semantic sources: General model and the case of gaussian observation[J]. IEEE Transactions on Communications, 2022, 70(9): 5946–5959. doi: 10.1109/TCOMM.2022.3194978. [22] ZHANG Ping, XU Wenjun, GAO Hui, et al. Toward wisdom-evolutionary and primitive-concise 6G: A new paradigm of semantic communication networks[J]. Engineering, 2022, 8: 60–73. doi: 10.1016/j.eng.2021.11.003. [23] TONG Wanjie, LIU Fangfang, SUN Zhengfen, et al. Image semantic communications: An extended rate-distortion theory based scheme[C]. 2022 IEEE Globecom Workshops (GC Wkshps), Rio de Janeiro, Brazil, 2022: 1723–1728. doi: 10.1109/GCWkshps56602.2022.10008733. [24] 3GPP. 3GPP TR 38.811 Study on New Radio (NR) to support non-terrestrial networks (Release 15)[S]. 3GPP, 2010. [25] ZHANG Wenyu, ZHANG Haijun, MA Hui, et al. Predictive and adaptive deep coding for wireless image transmission in semantic communication[J]. IEEE Transactions on Wireless Communications, 2023, 22(8): 5486–5501. doi: 10.1109/TWC.2023.3234408. [26] JANG E, GU Shixiang, POOLE B. Categorical reparameterization with gumbel-softmax[C]. 5th International Conference on Learning Representations, Toulon, France, 2017. [27] NACEREDDINE N, TABBONE S, ZIOU D, et al. Asymmetric generalized Gaussian mixture models and EM algorithm for image segmentation[C]. 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 2010: 4557–4560. doi: 10.1109/ICPR.2010.1107. [28] LIU Fangfang, TONG Wanjie, YANG Yang, et al. Task-oriented image semantic communication based on rate-distortion theory[J]. arXiv preprint arXiv: 2201.10929, 2022. doi: 10.48550/arXiv.2201.10929. [29] 郑远, 王凤玉, 许文俊. 语义通信性能评估体系及指标[J]. 中兴通讯技术, 2023, 29(2): 46–53. doi: 10.12142/ZTETJ.202302010.ZHENG Yuan, WANG Fengyu, and XU Wenjun. Performance evaluation systems and metrics for semantic communications[J]. ZTE Technology Journal, 2023, 29(2): 46–53. doi: 10.12142/ZTETJ.202302010. [30] XU Jialong, AI Bo, CHEN Wei, et al. Wireless image transmission using deep source channel coding with attention modules[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(4): 2315–2328. doi: 10.1109/TCSVT.2021.3082521. [31] 3GPP. 3GPP TR 38.821 Solutions for NR to support non-terrestrial networks (NTN) (Release 16)[S]. Washington: 3GPP, 2019. -