面向空天地网络的弹性语义通信

王文远; 周明宇; 王朝炜; 许霁松; 张云泽; 庞明亮; 江帆; 徐乐西; 张治

doi:10.11999/JEIT250077

面向空天地网络的弹性语义通信

doi: 10.11999/JEIT250077 cstr: 32379.14.JEIT250077

王文远¹,
周明宇²,
王朝炜^{1, 3, ,},
许霁松¹,
张云泽¹,
庞明亮¹,
江帆⁴,
徐乐西⁵,
张治^{3, 6}

1.
北京邮电大学电子工程学院北京 100876
2.
北京首都科技项目经理人管理有限公司北京 100083
3.
泛网无线通信教育部重点实验室北京 100876
4.
西安邮电大学通信与信息工程学院(人工智能学院) 西安 710121
5.
中国联合网络通信有限公司研究院北京 100048
6.
北京邮电大学信息与通信工程学院北京 100876

基金项目: 国家自然科学基金(62471052)

详细信息

作者简介:
王文远：男，博士生，研究方向为无线通信、语义通信

周明宇：男，博士，研究方向为无线通信

王朝炜：男，博士，副教授，研究方向为下一代移动通信技术、无线传感器与物联网技术等

许霁松：男，硕士生，研究方向为语义通信、无蜂窝网络

张云泽：男，硕士生，研究方向为语义通信、无线通信

庞明亮：男，博士生，研究方向为卫星通信、多址接入技术和资源管理等

江帆：女，博士，教授，研究方向为基于人工智能的边缘计算及缓存技术、D2D通信技术、5G超密集异构网络中的无线资源管理等

徐乐西：男，博士，高级工程师，研究方向为大数据算法研究及行业应用

张治：男，教授，研究方向为无线通信、语义通信

通讯作者:
王朝炜　wangchaowei@bupt.edu.cn

中图分类号: TN927
计量
- 文章访问数: 515
- HTML全文浏览量: 289
- PDF下载量: 82
- 被引次数: 0
出版历程
- 收稿日期: 2025-02-12
- 修回日期: 2025-07-02
- 网络出版日期: 2025-07-14
- 刊出日期: 2025-10-10

Resilient Semantic Communication for Space-Air-Ground Networks

1.
School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
2.
Beijing Science And Technology Project Manager Management Corporation , Ltd., Beijing 100083, China
3.
Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing 100876, China
4.
School of Communications and Information Engineering & School of Artificial Intelligence, Xi’an University of Posts & Telecommunications, Xi’an 710121, China
5.
Research Institute of China United Network Communications Co., Ltd., Beijing 100048, China
6.
School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

Funds: The National Natural Science Foundation of China (62471052)

摘要

摘要: 针对空天地网络中图像传输面临的带宽受限和信道损伤等挑战，该文提出一种弹性语义通信方案。该方案基于信息瓶颈(IB)理论构建了增强的率失真(RD)函数，采用Gumbel-Softmax方法和可变速率网络实现动态速率自适应，并设计了加权多重非对称高斯分布来表征不同语义特征的概率密度。在架构设计上，该方案采用注意力机制和残差学习，根据信噪比(SNR)要求自适应地选择网络模块，实现计算效率和传输可靠性之间的最佳权衡。实验表明，与传统方案相比，所提方案在信道带宽比(CBR)和重建质量方面均取得了显著提升，特别是在具有挑战性的信道条件下，表现出更强的鲁棒性和更高的图像保真度。
- 空天地网络 /
- 语义通信 /
- 信道自适应 /
- 拓展的信息瓶颈理论 /
- 信源信道联合编码
Abstract: Objective Space-air-ground networks represent a fundamental evolution in wireless communication infrastructure by extending conventional terrestrial networks into airspace and outer space through the integration of satellites, unmanned aerial vehicles, and ground-based nodes. These networks have become a cornerstone technology for future sixth-generation wireless communications, providing wide-area coverage and flexible networking capabilities that overcome the limitations of geographically constrained terrestrial systems. However, image transmission over space-air-ground networks faces considerable challenges due to inherent bandwidth constraints, severe channel impairments, and highly dynamic propagation environments. Traditional Separate Source-Channel Coding (SSCC) methods, although theoretically sound, exhibit poor performance under these adverse conditions, particularly near information-theoretic limits. The rapid expansion of multimedia applications and service demands requires new approaches to information extraction and transmission that meet ultra-reliable low-latency communication requirements. Conventional image coding algorithms, such as JPEG and JPEG2000, experience significant performance degradation in satellite communication systems due to channel impairments. These limitations highlight the need for advanced semantic communication frameworks that move beyond conventional bit-level transmission and prioritize the preservation of semantic content integrity under complex wireless conditions. Methods A resilient semantic communication framework is proposed for image transmission in space-air-ground networks, based on enhanced information bottleneck theory. The key innovation is the development of an augmented rate-distortion function that jointly optimizes system capacity, pixel-level reconstruction fidelity, semantic feature preservation, and perceptual quality. The framework combines deep learning with information-theoretic principles to ensure reliable performance under varying channel conditions. A Gumbel-Softmax approach is integrated with variable rate networks to enable dynamic rate adaptation in response to fluctuating channel states. The system employs a weighted multi-asymmetric Gaussian distribution model to characterize the probability density functions of heterogeneous semantic features, providing accurate entropy estimation in the latent space. The architecture incorporates attention mechanisms and residual learning structures to enhance feature extraction and semantic preservation. A reconfigurable neural network architecture is introduced, featuring adaptive module selection driven by real-time signal-to-noise ratio assessments and service quality requirements. The encoder subsystem consists of semantic extractor networks that identify and isolate critical image features, joint source-channel encoders that perform integrated compression and error protection, and adaptive networks that generate binary mask vectors for intelligent symbol selection. The semantic extractor combines convolutional neural networks with fully connected layers to capture hierarchical feature representations. The joint source-channel coding architecture integrates residual convolution blocks, attention feature blocks, and residual transpose convolution blocks to optimize rate-distortion performance. The adaptive network produces dynamic masks through Gumbel-Softmax sampling, controlling the transmission of semantic symbols based on their relevance and channel state. A two-stage training strategy is implemented. First, end-to-end optimization of the joint source-channel encoders and decoders is performed using mean squared error loss. This is followed by full system training based on a composite loss function that jointly considers transmission rate, pixel-level distortion, semantic distortion, and perceptual quality. Results and Discussions Comprehensive experimental validation is conducted using the CIFAR, and Kodak24 datasets to assess the effectiveness of the proposed framework. Performance is evaluated using multiple metrics, including Peak Signal-to-Noise Ratio (PSNR) for objective image quality assessment and USC metric for overall system efficiency. Comparative analysis with conventional JPEG combined with LDPC and QAM schemes shows substantial performance gains, particularly under low Signal-to-Noise Ratio (SNR) conditions (Fig. 3, Fig. 4). The proposed resilient semantic communication framework consistently outperforms conventional methods across different compression ratios, demonstrating robust resistance to channel impairments and effectively mitigating the cliff effect observed in traditional SSCC systems. When compared with advanced deep learning-based approaches such as ADJSCC and DeepJSCC-V, the proposed method achieves significant improvements in both PSNR and transmission efficiency (Fig. 5, Fig. 6). Efficiency evaluation using USC metrics shows that the proposed framework achieves higher utility values across various compression settings, with performance advantages becoming more evident as SNR decreases (Fig. 7). Further analysis of different network configurations demonstrates the adaptability of the architecture in balancing computational complexity and transmission performance (Fig. 8, Fig. 9). Configurations with five network units consistently provide higher PSNR values compared to three- and four-unit designs. However, four-unit configurations achieve optimal efficiency under high SNR conditions, indicating effective resource allocation in response to varying channel states. Visual quality assessment using the Kodak dataset confirms improved reconstruction performance with reduced channel bandwidth requirements, supporting the practical feasibility of the proposed approach (Fig. 10). Computational complexity analysis shows that the five-unit configuration maintains complexity comparable to existing benchmarks, while three- and four-unit designs significantly reduce computational demands, supporting deployment on resource-constrained platforms (Table 4). These experimental results demonstrate that the proposed framework provides superior reconstruction quality, transmission efficiency, and adaptive capability across diverse and challenging wireless environments. Conclusions This study presents a resilient semantic communication framework that addresses key challenges in image transmission for space-air-ground networks. Experimental validation confirms that the proposed method achieves substantial improvements over conventional approaches in reconstruction quality, transmission efficiency, and adaptability.
- Space-Air-Ground networks /
- Semantic communication /
- Channel adaptive /
- Enhanced information bottleneck theory /
- Joint source-channel coding

HTML全文

图 1 图像语义通信方案的系统模型

下载: 全尺寸图片幻灯片

图 2 弹性语义通信的编码器结构

下载: 全尺寸图片幻灯片

图 3 CIFAR-100数据集弹性语义通信与SSCC性能对比

下载: 全尺寸图片幻灯片

图 6 Kodak24数据集弹性语义通信与DL方案性能对比

下载: 全尺寸图片幻灯片

图 4 Kodak24数据集弹性语义通信与SSCC性能对比

下载: 全尺寸图片幻灯片

图 5 CIFAR-100数据集弹性语义通信与DL方案性能对比

下载: 全尺寸图片幻灯片

图 7 弹性语义通信与DeepJSCC-V的效率对比

下载: 全尺寸图片幻灯片

图 8 弹性语义通信不同“单元”数性能对比

下载: 全尺寸图片幻灯片

图 9 弹性语义通信不同“单元”数效率对比

下载: 全尺寸图片幻灯片

图 10 图像重构效果对比

下载: 全尺寸图片幻灯片

1 JSC编码器的训练过程

输入：训练数据集$\mathcal{X}$，批量大小$B$，学习率${\text{lr}}$，最大中间“单　元”数量L
输出：JSC编码器和解码器参数集合$({\boldsymbol{\theta }}_{\text{C}}^,{\boldsymbol{\varphi }}_{\text{C}}^{\text{}})$，SNR-l 映射向量${\boldsymbol{\varLambda }}$
(1) 从$\mathcal{X}$中随机抽取一个数据批次${\boldsymbol{X}} = [{{\boldsymbol{x}}_1},{{\boldsymbol{x}}_2}, \cdots ,{{\boldsymbol{x}}_B}]$
(2) 对于每个$l \in \{ 3,4, \cdots ,L\} $：
(3) 　对于批量中的每个数据样本${{\boldsymbol{x}}_i} \in {\boldsymbol{X}}$：
(4) 　　生成得到SNR
(5) 　　随机生成压缩率${R_i} \in \mathcal{U}(0.05,0.4)$
(6) 　　提取语义特征${{\boldsymbol{z}}_i} = Q({E_{\text{C}}}({E_{\text{S}}}({{\boldsymbol{x}}_i}),{\gamma _i}))$
(7) 　　生成具有SNR ${\gamma _i}$的加性高斯白噪声${{\boldsymbol{n}}_i}{\text{～}}\mathcal{C}\mathcal{N}(0,\sigma _2^i{\boldsymbol{I}})$
(8) 　　计算接收的压缩信号${{{\tilde {\boldsymbol{z}}}}_i}$并重构${{\boldsymbol{\hat y}}_i}$
(9) 　　使用解码器重构图像${{{\hat {\boldsymbol{x}}}}_i} = {D_{\text{S}}}({D_{\text{C}}}({{{\hat {\boldsymbol{y}}}}_i},{\gamma _i}))$
(10) 结束循环
(11) 计算平均损失$\mathcal{L}$
(12) 更新映射向量${\boldsymbol{\varLambda }}$
(13) 使用梯度下降法更新模型参数$({\boldsymbol{\theta }}_{\text{C}}^,{\boldsymbol{\varphi }}_{\text{C}}^)$

下载: 导出CSV

表 1 语义通信效用指标参数

指标	图像重建任务中的数值
${\text{AC}}{{\text{C}}_{{\text{min}}}}$	0
${\text{AC}}{{\text{C}}_{{\text{th}}}}$	0.99
${\text{TI}}{{\text{M}}_{{\text{th}}}}$	100 ms
$\delta $	6

下载: 导出CSV

表 2 训练中的关键参数设置

数据集	CIFAR-100	WHU
初始化学习率	1e–4	5e–5
批量大小	128	256
训练轮次	400	400
优化器	Adam
学习率调度器	poly
权重参数	$\varepsilon $=0.015, $\eta $=0.5, $\lambda $=1

下载: 导出CSV

表 3 仿真参数配置

参数名称	参数配置
卫星高度	35 786 km
等效卫星天线孔径	22 m
3 dB波束宽度	0.401 1°
卫星天线发射增益	51 dBi
中心频率	2 GHz
总带宽	30 MHz
地面设备接收增益	39.7 dBi
噪声温度	290 K
噪声系数	7 dB
噪声功率谱密度	1.38 × 10^–23 ×290 W/Hz

下载: 导出CSV

表 4 复杂度比较

方案	FLOPs (G)	参数量(M)
所提方案 5 unit	5.797	11.700
所提方案 4 unit	3.942	8.160
所提方案 3 unit	2.088	4.620
DeepJSCC-V	5.797	11.700
ADJSCC	5.527	11.102

下载: 导出CSV

参考文献(31)

[1]	LIU Yalin, DAI Hongning, WANG Qubeijian, et al. Space-air-ground integrated networks: Spherical stochastic geometry-based uplink connectivity analysis[J]. IEEE Journal on Selected Areas in Communications, 2024, 42(5): 1387–1402. doi: 10.1109/JSAC.2024.3365891.
[2]	CHEN Qian, WU Chenyu, HAN Shuai, et al. Intersatellite-link-enhanced transmission scheme toward aviation IoT in SAGIN[J]. IEEE Internet of Things Journal, 2025, 12(9): 11812–11826. doi: 10.1109/JIOT.2024.3519659.
[3]	MAO Bomin, ZHOU Xueming, LIU Jiajia, et al. Digital twin satellite networks toward 6G: Motivations, challenges, and future perspectives[J]. IEEE Network, 2024, 38(1): 54–60. doi: 10.1109/MNET.2023.3332895.
[4]	MAHBOOB S and LIU Lingjia. Revolutionizing future connectivity: A contemporary survey on AI-empowered satellite-based non-terrestrial networks in 6G[J]. IEEE Communications Surveys & Tutorials, 2024, 26(2): 1279–1321. doi: 10.1109/COMST.2023.3347145.
[5]	WALLACE G K. The JPEG still picture compression standard[J]. Communications of the ACM, 1991, 34(4): 30–44. doi: 10.1145/103085.103089.
[6]	MARCELLIN M W, GORMISH M J, BILGIN A, et al. An overview of JPEG-2000[C]. DCC 2000. Data Compression Conference, Snowbird, USA, 2000: 523–541. doi: 10.1109/DCC.2000.838192.
[7]	SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27(3): 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x.
[8]	RICHARDSON T J, SHOKROLLAHI M A, and URBANKE R L. Design of capacity-approaching irregular low-density parity-check codes[J]. IEEE Transactions on Information Theory, 2001, 47(2): 619–637. doi: 10.1109/18.910578.
[9]	VANGALA H, VITERBO E, and HONG Y. A comparative study of polar code constructions for the AWGN channel[J]. arXiv preprint arXiv: 1501.02473, 2015. doi: 10.48550/arXiv.1501.02473.
[10]	WEITHOFFER S, NOUR C A, WEHN N, et al. 25 years of turbo codes: From Mb/s to beyond 100 Gb/s[C]. 2018 IEEE 10th International Symposium on Turbo Codes & Iterative Information Processing (ISTC), Hong Kong, China, 2018: 1–6. doi: 10.1109/ISTC.2018.8625377.
[11]	QIN Zhijin, TAO Xiaoming, LU Jianhua, et al. Semantic communications: Principles and challenges[J]. arXiv preprint arXiv: 2201.01389, 2021. doi: 10.48550/arXiv.2201.01389.
[12]	WANG Chaowei, LI Yehao, GAO Feifei, et al. Adaptive semantic-bit communication for extended reality interactions[J]. IEEE Journal of Selected Topics in Signal Processing, 2023, 17(5): 1080–1092. doi: 10.1109/JSTSP.2023.3310654.
[13]	YU Lei, LI Houqiang, and LI Weiping. Wireless scalable video coding using a hybrid digital-analog scheme[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(2): 331–345. doi: 10.1109/TCSVT.2013.2273675.
[14]	MA Yuanliang, ZHANG Qunfei, and WANG Honglei. 6G: Ubiquitously extending to the vast underwater world of the oceans[J]. Engineering, 2022, 8: 12–17. doi: 10.1016/j.eng.2021.07.017.
[15]	BOURTSOULATZE E, KURKA D B, and GÜNDÜZ D. Deep joint source-channel coding for wireless image transmission[J]. IEEE Transactions on Cognitive Communications and Networking, 2019, 5(3): 567–579. doi: 10.1109/TCCN.2019.2919300.
[16]	SUN Lunan, YANG Yang, CHEN Mingzhe, et al. Adaptive information bottleneck guided joint source and channel coding for image transmission[J]. IEEE Journal on Selected Areas in Communications, 2023, 41(8): 2628–2644. doi: 10.1109/JSAC.2023.3288238.
[17]	LIU Fangfang, SUN Zhengfen, YANG Yang, et al. Rate-adaptable multitask-oriented semantic communication: An extended rate-distortion theory-based scheme[J]. IEEE Internet of Things Journal, 2024, 11(9): 15557–15570. doi: 10.1109/JIOT.2024.3350656.
[18]	DAI Jincheng, ZHANG Ping, NIU Kai, et al. Communication beyond transmitting bits: Semantics-guided source and channel coding[J]. IEEE Wireless Communications, 2023, 30(4): 170–177. doi: 10.1109/MWC.017.2100705.
[19]	DAI Jincheng, WANG Sixian, TAN Kailin, et al. Nonlinear transform source-channel coding for semantic communications[J]. IEEE Journal on Selected Areas in Communications, 2022, 40(8): 2300–2316. doi: 10.1109/JSAC.2022.3180802.
[20]	XIAO Zixuan, YAO Shengshi, DAI Jincheng, et al. Wireless deep speech semantic transmission[C]. ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023: 1–5. doi: 10.1109/ICASSP49357.2023.10094680.
[21]	LIU Jiakun, SHAO Shuo, ZHANG Wenyi, et al. An indirect rate-distortion characterization for semantic sources: General model and the case of gaussian observation[J]. IEEE Transactions on Communications, 2022, 70(9): 5946–5959. doi: 10.1109/TCOMM.2022.3194978.
[22]	ZHANG Ping, XU Wenjun, GAO Hui, et al. Toward wisdom-evolutionary and primitive-concise 6G: A new paradigm of semantic communication networks[J]. Engineering, 2022, 8: 60–73. doi: 10.1016/j.eng.2021.11.003.
[23]	TONG Wanjie, LIU Fangfang, SUN Zhengfen, et al. Image semantic communications: An extended rate-distortion theory based scheme[C]. 2022 IEEE Globecom Workshops (GC Wkshps), Rio de Janeiro, Brazil, 2022: 1723–1728. doi: 10.1109/GCWkshps56602.2022.10008733.
[24]	3GPP. 3GPP TR 38.811 Study on New Radio (NR) to support non-terrestrial networks (Release 15)[S]. 3GPP, 2010.
[25]	ZHANG Wenyu, ZHANG Haijun, MA Hui, et al. Predictive and adaptive deep coding for wireless image transmission in semantic communication[J]. IEEE Transactions on Wireless Communications, 2023, 22(8): 5486–5501. doi: 10.1109/TWC.2023.3234408.
[26]	JANG E, GU Shixiang, POOLE B. Categorical reparameterization with gumbel-softmax[C]. 5th International Conference on Learning Representations, Toulon, France, 2017.
[27]	NACEREDDINE N, TABBONE S, ZIOU D, et al. Asymmetric generalized Gaussian mixture models and EM algorithm for image segmentation[C]. 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 2010: 4557–4560. doi: 10.1109/ICPR.2010.1107.
[28]	LIU Fangfang, TONG Wanjie, YANG Yang, et al. Task-oriented image semantic communication based on rate-distortion theory[J]. arXiv preprint arXiv: 2201.10929, 2022. doi: 10.48550/arXiv.2201.10929.
[29]	郑远, 王凤玉, 许文俊. 语义通信性能评估体系及指标[J]. 中兴通讯技术, 2023, 29(2): 46–53. doi: 10.12142/ZTETJ.202302010. ZHENG Yuan, WANG Fengyu, and XU Wenjun. Performance evaluation systems and metrics for semantic communications[J]. ZTE Technology Journal, 2023, 29(2): 46–53. doi: 10.12142/ZTETJ.202302010.
[30]	XU Jialong, AI Bo, CHEN Wei, et al. Wireless image transmission using deep source channel coding with attention modules[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(4): 2315–2328. doi: 10.1109/TCSVT.2021.3082521.
[31]	3GPP. 3GPP TR 38.821 Solutions for NR to support non-terrestrial networks (NTN) (Release 16)[S]. Washington: 3GPP, 2019.