Multi-modal Joint Automatic Modulation Recognition Method Towards Low SNR Sequences

WANG Zhen; LIU Wei; LU Wanjie; NIU Chaoyang; LI Runsheng

doi:10.11999/JEIT250594

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 >

WANG Zhen, LIU Wei, LU Wanjie, NIU Chaoyang, LI Runsheng. Multi-modal Joint Automatic Modulation Recognition Method Towards Low SNR Sequences[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250594

Citation:

WANG Zhen, LIU Wei, LU Wanjie, NIU Chaoyang, LI Runsheng. Multi-modal Joint Automatic Modulation Recognition Method Towards Low SNR Sequences[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250594

Citation:

PDF( 7286 KB)

Multi-modal Joint Automatic Modulation Recognition Method Towards Low SNR Sequences

doi: 10.11999/JEIT250594 cstr: 32379.14.JEIT250594

Institute of Data and Target Engineering, Information Engineering University, Zhengzhou 450001, China

Received Date: 2025-06-24
Rev Recd Date: 2025-10-14

Available Online: 2025-10-20

Abstract

Abstract

Objective The rapid evolution of data-driven intelligent algorithms and the rise of multi-modal data indicate that the future of Automatic Modulation Recognition (AMR) lies in joint approaches that integrate multiple domains, use multiple frameworks, and connect multiple scales. However, the embedding spaces of different modalities are heterogeneous, and existing models lack cross-modal adaptive representation, limiting their ability to achieve collaborative interpretation. To address this challenge, this study proposes a performance-interpretable two-stage deep learning–based AMR (DL-AMR) method that jointly models the signal in the time and transform domains. The approach explicitly and implicitly represents signals from multiple perspectives, including temporal, spatial, frequency, and intensity dimensions. This design provides theoretical support for multi-modal AMR and offers an intelligent solution for modeling low Signal-to-Noise Ratio (SNR) time sequences in open environments. Methods The proposed AMR network begins with a preprocessing stage, where the input signal is represented as an in-phase and quadrature (I–Q) sequence. After wavelet thresholding denoising, the signal is converted into a dual-channel representation, with one channel undergoing Short-Time Fourier transform (STFT). This preprocessing yields a dual-stream representation comprising both time-domain and transform-domain signals. The signal is then tokenized through time-domain and transform-domain encoders. In the first stage, explicit modal alignment is performed. The token sequences from the time and transform domains are input in parallel into a contrastive learning module, which explicitly captures and strengthens correlations between the two modalities in dimensions such as temporal structure and amplitude. The learned features are then passed into the feature fusion module. Bidirectional Long Short-Term Memory (BiLSTM) and local representation layers are employed to capture temporally sparse features, enabling subsequent feature decomposition and reconstruction. To refine feature extraction, a subspace attention mechanism is applied to the high-dimensional sparse feature space, allowing efficient capture of discriminative information contained in both high-frequency and low-frequency components. Finally, Convolutional Neural Network – Kolmogorov-Arnold Network (CNN-KAN) layers replace traditional multilayer perceptrons as classifiers, thereby enhancing classification performance under low SNR conditions. Results and Discussions The proposed method is experimentally validated on three datasets: RML2016.10a, RML2016.10b, and HisarMod2019.1. Under high SNR conditions (SNR > 0 dB), classification accuracies of 93.36%, 93.13%, and 93.37% are achieved on the three datasets, respectively. Under low SNR conditions, where signals are severely corrupted or blurred by noise, recognition performance decreases but remains robust. When the SNR ranges from –6 dB to 0 dB, overall accuracies of 78.36%, 80.72%, and 85.43% are maintained, respectively. Even at SNR levels below –6 dB, accuracies of 17.10%, 21.30%, and 29.85% are obtained. At particularly challenging low-SNR levels, the model still achieves 43.45%, 44.54%, and 60.02%. Compared with traditional approaches, and while maintaining a low parameter count (0.33–0.41 M), the proposed method improves average recognition accuracy by 2.12–7.89%, 0.45–4.64%, and 6.18–9.53% on the three datasets. The improvements under low SNR conditions are especially significant, reaching 4.89–12.70% (RML2016.10a), 2.62–8.72% (RML2016.10b), and 4.96–11.63% (HisarMod2019.1). The results indicate that explicit modeling of time–transform domain correlations through contrastive learning, combined with the hybrid architecture consisting of LSTM for temporal sequence modeling, CNN for local feature extraction, and KAN for nonlinear approximation, substantially enhances the noise robustness of the model. Conclusions This study proposes a two-stage AMR method based on time–transform domain multimodal fusion. Explicit multimodal alignment is achieved through contrastive learning, while temporal and local features are extracted using a combination of LSTM and CNN. The KAN is used to enhance nonlinear modeling, enabling implicit feature-level multimodal fusion. Experiments conducted on three benchmark datasets demonstrate that, compared with classical methods, the proposed approach improves recognition accuracy by 2.62–11.63% within the SNR range of –20 to 0 dB, while maintaining a similar number of parameters. The performance gains are particularly significant under low-SNR conditions, confirming the effectiveness of multimodal joint modeling for robust AMR.
- Automatic Modulation Recognition (AMR),
- Cross-modal feature fusion,
- Time series analysis,
- Contrastive learning,
- Kolmogorov–Arnold Network (KAN)

FullText(HTML)

References(33)

References

[1]	MA Jitong, HU Mutian, CHEN Xiao, et al. Few-shot automatic modulation classification via semi-supervised metric learning and lightweight conv-transformer model[J]. IEEE Transactions on Cognitive Communications and Networking. doi: 10.1109/TCCN.2025.3574312.
[2]	XU J L, SU Wei, and ZHOU Mengchu. Likelihood-ratio approaches to automatic modulation classification[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2011, 41(4): 455–469. doi: 10.1109/TSMCC.2010.2076347.
[3]	IGLESIAS V, GRAJAL J, ROYER P, et al. Real-time low-complexity automatic modulation classifier for pulsed radar signals[J]. IEEE Transactions on Aerospace and Electronic Systems, 2015, 51(1): 108–126. doi: 10.1109/TAES.2014.130183.
[4]	郑庆河, 刘方霖, 余礼苏, 等. 基于改进Kolmogorov-Arnold混合卷积神经网络的调制识别方法[J]. 电子与信息学报, 2025, 47(8): 2584–2597. doi: 10.11999/JEIT250161. ZHENG Qinghe, LIU Fanglin, YU Lisu, et al. An improved modulation recognition method based on hybrid kolmogorov-arnold convolutional neural network[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2584–2597. doi: 10.11999/JEIT250161.
[5]	LI Mingkun, WANG Pengyu, DONG Yuhan, et al. Diffusion model empowered data augmentation for automatic modulation recognition[J]. IEEE Wireless Communications Letters, 2025, 14(4): 1224–1228. doi: 10.1109/LWC.2025.3539821.
[6]	李钦, 刘伟, 牛朝阳, 等. 低信噪比下基于分裂EfficientNet网络的雷达信号调制方式识别[J]. 电子学报, 2023, 51(3): 675–686. doi: 10.12263/DZXB.20210656. LI Qin, LIU Wei, NIU Chaoyang, et al. Radar signal modulation recognition based on split EfficientNet under low signal-to-noise ratio[J]. Acta Electronica Sinica, 2023, 51(3): 675–686. doi: 10.12263/DZXB.20210656.
[7]	CHEN Zhuangzhi, CUI Hui, XIANG Jingyang, et al. SigNet: A novel deep learning framework for radio signal classification[J]. IEEE Transactions on Cognitive Communications and Networking, 2022, 8(2): 529–541. doi: 10.1109/TCCN.2021.3120997.
[8]	KE Ziqi and VIKALO H. Real-time radio technology and modulation classification via an LSTM auto-encoder[J]. IEEE Transactions on Wireless Communications, 2022, 21(1): 370–382. doi: 10.1109/TWC.2021.3095855.
[9]	SHAO Mingyuan, LI Dingzhao, HONG Shaohua, et al. IQFormer: A novel transformer-based model with multi-modality fusion for automatic modulation recognition[J]. IEEE Transactions on Cognitive Communications and Networking, 2025, 11(3): 1623–1634. doi: 10.1109/TCCN.2024.3485118.
[10]	ZHAN Quanhai, ZHANG Xiongwei, SUN Meng, et al. Adversarial robust modulation recognition guided by attention mechanisms[J]. IEEE Open Journal of Signal Processing, 2025, 6: 17–29. doi: 10.1109/OJSP.2025.3526577.
[11]	GUO Yuanpu, DAN Zhong, SUN Haixin, et al. SemiAMR: Semi-supervised automatic modulation recognition with corrected pseudo-label and consistency regularization[J]. IEEE Transactions on Cognitive Communications and Networking, 2024, 10(1): 107–121. doi: 10.1109/TCCN.2023.3319530.
[12]	MA Jitong, HU Mutian, WANG Tianyu, et al. Automatic modulation classification in impulsive noise: Hyperbolic-tangent cyclic spectrum and multibranch attention shuffle network[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 5501613. doi: 10.1109/TIM.2023.3244798.
[13]	WANG Danshi, ZHANG Min, LI Ze, et al. Modulation format recognition and OSNR estimation using CNN-based deep learning[J]. IEEE Photonics Technology Letters, 2017, 29(19): 1667–1670. doi: 10.1109/LPT.2017.2742553.
[14]	PENG Shengliang, JIANG Hanyu, WANG Huaxia, et al. Modulation classification based on signal constellation diagrams and deep learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(3): 718–727. doi: 10.1109/TNNLS.2018.2850703.
[15]	SHI Yunhao, XU Hua, ZHANG Yue, et al. GAF-MAE: A self-supervised automatic modulation classification method based on gramian angular field and masked autoencoder[J]. IEEE Transactions on Cognitive Communications and Networking, 2024, 10(1): 94–106. doi: 10.1109/TCCN.2023.3318414.
[16]	ZHANG Zufan, WANG Chun, GAN Chenquan, et al. Automatic modulation classification using convolutional neural network with features fusion of SPWVD and BJD[J]. IEEE Transactions on Signal and Information Processing over Networks, 2019, 5(3): 469–478. doi: 10.1109/TSIPN.2019.2900201.
[17]	ZHENG Shilian, ZHOU Xiaoyu, ZHANG Luxin, et al. Toward next-generation signal intelligence: A hybrid knowledge and data-driven deep learning framework for radio signal classification[J]. IEEE Transactions on Cognitive Communications and Networking, 2023, 9(3): 564–579. doi: 10.1109/TCCN.2023.3243899.
[18]	SHI Yunhao, XU Hua, QI Zisen, et al. STTMC: A few-shot spatial temporal transductive modulation classifier[J]. IEEE Transactions on Machine Learning in Communications and Networking, 2024, 2: 546–559. doi: 10.1109/TMLCN.2024.3387430.
[19]	WANG Feng, YANG Chenlu, HUANG Shanshan, et al. Automatic modulation classification based on joint feature map and convolutional neural network[J]. IET Radar, Sonar & Navigation, 2019, 13(6): 998–1003. doi: 10.1049/iet-rsn.2018.5549.
[20]	ZHUANG Long, LUO Kai, and YANG Zhibo. A multimodal gated recurrent unit neural network model for damage assessment in CFRP composites based on lamb waves and minimal sensing[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 3506911. doi: 10.1109/TIM.2023.3348884.
[21]	QU Yunpeng, LU Zhilin, ZENG Rui, et al. Enhancing automatic modulation recognition through robust global feature extraction[J]. IEEE Transactions on Vehicular Technology, 2025, 74(3): 4192–4207. doi: 10.1109/TVT.2024.3486079.
[22]	ZHANG Shu, ZHENG Dequan, HU Xinchen, et al. Bidirectional long short-term memory networks for relation classification[C]. The 29th Pacific Asia Conference on Language, Information and Computation, Shanghai, China, 2015: 73–78.
[23]	ZHU Jinhua, XIA Yingce, WU Lijun, et al. Masked contrastive representation learning for reinforcement learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3421–3433. doi: 10.1109/TPAMI.2022.3176413.
[24]	HUANG Wenhao, GONG Haifan, ZHANG Huan, et al. BCNet: Bronchus classification via structure guided representation learning[J]. IEEE Transactions on Medical Imaging, 2025, 44(1): 489–498. doi: 10.1109/TMI.2024.3448468.
[25]	SAINI R, JHA N K, DAS B, et al. ULSAM: Ultra-lightweight subspace attention module for compact convolutional neural networks[C]. 2020 IEEE Winter Conference on Applications of Computer Vision, Snowmass, CO, USA, 2020: 1616–1625. doi: 10.1109/WACV45572.2020.9093341.
[26]	LIU Ziming, WANG Yixuan, VAIDYA S, et al. KAN: Kolmogorov-Arnold networks[C]. 13th International Conference on Learning Representations, Singapore, Singapore, 2025.
[27]	HUYNH-THE T, HUA C H, PHAM Q V, et al. MCNet: An efficient CNN architecture for robust automatic modulation classification[J]. IEEE Communications Letters, 2020, 24(4): 811–815. doi: 10.1109/LCOMM.2020.2968030.
[28]	RAJENDRAN S, MEERT W, GIUSTINIANO D, et al. Deep learning models for wireless signal classification with distributed low-cost spectrum sensors[J]. IEEE Transactions on Cognitive Communications and Networking, 2018, 4(3): 433–445. doi: 10.1109/TCCN.2018.2835460.
[29]	ZHANG Fuxin, LUO Chunbo, XU Jialang, et al. An efficient deep learning model for automatic modulation recognition based on parameter estimation and transformation[J]. IEEE Communications Letters, 2021, 25(10): 3287–3290. doi: 10.1109/LCOMM.2021.3102656.
[30]	XU Jialang, LUO Chunbo, PARR G, et al. A spatiotemporal multi-channel learning framework for automatic modulation recognition[J]. IEEE Wireless Communications Letters, 2020, 9(10): 1629–1632. doi: 10.1109/LWC.2020.2999453.
[31]	ZHANG Jiawei, WANG Tiantian, FENG Zhixi, et al. AMC-Net: An effective network for automatic modulation classification[C]. 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 2023: 1–5. doi: 10.1109/ICASSP49357.2023.10097070.
[32]	CHEN Yantao, DONG Binhong, LIU Cuiting, et al. Abandon locality: Frame-wise embedding aided transformer for automatic modulation recognition[J]. IEEE Communications Letters, 2023, 27(1): 327–331. doi: 10.1109/LCOMM.2022.3213523.
[33]	O’SHEA T J, CORGAN J, and CHARLES CLANCY T. Convolutional radio modulation recognition networks[C]. 17th International Conference on Engineering Applications of Neural Networks, Aberdeen, UK, 2016: 213–226. doi: 10.1007/978-3-319-44188-7_16.