Citation: | REN Chao, DING Siying, ZHANG Xiaoqi, ZHANG Haijun. Wireless Multimodal Communications for 6G[J]. Journal of Electronics & Information Technology, 2024, 46(5): 1658-1671. doi: 10.11999/JEIT231201 |
[1] |
CHEN Guanghui and ZENG Xiaoping. Multi-modal emotion recognition by fusing correlation features of speech-visual[J]. IEEE Signal Processing Letters, 2021, 28: 533–537. doi: 10.1109/LSP.2021.3055755.
|
[2] |
LIU Shuang, DUAN Linlin, ZHANG Zhong, et al. Multimodal ground-based remote sensing cloud classification via learning heterogeneous deep features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(11): 7790–7800. doi: 10.1109/TGRS.2020.2984265.
|
[3] |
CHEN Haifeng, JIANG Dongmei, and SAHLI H. Transformer encoder with multi-modal multi-head attention for continuous affect recognition[J]. IEEE Transactions on Multimedia, 2021, 23: 4171–4183. doi: 10.1109/TMM.2020.3037496.
|
[4] |
DENG Li, DU Xi and SHEN Ji zhong. Web page classification based on heterogeneous features and a combination of multiple classifiers[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(217): 995–1004. doi: 10.1631/FITEE.1900240.
|
[5] |
FINOMORE V JR, POPIK D JR, CASTLE C JR, et al. Effects of a network-centric multi-modal communication tool on a communication monitoring task[J]. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2010, 54(25): 2125–2129. doi: 10.1177/154193121005402501.
|
[6] |
PORWOL L and OJO A. VR-participation: The feasibility of the virtual reality-driven multi-modal communication technology facilitating e-Participation[C]. The 11th International Conference on Theory and Practice of Electronic Governance, Galway, Ireland, 2018: 269–278. doi: 10.1145/3209415.3209515.
|
[7] |
徐建博, 魏昕, 周亮. 面向跨模态通信的信息恢复技术[J]. 电子学报, 2022, 50(7): 1631–1642. doi: 10.12263/DZXB.20210945.
XU Jianbo, WEI Xin, and ZHOU Liang. Information recovery technology for cross-modal communications[J]. Acta Electronica Sinica, 2022, 50(7): 1631–1642. doi: 10.12263/DZXB.20210945.
|
[8] |
XIE Jiayu, JIN Xin, and CAO Hongkun. SMRD: A local feature descriptor for multi-modal image registration[C]. 2021 International Conference on Visual Communications and Image Processing (VCIP), Munich, Germany, 2021: 1–5. doi: 10.1109/VCIP53242.2021.9675401.
|
[9] |
GERMINIAN J F and TRICYA ESTERINA WIDAGDO ST. Utilizing postGIS extension to process spatial data stored in Neo4j database[C]. IEEE International Conference on Data and Software Engineering (ICoDSE), Toba, Indonesia, 2023: 250–255. doi: 10.1109/ICoDSE59534.2023.10291400.
|
[10] |
王佩瑾, 闫志远, 容雪娥, 等. 数据受限条件下的多模态处理技术综述[J]. 中国图象图形学报, 2022, 27(10): 2803–2834.
WANG Peijin, YAN Zhiyuan, RONG Xuee, et al. Review of multimodal data processing techniques with limited data[J]. Journal of Image and Graphics, 2022, 27(10): 2803–2834.
|
[11] |
BHANDARI D and PAUL S. Perception net: A multimodal deep neural network for machine perception[C]. 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India, 2018: 1419–1426. doi: 10.1109/SSCI.2018.8628711.
|
[12] |
LACKEY S, BARBER D, REINERMAN L, et al. Defining next-generation multi-modal communication in human robot interaction[J]. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2011, 55(1): 461–464. doi: 10.1177/1071181311551095.
|
[13] |
AI Tianfang. Application of human-computer interaction technology in mobile interface design for digital media[C]. 2022 International Conference on Electronics and Devices, Computational Science (ICEDCS), Marseille, France, 2022: 152–155. doi: 10.1109/ICEDCS57360.2022.00040.
|
[14] |
SHU Beibei, SZIEBIG G, and PIETERS R. Architecture for safe human-robot collaboration: Multi-modal communication in virtual reality for efficient task execution[C]. 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), Vancouver, Canada, 2019: 2297–2302. doi: 10.1109/ISIE.2019.8781372.
|
[15] |
TODA Y and KUBOTA N. Computational intelligence for human-friendly robot partners based on multi-modal communication[C]. The 1st IEEE Global Conference on Consumer Electronics 2012, Tokyo, Japan, 2012: 309–313. doi: 10.1109/GCCE.2012.6379611.
|
[16] |
CHU Ziyan, WANG Jiacheng, JIANG Xinyue, et al. mind-vr: a utility approach of human-computer interaction in virtual space based on autonomous consciousness[C]. 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), Changsha, China, 2022: 134–138. doi: 10.1109/VRHCIAI57205.2022.00030.
|
[17] |
WANG Dayan, QU Jue, WANG Wei, et al. The verification system for interface intelligent perception of human-computer interaction[C]. 2020 International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI), Sanya, China, 2020: 43–48. doi: 10.1109/ICHCI51889.2020.00017.
|
[18] |
LIU Weiwei, PANG Yalong and LUAN Shenshen, et al. Building a Spaceborne integrated high-performance processing and computing platform based on SpaceVPX[C]. 2022 International Conference on Computing, Communication, Perception and Quantum Technology (CCPQT), Xiamen, China, 2022: 287-293. doi: 10.1109/CCPQT56151.2022.00056.
|
[19] |
REN Chao and LIU Luchuan. Toward full passive internet of things: Symbiotic localization and ambient backscatter communication[J]. IEEE Internet of Things Journal, 2023, 10(22): 19495–19506. doi: 10.1109/JIOT.2023.3262779.
|
[20] |
REN Chao, HE Zongrui, LI Yingqi, et al. Priority aggregation network with integrated computation and sensation for ultra dense artificial intelligence of things[J]. IEEE Wireless Communications Letters, 2024, 13(2): 270–273. doi: 10.1109/LWC.2023.3323421.
|
[21] |
REN Chao, LIU Luchuan, and ZHANG Haijun. Multimodal interference compatible passive UAV network based on location-aware flexibility[J]. IEEE Wireless Communications Letters, 2023, 12(4): 640–643. doi: 10.1109/LWC.2023.3237637.
|
[22] |
REN Chao, GONG Chao, and LIU Luchuan. Task-oriented multimodal communication based on cloud-edge-UAV collaboration[J]. IEEE Internet of Things Journal, 2024, 11(1): 125–136. doi: 10.1109/JIOT.2023.3295650.
|
[23] |
REN Chao, GONG Chao, CAO Difei, et al. Enhancing reliability in multimodal UAV communication based on opportunistic task space[J]. IEEE Wireless Communications Letters, 2024, 13(2): 284–287. doi: 10.1109/LWC.2023.3326130.
|
[24] |
KASHEVNIK A, LASHKOV I, AXYONOV A, et al. Multimodal corpus design for audio-visual speech recognition in vehicle cabin[J]. IEEE Access, 2021, 9: 34986–35003. doi: 10.1109/ACCESS.2021.3062752.
|
[25] |
KARPOV A, RONZHIN A, and KIPYATKOVA I. Designing a multimodal corpus of audio-visual speech using a high-speed camera[C]. 2012 IEEE 11th International Conference on Signal Processing, Beijing, China, 2012: 519–522. doi: 10.1109/ICoSP.2012.6491539.
|
[26] |
MONTORSI G and BENEDETTO S. Design of spatially coupled turbo product codes for optical communications[C]. 2021 11th International Symposium on Topics in Coding (ISTC), Montreal, Canada, 2021: 1–5. doi: 10.1109/ISTC49272.2021.9594120.
|
[27] |
ÇALIŞKAN E K, GÜLBAZ A, TENGIZLER B, et al. NDC-O-OFDM with channel coding for IM/DD communication systems[C]. 2022 30th Signal Processing and Communications Applications Conference (SIU), Safranbolu, Turkey, 2022: 1–4. doi: 10.1109/SIU55565.2022.9864710.
|
[28] |
KHAN M H and ZHANG Gongxuan. Evaluation of channel coding techniques for massive machine-type communication in 5G cellular network[C]. 2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP), Shanghai, China, 2020: 375–379. doi: 10.1109/ICICSP50920.2020.9232037.
|
[29] |
WANG Yue, SHI Qingwen, XU Li, et al. A study on the construction of traditional Chinese medicine multimodal corpus under the view of self-media[C]. 2022 International Conference on Computers, Information Processing and Advanced Education (CIPAE), Ottawa, Canada, 2022: 373–375. doi: 10.1109/CIPAE55637.2022.00084.
|
[30] |
LI Guangyan, FANG Tian, LIANG Li, et al. Current situation of multimodal corpora and the method of Tibetan corpus construction[C]. 2018 IEEE International Conference of Safety Produce Informatization (IICSPI), Chongqing, China, 2018: 449–453. doi: 10.1109/IICSPI.2018.8690378.
|
[31] |
YEH C H, LIN Y L, LI C C, et al. Compressed-and-forward: Compressive sensing for cooperative communication[C]. 2012 International Symposium on Intelligent Signal Processing and Communications Systems, Tamsui, China, 2012: 319–322. doi: 10.1109/ISPACS.2012.6473503.
|
[32] |
HANECHE H, BOUDRAA B, and OUAHABI A. Compressed sensing investigation in an end-to-end rayleigh communication system: Speech compression[C]. 2018 International Conference on Smart Communications in Network Technologies (SaCoNeT), El Oued, Algeria, 2018: 73–77. doi: 10.1109/SaCoNeT.2018.8585702.
|
[33] |
CHEN Mingkai, LIU Minghao, WANG Wenjun, et al. Cross-modal semantic communications in 6G[C]. 2023 IEEE/CIC International Conference on Communications in China (ICCC), Dalian, China, 2023: 1–6. doi: 10.1109/ICCC57788.2023.10233481.
|
[34] |
YANG Lianxin, WU Dan, and ZHOU Liang. Heterogeneous stream scheduling for cross-modal transmission[J]. IEEE Transactions on Communications, 2021, 69(9): 6037–6049. doi: 10.1109/TCOMM.2021.3086522.
|
[35] |
LU Kangkang, LIANG Meiyu, XUE Zhe, et al. Adversarial guided gradient estimation hashing for cross-modal retrieval[C]. 2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS), Chengdu, China, 2022: 109–113. doi: 10.1109/CCIS57298.2022.10016424.
|
[36] |
ZHAO Wenying, XU Qian, WANG Haoyu, et al. Cross-modal hashing for material surface properties fusion[C]. 2023 International Wireless Communications and Mobile Computing (IWCMC), Marrakesh, Morocco, 2023: 194–198. doi: 10.1109/IWCMC58020.2023.10183090.
|
[37] |
HUANG Yingying, WANG Quan, ZHANG Yipeng, et al. A unified perspective of multi-level cross-modal similarity for cross-modal retrieval[C]. 2022 5th International Conference on Information Communication and Signal Processing (ICICSP), Shenzhen, China, 2022: 466–471. doi: 10.1109/ICICSP55539.2022.10050678.
|
[38] |
WEI Xin, SHI Yingying, and ZHOU Liang. Haptic signal reconstruction for cross-modal communications[J]. IEEE Transactions on Multimedia, 2022, 24: 4514–4525. doi: 10.1109/TMM.2021.3119860.
|
[39] |
王惠琴, 高大庆, 何永强, 等. GPR图像的数据集构建及其DRDU-Net去噪算法[J/OL]. 湖南大学学报: 自然科学版,http://kns.cnki.net/kcms/detail/43.1061.N.20231017.1626.004.html, 2023.
WANG Huiqin, GAO Daqing, HE Yongqiang, et al. DRDU-net based denoising algorithm for GPR image dataset[J/OL]. Journal of Hunan University: Natural Sciences,http://kns.cnki.net/kcms/detail/43.1061.N.20231017.1626.004.html, 2023.
|
[40] |
AKIYAMA H, TANAKA M, and OKUTOMI M. Pseudo four-channel image denoising for noisy CFA raw data[C]. 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, Canada, 2015: 4778–4782. doi: 10.1109/ICIP.2015.7351714.
|
[41] |
LI Jun, YU Menghong, and YUAN Wei. Application of wavelet new threshold denoising in ship data preprocessing and modeling[C]. 2020 Chinese Automation Congress (CAC), Shanghai, China, 2020: 1263–1267. doi: 10.1109/CAC51589.2020.9326954.
|
[42] |
WANG Feng, YANG Bo, WANG Yuqing, et al. Learning from noisy data: An unsupervised random denoising method for seismic data using model-based deep learning[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5913314. doi: 10.1109/TGRS.2022.3165037.
|
[43] |
TIAN Chongpeng, HONG Mei, LI Dongying, et al. Deep recurrent neural network for ground-penetrating radar signal denoising[C]. 2022 4th International Conference on Intelligent Information Processing (IIP), Guangzhou, China, 2022: 85−88. doi: 10.1109/IIP57348.2022.00024.
|
[44] |
GONG Jianhu. Denoising control method of abnormal signals in communication networks based on big data analysis[C]. 2021 13th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), Beihai, China, 2021: 329–334. doi: 10.1109/ICMTMA52658.2021.00077.
|
[45] |
LIU Yuchi, YAO Yue, WANG Zhengjie, et al. Generalized alignment for multimodal physiological signal learning[C]. 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 2019: 1–10. doi: 10.1109/IJCNN.2019.8852216.
|
[46] |
CHEN Liyi, LI Zhi and WANG Yijun, et al. MMEA: Entity alignment for multi-modal knowledge graph[C]. In Knowledge Science, Engineering and Management: 13th International Conference, KSEM 2020, Hangzhou, China, 2020: 134–147. doi: https://doi.org/10.1007/978-3-030-55130-8_12.
|
[47] |
王欢, 宋丽娟, 杜方. 基于多模态知识图谱的中文跨模态实体对齐方法[J]. 计算机工程, 2023, 49(12): 88–95. doi: 10.19678/j.issn.1000-3428.0066938.
WANG Huan, SONG Lijuan, and DU Fang. Chinese cross-modal entity alignment method based on multi-modal knowledge graph[J]. Computer Engineering, 2023, 49(12): 88–95. doi: 10.19678/j.issn.1000-3428.0066938.
|
[48] |
罗俊豪, 朱焱. 用于未对齐多模态语言序列情感分析的多交互感知网络[J]. 计算机应用, 2024, 44(1): 79–85. doi: 10.11772/j.issn.1001-9081.2023060815.
LUO Junhao and ZHU Yan. Multi-dynamic aware network for unaligned multimodal language sequence sentiment analysis[J]. Journal of Computer Applications, 2024, 44(1): 79–85. doi: 10.11772/j.issn.1001-9081.2023060815.
|
[49] |
LIU Ye, QIAO Lingfeng, LU Changchong, et al. OSAN: A one-stage alignment network to unify multimodal alignment and unsupervised domain adaptation[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 2023: 3551–3560. doi: 10.1109/CVPR52729.2023.00346.
|
[50] |
GAO Lei and GUAN Ling. A discriminative vectorial framework for multi-modal feature representation[J]. IEEE Transactions on Multimedia, 2022, 24: 1503–1514. doi: 10.1109/TMM.2021.3066118.
|
[51] |
XUE Zhixiang, YU Xuchu, ZHANG Pengqiang, et al. Self-supervised feature learning and few-shot land cover classification for cross-modal remote sensing images[C]. IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022: 3600–3603. doi: 10.1109/IGARSS46834.2022.9884265.
|
[52] |
AL-HMOUZ R, DAQROUQ K, MORFEQ A, et al. Multimodal biometrics using multiple feature representations to speaker identification system[C]. 2015 International Conference on Information and Communication Technology Research (ICTRC), Abu Dhabi, United Arab Emirates, 2015: 314–317. doi: 10.1109/ICTRC.2015.7156485.
|
[53] |
GAO Lei and GUAN Ling. Interpretable learning-based multi-modal hashing analysis for multi-view feature representation learning[C]. 2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR), CA, USA, 2022: 47–52. doi: 10.1109/MIPR54900.2022.00016.
|
[54] |
WANG Shiping and GUO Wenzhong. Sparse multigraph embedding for multimodal feature representation[J]. IEEE Transactions on Multimedia, 2017, 19(7): 1454–1466. doi: 10.1109/TMM.2017.2663324.
|
[55] |
SHEKHAR S, PATEL V M, NASRABADI N M, et al. Joint sparse representation for robust multimodal biometrics recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(1): 113–126. doi: 10.1109/TPAMI.2013.109.
|
[56] |
HOU Xiaofeng, XU Cheng, LIU Jiacheng, et al. Characterizing and understanding end-to-end multi-modal neural networks on GPUs[J]. IEEE Computer Architecture Letters, 2022, 21(2): 125–128. doi: 10.1109/LCA.2022.3215718.
|
[57] |
LI Shuzhen, ZHANG Tong, CHEN Bianna, et al. MIA-Net: Multi-modal interactive attention network for multi-modal affective analysis[J]. IEEE Transactions on Affective Computing, 2023, 14(4): 2796–2809. doi: 10.1109/TAFFC.2023.3259010.
|
[58] |
LIU Bin. Robust dynamic multi-modal data fusion: A model uncertainty perspective[J]. IEEE Signal Processing Letters, 2021, 28: 2107–2111. doi: 10.1109/LSP.2021.3117731.
|
[59] |
TIAN Pinzhuo, LI Wenbin, and GAO Yang. Consistent meta-regularization for better meta-knowledge in few-shot learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 7277–7288. doi: 10.1109/TNNLS.2021.3084733.
|
[60] |
WANG Ruiqi, ZHANG Xuyao, and LIU Chenglin. Meta-prototypical learning for domain-agnostic few-shot recognition[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(11): 6990–6996. doi: 10.1109/TNNLS.2021.3083650.
|
[61] |
YAN Minghao. Adaptive learning knowledge networks for few-shot learning[J]. IEEE Access, 2019, 7: 119041–119051. doi: 10.1109/ACCESS.2019.2934694.
|
[62] |
WANG Kai. An overview of deep learning based small sample medical imaging classification[C]. 2021 International Conference on Signal Processing and Machine Learning (CONF-SPML), Stanford, USA, 2021: 278–281. doi: 10.1109/CONF-SPML54095.2021.00060.
|
[63] |
LIANG Yong, CHEN Zetao, LIN Daoqian, et al. Three-dimension attention mechanism and self-supervised pretext task for augmenting few-shot learning[J]. IEEE Access, 2023, 11: 59428–59437. doi: 10.1109/ACCESS.2023.3285721.
|