Advanced Search
Turn off MathJax
Article Contents
ZHAO Chuanbin, XU Weihua, LIN bo, ZHANG Tengyu, FENG Yuan, GAO Feifei. Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250685
Citation: ZHAO Chuanbin, XU Weihua, LIN bo, ZHANG Tengyu, FENG Yuan, GAO Feifei. Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250685

Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation

doi: 10.11999/JEIT250685 cstr: 32379.14.JEIT250685
  • Received Date: 2025-07-21
  • Accepted Date: 2025-11-05
  • Rev Recd Date: 2025-10-26
  • Available Online: 2025-11-14
  •   Objective  Integrated Sensing And Communications (ISAC) is regarded as a key enabling technology for Sixth-Generation mobile communications (6G), as it simultaneously senses and monitors information in the physical world while maintaining communication with users. The technology supports emerging scenarios such as low-altitude economy, digital twin systems, and vehicle networking. Current ISAC research primarily concentrates on wireless devices that include base stations and terminals. Visual sensing, which provides strong visibility and detailed environmental information, has long been a major research direction in computer science. This study proposes the integration of visual sensing with wireless-device sensing to construct a multimodal ISAC system. In this system, visual sensing captures environmental information to assist wireless communications, and wireless signals help overcome limitations inherent to visual sensing.  Methods  The study first explores the correlation mechanism between environmental vision and wireless communications. Key algorithms for visual-sensing-assisted wireless communication are then discussed, including beam prediction, occlusion prediction, and resource scheduling and allocation methods for multiple base stations and users. These schemes demonstrate that visual sensing, used as prior information, enhances the communication performance of the multimodal ISAC system. The sensing gains provided by wireless devices combined with visual sensors are subsequently explored. A static-environment reconstruction scheme and a dynamic-target sensing scheme based on wireless–visual fusion are proposed to obtain global information about the physical world. In addition, a “vision–communication” simulation and measurement dataset is constructed, establishing a complete theoretical and technical framework for multimodal ISAC.  Results and Discussions  For visual-sensing-assisted wireless communications, the hardware prototype system constructed in this study is shown in (Fig. 6) and (Fig. 7), and the corresponding hardware test results are presented in (Table 1). The results show that visual sensing assists millimetre-wave communications in performing beam alignment and beam prediction more effectively, thereby improving system communication performance. For wireless-communication-assisted sensing, the hardware prototype system is shown in (Fig. 8), and the experimental results are shown in (Fig. 9) and (Table 2). The static-environment reconstruction obtained through wireless–visual fusion shows improved robustness and higher accuracy. Depth estimation based on visual and communication fusion also presents strong robustness in rainy and snowy weather, with the RMSE reduced by approximately 50% compared with pure visual algorithms. These experimental results indicate that vision-enabled multimodal ISAC systems present strong potential for practical application.  Conclusions  A multimodal ISAC system that integrates visual sensing with wireless-device sensing is proposed. In this system, visual sensing captures environmental information to assist wireless communications, and wireless signals help overcome the inherent limitations of visual sensing. Key algorithms for visual-sensing-assisted wireless communication are examined, including beam prediction, occlusion prediction, and resource scheduling and allocation for multiple base stations and users. The sensing gains brought by wireless devices combined with visual sensors are also analysed. Static-environment reconstruction and dynamic-target sensing schemes based on wireless–visual fusion are proposed to obtain global information about the physical world. A “vision–communication” simulation and measurement dataset is further constructed, forming a coherent theoretical and technical framework for multimodal ISAC. Experimental results show that vision-enabled multimodal ISAC systems present strong potential for use in 6G networks.
  • loading
  • [1]
    MASCHIETTI F, GESBERT D, DE KERRET P, et al. Robust location-aided beam alignment in millimeter wave massive MIMO[C]. 2017 IEEE Global Communications Conference, Singapore, Singapore, 2017: 1–6. doi: 10.1109/GLOCOM.2017.8254901.
    [2]
    MUPPIRISETTY L S, CHARALAMBOUS T, KAROUT J, et al. Location-aided pilot contamination avoidance for massive MIMO systems[J]. IEEE Transactions on Wireless Communications, 2018, 17(4): 2662–2674. doi: 10.1109/TWC.2018.2800038.
    [3]
    DESTINO G and WYMEERSCH H. On the trade-off between positioning and data rate for mm-wave communication[C]. 2017 IEEE International Conference on Communications Workshops (ICC Workshops), Paris, France, 2017: 797–802. doi: 10.1109/ICCW.2017.7962756.
    [4]
    GAO Jiabao, ZHONG Caijun, and ZHANG Zhaoyang. Location-aided deep learning-based channel estimation for hybrid massive MIMO systems[C]. 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP), Changsha, China, 2021: 1–6. doi: 10.1109/WCSP52459.2021.9613576.
    [5]
    CHEN Xuhong, LU Jiaxun, LIU Shanyun, et al. Location-aided umbrella-shaped massive MIMO beamforming scheme with transmit diversity for high speed railway communications[C]. 2016 IEEE 83rd Vehicular Technology Conference (VTC Spring), Nanjing, China, 2016: 1–5. doi: 10.1109/VTCSpring.2016.7504325.
    [6]
    CHOWDHURY M Z, SHAHJALAL M, AHMED S, et al. 6G wireless communication systems: Applications, requirements, technologies, challenges, and research directions[J]. IEEE Open Journal of the Communications Society, 2020, 1: 957–975. doi: 10.1109/OJCOMS.2020.3010270.
    [7]
    CUI Yuanhao, LIU Fan, JING Xiaojun, et al. Integrating sensing and communications for ubiquitous IoT: Applications, trends, and challenges[J]. IEEE Network, 2021, 35(5): 158–167. doi: 10.1109/MNET.010.2100152.
    [8]
    GAO Feifei, LIN Bo, BIAN Chenghong, et al. FusionNet: Enhanced beam prediction for mmWave communications using sub-6 GHz channel and a few pilots[J]. IEEE Transactions on Communications, 2021, 69(12): 8488–8500. doi: 10.1109/TCOMM.2021.3110301.
    [9]
    WU Shunyao, CHAKRABARTI C, and ALKHATEEB A. LiDAR-aided mobile blockage prediction in real-world millimeter wave systems[C]. 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, USA, 2022: 2631–2636. doi: 10.1109/WCNC51071.2022.9771651.
    [10]
    XU Weihua, GAO Feifei, JIN Shi, et al. 3D scene-based beam selection for mmWave communications[J]. IEEE Wireless Communications Letters, 2020, 9(11): 1850–1854. doi: 10.1109/LWC.2020.3005983.
    [11]
    DEMIRHAN U and ALKHATEEB A. Radar aided 6G beam prediction: Deep learning algorithms and real-world demonstration[C]. 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, USA, 2022: 2655–2660. doi: 10.1109/WCNC51071.2022.9771564.
    [12]
    KLAUTAU A, GONZÁLEZ-PRELCIC N, and HEATH R W. LIDAR data for deep learning-based mmWave beam-selection[J]. IEEE Wireless Communications Letters, 2019, 8(3): 909–912. doi: 10.1109/LWC.2019.2899571.
    [13]
    DIAS M, KLAUTAU A, GONZÁLEZ-PRELCIC N, et al. Position and LIDAR-aided mmWave beam selection using deep learning[C]. 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Cannes, France, 2019: 1–5. doi: 10.1109/SPAWC.2019.8815569.
    [14]
    MASHHADI M B, JANKOWSKI M, TUNG T Y, et al. Federated mmWave beam selection utilizing LIDAR data[J]. IEEE Wireless Communications Letters, 2021, 10(10): 2269–2273. doi: 10.1109/LWC.2021.3099136.
    [15]
    JIANG Shuaifeng, CHARAN G, and ALKHATEEB A. LiDAR aided future beam prediction in real-world millimeter wave V2I communications[J]. IEEE Wireless Communications Letters, 2023, 12(2): 212–216. doi: 10.1109/LWC.2022.3219409.
    [16]
    SALEHI B, REUS-MUNS G, ROY D, et al. Deep learning on multimodal sensor data at the wireless edge for vehicular network[J]. IEEE Transactions on Vehicular Technology, 2022, 71(7): 7639–7655. doi: 10.1109/TVT.2022.3170733.
    [17]
    CHARAN G, ALRABEIAH M, and ALKHATEEB A. Vision-aided 6G wireless communications: Blockage prediction and proactive handoff[J]. IEEE Transactions on Vehicular Technology, 2021, 70(10): 10193–10208. doi: 10.1109/TVT.2021.3104219.
    [18]
    XU Weihua, GAO Feifei, ZHANG Jianhua, et al. Deep learning based channel covariance matrix estimation with user location and scene images[J]. IEEE Transactions on Communications, 2021, 69(12): 8145–8158. doi: 10.1109/TCOMM.2021.3107947.
    [19]
    ALRABEIAH M, HREDZAK A, and ALKHATEEB A. Millimeter wave base stations with cameras: Vision-aided beam and blockage prediction[C]. 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 2020: 1–5. doi: 10.1109/VTC2020-Spring48590.2020.9129369.
    [20]
    KODA Y, NAKASHIMA K, YAMAMOTO K, et al. Handover management for mmWave networks with proactive performance prediction using camera images and deep reinforcement learning[J]. IEEE Transactions on Cognitive Communications and Networking, 2020, 6(2): 802–816. doi: 10.1109/TCCN.2019.2961655.
    [21]
    CHARAN G, ALRABEIAH M, and ALKHATEEB A. Vision-aided dynamic blockage prediction for 6G wireless communication networks[C]. 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, Canada, 2021: 1–6. doi: 10.1109/ICCWorkshops50388.2021.9473651.
    [22]
    NISHIO T, OKAMOTO H, NAKASHIMA K, et al. Proactive received power prediction using machine learning and depth images for mmWave networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(11): 2413–2427. doi: 10.1109/JSAC.2019.2933763.
    [23]
    KODA Y, PARK J, BENNIS M, et al. Communication-efficient multimodal split learning for mmWave received power prediction[J]. IEEE Communications Letters, 2020, 24(6): 1284–1288. doi: 10.1109/LCOMM.2020.2978824.
    [24]
    TIAN Yu and WANG Chenwei. Vision-aided beam tracking: Explore the proper use of camera images with deep learning[C]. 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall), Norman, USA, 2021: 1–5. doi: 10.1109/VTC2021-Fall52928.2021.9625195.
    [25]
    AHN Y, KIM J, KIM S, et al. Toward intelligent millimeter and terahertz communication for 6G: Computer vision-aided beamforming[J]. IEEE Wireless Communications, 2023, 30(5): 179–186. doi: 10.1109/MWC.007.2200155.
    [26]
    WU Shunyao, ALRABEIAH M, CHAKRABARTI C, et al. Blockage prediction using wireless signatures: Deep learning enables real-world demonstration[J]. IEEE Open Journal of the Communications Society, 2022, 3: 776–796. doi: 10.1109/OJCOMS.2022.3162591.
    [27]
    DING Ruijin, XU Weihua, YUAN Wanmai, et al. Vision-aided blockage avoidance in UAV-assisted V2X communications[Z]. arXiv: 2207.12991, 2022. doi: 10.48550/arXiv.2207.12991.
    [28]
    CHARAN G, HREDZAK A, and ALKHATEEB A. Millimeter wave drones with cameras: Computer vision aided wireless beam prediction[C]. 2023 IEEE International Conference on Communications Workshops (ICC Workshops), Rome, Italy, 2023: 1896–1901. doi: 10.1109/ICCWorkshops57953.2023.10283784.
    [29]
    XU Weihua, ZHAO Chuanbin, and GAO Feifei. Angle domain channel-based camera pose correction for vision-aided ISAC systems[J]. IEEE Wireless Communications Letters, 2024, 13(8): 2080–2084. doi: 10.1109/LWC.2024.3401408.
    [30]
    CHEN Quan, ZHU Hai, YANG Lei, et al. Edge computing assisted autonomous flight for UAV: Synergies between vision and communications[J]. IEEE Communications Magazine, 2021, 59(1): 28–33. doi: 10.1109/MCOM.001.2000501.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(2)

    Article Metrics

    Article views (129) PDF downloads(43) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return