Advanced Search
Turn off MathJax
Article Contents
ZHAO Chuanbin, XU Weihua, LIN bo, ZHANG Tengyu, FENG Yuan, GAO Feifei. Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250685
Citation: ZHAO Chuanbin, XU Weihua, LIN bo, ZHANG Tengyu, FENG Yuan, GAO Feifei. Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250685

Vision Enabled Multimodal Integrated Sensing and Communications: Key Technologies and Prototype Validation

doi: 10.11999/JEIT250685 cstr: 32379.14.JEIT250685
  • Received Date: 2025-07-21
  • Accepted Date: 2025-11-05
  • Rev Recd Date: 2025-11-05
  • Available Online: 2025-11-14
  •   Objective  Integrated Sensing and Communications (ISAC) has become a key enabling technology for the sixth generation mobile communications (6G) network, which can sense and monitor various information in the physical world while communicating with the users, thereby empowering emerging application scenarios such as low altitude economy, digital twin, and vehicle networking. Currently, existing ISAC research mainly focuses on wireless devices that include base stations and terminals. Meanwhile, visual sensing has been a hot research topic in the field of computer science for a long time. Visual sensing has many advantages such as strong visibility and rich details. This paper proposes to combine visual sensing with wireless device sensing to construct a new multimodal ISAC system. Among them, vision can sense the environment and then assist wireless communications, and wireless signals can also help break through the limitations of visual sensing.  Methods  This paper first explores the inherent correlation mechanism between environmental vision and wireless communications. Then, we discuss many key algorithms for visual sensing assisted wireless communication, including beam prediction, occlusion prediction, and resource scheduling and allocation methods for multiple base stations and users. These schemes confirm that visual sensing as prior information can enhance the communications performance of multimodal ISAC system. Next, we discuss the new sensing gains brought by wireless devices combined with visual sensor. Specifically, we propose the static environment reconstruction and dynamic target sensing schemes based on wireless signal and visual fusion, aiming to obtain global information of the physical world. Besides, we construct a "vision communication" simulation and measurement dataset, forming a complete theoretical and technical method for multimodal ISAC.  Results and Discussions  In terms of visual sensing assisted wireless communications, the hardware prototype system constructed in this paper is shown in Figure 6 and Figure 7, and the hardware test results are shown in Table 1. It can be seen that visual sensing can help millimeter wave communications better complete tasks such as beam alignment and beam prediction, thereby enhancing system communications performance. In terms of wireless communication assisted sensing, the hardware prototype system constructed in this article is shown in Figure 8, and the experimental results are presented in Figure 10 and Table 2. It can be seen that the static environment reconstruction effect combining wireless signals and visual sensors is more robust and has higher accuracy. The depth estimation of visual and communications fusion has strong robustness in rainy and snowy weather, and the RMSE error is reduced by about 50% compared to pure visual algorithms. These experimental results indicate that visual enabled multimodal ISAC systems have great potential for application.  Conclusions  This paper proposes to combine visual sensing with wireless device sensing to construct a new multimodal ISAC system. Among them, vision can sense the environment and then assist wireless communications, and wireless signals can also help break through the limitations of visual sensing. We discuss many key algorithms for visual sensing assisted wireless communication, including beam prediction, occlusion prediction, and resource scheduling and allocation methods for multiple base stations and users. We discuss the new sensing gains brought by wireless devices combined with visual sensor. Specifically, we propose the static environment reconstruction and dynamic target sensing schemes based on wireless signal and visual fusion, aiming to obtain global information of the physical world. Besides, we construct a "vision communication" simulation and measurement dataset, forming a complete theoretical and technical method for multimodal ISAC. The experimental results indicate that visual enabled multimodal ISAC systems have great potential for application in 6G networks.
  • loading
  • [1]
    MASCHIETTI F, GESBERT D, DE KERRET P, et al. Robust location-aided beam alignment in millimeter wave massive MIMO[C]. 2017 IEEE Global Communications Conference, Singapore, Singapore, 2017: 1–6. doi: 10.1109/GLOCOM.2017.8254901.
    [2]
    MUPPIRISETTY L S, CHARALAMBOUS T, KAROUT J, et al. Location-aided pilot contamination avoidance for massive MIMO systems[J]. IEEE Transactions on Wireless Communications, 2018, 17(4): 2662–2674. doi: 10.1109/TWC.2018.2800038.
    [3]
    DESTINO G and WYMEERSCH H. On the trade-off between positioning and data rate for mm-wave communication[C]. 2017 IEEE International Conference on Communications Workshops (ICC Workshops), Paris, France, 2017: 797–802. doi: 10.1109/ICCW.2017.7962756.
    [4]
    GAO Jiabao, ZHONG Caijun, and ZHANG Zhaoyang. Location-aided deep learning-based channel estimation for hybrid massive MIMO systems[C]. 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP), Changsha, China, 2021: 1–6. doi: 10.1109/WCSP52459.2021.9613576.
    [5]
    CHEN Xuhong, LU Jiaxun, LIU Shanyun, et al. Location-aided umbrella-shaped massive MIMO beamforming scheme with transmit diversity for high speed railway communications[C]. 2016 IEEE 83rd Vehicular Technology Conference (VTC Spring), Nanjing, China, 2016: 1–5. doi: 10.1109/VTCSpring.2016.7504325.
    [6]
    CHOWDHURY M Z, SHAHJALAL M, AHMED S, et al. 6G wireless communication systems: Applications, requirements, technologies, challenges, and research directions[J]. IEEE Open Journal of the Communications Society, 2020, 1: 957–975. doi: 10.1109/OJCOMS.2020.3010270.
    [7]
    CUI Yuanhao, LIU Fan, JING Xiaojun, et al. Integrating sensing and communications for ubiquitous IoT: Applications, trends, and challenges[J]. IEEE Network, 2021, 35(5): 158–167. doi: 10.1109/MNET.010.2100152.
    [8]
    GAO Feifei, LIN Bo, BIAN Chenghong, et al. FusionNet: Enhanced beam prediction for mmWave communications using sub-6 GHz channel and a few pilots[J]. IEEE Transactions on Communications, 2021, 69(12): 8488–8500. doi: 10.1109/TCOMM.2021.3110301.
    [9]
    WU Shunyao, CHAKRABARTI C, and ALKHATEEB A. LiDAR-aided mobile blockage prediction in real-world millimeter wave systems[C]. 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, USA, 2022: 2631–2636. doi: 10.1109/WCNC51071.2022.9771651.
    [10]
    XU Weihua, GAO Feifei, JIN Shi, et al. 3D scene-based beam selection for mmWave communications[J]. IEEE Wireless Communications Letters, 2020, 9(11): 1850–1854. doi: 10.1109/LWC.2020.3005983.
    [11]
    DEMIRHAN U and ALKHATEEB A. Radar aided 6G beam prediction: Deep learning algorithms and real-world demonstration[C]. 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, USA, 2022: 2655–2660. doi: 10.1109/WCNC51071.2022.9771564.
    [12]
    KLAUTAU A, GONZÁLEZ-PRELCIC N, and HEATH R W. LIDAR data for deep learning-based mmWave beam-selection[J]. IEEE Wireless Communications Letters, 2019, 8(3): 909–912. doi: 10.1109/LWC.2019.2899571.
    [13]
    DIAS M, KLAUTAU A, GONZÁLEZ-PRELCIC N, et al. Position and LIDAR-aided mmWave beam selection using deep learning[C]. 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Cannes, France, 2019: 1–5. doi: 10.1109/SPAWC.2019.8815569.
    [14]
    MASHHADI M B, JANKOWSKI M, TUNG T Y, et al. Federated mmWave beam selection utilizing LIDAR data[J]. IEEE Wireless Communications Letters, 2021, 10(10): 2269–2273. doi: 10.1109/LWC.2021.3099136.
    [15]
    JIANG Shuaifeng, CHARAN G, and ALKHATEEB A. LiDAR aided future beam prediction in real-world millimeter wave V2I communications[J]. IEEE Wireless Communications Letters, 2023, 12(2): 212–216. doi: 10.1109/LWC.2022.3219409.
    [16]
    SALEHI B, REUS-MUNS G, ROY D, et al. Deep learning on multimodal sensor data at the wireless edge for vehicular network[J]. IEEE Transactions on Vehicular Technology, 2022, 71(7): 7639–7655. doi: 10.1109/TVT.2022.3170733.
    [17]
    CHARAN G, ALRABEIAH M, and ALKHATEEB A. Vision-aided 6G wireless communications: Blockage prediction and proactive handoff[J]. IEEE Transactions on Vehicular Technology, 2021, 70(10): 10193–10208. doi: 10.1109/TVT.2021.3104219.
    [18]
    XU Weihua, GAO Feifei, ZHANG Jianhua, et al. Deep learning based channel covariance matrix estimation with user location and scene images[J]. IEEE Transactions on Communications, 2021, 69(12): 8145–8158. doi: 10.1109/TCOMM.2021.3107947.
    [19]
    ALRABEIAH M, HREDZAK A, and ALKHATEEB A. Millimeter wave base stations with cameras: Vision-aided beam and blockage prediction[C]. 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 2020: 1–5. doi: 10.1109/VTC2020-Spring48590.2020.9129369.
    [20]
    KODA Y, NAKASHIMA K, YAMAMOTO K, et al. Handover management for mmWave networks with proactive performance prediction using camera images and deep reinforcement learning[J]. IEEE Transactions on Cognitive Communications and Networking, 2020, 6(2): 802–816. doi: 10.1109/TCCN.2019.2961655.
    [21]
    CHARAN G, ALRABEIAH M, and ALKHATEEB A. Vision-aided dynamic blockage prediction for 6G wireless communication networks[C]. 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, Canada, 2021: 1–6. doi: 10.1109/ICCWorkshops50388.2021.9473651.
    [22]
    NISHIO T, OKAMOTO H, NAKASHIMA K, et al. Proactive received power prediction using machine learning and depth images for mmWave networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(11): 2413–2427. doi: 10.1109/JSAC.2019.2933763.
    [23]
    KODA Y, PARK J, BENNIS M, et al. Communication-efficient multimodal split learning for mmWave received power prediction[J]. IEEE Communications Letters, 2020, 24(6): 1284–1288. doi: 10.1109/LCOMM.2020.2978824.
    [24]
    TIAN Yu and WANG Chenwei. Vision-aided beam tracking: Explore the proper use of camera images with deep learning[C]. 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall), Norman, USA, 2021: 1–5. doi: 10.1109/VTC2021-Fall52928.2021.9625195.
    [25]
    AHN Y, KIM J, KIM S, et al. Toward intelligent millimeter and terahertz communication for 6G: Computer vision-aided beamforming[J]. IEEE Wireless Communications, 2023, 30(5): 179–186. doi: 10.1109/MWC.007.2200155.
    [26]
    WU Shunyao, ALRABEIAH M, CHAKRABARTI C, et al. Blockage prediction using wireless signatures: Deep learning enables real-world demonstration[J]. IEEE Open Journal of the Communications Society, 2022, 3: 776–796. doi: 10.1109/OJCOMS.2022.3162591.
    [27]
    DING Ruijin, XU Weihua, YUAN Wanmai, et al. Vision-aided blockage avoidance in UAV-assisted V2X communications[Z]. arXiv: 2207.12991, 2022. doi: 10.48550/arXiv.2207.12991. (查阅网上资料,请作者核对文献类型及格式是否正确).
    [28]
    CHARAN G, HREDZAK A, and ALKHATEEB A. Millimeter wave drones with cameras: Computer vision aided wireless beam prediction[C]. 2023 IEEE International Conference on Communications Workshops (ICC Workshops), Rome, Italy, 2023: 1896–1901. doi: 10.1109/ICCWorkshops57953.2023.10283784.
    [29]
    XU Weihua, ZHAO Chuanbin, and GAO Feifei. Angle domain channel-based camera pose correction for vision-aided ISAC systems[J]. IEEE Wireless Communications Letters, 2024, 13(8): 2080–2084. doi: 10.1109/LWC.2024.3401408.
    [30]
    CHEN Quan, ZHU Hai, YANG Lei, et al. Edge computing assisted autonomous flight for UAV: Synergies between vision and communications[J]. IEEE Communications Magazine, 2021, 59(1): 28–33. doi: 10.1109/MCOM.001.2000501.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(2)

    Article Metrics

    Article views (55) PDF downloads(23) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return