Advanced Search
Turn off MathJax
Article Contents
LI Bing, HU Weijie, LIU Xia. Research on Segmentation Algorithm of Oral and Maxillofacial Panoramic X-ray Images under Dual-domain Multiscale State Space Network[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250639
Citation: LI Bing, HU Weijie, LIU Xia. Research on Segmentation Algorithm of Oral and Maxillofacial Panoramic X-ray Images under Dual-domain Multiscale State Space Network[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250639

Research on Segmentation Algorithm of Oral and Maxillofacial Panoramic X-ray Images under Dual-domain Multiscale State Space Network

doi: 10.11999/JEIT250639 cstr: 32379.14.JEIT250639
Funds:  The National Natural Science Foundation of China (61172167), Heilongjiang Provincial Natural Science Foundation Project (LH00F035)
  • Accepted Date: 2025-12-22
  • Rev Recd Date: 2025-12-22
  • Available Online: 2025-12-29
  •   Objective  To address significant morphological variations, blurred boundaries between teeth and gums, and overlapping gray levels in periodontal tissues within panoramic X-ray images of the oral and maxillofacial region, this paper introduces a state-space model based on Mamba, a recent neural network architecture. This model retains the advantage of Convolutional Neural Networks (CNNs) in extracting local image features while avoiding the high computational complexity of Transformers. Building on this, the study proposes a dual-domain multiscale state space network-based segmentation algorithm for panoramic X-ray images of the oral and maxillofacial region, ultimately achieving significant improvements in segmentation accuracy and model efficiency.  Methods  This model adopts a network architecture primarily based on encoding-decoding, comprising a dual-branch encoder for acquiring global and local information, a decoder for feature restoration, and skip connections for conveying fused encoding path feature maps. During the decoding phase, fused features gradually recover resolution and reduce channel count through deconvolution combined with sampling modules, ultimately outputting a 2-channel segmentation map.  Results and Discussions  This study validated the contribution of each module to model performance through ablation experiments, as shown in Table 1. This model achieved significant performance improvements: in segmentation accuracy, the Dice score increased by 5.69 percentage points to 93.86%, while the HD95 value decreased by 2.97 mm to 18.73 mm, achieving 94.57% accuracy. Regarding model efficiency, the model has a size of only 81.23 MB with 90.1 million parameters, which is significantly lower than that of the baseline model, achieving synergistic optimization of accuracy improvement and model compression. When compared with seven mainstream medical image segmentation models under identical experimental conditions, as shown in Table 2, DMSS-Net achieves higher segmentation accuracy while maintaining a model size comparable to or smaller than that of similarly scaled Transformer models.  Conclusions  This paper proposes a dual-domain multiscale state-space network-based segmentation algorithm for panoramic X-ray images of the oral and maxillofacial region. The algorithm centers on a dual-domain fusion network framework. This framework significantly enhances the model’s ability to capture long-range dependencies in dental images and enables precise segmentation of blurred boundaries. An innovative spatial domain design helps address the challenge of capturing long-range context amid the dynamic dental arch morphology. In addition, a mechanism that enhances the feature domain not only improves the detection of low-contrast features but also increases robustness against interference.
  • loading
  • [1]
    RUAN Jiacheng, XIE Mingye, GAO Jingsheng, et al. EGE-UNet: An efficient group enhanced UNet for skin lesion segmentation[C]. Proceedings of the 26th International Conference on Medical Image Computing and Computer Assisted Intervention, Vancouver, Canada: Springer, 2023: 481–490. doi: 10.1007/978-3-031-43901-8_46.
    [2]
    CHEN Junren, CHEN Rui, WANG Wei, et al. TinyU-Net: Lighter yet better U-Net with cascaded multi-receptive fields[C]. Proceedings of the 27th International Conference on Medical Image Computing and Computer Assisted Intervention, Marrakesh, Morocco: Springer, 2024: 626–635. doi: 10.1007/978-3-031-72114-4_60.
    [3]
    CAO Hu, WANG Yueyue, CHEN J, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation[C]. Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel: Springer, 2023: 205–218. doi: 10.1007/978-3-031-25066-8_9.
    [4]
    CHEN Jieneng, LU Yongyi, YU Qihang, et al. TransUNet: Transformers make strong encoders for medical image segmentation[J]. arXiv preprint arXiv: 2102.04306, 2021. doi: 10.48550/arXiv.2102.04306. (查阅网上资料,不确定本条文献的格式和类型,请确认).
    [5]
    SUN Guanqun, PAN Yizhi, KONG Weikun, et al. DA-TransUNet: Integrating spatial and channel dual attention with transformer U-Net for medical image segmentation[J]. Frontiers in Bioengineering and Biotechnology, 2024, 12: 1398237. doi: 10.3389/fbioe.2024.1398237.
    [6]
    LEE H H, BAO Shunxing, HUO Yuankai, et al. 3D UX-Net: A large kernel volumetric convnet modernizing hierarchical transformer for medical image segmentation[C]. Proceedings of the 11th International Conference on Learning Representations, Kigali, Rwanda: ICLR, 2023.
    [7]
    ZHOU Hongyu, GUO Jiansen, ZHANG Yinghao, et al. nnFormer: Volumetric medical image segmentation via a 3D transformer[J]. IEEE Transactions on Image Processing, 2023, 32: 4036–4045. doi: 10.1109/TIP.2023.3293771.
    [8]
    HATAMIZADEH A, NATH V, TANG Yucheng, et al. Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images[C]. Proceedings of the 7th International Workshop on Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, Springer, 2021: 272–284. doi: 10.1007/978-3-031-08999-2_22. (查阅网上资料,未找到出版地信息,请确认).
    [9]
    GU A and DAO T. Mamba: Linear-time sequence modeling with selective state spaces[J]. arXiv preprint arXiv: 2312.00752, 2023. doi: 10.48550/arXiv.2312.00752. (查阅网上资料,不确定本条文献的格式和类型,请确认).
    [10]
    RUAN Jiacheng, LI Jincheng, and XIANG Suncheng. VM-UNet: Vision mamba UNet for medical image segmentation[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2025. doi: 10.1145/3767748. (查阅网上资料,未找到卷期页码信息,请确认).
    [11]
    HAO Jing, ZHU Yonghui, HE Lei, et al. T-Mamba: A unified framework with long-range dependency in dual-domain for 2D & 3D tooth segmentation[J]. arXiv preprint arXiv: 2404.01065, 2024. doi: 10.48550/arXiv.2404.01065. (查阅网上资料,不确定本条文献的格式和类型,请确认).
    [12]
    LIN Xian, XIANG Yangyang, YU Li, et al. Beyond adapting SAM: Towards end-to-end ultrasound image segmentation via auto prompting[C]. Proceedings of the 27th International Conference on Medical Image Computing and Computer Assisted Intervention, Marrakesh, Morocco: Springer, 2024: 24–34. doi: 10.1007/978-3-031-72111-3_3.
    [13]
    LIN P L, HUANG P Y, HUANG P W, et al. Teeth segmentation of dental periapical radiographs based on local singularity analysis[J]. Computer Methods and Programs in Biomedicine, 2014, 113(2): 433–445. doi: 10.1016/j.cmpb.2013.10.015.
    [14]
    MAHDI F P and KOBASHI S. Quantum particle swarm optimization for multilevel thresholding-based image segmentation on dental X-ray images[C]. Proceedings of the Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, Toyama, Japan: IEEE, 2018: 1148–1153. doi: 10.1109/SCIS-ISIS.2018.00181.
    [15]
    SON L H and TUAN T M. A cooperative semi-supervised fuzzy clustering framework for dental X-ray image segmentation[J]. Expert Systems with Applications, 2016, 46: 380–393. doi: 10.1016/j.eswa.2015.11.001.
    [16]
    PUSHPARAJ V, GURUNATHAN U, ARUMUGAM B, et al. An effective numbering and classification system for dental panoramic radiographs[C]. Proceedings of the 4th International Conference on Computing, Communications and Networking Technologies, Tiruchengode, India: IEEE, 2013: 1–8. doi: 10.1109/ICCCNT.2013.6726480.
    [17]
    ALSMADI M K. A hybrid Fuzzy C-Means and Neutrosophic for jaw lesions segmentation[J]. Ain Shams Engineering Journal, 2018, 9(4): 697–706. doi: 10.1016/j.asej.2016.03.016.
    [18]
    KOCH T L, PERSLEV M, IGEL C, et al. Accurate segmentation of dental panoramic radiographs with U-NETS[C]. Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging, Venice, Italy: IEEE, 2019: 15–19. doi: 10.1109/ISBI.2019.8759563.
    [19]
    ZANNAH R, BASHAR M, MUSHFIQ R B, et al. Semantic segmentation on panoramic dental X-ray images using U-Net architectures[J]. IEEE Access, 2024, 12: 44598–44612. doi: 10.1109/ACCESS.2024.3380027.
    [20]
    IMAK A, ÇELEBI A, POLAT O, et al. ResMIBCU-Net: An encoder–decoder network with residual blocks, modified inverted residual block, and bi-directional ConvLSTM for impacted tooth segmentation in panoramic X-ray images[J]. Oral Radiology, 2023, 39(4): 614–628. doi: 10.1007/s11282-023-00677-8.
    [21]
    LI Yunxiang, WANG Shuai, WANG Jun, et al. GT U-Net: A U-net like group transformer network for tooth root segmentation[C]. Proceedings of the Machine Learning in Medical Imaging: 12th International Workshop, Strasbourg, France: Springer, 2021: 386–395. doi: 10.1007/978-3-030-87589-3_40.
    [22]
    SHENG Chen, WANG Lin, HUANG Zhenhuan, et al. Transformer-based deep learning network for tooth segmentation on panoramic radiographs[J]. Journal of Systems Science and Complexity, 2023, 36(1): 257–272. doi: 10.1007/s11424-022-2057-9.
    [23]
    LI Pengcheng, GAO Chenqiang, LIAN Chunfeng, et al. Spatial prior-guided bi-directional cross-attention transformers for tooth instance segmentation[J]. IEEE Transactions on Medical Imaging, 2024, 43(11): 3936–3948. doi: 10.1109/TMI.2024.3406015.
    [24]
    LIU Yue, TIAN Yunjie, ZHAO Yuzhong, et al. VMamba: Visual state space model[C]. Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, Canada: Curran Associates Inc. , 2024: 3273.
    [25]
    HOWARD A G, ZHU Menglong, CHEN Bo, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv: 1704.04861, 2017. doi: 10.48550/arXiv.1704.04861. (查阅网上资料,不确定本条文献的格式和类型,请确认).
    [26]
    SI Yunzhong, XU Huiying, ZHU Xinzhong, et al. SCSA: Exploring the synergistic effects between spatial and channel attention[J]. Neurocomputing, 2025, 634: 129866. doi: 10.1016/j.neucom.2025.129866.
    [27]
    SUN Hongkun, XU Jing, and DUAN Yuping. ParaTransCNN: Parallelized transcnn encoder for medical image segmentation[J]. arXiv preprint arXiv: 2401.15307, 2024. doi: 10.48550/arXiv.2401.15307. (查阅网上资料,不确定本条文献的格式和类型,请确认).
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(2)

    Article Metrics

    Article views (52) PDF downloads(17) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return