联合掩码引导与多频域双重注意力机制的急性缺血性脑卒中CT到DWI影像生成模型

张泽华; 赵宁; 王帅; 王璇; 郑强

doi:10.11999/JEIT250643

联合掩码引导与多频域双重注意力机制的急性缺血性脑卒中CT到DWI影像生成模型

doi: 10.11999/JEIT250643 cstr: 32379.14.JEIT250643

张泽华¹,
赵宁¹,
王帅²,
王璇¹,
郑强^1, ,

1.
烟台大学计算机与控制工程学院烟台 264005
2.
滨州医学院附属医院滨州 256603

基金项目: 国家自然科学基金(61802330，61802331)，山东省自然科学基金(ZR2024MH072)，烟台市科技创新发展计划(2023XDRH006)，山东省科技型中小企业创新能力提升项目(2023TSGC0878)

详细信息

作者简介:
张泽华：男，硕士生，研究方向为医学图像生成

赵宁：男，硕士生，研究方向为医学图像生成

王帅：男，讲师，研究方向为医学图像处理

王璇：女，副教授，研究方向为医学图像处理

郑强：男，教授，研究方向为医学图像处理

通讯作者:
郑强　zhengqiang@ytu.edu.cn

中图分类号: TN; TP391
计量
- 文章访问数: 41
- HTML全文浏览量: 23
- PDF下载量: 3
- 被引次数: 0
出版历程
- 收稿日期: 2025-07-08
- 修回日期: 2025-11-03
- 录用日期: 2025-11-03
- 网络出版日期: 2025-11-12

Joint Mask and Multi-Frequency Dual Attention GAN Network for CT-to-DWI Image Synthesis in Acute Ischemic Stroke

1.
School of Computer and Control Engineering, Yantai University, Yantai 264005, China
2.
The Affiliated Hospital of Binzhou Medical University, Binzhou 256603, China

Funds: The National Natural Science Foundation of China (61802330, 61802331), The Natural Science Foundation of Shandong Province (ZR2024MH072), The Science and Technology Innovation Development Program of Yantai (2023XDRH006), The Innovation Capacity Enhancement Project for Technology-based SMEs of Shandong Province (2023TSGC0878)

摘要

摘要: 基于人工智能的跨模态医学图像生成技术为急性缺血性脑卒中的快速多模态诊疗提供了新的路径。针对现有医学图像生成方法仅依赖图像数据本身的统计特征、忽略医学图像的解剖结构，从而造成病灶模糊和结构偏差问题，该文提出了一种新的联合掩码与多频双重注意力GAN模型，用于急性脑缺血性卒中CT到DWI影像生成。该模型主要包含：(1)掩码引导特征融合模块：通过CT图像与掩码图像的卷积融合，引入解剖结构的空间先验信息，增强脑区及病灶区域的特征表达；(2)多频域注意力编码器：采用离散小波变换分解低频全局特征与高频边缘特征，通过双通路注意力跨尺度融合，减少深层信息的丢失；(3)自适应融合权重模块：结合卷积神经网络与注意力机制，自动学习每个输入特征的自适应权重系数。该研究在临床CT到DWI多模态急性脑缺血性卒中数据集上开展了实验验证，分别在全局尺度采用均方误差、峰值信噪比、结构相似度指数进行评估，在局部尺度基于超像素分割后统计灰度均值相关性进行分析。结果表明，所提模型在各项指标上均优于当前先进方法，对脑区轮廓和病灶区域具有更高的准确性和还原性。
- 生成对抗网络 /
- 医学图像生成 /
- 离散小波变换 /
- 脑缺血性卒中
Abstract: Objective In the clinical management of Acute Ischemic Stroke (AIS), Computed Tomography (CT) and Diffusion-Weighted Imaging (DWI) serve complementary roles at different stages. CT is widely applied for initial evaluation due to its rapid acquisition and accessibility, but it has limited sensitivity in detecting early ischemic changes, which can result in diagnostic uncertainty. In contrast, DWI demonstrates high sensitivity to early ischemic lesions, enabling visualization of diffusion-restricted regions soon after symptom onset. However, DWI acquisition requires a longer time, is susceptible to motion artifacts, and depends on scanner availability and patient cooperation, thereby reducing its clinical accessibility. The limited availability of multimodal imaging data remains a major challenge for timely and accurate AIS diagnosis. Therefore, developing a method capable of rapidly and accurately generating DWI images from CT scans has important clinical significance for improving diagnostic precision and guiding treatment planning. Existing medical image translation approaches primarily rely on statistical image features and overlook anatomical structures, which leads to blurred lesion regions and reduced structural fidelity. Methods This study proposes a Joint Mask and Multi-Frequency Dual Attention Generative Adversarial Network (JMMDA-GAN) for CT-to-DWI image synthesis to assist in the diagnosis and treatment of ischemic stroke. The approach incorporates anatomical priors from brain masks and adaptive multi-frequency feature fusion to improve image translation accuracy. JMMDA-GAN comprises three principal modules: a mask-guided feature fusion module, a multi-frequency attention encoder, and an adaptive fusion weighting module. The mask-guided feature fusion module integrates CT images with anatomical masks through convolution, embedding spatial priors to enhance feature representation and texture detail within brain regions and ischemic lesions. The multi-frequency attention encoder applies Discrete Wavelet Transform (DWT) to decompose images into low-frequency global components and high-frequency edge components. A dual-path attention mechanism facilitates cross-scale feature fusion, reducing high-frequency information loss and improving structural detail reconstruction. The adaptive fusion weighting module combines convolutional neural networks and attention mechanisms to dynamically learn the relative importance of input features. By assigning adaptive weights to multi-scale features, the module selectively enhances informative regions and suppresses redundant or noisy information. This process enables effective integration of low- and high-frequency features, thereby improving both global contextual consistency and local structural precision. Results and Discussions Extensive experiments were performed on two independent clinical datasets collected from different hospitals to assess the effectiveness of the proposed method. JMMDA-GAN achieved Mean Squared Error (MSE) values of 0.0097 and 0.0059 on Clinical Dataset 1 and Clinical Dataset 2, respectively, exceeding state-of-the-art models by reducing MSE by 35.8% and 35.2% compared with ARGAN. The proposed network reached peak Signal-to-Noise Ratio (PSNR) values of 26.75 and 28.12, showing improvements of 30.7% and 7.9% over the best existing methods. For Structural Similarity Index (SSIM), JMMDA-GAN achieved 0.753 and 0.844, indicating superior structural preservation and perceptual quality. Visual analysis further demonstrates that JMMDA-GAN restores lesion morphology and fine texture features with higher fidelity, producing sharper lesion boundaries and improved structural consistency compared with other methods. Cross-center generalization and multi-center mixed experiments confirm that the model maintains stable performance across institutions, highlighting its robustness and adaptability in clinical settings. Parameter sensitivity analysis shows that the combination of Haar wavelet and four attention heads achieves an optimal balance between global structural retention and local detail reconstruction. Moreover, superpixel-based gray-level correlation experiments demonstrate that JMMDA-GAN exceeds existing models in both local consistency and global image quality, confirming its capacity to generate realistic and diagnostically reliable DWI images from CT inputs. Conclusions This study proposes a novel JMMDA-GAN designed to enhance lesion and texture detail generation by incorporating anatomical structural information. The method achieves this through three principal modules. (1) The mask-guided feature fusion module effectively integrates anatomical structure information, with particular optimization of the lesion region. The mask-guided network focuses on critical lesion features, ensuring accurate restoration of lesion morphology and boundaries. By combining mask and image data, the method preserves the overall anatomical structure while enhancing lesion areas, preventing boundary blurring and texture loss commonly observed in traditional approaches, thereby improving diagnostic reliability. (2) The multi-frequency feature fusion module jointly optimizes low- and high-frequency features to enhance image detail. This integration preserves global structural integrity while refining local features, producing visually realistic and high-fidelity images. (3) The adaptive fusion weighting module dynamically adjusts the learning strategy for frequency-domain features according to image content, enabling the network to manage texture variations and complex anatomical structures effectively, thereby improving overall image quality. Through the coordinated function of these modules, the proposed method enhances image realism and diagnostic precision. Experimental results demonstrate that JMMDA-GAN exceeds existing advanced models across multiple clinical datasets, highlighting its potential to support clinicians in the diagnosis and management of AIS.
- Generative adversarial network /
- Medical image generation /
- Discrete Wavelet Transform (DWT) /
- Acute Ischemic Stroke (AIS)

HTML全文

图 1 联合掩码与多频双重注意力GAN模型图

下载: 全尺寸图片幻灯片

图 2 掩码引导特征融合模块及掩码生成流程图

下载: 全尺寸图片幻灯片

图 3 多频注意力编码器

下载: 全尺寸图片幻灯片

图 4 自适应融合权重模块及其频域响应示意图

下载: 全尺寸图片幻灯片

图 5 临床数据集1的对比实验结果

下载: 全尺寸图片幻灯片

图 6 临床数据集2的对比实验结果

下载: 全尺寸图片幻灯片

图 7 关键参数敏感性分析结果

下载: 全尺寸图片幻灯片

图 8 临床数据集中DWI图像的超像素分割图

下载: 全尺寸图片幻灯片

图 9 各模型在临床数据集的区域灰度相关性结果

下载: 全尺寸图片幻灯片

表 1 为评估JMMDA-GAN模型性能使用的两个临床数据集信息

数据集	男性数量	女性数量	平均年龄	训练图像	测试图像
临床数据集1	148	80	58	6222	1556
临床数据集2	365	270	64	6533	1634

下载: 导出CSV

表 2 临床数据集1的对比实验指标

方法	MSE↓	MSE p值	PSNR↑	PSNR p值	SSIM↑	SSIM p值
HisGAN	0.0186±0.0124*	2×10^–163	22.84±2.52*	7×10^–205	0.540±0.112*	9×10^–255
ARGAN	0.0151±0.0113*	4×10^–145	24.02±3.20*	6×10^–179	0.752±0.074*	3×10^–5
MedGAN	0.0187±0.0133*	1×10^–193	23.14±3.41*	1×10^–228	0.713±0.081*	4×10^–125
MultiCycleGAN	0.0240±0.0165*	9×10^–217	21.74±2.50*	9×10^–240	0.636±0.098*	3×10^–247
ResCycleGAN	0.0344±0.0270*	5×10^–232	20.44±2.88*	6×10^–245	0.596±0.106*	4×10^–254
本文方法	0.0097±0.0114	-	26.75±4.32	-	0.753±0.101	-
注：*表示本文方法与其他方法在Wilcoxon符号秩检验中取得显著差异

下载: 导出CSV

表 3 临床数据集2的对比实验指标

方法	MSE↓	MSE p值	PSNR↑	PSNR p值	SSIM↑	SSIM p值
HisGAN	0.0119±0.0076*	7×10^–223	24.69±2.34*	6×10^–246	0.642±0.116*	1×10^–268
ARGAN	0.0091±0.0065*	4×10^–166	26.06±2.79*	2×10^–190	0.825±0.063*	7×10^–159
MedGAN	0.0112±0.0075*	6×10^–217	25.12±2.81*	2×10^–240	0.796±0.065*	1×10^–225
MultiCycleGAN	0.0131±0.0072*	1×10^–243	24.12±2.10*	6×10^–260	0.744±0.077*	1×10^–263
ResCycleGAN	0.0283±0.0199*	1×10^–260	21.09±2.60*	1×10^–265	0.660±0.087*	1×10^–268
本文方法	0.0059±0.0053	-	28.12±2.85	-	0.844±0.072	-
注：*表示JMMDA-GAN模型与其他方法在Wilcoxon符号秩检验中取得显著差异

下载: 导出CSV

表 4 跨中心泛化实验指标(临床数据集1训练、临床数据集2测试)

方法	MSE↓	MSE p值	PSNR↑	PSNR p值	SSIM↑	SSIM p值
HisGAN	0.0337±0.0209*	3×10^–137	20.24±2.57*	3×10^–174	0.601±0.112*	3×10^–106
ARGAN	0.0296±0.0170*	1×10^–181	20.66±2.25*	7×10^–220	0.611±0.089*	2×10^–218
MedGAN	0.0313±0.0206*	4×10^–146	20.54±2.45*	5×10^–190	0.625±0.074*	4×10^–158
MultiCycleGAN	0.0462±0.0273*	7×10^–256	18.82±2.44*	3×10^–263	0.556±0.085*	9×10^–263
ResCycleGAN	0.0345±0.0240*	1×10^–160	20.19±2.56*	1×10^–193	0.618±0.079*	5×10^–154
本文方法	0.0155±0.0154	-	24.17±3.12	-	0.678±0.073	-

下载: 导出CSV

表 5 跨中心泛化实验指标(临床数据集2训练、临床数据集1测试)

方法	MSE↓	MSE p值	PSNR↑	PSNR p值	SSIM↑	SSIM p值
HisGAN	0.0313±0.0211*	1×10^–174	20.55±2.43*	1×10^–205	0.480±0.115*	4×10^–251
ARGAN	0.0313±0.0224*	3×10^–184	20.66±2.59*	1×10^–215	0.615±0.090*	2×10^–236
MedGAN	0.0294±0.0211*	1×10^–170	20.88±2.49*	2×10^–205	0.620±0.084*	6×10^–225
MultiCycleGAN	0.0356±0.0242*	2×10^–224	20.02±2.46*	7×10^–240	0.578±0.092*	5×10^–253
ResCycleGAN	0.0394±0.0300*	7×10^–210	19.82±2.88*	1×10^–224	0.585±0.100*	1×10^–238
本文方法	0.0161±0.0153	-	23.96±3.10	-	0.681±0.084	-

下载: 导出CSV

表 6 多中心混合数据集的对比实验指标

方法	MSE↓	MSE p值	PSNR↑	PSNR p值	SSIM↑	SSIM p值
HisGAN	0.0156±0.0116*	< 0.001	23.68±2.59*	< 0.001	0.592±0.125*	< 0.001
ARGAN	0.0121±0.0095*	1E-302	24.95±2.97*	< 0.001	0.786±0.075*	3E-77
MedGAN	0.0153±0.0111*	< 0.001	23.91±3.09*	< 0.001	0.748±0.079*	< 0.001
MultiCycleGAN	0.0255±0.0187*	< 0.001	21.50±2.49*	< 0.001	0.660±0.103*	< 0.001
ResCycleGAN	0.0321±0.0240*	< 0.001	20.63±2.74*	< 0.001	0.623±0.105*	< 0.001
本文方法	0.0077±0.0080	-	27.22±3.34	-	0.794±0.098	-
注：p<0.001表示p值远小于统计软件或数值精度的下限(如<1E-300)。

下载: 导出CSV

表 7 消融实验指标

方法	MSE↓	MSE p值	PSNR↑	PSNR p值	SSIM↑	SSIM p值
Pix2PixHD	0.00969±0.0066*	8×10^–191	25.78±3.00*	1×10^–206	0.813±0.084*	9×10^–239
Pix2PixHD+MGFF	0.00681±0.0060*	2×10^–55	27.41±2.78*	6×10^–69	0.830±0.069*	2×10^–168
Pix2PixHD+MFAB	0.00809±0.0058*	2×10^–137	26.62±3.02*	1×10^–153	0.831±0.077*	1×10^–143
本文方法	0.00585±0.0053	-	28.12±2.85	-	0.844±0.072	-

下载: 导出CSV

参考文献(24)

[1]	ZHANG Xuting, ZHONG Wansi, XUE Rui, et al. Argatroban in patients with acute ischemic stroke with early neurological deterioration: A randomized clinical trial[J]. JAMA Neurology, 2024, 81(2): 118–125. doi: 10.1001/jamaneurol.2023.5093.
[2]	VANDE VYVERE T, PISICĂ D, WILMS G, et al. Imaging findings in acute traumatic brain injury: A national institute of neurological disorders and stroke common data element-based pictorial review and analysis of over 4000 admission brain computed tomography scans from the collaborative European NeuroTrauma effectiveness research in traumatic brain injury (CENTER-TBI) study[J]. Journal of Neurotrauma, 2024, 41(19/20): 2248–2297. doi: 10.1089/neu.2023.0553.
[3]	ELSHERIF S, LEGERE B, MOHAMED A, et al. Beyond conventional imaging: A systematic review and meta-analysis assessing the impact of computed tomography perfusion on ischemic stroke outcomes in the late window[J]. International Journal of Stroke, 2025, 20(3): 278–288. doi: 10.1177/17474930241292915.
[4]	RAPILLO C M, DUNET V, PISTOCCHI S, et al. Moving from CT to MRI paradigm in acute ischemic stroke: Feasibility, effects on stroke diagnosis and long-term outcomes[J]. Stroke, 2024, 55(5): 1329–1338. doi: 10.1161/strokeaha.123.045154.
[5]	GHEBREHIWET I, ZAKI N, DAMSEH R, et al. Revolutionizing personalized medicine with generative AI: A systematic review[J]. Artificial Intelligence Review, 2024, 57(5): 128. doi: 10.1007/s10462-024-10768-5.
[6]	SHURRAB S, GUERRA-MANZANARES A, MAGID A, et al. Multimodal machine learning for stroke prognosis and diagnosis: A systematic review[J]. IEEE Journal of Biomedical and Health Informatics, 2024, 28(11): 6958–6973. doi: 10.1109/jbhi.2024.3448238.
[7]	ARMANIOUS K, JIANG Chenming, FISCHER M, et al. MedGAN: Medical image translation using GANs[J]. Computerized Medical Imaging and Graphics, 2020, 79: 101684. doi: 10.1016/j.compmedimag.2019.101684.
[8]	EKANAYAKE M, PAWAR K, HARANDI M, et al. McSTRA: A multi-branch cascaded swin transformer for point spread function-guided robust MRI reconstruction[J]. Computers in Biology and Medicine, 2024, 168: 107775. doi: 10.1016/j.compbiomed.2023.107775.
[9]	DALMAZ O, YURT M, and ÇUKUR T. ResViT: Residual vision transformers for multimodal medical image synthesis[J]. IEEE Transactions on Medical Imaging, 2022, 41(10): 2598–2614. doi: 10.1109/tmi.2022.3167808.
[10]	ÖZBEY M, DALMAZ O, DAR S U H, et al. Unsupervised medical image translation with adversarial diffusion models[J]. IEEE Transactions on Medical Imaging, 2023, 42(12): 3524–3539. doi: 10.1109/tmi.2023.3290149.
[11]	LUO Yu, ZHANG Shaowei, LING Jie, et al. Mask-guided generative adversarial network for MRI-based CT synthesis[J]. Knowledge-Based Systems, 2024, 295: 111799. doi: 10.1016/j.knosys.2024.111799.
[12]	YANG Linlin, SHANGGUAN Hong, ZHANG Xiong, et al. High-frequency sensitive generative adversarial network for low-dose CT image denoising[J]. IEEE Access, 2020, 8: 930–943. doi: 10.1109/access.2019.2961983.
[13]	HUTCHINSON E B, AVRAM A V, IRFANOGLU M O, et al. Analysis of the effects of noise, DWI sampling, and value of assumed parameters in diffusion MRI models[J]. Magnetic Resonance in Medicine, 2017, 78(5): 1767–1780. doi: 10.1002/mrm.26575.
[14]	DAS S and KUNDU M K. NSCT-based multimodal medical image fusion using pulse-coupled neural network and modified spatial frequency[J]. Medical & Biological Engineering & Computing, 2012, 50(10): 1105–1114. doi: 10.1007/s11517-012-0943-3.
[15]	周涛, 刘赟璨, 陆惠玲, 等. ResNet及其在医学图像处理领域的应用: 研究进展与挑战[J]. 电子与信息学报, 2022, 44(1): 149–167. doi: 10.11999/JEIT210914. ZHOU Tao, LIU Yuncan, LU Huiling, et al. ResNet and its application to medical image processing: Research progress and challenges[J]. Journal of Electronics & Information Technology, 2022, 44(1): 149–167. doi: 10.11999/JEIT210914.
[16]	BARRON J T. A general and adaptive robust loss function[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4326–4334. doi: 10.1109/CVPR.2019.00446.
[17]	LUO Jialin, DAI Peishan, HE Zhuang, et al. Deep learning models for ischemic stroke lesion segmentation in medical images: A survey[J]. Computers in Biology and Medicine, 2024, 175: 108509. doi: 10.1016/j.compbiomed.2024.108509.
[18]	WANG Tingchun, LIU Mingyu, ZHU Junyan, et al. High-resolution image synthesis and semantic manipulation with conditional GANs[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8798–8807. doi: 10.1109/CVPR.2018.00917.
[19]	LIU Rui, DING Xiaoxi, SHAO Yimin, et al. An interpretable multiplication-convolution residual network for equipment fault diagnosis via time–frequency filtering[J]. Advanced Engineering Informatics, 2024, 60: 102421. doi: 10.1016/j.aei.2024.102421.
[20]	LI Yihao, EL HABIB DAHO M, CONZE P H, et al. A review of deep learning-based information fusion techniques for multimodal medical image classification[J]. Computers in Biology and Medicine, 2024, 177: 108635. doi: 10.1016/j.compbiomed.2024.108635.
[21]	PENG Yanjun, SUN Jindong, REN Yande, et al. A histogram-driven generative adversarial network for brain MRI to CT synthesis[J]. Knowledge-Based Systems, 2023, 277: 110802. doi: 10.1016/j.knosys.2023.110802.
[22]	LIU Yanxia, CHEN Anni, SHI Hongyu, et al. CT synthesis from MRI using multi-cycle GAN for head-and-neck radiation therapy[J]. Computerized Medical Imaging and Graphics, 2021, 91: 101953. doi: 10.1016/j.compmedimag.2021.101953.
[23]	DAI Xianjin, LEI Yang, LIU Yingzi, et al. Intensity non-uniformity correction in MR imaging using residual cycle generative adversarial network[J]. Physics in Medicine & Biology, 2020, 65(21): 215025. doi: 10.1088/1361-6560/abb31f.
[24]	DING Bin, LONG Chengjiang, ZHANG Ling, et al. ARGAN: Attentive recurrent generative adversarial network for shadow detection and removal[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, South Korea, 2019: 10212–10221. doi: 10.1109/ICCV.2019.01031.