Citation: | ZHAO Hong, LI Wengai. Text-to-image Generation Model Based on Diffusion Wasserstein Generative Adversarial Networks[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4371-4381. doi: 10.11999/JEIT221400 |
[1] |
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139–144. doi: 10.1145/3422622
|
[2] |
李云红, 朱绵云, 任劼, 等. 改进深度卷积生成式对抗网络的文本生成图像[J/OL]. 北京航空航天大学学报. http://kns.cnki.net/kcms/detail/11.2625.V.20220207.1115.002.html, 2022.
LI Yunhong, ZHU Mianyun, REN Jie, et al. Text-to-image synthesis based on modified deep convolutional generative adversarial network[J/OL]. Journal of Beijing University of Aeronautics and Astronautics. http://kns.cnki.net/kcms/detail/11.2625.V.20220207.1115.002.html, 2022.
|
[3] |
谈馨悦, 何小海, 王正勇, 等. 基于Transformer交叉注意力的文本生成图像技术[J]. 计算机科学, 2021, 49(2): 107–115. doi: 10.11896/jsjkx.210600085
TAN Xinyue, HE Xiaohai, WANG Zhengyong, et al. Text-to-image generation technology based on transformer cross attention[J]. Computer Science, 2021, 49(2): 107–115. doi: 10.11896/jsjkx.210600085
|
[4] |
赵雅琴, 孙蕊蕊, 吴龙文, 等. 基于改进深度生成对抗网络的心电信号重构算法[J]. 电子与信息学报, 2022, 44(1): 59–69. doi: 10.11999/JEIT210922
ZHAO Yaqin, SUN Ruirui, WU Longwen, et al. ECG reconstruction based on improved deep convolutional generative adversarial networks[J]. Journal of Electronics &Information Technology, 2022, 44(1): 59–69. doi: 10.11999/JEIT210922
|
[5] |
ARJOVSKY M and BOTTOU L. Towards principled methods for training generative adversarial networks[C]. The 5th International Conference on Learning Representations. Toulon, France, 2017.
|
[6] |
JALAYER M, JALAYER R, KABOLI A, et al. Automatic visual inspection of rare defects: A framework based on gp-wgan and enhanced faster R-CNN[C]. 2021 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, Bandung, Indonesia, 2021: 221–227.
|
[7] |
WANG Zhendong, ZHENG Huangjie, HE Pengcheng, et al. Diffusion-GAN: Training GANs with diffusion[J]. arXiv: 2206.02262, 2022.
|
[8] |
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]. The 38th International Conference on Machine Learning, Westminster, UK, 2021: 8748–8763.
|
[9] |
HO J, JAIN A, and ABBEEL P. Denoising diffusion probabilistic models[C]. The 34th Conference on Neural Information Processing Systems, Vancouver, Canada, 2020: 6840–6851.
|
[10] |
CHONG Minjin and FORSYTH D. Effectively unbiased FID and inception score and where to find them[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA 2020: 6069–6078.
|
[11] |
KYNKÄÄNNIEMI T, KARRAS T, LAINE S, et al. Improved precision and recall metric for assessing generative models[C]. The 33rd Conference on Neural Information Processing Systems, Vancouver, Canada, 2019: 32.
|
[12] |
ZHANG Han, XU Tao, LI Hongsheng, et al. StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 5908–5916.
|
[13] |
SOUZA D M, WEHRMANN J, and RUIZ D D. Efficient neural architecture for text-to-image synthesis[C]. 2020 International Joint Conference on Neural Networks, Glasgow, UK, 2020: 1–8.
|
[14] |
XU Tao, ZHANG Pengchuan, HUANG Qiuyuan, et al. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1316–1324.
|
[15] |
ZHU Minfeng, PAN Pingbo, CHEN Wei, et al. Dm-GAN: Dynamic memory generative adversarial networks for text-to-image synthesis[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 5795–5803.
|
[16] |
TAO Ming, TANG Hao, WU Fei, et al. DF-GAN: A simple and effective baseline for text-to-image synthesis[J]. arXiv: 2008.05865, 2020.
|
[17] |
RAMESH A, PAVLOV M, GOH G, et al. Zero-shot text-to-image generation[C]. The 38th International Conference on Machine Learning, Westminster, UK, 2021: 8821–8831.
|
[18] |
CHENG Qingrong, WEN Keyu, and GU Xiaodong. Vision-language matching for text-to-image synthesis via generative adversarial networks[J]. IEEE Transactions on Multimedia, To be published.
|
[19] |
LIAO Wentong, HU Kai, YANG M Y, et al. Text to image generation with semantic-spatial aware GAN[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 18166–18175.
|
[20] |
BAIOLETTI M, DI BARI G, POGGIONI V, et al. Smart multi-objective evolutionary GAN[C]. 2021 IEEE Congress on Evolutionary Computation, Kraków, Poland, 2021: 2218–2225.
|
[21] |
YIN Xusen and MAY J. Comprehensible context-driven text game playing[C]. 2019 IEEE Conference on Games, London, UK, 2019: 1–8.
|