PSAQNet: A Perceptual Structure Adaptive Quality Network for Authentic Distortion Oriented No-reference Image Quality Assessment

JIA Huizhen; ZHAO Yuxuan; FU Peng; WANG Tonghan

doi:10.11999/JEIT251220

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2026 >

JIA Huizhen, ZHAO Yuxuan, FU Peng, WANG Tonghan. PSAQNet: A Perceptual Structure Adaptive Quality Network for Authentic Distortion Oriented No-reference Image Quality Assessment[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251220

Citation:

JIA Huizhen, ZHAO Yuxuan, FU Peng, WANG Tonghan. PSAQNet: A Perceptual Structure Adaptive Quality Network for Authentic Distortion Oriented No-reference Image Quality Assessment[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251220

Citation:

JIA Huizhen, ZHAO Yuxuan, FU Peng, WANG Tonghan. PSAQNet: A Perceptual Structure Adaptive Quality Network for Authentic Distortion Oriented No-reference Image Quality Assessment[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251220

PDF( 1421 KB)

PSAQNet: A Perceptual Structure Adaptive Quality Network for Authentic Distortion Oriented No-reference Image Quality Assessment

doi: 10.11999/JEIT251220 cstr: 32379.14.JEIT251220

1.
School of Artificial Intelligence and Information Engineering, East China University of Technology, Nanchang 330013, China
2.
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

Funds: National Natural Science Foundation of China (62266001, 62261001)

Accepted Date: 2026-01-30
Rev Recd Date: 2026-01-30

Available Online: 2026-02-12

Abstract

Abstract

Objective No-reference image quality assessment (NR-IQA) is critical for practical imaging systems, especially when pristine references are unavailable. However, many existing methods face three main challenges: limited robustness under complex distortions, poor generalization when distortion distributions shift (e.g., from synthetic to real-world settings), and insufficient modeling of geometric/structural degradations, such as spatially varying blur, misalignments, and texture–structure coupling. These issues cause models to overfit dataset-specific statistics, leading to poor performance when confronted with diverse scenes and mixed degradations. To address these problems, we propose the Perceptual Structure-Adaptive Quality Network (PSAQNet), which aims to improve both the accuracy and adaptability of NR-IQA under complex distortion conditions. Methods PSAQNet is a unified CNN-Transformer framework designed to retain hierarchical perceptual cues while enabling global context reasoning. Instead of relying on late-stage pooling, it progressively enhances distortion evidence throughout the network. The core of PSAQNet includes several key components: the Advanced Distortion Enhanced Module (ADEM), which operates on multi-scale features from a pre-trained backbone and utilizes multi-branch gating along with a distortion-aware adapter to prioritize degradation-related signals while minimizing content-dominant interference. This module dynamically selects feature branches that align with perceptual degradation patterns, making it effective for handling spatially non-uniform or mixed distortions. To model geometric degradations, PSAQNet integrates Spatial-guided Convolution (SGC) and Channel-aware Adaptive Kernel Convolution (CA_AK), where SGC enhances spatial sensitivity by guiding convolutional responses with structure-aware cues, focusing on regions where geometric distortion is significant, while CA_AK refines geometric modeling by adjusting receptive behavior and recalibrating channels to preserve distortion-sensitive components. Additionally, PSAQNet incorporates efficient fusion techniques like GroupCBAM, which enables lightweight attention-based fusion of multi-level CNN features, and AttInjector, which selectively injects local distortion cues into global Transformer representations, ensuring that global semantic reasoning is directed by localized degradation evidence without causing redundancy or instability. Results and Discussions Comprehensive experiments on six benchmark datasets, including both synthetic and real-world distortions, demonstrate that PSAQNet achieves strong performance and stable agreement with human subjective judgments. PSAQNet outperforms several recent methods, especially on real-world distortion datasets. This indicates that PSAQNet effectively enhances distortion evidence, models geometric degradation, and selectively bridges local distortion cues with global semantic representations. These contributions enable PSAQNet to maintain robustness under distribution shifts and avoid over-reliance on narrow distortion priors. The ablation studies verify the contributions of individual modules. ADEM improves distortion saliency, SGC and CA_AK enhance geometric sensitivity, and GroupCBAM and AttInjector strengthen the synergy between local and global cues. Cross-dataset evaluations confirm PSAQNet's generalization capabilities across various content categories and distortion types. Scalability tests also show that PSAQNet benefits from stronger pre-trained models without compromising its modular design. Conclusions PSAQNet effectively addresses key limitations in NR-IQA by synergizing local distortion enhancement, geometric alignment, and global semantic fusion. Its modular design ensures robustness and generalization across diverse distortion types, providing a practical solution for real-world applications. Future work will explore vision-language pre-training to further enhance cross-scene adaptability.
- Image quality assessment,
- distortion perception,
- geometric modeling,
- channel perception

FullText(HTML)

References(30)

References

[1]	韩玉兰, 崔玉杰, 罗轶宏, 等. 基于密集残差和质量评估引导的频率分离生成对抗超分辨率重构网络[J]. 电子与信息学报, 2024, 46(12): 4563–4574. doi: 10.11999/JEIT240388. HAN Yulan, CUI Yujie, LUO Yihong, et al. Frequency separation generative adversarial super-resolution reconstruction network based on dense residual and quality assessment[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4563–4574. doi: 10.11999/JEIT240388.
[2]	柏园超, 刘文昌, 江俊君, 等. 深度神经网络图像压缩方法进展综述[J]. 电子与信息学报, 2025, 47(11): 4112–4128. doi: 10.11999/JEIT250567. BAI Yuanchao, LIU Wenchang, JIANG Junjun, et al. Advances in deep neural network based image compression: A survey[J]. Journal of Electronics & Information Technology, 2025, 47(11): 4112–4128. doi: 10.11999/JEIT250567.
[3]	WANG Zhou, BOVIK A C, SHEIKH H R, et al. Image quality assessment: From error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600–612. doi: 10.1109/TIP.2003.819861.
[4]	WANG Zhou, SIMONCELLI E P, BOVIK A C. Multiscale structural similarity for image quality assessment[C]. Proceedings of the 37th Asilomar Conference on Signals, Systems & Computers, Pacific Grove, USA, 2003: 1398–1402. doi: 10.1109/ACSSC.2003.1292216.
[5]	YANG Jie, LYU Mengjin, QI Zhiquan, et al. Deep learning based image quality assessment: A survey[J]. Procedia Computer Science, 2023, 221: 1000–1005. doi: 10.1016/j.procs.2023.08.080.
[6]	MOORTHY A K and BOVIK A C. Blind image quality assessment: From natural scene statistics to perceptual quality[J]. IEEE Transactions on Image Processing, 2011, 20(12): 3350–3364. doi: 10.1109/TIP.2011.2147325.
[7]	MITTAL A, MOORTHY A K, and BOVIK A C. No-reference image quality assessment in the spatial domain[J]. IEEE Transactions on Image Processing, 2012, 21(12): 4695–4708. doi: 10.1109/TIP.2012.2214050.
[8]	BOSSE S, MANIRY D, MÜLLER K R, et al. Deep neural networks for no-reference and full-reference image quality assessment[J]. IEEE Transactions on Image Processing, 2018, 27(1): 206–219. doi: 10.1109/TIP.2017.2760518.
[9]	ZHANG Lin, ZHANG Lei, and BOVIK A C. A feature-enriched completely blind image quality evaluator[J]. IEEE Transactions on Image Processing, 2015, 24(8): 2579–2591. doi: 10.1109/TIP.2015.2426416.
[10]	ZHANG Weixia, MA Kede, YAN Jia, et al. Blind image quality assessment using a deep bilinear convolutional neural network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(1): 36–47. doi: 10.1109/TCSVT.2018.2886771.
[11]	KE Junjie, WANG Qifei, WANG Yilin, et al. MUSIQ: Multi-scale image quality transformer[C]. Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 2021: 5128–5137. doi: 10.1109/ICCV48922.2021.00510.
[12]	CHEON M, YOON S J, KANG B, et al. Perceptual image quality assessment with transformers[C]. Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, USA, 2021: 433–442. doi: 10.1109/CVPRW53098.2021.00054.
[13]	CHEN Zewen, QIN Haina, WANG Juan, et al. PromptIQA: Boosting the performance and generalization for no-reference image quality assessment via prompts[C]. Proceedings of the 18th European Conference on Computer Vision, Milan, Italy, 2024: 247–264. doi: 10.1007/978-3-031-73232-4_14.
[14]	ZHANG Bo, WANG Luoxi, ZHANG Cheng, et al. No-reference image quality assessment based on improved vision transformer and transfer learning[J]. Signal Processing: Image Communication, 2025, 135: 117282. doi: 10.1016/j.image.2025.117282.
[15]	郭颖聪, 唐天航, 刘怡光. 基于Transformer与权重令牌引导的双分支无参考图像质量评价网络[J]. 四川大学学报: 自然科学版, 2025, 62(4): 847–856. doi: 10.19907/j.0490-6756.240396. GUO Yingcong, TANG Tianhang, and LIU Yiguang. A dual-branch no-reference image quality assessment network guided by Transformer and a weight token[J]. Journal of Sichuan University: Natural Science Edition, 2025, 62(4): 847–856. doi: 10.19907/j.0490-6756.240396.
[16]	陈勇, 朱凯欣, 房昊, 等. 基于空间分布分析的混合失真无参考图像质量评价[J]. 电子与信息学报, 2020, 42(10): 2533–2540. doi: 10.11999/JEIT190721. CHEN Yong, ZHU Kaixin, FANG Hao, et al. No-reference image quality evaluation for multiply-distorted images based on spatial domain coding[J]. Journal of Electronics & Information Technology, 2020, 42(10): 2533–2540. doi: 10.11999/JEIT190721.
[17]	XU Kangmin, LIAO Liang, XIAO Jing, et al. Boosting image quality assessment through efficient transformer adaptation with local feature enhancement[C]. Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2024: 2662–2672. doi: 10.1109/CVPR52733.2024.00257.
[18]	SHI Jinsong, GAO Pan, and QIN Jie. Transformer-based no-reference image quality assessment via supervised contrastive learning[C]. Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 4829–4837. doi: 10.1609/aaai.v38i5.28285.
[19]	SU Shaolin, YAN Qingsen, ZHU Yu, et al. Blindly assess image quality in the wild guided by a self-adaptive hyper network[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 3664–3673. doi: 10.1109/CVPR42600.2020.00372.
[20]	HOSU V, LIN Hanhe, SZIRANYI T, et al. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment[J]. IEEE Transactions on Image Processing, 2020, 29: 4041–4056. doi: 10.1109/TIP.2020.2967829.
[21]	LI Aobo, WU Jinjian, LIU Yongxu, et al. Bridging the synthetic-to-authentic gap: Distortion-guided unsupervised domain adaptation for blind image quality assessment[C]. Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2024: 28422–28431. doi: 10.1109/CVPR52733.2024.02685.
[22]	GU Liping, LI Tongyan, and HE Jiyong. Classification of diabetic retinopathy grade based on G-ENet convolutional neural network model: Convolutional neural networks are used to solve the problem of diabetic retinopathy grade classification[C]. Proceedings of 2023 7th International Conference on Electronic Information Technology and Computer Engineering, Xiamen, China, 2023: 1590–1594. doi: 10.1145/3650400.3650666.
[23]	LI Yuhao and ZHANG Aihua. AKA-MobileNet: A cloud-noise-robust lightweight convolution neural network[C]. Proceedings of 2024 39th Youth Academic Annual Conference of Chinese Association of Automation (YAC), Dalian, China, 2024: 188–193. doi: 10.1109/YAC63405.2024.10598582.
[24]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision, Munich: Springer, 2018: 3–19. doi: 10.1007/978-3-030-01234-2_1.
[25]	SHEIKH H R, BOVIK A C, and DE VECIANA G. An information fidelity criterion for image quality assessment using natural scene statistics[J]. IEEE Transactions on Image Processing, 2005, 14(12): 2117–2128. doi: 10.1109/TIP.2005.859389.
[26]	LARSON E C and CHANDLER D M. Most apparent distortion: Full-reference image quality assessment and the role of strategy[J]. Journal of Electronic Imaging, 2010, 19(1): 011006. doi: 10.1117/1.3267105.
[27]	PONOMARENKO N, JIN Lina, IEREMEIEV O, et al. Image database TID2013: Peculiarities, results and perspectives[J]. Signal Processing: Image Communication, 2015, 30: 57–77. doi: 10.1016/j.image.2014.10.009.
[28]	LIN Hanhe, HOSU V, and SAUPE D. KADID-10k: A large-scale artificially distorted IQA database[C]. Proceedings of 2019 11th International Conference on Quality of Multimedia Experience (QoMEX), Berlin, Germany, 2019: 1–3. doi: 10.1109/QoMEX.2019.8743252.
[29]	GHADIYARAM D and BOVIK A C. Massive online crowdsourced study of subjective and objective picture quality[J]. IEEE Transactions on Image Processing, 2016, 25(1): 372–387. doi: 10.1109/TIP.2015.2500021.
[30]	ZHAO Yongcan, ZHANG Yinghao, XIA Tianfeng, et al. No-reference image quality assessment based on multi-scale dynamic modulation and degradation information[J]. Displays, 2026, 91: 103207. doi: 10.1016/j.displa.2025.103207.