Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation

WU Ting; WEN Shulin; YAN Zhaoli; FU Gaoyuan; LI Linfeng; LIU Xudu; CHENG Xiaobin; YANG Jun

doi:10.11999/JEIT250934

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 >

WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun. Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250934

Citation:

WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun. Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250934

Citation:

WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun. Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250934

PDF( 16618 KB)

Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation

doi: 10.11999/JEIT250934 cstr: 32379.14.JEIT250934

WU Ting^{1, 2, 4},
WEN Shulin^{3, 4},
YAN Zhaoli^{5, 1},
FU Gaoyuan^{3, 4},
LI Linfeng^{3, 4},
LIU Xudu^{3, 4},
CHENG Xiaobin^{1, 2
,
,},
YANG Jun^{1, 2}

1.
State Key Laboratory of Acoustics and Marine Information, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
2.
University of Chinese Academy of Sciences, School of Electronic, Electrical and Communication Engineering, Beijing 100049, China
3.
China Yangtze Power Co., Ltd, Wuhan 443000, China
4.
Hubei Technology Innovation Center for Smart Hydropower, Wuhan 430000, China
5.
College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing 100029, China

Funds: The project of China Yangtze Power Co., Ltd. (Z152302048)

Received Date: 2025-09-19
Accepted Date: 2025-11-12
Rev Recd Date: 2025-11-10

Available Online: 2025-11-18

Abstract

Abstract

Objective Hydro-Turbine Generator Units (HTGUs) require reliable early fault detection to maintain operational safety and reduce maintenance cost. Acoustic signals provide a non-intrusive and sensitive monitoring approach, but their use is limited by complex structural acoustics, strong background noise, and the scarcity of abnormal data. An unsupervised acoustic anomaly detection framework is presented, in which a large-scale pretrained audio model is integrated with density-based k-nearest neighbors estimation. This framework is designed to detect anomalies using only normal data and to maintain robustness and strong generalization across different operational conditions of HTGUs. Methods The framework performs unsupervised acoustic anomaly detection for HTGUs using only normal data. Time-domain signals are preprocessed with Z-score normalization and Fbank features, and random masking is applied to enhance robustness and generalization. A large-scale pretrained BEATs model is used as the feature encoder, and an Attentive Statistical Pooling module aggregates frame-level representations into discriminative segment-level embeddings by emphasizing informative frames. To improve class separability, an ArcFace loss replaces the conventional classification layer during training, and a warm-up learning rate strategy is adopted to ensure stable convergence. During inference, density-based k-nearest neighbors estimation is applied to the learned embeddings to detect acoustic anomalies. Results and Discussions The effectiveness of the proposed unsupervised acoustic anomaly detection framework for HTGUs is examined using data collected from eight real-world machines. As shown in Fig. 7 and Table 2, large-scale pretrained audio representations show superior capability compared with traditional features in distinguishing abnormal sounds. With the FED-KE algorithm, the framework attains high accuracy across six metrics, with Hmean reaching 98.7% in the wind tunnel and exceeding 99.9% in the slip-ring environment, indicating strong robustness under complex industrial conditions. As shown in Table 4, ablation studies confirm the complementary effects of feature enhancement, ASP-based representation refinement, and density-based k-NN inference. The framework requires only normal data for training, reducing dependence on scarce fault labels and enhancing practical applicability. Remaining challenges include computational cost introduced by the pretrained model and the absence of multimodal fusion, which will be addressed in future work. Conclusions An unsupervised acoustic anomaly detection framework is proposed for HTGUs, addressing the scarcity of fault samples and the complexity of industrial acoustic environments. A pretrained large-scale audio foundation model is adopted and fine-tuned with turbine-specific strategies to improve the modeling of normal operational acoustics. During inference, a density-estimation-based k-NN mechanism is applied to detect abnormal patterns using only normal data. Experiments conducted on real-world hydropower station recordings show high detection accuracy and strong generalization across different operating conditions, exceeding conventional supervised approaches. The framework introduces foundation-model-based audio representation learning into the hydro-turbine domain, provides an efficient adaptation strategy tailored to turbine acoustics, and integrates a robust density-based anomaly scoring mechanism. These components jointly reduce dependence on labeled anomalies and support practical deployment for intelligent condition monitoring. Future work will examine model compression, such as knowledge distillation, to enable on-device deployment, and explore semi-/self-supervised learning and multimodal fusion to enhance robustness, scalability, and cross-station adaptability.
- Pretrained audio models,
- Hydropower generating units,
- Anomaly detection,
- Unsupervised deep learning

FullText(HTML)

References(29)

References

[1]	黄紫馨, 李佰霖, 付文龙. 采用PSOGSA算法的水电机组调节系统非线性鲁棒控制研究[J]. 水力发电学报, 2024, 43(6): 101–112. doi: 10.11660/slfdxb.20240610. HUANG Zixin, LI Bailin, and FU Wenlong. Study on nonlinear robust control of hydropower unit regulation system using PSOGSA algorithm[J]. Journal of Hydroelectric Engineering, 2024, 43(6): 101–112. doi: 10.11660/slfdxb.20240610.
[2]	YING Wanming, LI Lunyong, LI Yongbo, et al. Trustworthy multimodal feature-enhanced fusion network for non-contact rotating machinery fault diagnosis[J]. Information Fusion, 2025, 124: 103377. doi: 10.1016/j.inffus.2025.103377.
[3]	BECHARA H, IBRAHIM R, ZEMOURI R, et al. Review of artificial intelligence methods for faults monitoring, diagnosis, and prognosis in hydroelectric synchronous generators[J]. IEEE Access, 2024, 12: 173599–173617. doi: 10.1109/ACCESS.2024.3502546.
[4]	TANG Linjiang, WU Xing, WANG Dongxiao, et al. A comparative experimental study of vibration and acoustic emission on fault diagnosis of low-speed bearing[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 3529211. doi: 10.1109/TIM.2023.3312761.
[5]	XU Shuxian, DAO Fang, ZENG Yun, et al. Fault diagnosis of hydro-turbine runner based on improved masking signal method incorporate RLMD[J]. Applied Acoustics, 2025, 228: 110371. doi: 10.1016/j.apacoust.2024.110371.
[6]	LV Yanchun, XU Lingjiang, YIN Chengyi, et al. Overview of abnormal sound detection for hydroelectric generating units[C]. 2023 7th International Conference on Electrical, Mechanical and Computer Engineering (ICEMCE), Xi’an, China, 2023: 597–604. doi: 10.1109/ICEMCE60359.2023.10490498.
[7]	LIU Yi, XU Yanhe, LIU Jie, et al. Real-time comprehensive health status assessment of hydropower units based on multi-source heterogeneous uncertainty information[J]. Measurement, 2023, 216: 112979. doi: 10.1016/j.measurement.2023.112979.
[8]	钟卫华, 张健, 徐衡, 等. 基于归一化流概率模型的水电机组异常声音检测[J]. 中国农村水利水电, 2024(1): 237–243,256. doi: 10.12396/znsd.230476. ZHANG Weihua, ZHOU Jian, XU Heng, et al. Abnormal sound detection of hydropower units based on normalized flow probability model[J]. China Rural Water and Hydropower, 2024(1): 237–243,256. doi: 10.12396/znsd.230476.
[9]	LUO Jian, WANG Xinyang, and XU Yonggan. Vibration fault diagnosis for hydroelectric generating unit based on generalized S-transform and QPSO-SVM[C]. 2019 IEEE Sustainable Power and Energy Conference (iSPEC), Beijing, China, 2019: 2133–2137. doi: 10.1109/iSPEC48194.2019.8975046.
[10]	XIAO Boyi, ZENG Yun, HU Wenqing, et al. Feature extraction of flow sediment content of hydropower unit based on voiceprint signal[J]. Energies, 2024, 17(5): 1041. doi: 10.3390/en17051041.
[11]	BERNIER S, MERKHOUF A, and AL-HADDAD K. Diagnosis of multiple defects within large hydroelectric generator using stray flux and air gap (distance and flux) measurements[J]. IEEE Transactions on Industry Applications, 2024, 60(6): 8687–8700. doi: 10.1109/TIA.2024.3441519.
[12]	HE Shengming, WANG Zhaocheng, LIAO Bo, et al. Anomaly detection of hydro-turbine based on audio feature extraction of deep convolutional neural network[J]. International Journal of Computer Applications in Technology, 2023, 73(3): 192–202. doi: 10.1504/IJCAT.2023.135584.
[13]	董书琴, 张斌. 基于深度特征学习的网络流量异常检测方法[J]. 电子与信息学报, 2020, 42(3): 695–703. doi: 10.11999/JEIT190266. DONG Shuqin and ZHANG Bin. Network traffic anomaly detection method based on deep features learning[J]. Journal of Electronics & Information Technology, 2020, 42(3): 695–703. doi: 10.11999/JEIT190266.
[14]	SUJATHA V. Investigation on Machine learning based fault detection and estimation in hydro turbines of industrial hydro power plant[J]. Measurement, 2025, 247: 116852. doi: 10.1016/j.measurement.2025.116852.
[15]	XU Xiong, DENG Jiazeng, LIN Haijun, et al. Lightweight anomalous detection of hydro turbine operation sound using fusion network enhanced by load information[J]. IEEE Transactions on Instrumentation and Measurement, 2025, 74: 9600213. doi: 10.1109/TIM.2025.3533632.
[16]	ZHAO Weiqiang, EGUSQUIZA M, VALERO C, et al. On the use of artificial neural networks for condition monitoring of pump-turbines with extended operation[J]. Measurement, 2020, 163: 107952. doi: 10.1016/j.measurement.2020.107952.
[17]	WANG Hongteng, LIU Xuewei, MA Liyong, et al. Anomaly detection for hydropower turbine unit based on variational modal decomposition and deep autoencoder[J]. Energy Reports, 2021, 7: 938–946. doi: 10.1016/j.egyr.2021.09.179.
[18]	IBRAHIM R, ZEMOURI R, KEDJAR B, et al. Non-invasive detection of rotor inter-turn short circuit of a hydrogenerator using AI-based variational autoencoder[J]. IEEE Transactions on Industry Applications, 2024, 60(1): 28–37. doi: 10.1109/TIA.2023.3281311.
[19]	IBRAHIM R, ZEMOURI R, TAHAN A, et al. Early detection of rotor faults in large hydrogenerators using vibration measurements, variational autoencoder, and Euclidean distance[J]. IEEE Transactions on Industry Applications, 2025, 61(6): 9023–9032. doi: 10.1109/TIA.2025.3571883.
[20]	郭铁峰, 贺建军, 申帅, 等. 基于动态规整与改进变分自编码器的异常电池在线检测方法[J]. 电子与信息学报, 2024, 46(2): 738–747. doi: 10.11999/JEIT230084. GUO Tiefeng, HE Jianjun, SHEN Shuai, et al. Abnormal battery on-line detection method based on dynamic time warping and improved variational auto-encoder[J]. Journal of Electronics & Information Technology, 2024, 46(2): 738–747. doi: 10.11999/JEIT230084.
[21]	陈欣, 李紫薇, 张卫君, 等. 深度学习在水电机组故障诊断中的应用与研究[J]. 水电站机电技术, 2024, 47(12): 86–89. doi: 10.13599/j.cnki.11-5130.2024.12.024. CHEN Xin, LI Ziwei, ZHANG Weijun, et al. Application and research of deep learning in fault diagnosis of hydropower units[J]. Mechanical & Electrical Technique of Hydropower Station, 2024, 47(12): 86–89. doi: 10.13599/j.cnki.11-5130.2024.12.024.
[22]	张晨旭, 李圣辰, 邵曦. 基于自编码器的无监督机器异常声检测[J]. 复旦学报: 自然科学版, 2021, 60(3): 297–302. doi: 10.15943/j.cnki.fdxb-jns.2021.03.004. ZHANG Chenxu, LI Shengchen, and SHAO Xi. Unsupervised detection of anomalous sounds for machine based on auto-encoder[J]. Journal of Fudan University: Natural Science, 2021, 60(3): 297–302. doi: 10.15943/j.cnki.fdxb-jns.2021.03.004.
[23]	WILKINGHOFF K. Self-supervised learning for anomalous sound detection[C]. The ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024: 276–280. doi: 10.1109/ICASSP48485.2024.10447156.
[24]	LI Xian, SHAO Nian, and LI Xiaofei. Self-supervised audio teacher-student transformer for both clip-level and frame-level tasks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024, 32: 1336–1351. doi: 10.1109/TASLP.2024.3352248.
[25]	NIIZUMI D, TAKEUCHI D, OHISHI Y, et al. Masked modeling duo: Towards a universal audio pre-training framework[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024, 32: 2391–2406. doi: 10.1109/TASLP.2024.3389636.
[26]	SRIVASTAVA S and SHARMA G. OmniVec2-a novel transformer based network for large scale multimodal and multitask learning[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA: 27402–27414. doi: 10.1109/CVPR52733.2024.02588.
[27]	CHEN Sanyuan, WU Yu, WANG Chengyi, et al. BEATs: Audio pre-training with acoustic tokenizers[C]. The 40th International Conference on Machine Learning, Honolulu, USA, 2023: 5178–5193.
[28]	DENG Jiankang, GUO Jia, XUE Niannan, et al. ArcFace: Additive angular margin loss for deep face recognition[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4685–4694. doi: 10.1109/CVPR.2019.00482.
[29]	XU Xiong, WEN He, LIN Haijun, et al. Online detection method for variable load conditions and anomalous sound of hydro turbines using correlation analysis and PCA-adaptive-K-means[J]. Measurement, 2024, 224: 113846. doi: 10.1016/j.measurement.2023.113846.