Advanced Search
Turn off MathJax
Article Contents
WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun. Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250934
Citation: WU Ting, WEN Shulin, YAN Zhaoli, FU Gaoyuan, LI Linfeng, LIU Xudu, CHENG Xiaobin, YANG Jun. Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250934

Unsupervised Anomaly Detection of Hydro-Turbine Generator Acoustics by Integrating Pre-Trained Audio Large Model and Density Estimation

doi: 10.11999/JEIT250934 cstr: 32379.14.JEIT250934
Funds:  China Yangtze Power Co., Ltd. - Research and demonstration application of acoustic monitoring system for river basin power station equipment status based on industrial Internet platform (Z152302048)
  • Accepted Date: 2025-11-12
  • Rev Recd Date: 2025-11-12
  • Available Online: 2025-11-18
  •   Objective  Hydro-turbine generator units (HTGUs) require reliable early-stage fault detection to ensure operational safety and reduce maintenance costs. Acoustic signals provide a non-intrusive and sensitive monitoring modality, yet their use is hindered by complex structural acoustics, strong background noise, and the scarcity of abnormal data. This work presents an unsupervised acoustic anomaly detection framework that integrates a large-scale pretrained audio model with density-based k-nearest neighbors estimation, enabling accurate anomaly detection using only normal data while maintaining robustness and strong generalization across diverse HTGU conditions.  Methods  The proposed framework performs unsupervised acoustic anomaly detection for HTGUs using only normal data. Time-domain signals are preprocessed with Z-score normalization and Fbank features, followed by random masking to enhance robustness and generalization. A large-scale pretrained BEATs model serves as the feature encoder, and an Attentive Statistical Pooling module is applied to aggregate frame-level representations into discriminative segment-level embeddings by emphasizing informative frames. To improve class separability, an ArcFace loss replaces the conventional classification layer during training. A warm-up learning rate strategy is adopted to ensure stable convergence. During inference, density-based k-nearest neighbors estimation is performed on the learned embeddings to detect acoustic anomalies.  Results and Discussions  This study verifies the effectiveness of the proposed unsupervised acoustic anomaly detection framework for HTGUs using data collected from eight real-world machines. As shown in Fig. 7 and Table 2, large-scale pretrained audio representations significantly outperform traditional features in distinguishing abnormal sounds. With the FED-KE algorithm, the method achieves high accuracy across six metrics, with Hmean reaching 98.7% in the wind tunnel and over 99.9% in the slip-ring environment, demonstrating strong robustness under complex industrial conditions. As shown in Table 4, ablation studies confirm the complementary contributions of feature enhancement, ASP-based representation refinement, and density-based k-NN inference. The framework requires only normal data for training, reducing dependence on scarce fault labels and improving practical applicability. Remaining challenges include computational cost due to the pretrained model and the lack of multimodal fusion, which will be investigated in future work.  Conclusions  This study proposes an unsupervised acoustic anomaly detection framework for HTGUs, addressing the scarcity of fault samples and the complexity of industrial acoustic environments. A pretrained large-scale audio foundation model is adopted and further fine-tuned with turbine-specific strategies to enhance the modeling of normal operational acoustics. During inference, a density-estimation-based k-NN mechanism is employed to detect abnormal patterns using only normal data. Experiments on real-world hydropower station recordings demonstrate high detection accuracy and strong generalization across diverse operating conditions, outperforming conventional supervised approaches. The framework introduces foundation-model-based audio representation learning into the hydro-turbine domain, establishes an efficient adaptation strategy tailored to turbine acoustics, and integrates a robust density-based anomaly scoring mechanism. These components jointly reduce reliance on labeled anomalies and enable practical deployment for intelligent condition monitoring. Future work will investigate model compression, such as knowledge distillation, to support on-device deployment, and explore semi-/self-supervised learning and multimodal fusion to further enhance robustness, scalability, and cross-station adaptability.
  • loading
  • [1]
    黄紫馨, 李佰霖, 付文龙. 采用PSOGSA算法的水电机组调节系统非线性鲁棒控制研究[J]. 水力发电学报, 2024, 43(6): 101–112. doi: 10.11660/slfdxb.20240610.

    HUANG Zixin, LI Bailin, and FU Wenlong. Study on nonlinear robust control of hydropower unit regulation system using PSOGSA algorithm[J]. Journal of Hydroelectric Engineering, 2024, 43(6): 101–112. doi: 10.11660/slfdxb.20240610.
    [2]
    YING Wanming, LI Lunyong, LI Yongbo, et al. Trustworthy multimodal feature-enhanced fusion network for non-contact rotating machinery fault diagnosis[J]. Information Fusion, 2025, 124: 103377. doi: 10.1016/j.inffus.2025.103377.
    [3]
    BECHARA H, IBRAHIM R, ZEMOURI R, et al. Review of artificial intelligence methods for faults monitoring, diagnosis, and prognosis in hydroelectric synchronous generators[J]. IEEE Access, 2024, 12: 173599–173617. doi: 10.1109/ACCESS.2024.3502546.
    [4]
    TANG Linjiang, WU Xing, WANG Dongxiao, et al. A comparative experimental study of vibration and acoustic emission on fault diagnosis of low-speed bearing[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 3529211. doi: 10.1109/TIM.2023.3312761.
    [5]
    XU Shuxian, DAO Fang, ZENG Yun, et al. Fault diagnosis of hydro-turbine runner based on improved masking signal method incorporate RLMD[J]. Applied Acoustics, 2025, 228: 110371. doi: 10.1016/j.apacoust.2024.110371.
    [6]
    LV Yanchun, XU Lingjiang, YIN Chengyi, et al. Overview of abnormal sound detection for hydroelectric generating units[C]. Proceedings of 2023 7th International Conference on Electrical, Mechanical and Computer Engineering (ICEMCE), Xi’an, China, 2023: 597–604. doi: 10.1109/ICEMCE60359.2023.10490498.
    [7]
    LIU Yi, XU Yanhe, LIU Jie, et al. Real-time comprehensive health status assessment of hydropower units based on multi-source heterogeneous uncertainty information[J]. Measurement, 2023, 216: 112979. doi: 10.1016/j.measurement.2023.112979.
    [8]
    钟卫华, 张健, 徐衡, 等. 基于归一化流概率模型的水电机组异常声音检测[J]. 中国农村水利水电, 2024(1): 237–243,256. doi: 10.12396/znsd.230476.

    ZHANG Weihua, ZHOU Jian, XU Heng, et al. Abnormal sound detection of hydropower units based on normalized flow probability model[J]. China Rural Water and Hydropower, 2024(1): 237–243,256. doi: 10.12396/znsd.230476.
    [9]
    LUO Jian, WANG Xinyang, and XU Yonggan. Vibration fault diagnosis for hydroelectric generating unit based on generalized S-transform and QPSO-SVM[C]. Proceedings of 2019 IEEE Sustainable Power and Energy Conference (iSPEC), Beijing, China, 2019: 2133–2137. doi: 10.1109/iSPEC48194.2019.8975046.
    [10]
    XIAO Boyi, ZENG Yun, HU Wenqing, et al. Feature extraction of flow sediment content of hydropower unit based on voiceprint signal[J]. Energies, 2024, 17(5): 1041. doi: 10.3390/en17051041.
    [11]
    BERNIER S, MERKHOUF A, and AL-HADDAD K. Diagnosis of multiple defects within large hydroelectric generator using stray flux and air gap (distance and flux) measurements[J]. IEEE Transactions on Industry Applications, 2024, 60(6): 8687–8700. doi: 10.1109/TIA.2024.3441519.
    [12]
    HE Shengming, WANG Zhaocheng, LIAO Bo, et al. Anomaly detection of hydro-turbine based on audio feature extraction of deep convolutional neural network[J]. International Journal of Computer Applications in Technology, 2023, 73(3): 192–202. doi: 10.1504/IJCAT.2023.135584.
    [13]
    董书琴, 张斌. 基于深度特征学习的网络流量异常检测方法[J]. 电子与信息学报, 2020, 42(3): 695–703. doi: 10.11999/JEIT190266.

    DONG Shuqin and ZHANG Bin. Network traffic anomaly detection method based on deep features learning[J]. Journal of Electronics & Information Technology, 2020, 42(3): 695–703. doi: 10.11999/JEIT190266.
    [14]
    SUJATHA V. Investigation on Machine learning based fault detection and estimation in hydro turbines of industrial hydro power plant[J]. Measurement, 2025, 247: 116852. doi: 10.1016/j.measurement.2025.116852.
    [15]
    XU Xiong, DENG Jiazeng, LIN Haijun, et al. Lightweight anomalous detection of hydro turbine operation sound using fusion network enhanced by load information[J]. IEEE Transactions on Instrumentation and Measurement, 2025, 74: 9600213. doi: 10.1109/TIM.2025.3533632.
    [16]
    ZHAO Weiqiang, EGUSQUIZA M, VALERO C, et al. On the use of artificial neural networks for condition monitoring of pump-turbines with extended operation[J]. Measurement, 2020, 163: 107952. doi: 10.1016/j.measurement.2020.107952.
    [17]
    WANG Hongteng, LIU Xuewei, MA Liyong, et al. Anomaly detection for hydropower turbine unit based on variational modal decomposition and deep autoencoder[J]. Energy Reports, 2021, 7: 938–946. doi: 10.1016/j.egyr.2021.09.179.
    [18]
    IBRAHIM R, ZEMOURI R, KEDJAR B, et al. Non-invasive detection of rotor inter-turn short circuit of a hydrogenerator using AI-based variational autoencoder[J]. IEEE Transactions on Industry Applications, 2024, 60(1): 28–37. doi: 10.1109/TIA.2023.3281311.
    [19]
    IBRAHIM R, ZEMOURI R, TAHAN A, et al. Early detection of rotor faults in large hydrogenerators using vibration measurements, variational autoencoder, and Euclidean distance[J]. IEEE Transactions on Industry Applications, 2025, 61(6): 9023–9032. doi: 10.1109/TIA.2025.3571883.
    [20]
    郭铁峰, 贺建军, 申帅, 等. 基于动态规整与改进变分自编码器的异常电池在线检测方法[J]. 电子与信息学报, 2024, 46(2): 738–747. doi: 10.11999/JEIT230084.

    GUO Tiefeng, HE Jianjun, SHEN Shuai, et al. Abnormal battery on-line detection method based on dynamic time warping and improved variational auto-encoder[J]. Journal of Electronics & Information Technology, 2024, 46(2): 738–747. doi: 10.11999/JEIT230084.
    [21]
    陈欣, 李紫薇, 张卫君, 等. 深度学习在水电机组故障诊断中的应用与研究[J]. 水电站机电技术, 2024, 47(12): 86–89. doi: 10.13599/j.cnki.11-5130.2024.12.024.

    CHEN Xin, LI Ziwei, ZHANG Weijun, et al. Application and research of deep learning in fault diagnosis of hydropower units[J]. Mechanical & Electrical Technique of Hydropower Station, 2024, 47(12): 86–89. doi: 10.13599/j.cnki.11-5130.2024.12.024.
    [22]
    张晨旭, 李圣辰, 邵曦. 基于自编码器的无监督机器异常声检测[J]. 复旦学报: 自然科学版, 2021, 60(3): 297–302. doi: 10.15943/j.cnki.fdxb-jns.2021.03.004.

    ZHANG Chenxu, LI Shengchen, and SHAO Xi. Unsupervised detection of anomalous sounds for machine based on auto-encoder[J]. Journal of Fudan University: Natural Science, 2021, 60(3): 297–302. doi: 10.15943/j.cnki.fdxb-jns.2021.03.004.
    [23]
    WILKINGHOFF K. Self-supervised learning for anomalous sound detection[C]. Proceedings of the ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024: 276–280. doi: 10.1109/ICASSP48485.2024.10447156.
    [24]
    LI Xian, SHAO Nian, and LI Xiaofei. Self-supervised audio teacher-student transformer for both clip-level and frame-level tasks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024, 32: 1336–1351. doi: 10.1109/TASLP.2024.3352248.
    [25]
    NIIZUMI D, TAKEUCHI D, OHISHI Y, et al. Masked modeling duo: Towards a universal audio pre-training framework[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024, 32: 2391–2406. doi: 10.1109/TASLP.2024.3389636.
    [26]
    SRIVASTAVA S and SHARMA G. OmniVec2-a novel transformer based network for large scale multimodal and multitask learning[C]. Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA: 27402–27414. doi: 10.1109/CVPR52733.2024.02588.
    [27]
    CHEN Sanyuan, WU Yu, WANG Chengyi, et al. BEATs: Audio pre-training with acoustic tokenizers[C]. Proceedings of the 40th International Conference on Machine Learning, Honolulu, USA, 2023: 5178–5193.
    [28]
    DENG Jiankang, GUO Jia, XUE Niannan, et al. ArcFace: Additive angular margin loss for deep face recognition[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4685–4694. doi: 10.1109/CVPR.2019.00482.
    [29]
    XU Xiong, WEN He, LIN Haijun, et al. Online detection method for variable load conditions and anomalous sound of hydro turbines using correlation analysis and PCA-adaptive-K-means[J]. Measurement, 2024, 224: 113846. doi: 10.1016/j.measurement.2023.113846.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(4)

    Article Metrics

    Article views (30) PDF downloads(4) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return