Advanced Search
Turn off MathJax
Article Contents
YANG Zhenzhen, XU Yi, WANG Chengye, YANG Yongpeng. Multi-scale Frequency Adapter and Dual-path Attention for Time Series Forecasting[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251188
Citation: YANG Zhenzhen, XU Yi, WANG Chengye, YANG Yongpeng. Multi-scale Frequency Adapter and Dual-path Attention for Time Series Forecasting[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251188

Multi-scale Frequency Adapter and Dual-path Attention for Time Series Forecasting

doi: 10.11999/JEIT251188 cstr: 32379.14.JEIT251188
Funds:  the National Natural Science Foundation of China (No. 62571269), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province (Nos. KYCX24_1125, SJCX24_0279)
  • Accepted Date: 2026-01-13
  • Rev Recd Date: 2026-01-13
  • Available Online: 2026-03-06
  •   Objective  With the rapid development of big data technology, time series data has been increasingly applied in areas such as meteorology, power systems, and finance. Nonetheless, mainstream methods for time series forecasting face notable challenges in multi-scale modeling and frequency-domain feature extraction, which prevents the comprehensive capture of crucial dynamic properties and periodic patterns in complex datasets. Traditional statistical approaches, including ARIMA, rely on assumptions of linear relationships, resulting in poor performance when handling nonlinear or high-dimensional time series data. Although deep learning methods, notably those based on convolutional neural network and Transformer, have improved forecasting accuracy through advanced feature extraction and long-range dependency modeling, limitations remain in the ability to efficiently extract and fuse multi-scale features, both in the temporal and frequency domains. These deficiencies lead to instability and suboptimal accuracy, particularly in dynamic and high-variety applications. This paper aims to address these challenges by proposing an intelligent forecasting framework that effectively models multi-scale information and enhances prediction accuracy in diverse scenarios.  Methods  The proposed method introduces a multi-scale frequency adapter and dual-path attention (MFADA) framework for time series forecasting. The framework integrates the multi-scale frequency adapter (MFA) and the multi-scale dual-path attention (MDA) two key modules. The MFA module efficiently captures multi-scale frequency features using the adaptive pooling and deep convolutions, which enhances the sensitivity to various frequency components and supports modeling of short-term and long-term dependencies. The MDA module applies a multi-scale attention mechanism to strengthen fine-grained modeling across both the temporal and feature dimensions, enabling effective extraction and fusion of comprehensive time and frequency information. The entire framework is designed with computational efficiency in mind to ensure scalability. Experimental validation on 8 public datasets demonstrates the superior performance and robustness compared to existing mainstream time series forecasting approaches.  Results and Discussions  Extensive experiments were conducted on 8 publicly available multivariate datasets, including ECL, Weather, ETT (ETTm1, ETTm2, ETTh1, ETTh2), Solar-Energy, and Traffic. The evaluation metrics used were mean absolute error (MAE) and mean squared error (MSE), with additional consideration given to parameter count, FLOPs, and training time for computational efficiency. Experimental comparisons with state-of-the-art models including Fredformer, Peri-midFormer, iTransformer, TFformer, PatchTST、MSGNet、TimesNet、TCM, show that the proposed MFADA consistently achieves superior forecasting performance across most datasets and forecasting horizons (Table 1), with the best average MSE and MAE of 0.163 and 0.261 on ECL and a 13.2% and 17.3% decrease versus TimesNet for forecasting length 96. On the periodic ETTm1 dataset, the average MSE reaches 0.377, outperforming MSGNet by 5.3%. Ablation studies (Table 2) demonstrate the importance of both MFA and MDA modules: removing MFA or reverting MDA to standard self-attention increases error rates on ECL, Weather, ETTh1, and ETTh2, indicating the synergistic contribution to modeling complexity. Complexity analysis (Fig. 2) reveals that MFADA achieves optimal balance among forecasting accuracy, parameter efficiency, and training time, outperforming Fredformer, MSGNet, and TimesNet. Visualization results for ECL and ETTh2 (Fig. 3, Fig. 4) confirm the ability of MFADA to track ground truth trends, forecast turning points, and outperform baselines in both global and local prediction fidelity. Notably, MFADA performance lags on the Traffic dataset due to its high spatial correlation, highlighting future directions for spatial structure integration.  Conclusions  This paper proposes MFADA, a novel time series forecasting method integrating multi-scale frequency adaptation and dual-path attention mechanisms. MFADA stands out with four key strengths: (1) The MFA module effectively extracts and merges multi-scale frequency-domain features, emphasizing diverse temporal scales through pyramid pooling and channel gating; (2) The MDA module captures multi-scale dependencies along both temporal and feature dimensions, enabling fine-grained dynamic modeling; (3) The architecture maintains computational efficiency using lightweight convolution and pooling operations; (4) Superior results across 8 datasets and various forecasting lengths demonstrate robust generalization, especially for multivariate and long-term forecasting scenarios. The extensive experiments confirm that MFADA advances the state-of-the-art in accurate and efficient time series forecasting, offering promising perspectives for both academic research and practical deployment. Future work will explore spatial correlation integration to further enhance model applicability.
  • loading
  • [1]
    KONG Xiangjie, CHEN Zhenghao, LIU Weiyao, et al. Deep learning for time series forecasting: A survey[J]. International Journal of Machine Learning and Cybernetics, 2025, 16(5): 5079–5112. doi: 10.1007/s13042-025-02560-w.
    [2]
    ZHONG Weiyi, ZHAI Dengshuai, XU Wenran, et al. Accurate and efficient daily carbon emission forecasting based on improved ARIMA[J]. Applied Energy, 2024, 376: 124232. doi: 10.1016/j.apenergy.2024.124232.
    [3]
    潘金伟, 王乙乔, 钟博, 等. 基于统计特征搜索的多元时间序列预测方法[J]. 电子与信息学报, 2024, 46(8): 3276–3284. doi: 10.11999/JEIT231264.

    PAN Jinwei, WANG Yiqiao, ZHONG Bo, et al. Statistical feature-based search for multivariate time series forecasting[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3276–3284. doi: 10.11999/JEIT231264.
    [4]
    DA SILVA D G and DE MOURA MENESES A A M. Comparing long short-term memory (LSTM) and bidirectional LSTM deep neural networks for power consumption prediction[J]. Energy Reports, 2023, 10: 3315–3334. doi: 10.1016/j.egyr.2023.09.175.
    [5]
    郑庆河, 李秉霖, 于治国, 等. 深度学习使能的自动调制分类技术研究进展[J]. 电子与信息学报, 2025, 47(11): 4096–4111. doi: 10.11999/JEIT250674.

    ZHENG Qinghe, LI Binglin, YU Zhiguo, et al. Research progress of deep learning enabled automatic modulation classification technology[J]. Journal of Electronics & Information Technology, 2025, 47(11): 4096–4111. doi: 10.11999/JEIT250674.
    [6]
    刘辉, 冯浩然, 马佳妮, 等. 融合空间自注意力感知的严重缺失多元时间序列插补算法[J]. 电子与信息学报, 2025, 47(10): 3917–3928. doi: 10.11999/JEIT250220.

    LIU Hui, FENG Haoran, MA Jiani, et al. Spatial self-attention incorporated imputation algorithm for severely missing multivariate time series[J]. Journal of Electronics & Information Technology, 2025, 47(10): 3917–3928. doi: 10.11999/JEIT250220.
    [7]
    RABBANI M B A, MUSARAT M A, ALALOUL W S, et al. A comparison between seasonal autoregressive integrated moving average (SARIMA) and exponential smoothing (ES) based on time series model for forecasting road accidents[J]. Arabian Journal for Science and Engineering, 2021, 46(11): 11113–11138. doi: 10.1007/s13369-021-05650-3.
    [8]
    WU Haixu, HU Tengge, LIU Yong, et al. TimesNet: Temporal 2D-variation modeling for general time series analysis[C]. Proceedings of the 11th International Conference on Learning Representations, Kigali, Rwanda, 2023. (查阅网上资料, 未找到页码和doi信息, 请确认).
    [9]
    COUTINHO E R, MADEIRA J G F, BORGES D G F, et al. Multi-step forecasting of meteorological time series using CNN-LSTM with decomposition methods[J]. Water Resources Management, 2025, 39(7): 3173–3198. doi: 10.1007/s11269-025-04102-z.
    [10]
    CAI Wanlin, LIANG Yuxuan, LIU Xianggen, et al. MSGNet: Learning multi-scale inter-series correlations for multivariate time series forecasting[C]. Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 11141–11149. doi: 10.1609/aaai.v38i10.28991.
    [11]
    YUNITA A, PRATAMA M H D I, ALMUZAKKI M Z, et al. Performance analysis of neural network architectures for time series forecasting: A comparative study of RNN, LSTM, GRU, and hybrid models[J]. MethodsX, 2025, 15: 103462. doi: 10.1016/j.mex.2024.103462.
    [12]
    YADAV H and THAKKAR A. NOA-LSTM: An efficient LSTM cell architecture for time series forecasting[J]. Expert Systems with Applications, 2024, 238: 122333. doi: 10.1016/j.eswa.2023.122333.
    [13]
    UBAL C, DI-GIORGI G, CONTRERAS-REYES J E, et al. Predicting the long-term dependencies in time series using recurrent artificial neural networks[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1340–1358. doi: 10.3390/make5040068.
    [14]
    ZENG Ailing, CHEN Muxi, ZHANG Lei, et al. Are transformers effective for time series forecasting?[C]. Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, USA, 2023: 11121–11128. doi: 10.1609/aaai.v37i9.26317.
    [15]
    JIANG Hongwei, LIU Dongsheng, DING Xinyi, et al. TCM: An efficient lightweight MLP-based network with affine transformation for long-term time series forecasting[J]. Neurocomputing, 2025, 617: 128960. doi: 10.1016/j.neucom.2024.128960.
    [16]
    ZHOU Haoyi, ZHANG Shanghang, PENG Jieqi, et al. Informer: Beyond efficient transformer for long sequence time-series forecasting[C]. Proceedings of the 35th AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2021: 11106–11115. doi: 10.1609/aaai.v35i12.17325. (查阅网上资料,未找到出版地信息,请确认).
    [17]
    WU Haixu, XU Jiehui, WANG Jianmin, et al. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting[C]. Proceedings of the 35th Conference on Neural Information Processing Systems, Red Hook, USA, 2021: 22419–22430. (查阅网上资料, 未找到出版地和doi信息, 请确认).
    [18]
    ZHOU Tian, MA Ziqing, WEN Qingsong, et al. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting[C]. Proceedings of the International Conference on Machine Learning, Baltimore, USA, 2022: 27268–27286.
    [19]
    NIE Yuqi, NGUYEN N H, SINTHONG P, et al. A time series is worth 64 words: Long-term forecasting with transformers[C]. Proceedings of the 11th International Conference on Learning Representations, Kigali, Rwanda, 2023. (查阅网上资料, 未找到页码和doi信息, 请确认).
    [20]
    WU Qiang, YAO Gechang, FENG Zhixi, et al. Peri-midFormer: Periodic pyramid transformer for time series analysis[C]. Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2024: 415. doi: 10.52202/079017-0415.
    [21]
    LIU Yong, HU Tengge, ZHANG Haoran, et al. iTransformer: Inverted transformers are effective for time series forecasting[C]. Proceedings of the 12th International Conference on Learning Representations, Vienna, Austria, 2024. (查阅网上资料, 未找到页码和doi信息, 请确认).
    [22]
    ZHAO Tianlong, FANG Lexin, MA Xiang, et al. TFformer: A time-frequency domain bidirectional sequence-level attention based transformer for interpretable long-term sequence forecasting[J]. Pattern Recognition, 2025, 158: 110994. doi: 10.1016/j.patcog.2024.110994.
    [23]
    ZHOU Tian, NIU Peisong, WANG Xue, et al. One fits all: Power general time series analysis by pretrained LM[C]. Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, USA, 2023: 1877.
    [24]
    PIAO Xihao, CHEN Zheng, MURAYAMA T, et al. Fredformer: Frequency debiased transformer for time series forecasting[C]. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 2024: 2400–2410. doi: 10.1145/3637528.3671928.
    [25]
    GAO Shixuan, ZHANG Pingping, YAN Tianyu, et al. Multi-scale and detail-enhanced segment anything model for salient object detection[C]. Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 2024: 9894–9903. doi: 10.1145/3664647.3680650.
    [26]
    SI Yunzhong, XU Huiying, ZHU Xinzhong, et al. SCSA: Exploring the synergistic effects between spatial and channel attention[J]. Neurocomputing, 2025, 634: 129866. doi: 10.1016/j.neucom.2025.129866.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(4)  / Tables(2)

    Article Metrics

    Article views (13) PDF downloads(2) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return