Different-region Balance Method for Exploring Varying Causal Relations Between Time Series
-
摘要: 针对探索时间序列之间随时间变化的因果关系问题,在每个窗口进行Granger因果检测的滑动时间窗口方法是求解该问题的常用方法,但其性能对窗宽敏感,不合适的窗宽很可能导致低性能。该文提出一种差异区域平衡方法,首先计算当前滑动窗口W内序列的波动程度Sw并作为波动界,计算窗口W的前向相邻区域U内序列的波动程度Su。然后,实施前向探索策略:若Su未超过Sw,则实施不同长度区域的平衡检测方案,即对窗口W、对窗口W与U的合并区域、对窗口W与后向相邻区域V的合并区域这3种不同长度的差异区域,分别进行时间序列之间因果关系的检测;若Su超过Sw,则实施上述平衡检测方案时,其中区域U和V的长度取相同值。最后,将窗口W的多次检测结果进行综合后输出。新方法将不同长度区域的结果进行综合,能够降低方法的性能对窗宽的敏感性,保障最终结果的准确性和稳定性。在1个模拟数据集和4个真实数据集上的实验结果显示,该文方法能有效地揭示出时间序列之间随时间变化的因果关系,在正确率高且性能稳定的综合性能上优于对比方法。
-
关键词:
- 时间序列 /
- 变化的因果关系 /
- Granger因果检测 /
- 差异区域平衡
Abstract: For discovering time-varying causal relations between time series, a common method is the sliding-window method with Granger causal tests on every window. However, the method performance is sensitive to window sizes, and an unsuitable size probably leads to poor performance. The different-region balance method is proposed. The variation degree of time series in current sliding window W (called variation bound Sw) is first computed, and the degree Su in front neighbor region U of W is computed. Then a forward exploring strategy is adopted: when Su≤Sw, a different-length-region balance test measure is carried out, i.e., causal-relation tests respectively in window W, combined region W and U, and combined window W and back neighbor region V of W; when Su>Sw, it uses the above-mentioned measure where region V has the same length as region U; Finally, in each region, all the test results are synthesized to give a final result. The new method combines the results from different-length regions to reduce its sensitivity to window sizes, and guarantees the accuracy and stability of final results. The experiments on one simulated data set and four real data sets show that, the new method can discover time-varying causal relations between time series effectively, and outperforms the compared methods on the balance performance of high accuracy and stability. -
表 1 不同方法在模拟数据集上发掘因果关系的正确率(%)
窗口宽度 滑动步长 噪声方差0.01 噪声方差0.2 噪声方差0.5 常规 F界 转折 平衡 常规 F界 转折 平衡 常规 F界 转折 平衡 20 5 91.32 88.18 90.99 95.40 80.82 94.56 58 84.13 80.57 91.00 50.29 82.52 10 88.98 88.18 91.91 95.06 80.25 94.56 53.39 83.33 80.15 91.00 46.33 82.54 15 86.78 88.18 90.00 94.43 78.1 94.56 44.86 83.26 78.52 91.00 46.22 82.03 20 83.65 88.18 89.17 92.77 77.05 94.56 45.4 82.15 77.98 91.00 45.48 81.87 30 5 95.85 88.18 90.61 95.87 85.69 94.56 60.4 92.31 83.63 91.00 47.37 87.78 10 94.43 88.18 90.15 95.49 84.95 94.56 53.01 91.95 83.69 91.00 46.4 87.41 15 94.07 88.18 90.85 94.91 83.22 94.56 51.62 91.65 81.93 91.00 45.98 86.84 20 92.96 88.18 90.95 95.57 81.62 94.56 53.21 91.37 81.18 91.00 46.38 86.96 40 5 95.56 88.18 91.83 94.85 92.46 94.56 63.52 94.65 87.62 91.00 46.43 92.17 10 95.31 88.18 90.07 94.87 91.05 94.56 58.89 94.27 87.08 91.00 45.38 92.22 15 94.95 88.18 90.65 94.77 90.62 94.56 58.77 94.55 86.87 91.00 46.39 91.76 20 94.59 88.18 89.9 94.31 89.5 94.56 52.31 93.93 85.86 91.00 45.80 91.03 表 2 在数据集Dropoff-tweet上发掘因果关系的正确率(%)
窗口
宽度滑动
步长常规滑
动窗F界检测法 转折点法 差异平衡法 12 4 91.95 92.62 93.56 93.42 8 90.87 92.62 93.56 93.83 12 89.26 92.62 94.36 90.20 18 4 94.09 92.62 94.90 95.30 8 94.36 92.62 94.36 94.77 12 92.21 92.62 94.36 94.77 24 4 94.36 92.62 94.09 96.51 8 97.05 92.62 96.24 96.78 12 91.95 92.62 91.41 95.70 表 3 在数据集Tweet-pickup上发掘因果关系的正确率(%)
窗口
宽度滑动
步长常规滑
动窗F界检测法 转折点法 差异平衡法 12 4 90.87 94.90 90.87 93.29 8 91.95 94.90 94.09 94.09 12 92.48 94.90 94.09 91.01 18 4 92.48 94.90 88.19 94.90 8 92.48 94.90 93.83 94.90 12 93.02 94.90 94.09 93.02 24 4 92.75 94.90 91.41 95.44 8 93.29 94.90 82.82 95.97 12 92.21 94.90 95.44 95.44 表 4 在数据集Fish-school上发掘因果关系的正确率(%)
窗口
宽度滑动
步长常规滑动窗 F界检测法 转折点法 差异平衡法 140 10 89.60 54.19 69.80 90.1 20 86.24 54.19 69.80 93.29 30 91.28 54.19 69.80 95.64 150 10 89.60 54.19 69.80 84.90 20 83.22 54.19 69.80 91.28 30 89.93 54.19 69.80 99.66 160 10 83.22 54.19 69.80 86.91 20 69.80 54.19 69.80 93.62 30 81.54 54.19 69.80 92.95 表 5 在数据集Baboon-troop上发掘因果关系的正确率(%)
窗口
宽度滑动
步长常规滑动窗 F界检测法 转折点法 差异平衡法 110 10 80.63 35.39 59.10 80.63 20 70.62 35.39 59.10 82.30 30 70.62 35.39 59.10 80.47 120 10 80.63 35.39 59.10 81.64 20 62.27 35.39 59.10 83.31 30 63.94 35.39 59.10 83.97 130 10 80.80 35.39 59.10 82.30 20 75.79 35.39 59.10 83.97 30 82.30 35.39 59.10 82.97 -
[1] XIE Feng, CAI Ruichu, ZENG Yan, et al. An efficient entropy-based causal discovery method for linear structural equation models with IID noise variables[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(5): 1667–1680. doi: 10.1109/TNNLS.2019.2921613 [2] YANG Jing, GUO Xiaoxue, AN Ning, et al. Streaming feature-based causal structure learning algorithm with symmetrical uncertainty[J]. Information Sciences, 2018, 467: 708–724. doi: 10.1016/j.ins.2018.04.076 [3] 任伟杰, 韩敏. 多元时间序列因果关系分析研究综述[J/OL]. 自动化学报, https://doi.org/10.16383/j.aas.c180189, 2019.REN Weijie and HAN Min. Survey on causality analysis of multivariate time series[J/OL]. Acta Automatica Sinica, https://doi.org/10.16383/j.aas.c180189, 2019. [4] HUANG Biwei, ZHANG Kun, GONG Mingming, et al. Causal discovery and forecasting in nonstationary environments with state-space models[C]. The 36th International Conference on Machine Learning, Long Beach, USA, 2019. [5] DU Sizhen, SONG Guojie, HAN Lei, et al. Temporal causal inference with time lag[J]. Neural Computation, 2018, 30(1): 271–291. doi: 10.1162/neco_a_01028 [6] GRANGER C W J. Investigating causal relations by econometric models and cross-spectral methods[J]. Econometrica, 1969, 37(3): 424–438. doi: 10.2307/1912791 [7] ORJUELA-CAÑÓN A D, CERQUERA A, FREUND J A, et al. Sleep apnea: Tracking effects of a first session of CPAP therapy by means of Granger causality[J]. Computer Methods and Programs in Biomedicine, 2020, 187: 105235. doi: 10.1016/j.cmpb.2019.105235 [8] 范立夫, 赵善学, 张永军. 信贷结构和产业结构的相互影响研究——基于异质面板数据的格兰杰因果检验[J]. 宏观经济研究, 2019(6): 73–82. doi: 10.16304/j.cnki.11-3952/f.2019.06.007FAN Lifu, ZHAO Shanxue, and ZHANG Yongjun. Research on the interaction between credit structure and industrial structure——Granger causality test based on heterogeneous panel data[J]. Scientific Management Research, 2019(6): 73–82. doi: 10.16304/j.cnki.11-3952/f.2019.06.007 [9] 李永立, 吴冲. 基于多变量的Granger因果检验方法[J]. 数理统计与管理, 2014, 33(1): 50–58. doi: 10.13860/j.cnki.sltj.2014.01.003LI Yongli and WU Chong. The Granger causality test method based on the multiple variables[J]. Journal of Applied Statistics and Management, 2014, 33(1): 50–58. doi: 10.13860/j.cnki.sltj.2014.01.003 [10] REN Weijie, LI Baisong, and HAN Min. A novel Granger causality method based on HSIC-Lasso for revealing nonlinear relationship between multivariate time series[J]. Physica A: Statistical Mechanics and its Applications, 2020, 541: 123245. doi: 10.1016/j.physa.2019.123245 [11] FINKLE J D, WU J J, and BAGHERI N. Windowed Granger causal inference strategy improves discovery of gene regulatory networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2018, 115(9): 2252–2257. doi: 10.1073/pnas.1710936115 [12] CHANG T, TSAI S L, and HAGA K Y A. Uncovering the interrelationship between the U. S. stock and housing markets: A bootstrap rolling window Granger causality approach[J]. Applied Economics, 2017, 49(58): 5841–5848. doi: 10.1080/00036846.2017.1346365 [13] LI Zhenhui, ZHENG Guanjie, AGARWAL A, et al. Discovery of causal time intervals[C]. 2017 SIAM International Conference on Data Mining, Houston, USA, 2017: 804–812. [14] MASNADI-SHIRAZI M, MAURYA M R, PAO G, et al. Time varying causal network reconstruction of a mouse cell cycle[J]. BMC Bioinformatics, 2019, 20: 294. doi: 10.1186/s12859-019-2895-1 [15] AMORNBUNCHORNVEJ C, ZHELEVA E, and BERGER-WOLF T Y. Variable-lag granger causality for time series analysis[C]. 2019 IEEE International Conference on Data Science and Advanced Analytics, Washington, USA, 2019.