最大期望模拟退火的贝叶斯变分推理算法

刘浩然; 张力悦; 苏昭玉; 张赟; 张磊

doi:10.11999/JEIT200389

最大期望模拟退火的贝叶斯变分推理算法

doi: 10.11999/JEIT200389

刘浩然^{1, 2, ,},
张力悦^{1, 2},
苏昭玉^{1, 2},
张赟³,
张磊³

1.
燕山大学信息科学与工程学院秦皇岛 066004
2.
河北省特种光纤与光纤传感重点实验室秦皇岛 066004
3.
北京市机电研究院北京 100027

基金项目: 国家重点研发项目(2019YFB1707301)，河北省人才工程培养资助项目(A201903005)

详细信息

作者简介:
刘浩然：男，1980年生，教授，研究方向为贝叶斯算法、工业故障诊断及预测

张力悦：男，1994年生，博士生，研究方向为贝叶斯算法、工业故障诊断及预测

苏昭玉：女，1994年生，硕士生，研究方向为贝叶斯算法、工业故障诊断及预测

张赟：女，1979年生，博士，研究方向为机械设计及原理、系统建模

张磊：男，1991年生，学士，研究方向为数控机床在线测量及系统建模

通讯作者:
刘浩然　liu.haoran@ysu.edu.cn

中图分类号: TN911.7
计量
- 文章访问数: 913
- HTML全文浏览量: 466
- PDF下载量: 90
- 被引次数: 0
出版历程
- 收稿日期: 2020-05-15
- 修回日期: 2021-03-19
- 网络出版日期: 2021-04-15
- 刊出日期: 2021-07-10

Bayesian Variational Inference Algorithm Based on Expectation-Maximization and Simulated Annealing

Haoran LIU^{1, 2
, ,},
Liyue ZHANG^{1, 2},
Zhaoyu SU^{1, 2},
Yun ZHANG³,
Lei ZHANG³

1.
School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
2.
The Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Yanshan University, Qinhuangdao 066004, China
3.
Beijing Institute of Mechanical and Electrical Engineering, Beijing 100027, China

Funds: The National Key Research and Development Program of China (2019YFB1707301), Hebei Talent Engineering Training Support Project(A201903005)

摘要

摘要: 针对贝叶斯变分推理收敛精度低和搜索过程中易陷入局部最优的问题，该文基于模拟退火理论(SA)和最大期望理论(EM)，考虑变分推理过程中初始先验对最终结果的影响和变分自由能的优化效率问题，构建了双重EM模型学习变分参数的初始先验，以降低初始先验的敏感性，同时构建逆温度参数改进变分自由能函数，使变分自由能在优化过程得到有效控制，并提出一种基于最大期望模拟退火的贝叶斯变分推理算法。该文使用收敛性准则理论分析算法的收敛性，利用所提算法对一个混合高斯分布实例进行实验仿真，实验结果表明该算法具有较优的收敛结果。
- 贝叶斯变分推理 /
- 模拟退火 /
- 最大期望 /
- 逆温度参数
Abstract: For the problem that Bayesian variational inference with low convergence precision is easy to fall into local optimum during search process, a Bayesian variational inference algorithm based on Expectation-Maximization (EM) and Simulated Annealing (SA) is proposed. The influence of the initial prior on the final result and the optimization efficiency of the variational free energy in the process of variational inference can not be ignored. The double EM is introduced to construct the initial prior of the variational parameter to reduce the sensitivity of the initial prior. And the inverse temperature parameter is introducted to improve the free energy function, which makes the energy be effectively controlled in the optimization process. This paper uses convergence criterion theory to analyze the convergence of the algorithm. The proposed algorithm is used for experiments with an Gaussian mixture model and the experimental results show that the proposed algorithm has better convergence results.
- Bayesian variational inference /
- Simulated Annealing(SA) /
- Expectation-Maximization(EM) /
- Inverse temperature parameter

HTML全文

图 1 KL距离与ELBO的关系

下载: 全尺寸图片幻灯片

图 2 各算法迭代过程对比图

下载: 全尺寸图片幻灯片

图 3 各算法针对高斯混合模型拟合图

下载: 全尺寸图片幻灯片

表 1 ES-VBI算法

(1)根据式(12)—式(18)方式构建基于最大似然估计的双重EM模型计算出初始先验${\omega ^ * }$，${z^ * }$；
(2)设置模拟退火的初始逆温度参数$\phi $,$0 < \phi < 1$, $t = 0$，构建基于逆温参数变分自由能的目标函数；
(3)根据拉格朗日算法子求得$z$和$\omega $的迭代公式；
(4)执行以下迭代步骤直至收敛：
执行迭代式(26)更新$q\left( z \right)$；
执行迭代式(27)更新$q\left( \omega \right)$；
执行$t = t + 1$；
(5)$\phi \leftarrow \phi \times {\rm{const}} $；
(6)如果$\phi < 1$，则跳转第(4)步，否则终止算法。

下载: 导出CSV

表 2 各算法ELOB和时间对比

算法名称	ELOB	t(s)
ES-VBI	–664428.51	9.28
A-VI	–721994.83	15.44
OSBL-VB	−707239.90	27.97
DA-VB	–922489.46	12.83
GDAEM	–790262.55	5.62
MCVI	–894487.22	30.25

下载: 导出CSV

参考文献(23)

[1]	SEEGER M W and WIPF D P. Variational Bayesian inference techniques[J]. IEEE Signal Processing Magazine, 2010, 27(6): 81–91. doi: 10.1109/msp.2010.938082
[2]	MA Yanjun, ZHAO Shunyi, and HUANG Biao. Multiple-model state estimation based on variational Bayesian inference[J]. IEEE Transactions on Automatic Control, 2019, 64(4): 1679–1685. doi: 10.1109/TAC.2018.2854897
[3]	JORDAN M I, GHAHRAMANI Z, JAAKKOLA T S, et al. An introduction to variational methods for graphical models[J]. Machine Learning, 1999, 37(2): 183–233. doi: 10.1023/a:1007665907178
[4]	LATOUCHE P and ROBIN S. Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models[J]. Statistics and Computing, 2016, 26(6): 1173–1185. doi: 10.1007/s11222-015-9607-0
[5]	WALTER J C and BARKEMA G T. An introduction to Monte Carlo methods[J]. Physica A: Statistical Mechanics and its Applications, 2015, 418: 78–87. doi: 10.1016/j.physa.2014.06.014
[6]	孙海英, 李锋, 商慧亮. 改进的变分自适应中值滤波算法[J]. 电子与信息学报, 2011, 33(7): 1743–1747. doi: 10.3724/SP.J.1146.2010.01295 SUN Haiying, LI Feng, and SHANG Huiliang. Salt-and-pepper noise removal by variational method based on improved adaptive median filter[J]. Journal of Electronics &Information Technology, 2011, 33(7): 1743–1747. doi: 10.3724/SP.J.1146.2010.01295
[7]	GUINDANI M and JOHNSON W O. More nonparametric Bayesian inference in applications[J]. Statistical Methods & Applications, 2018, 27(2): 239–251. doi: 10.1007/s10260-017-0399-6
[8]	王瑞, 芮国胜, 张洋. 基于变分贝叶斯推断的半盲信道估计[J]. 哈尔滨工业大学学报, 2018, 50(5): 192–198. doi: 10.11918/j.issn.0367-6234.201708062 WANG Rui, RUI Guosheng, and ZHANG Yang. Semi-blind channel estimation based on variational Bayesian inference[J]. Journal of Harbin Institute of Technology, 2018, 50(5): 192–198. doi: 10.11918/j.issn.0367-6234.201708062
[9]	DE CASTRO M and VIDAL I. Bayesian inference in measurement error models from objective priors for the bivariate normal distribution[J]. Statistical Papers, 2019, 60(4): 1059–1078. doi: 10.1007/s00362-016-0863-7
[10]	GIANNIOTIS N, SCHNÖRR C, MOLKENTHIN C, et al. Approximate variational inference based on a finite sample of Gaussian latent variables[J]. Pattern Analysis and Applications, 2016, 19(2): 475–485. doi: 10.1007/s10044-015-0496-9
[11]	SHEKARAMIZ M, MOON T K, and GUNTHER J H. Sparse Bayesian learning using variational Bayes inference based on a greedy criterion[C]. 2017 51st Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, USA, 2017: 858–862. doi: 10.1109/ACSSC.2017.8335470.
[12]	KATAHIRA K, WATANABE K, and OKADA M. Deterministic annealing variant of variational Bayes method[J]. Journal of Physics: Conference Series, 2008, 95(1): 012015. doi: 10.1088/1742-6596/95/1/012015
[13]	TABUSHI K and INOUE J. Improvement of EM algorithm by means of non-extensive statistical mechanics[C]. Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop, North Falmouth, USA, 2001: 133–142. doi: 10.1109/NNSP.2001.943118.
[14]	SALIMANS T, KINGMA D P, and WELLING M. Markov Chain Monte Carlo and variational inference: Bridging the gap[C]. The 32nd International Conference on International Conference on Machine Learning, Lille, France, 2015: 1218–1226. doi: arxiv.org/pdf/1410.6460.
[15]	GHODHBANI E, KAANICHE M, and BENAZZA-BENYAHIA A. Close approximation of kullback–leibler divergence for sparse source retrieval[J]. IEEE Signal Processing Letters, 2019, 26(5): 745–749. doi: 10.1109/LSP.2019.2907374
[16]	HE Xingyu, TONG Ningning, and HU Xiaowei. Superresolution radar imaging based on fast inverse-free sparse Bayesian learning for multiple measurement vectors[J]. Journal of Applied Remote Sensing, 2018, 12(1): 015013. doi: 10.1117/1.JRS.12.015013
[17]	LALAZISSIS G A, KÖNIG J, and RING P. New parametrization for the Lagrangian density of relativistic mean field theory[J]. Physical Review C, 1997, 55(1): 540–543. doi: 10.1103/PhysRevC.55.540
[18]	FORTUNATO S. Community detection in graphs[J]. Physics Reports, 2010, 486(3/5): 75–174. doi: 10.1016/j.physrep.2009.11.002
[19]	HUI Zhenyang, LI Dajun, JIN Shuanggen, et al. Automatic DTM extraction from airborne LiDAR based on expectation-maximization[J]. Optics & Laser Technology, 2019, 112: 43–55. doi: 10.1016/j.optlastec.2018.10.051
[20]	胡磊, 周剑雄, 石志广, 等. 利用期望-最大化算法实现基于动态词典的压缩感知[J]. 电子与信息学报, 2012, 34(11): 2554–2560. doi: 10.3724/SP.J.1146.2012.00347 HU Lei, ZHOU Jianxiong, SHI Zhiguang, et al. An EM-based approach for compressed sensing using dynamic dictionaries[J]. Journal of Electronics &Information Technology, 2012, 34(11): 2554–2560. doi: 10.3724/SP.J.1146.2012.00347
[21]	LALAOUI M, EL AFIA A, and CHIHEB R. A self-tuned simulated annealing algorithm using hidden Markov model[J]. International Journal of Electrical and Computer Engineering, 2018, 8(1): 291–298. doi: 10.11591/ijece.v8i1.pp291-298
[22]	HUANG Longbo and NEELY M J. Delay reduction via Lagrange multipliers in stochastic network optimization[J]. IEEE Transactions on Automatic Control, 2011, 56(4): 842–857. doi: 10.1109/TAC.2010.2067371
[23]	陈志敏, 田梦楚, 吴盘龙, 等. 基于蝙蝠算法的粒子滤波法研究[J]. 物理学报, 2017, 66(5): 050502. doi: 10.7498/aps.66.050502 CHEN Zhimin, TIAN Mengchu, WU Panlong, et al. Intelligent particle filter based on bat algorithm[J]. Acta Physica Sinica, 2017, 66(5): 050502. doi: 10.7498/aps.66.050502