基于Autoencoder网络的数据降维和重构
doi: 10.3724/SP.J.1146.2008.00477
Dimensionality Reduction and Reconstruction of Data Based on Autoencoder Network
-
摘要: 在机器学习,模式识别以及数据挖掘等诸多研究领域中,往往会面临着维数灾难问题。因此,特征数据的降维方法,即将高维的特征数据如何进行简化投射到低维空间中再进行处理,成为当前数据驱动的计算方法研究热点之一。该文引入一种特殊的非线性降维方法,称为自编码(Autoencoder)神经网络,该方法采用CRBM(Continuous Restricted Boltzmann Machine)的网络结构,通过训练具有多个中间层的双向深层神经网络将高维数据转换成低维嵌套并继而重构高维数据。特别地,自编码网络提供了高维数据空间和低维嵌套结构的双向映射,有效解决了大多数非线性降维方法所不具备的逆向映射问题。将Autoencoder用于人工数据和真实图像数据的实验表明,Autoencoder不仅能发现嵌入在高维数据中的非线性低维结构,也能有效地从低维结构中恢复原始高维数据。
-
关键词:
- 自编码网络;高维数据;降维;重构
Abstract: The curse of dimensionality is a central difficulty in many fields such as machine learning, pattern recognition and data mining etc. The dimensionality reduction method of characteristic data is one of the current research hotspots in data-driven calculation methods, which high-dimensional data is mapped into a low-dimensional space. In this paper, a special nonlinear dimensionality reduction method called Autoencoder is introduced, which uses Continuous Restricted Boltzmann Machine (CRBM) and converts high-dimensional data to low-dimensional codes by training a neural network with multiple hidden layers. In particular, the autoencoder provides such a bi-directional mapping between the high-dimensional data space and the low-dimensional manifold space and is therefore able to overcome the inherited deficiency of most nonlinear dimensionality reduction methods that do not have an inverse mapping. The experiments on synthetic datasets and true image data show that the autoencoder network not only can find the embedded manifold of high-dimensional datasets but also reconstruct exactly the original high-dimension datasets from low-dimensional structure.
计量
- 文章访问数: 3881
- HTML全文浏览量: 251
- PDF下载量: 2329
- 被引次数: 0