基于深度分层特征表示的行人识别方法

孙锐; 张广海; 高隽

doi:10.11999/JEIT150982

基于深度分层特征表示的行人识别方法

doi: 10.11999/JEIT150982

基金项目:

国家自然科学基金(61471154)，教育部留学回国人员科研启动基金

计量
- 文章访问数: 1725
- HTML全文浏览量: 111
- PDF下载量: 894
- 被引次数: 11
出版历程
- 收稿日期: 2015-09-06
- 修回日期: 2015-12-25
- 刊出日期: 2016-06-19

Pedestrian Recognition Method Based on Depth Hierarchical Feature Representation

Funds:

The National Natural Science Foundation of China (61471154), Scientific Research Foundation for Returned Scholars, Ministry of Education of China

摘要

摘要: 该文针对行人识别中的特征表示问题，提出一种混合结构的分层特征表示方法，这种混合结构结合了具有表示能力的词袋结构和学习适应性的深度分层结构。首先利用基于梯度的HOG局部描述符提取局部特征，再通过一个由空间聚集受限玻尔兹曼机组成的深度分层编码方法进行编码。对于每个编码层，利用稀疏性和选择性正则化进行无监督受限玻尔兹曼机学习，再应用监督微调来增强分类任务中视觉特征表示，采用最大池化和空间金字塔方法得到高层图像特征表示。最后采用线性支持向量机进行行人识别，提取深度分层特征遮挡等与目标无关部分自然分离，有效提高了后续识别的准确性。实验结果证明了所提出方法具有较高的识别率。
- 行人识别 /
- 混合结构 /
- 深度学习 /
- 深度分层编码 /
- 受限玻尔兹曼机
Abstract: For feature representation of pedestrian recognition, a hybrid hierarchical feature representation method which combines representation ability of the bag of words model and depth layered with learning adaptability is presented. This method first uses HOG local descriptor gradient-based for local features extraction, and then encoding the feature by a depth of layered coding method, the layered coding method by spatial aggregating Restricted Boltzmann Machine (RBM). For each coding layer, the sparse and selective regularization are used for the unsupervised RBM learning and supervision fine-tuning is used to enhance the visual features representation in classification task. Finally, high-level image feature representation is obtained by the maximum pool and space of Pyramid method, and then the linear support vector machine is used for pedestrian recognition, feature extraction of depth architecture. It improves effectively the accuracy of subsequent recognition. Experimental results show that the proposed method has a high recognition rate.
- Pedestrian recognition /
- Hybrid structure /
- Deep learning /
- Depth hierarchical coding /
- Restricted Boltzmann Machine (RBM)

HTML全文

参考文献(18)

DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. Proceedings of IEEE Computer Society Conference on in Computer Vision and Pattern Recognition. San Diego, 2005: 886-893. doi: 10.1109/CVPR. 2005.177.

ARMANFARD N, KOMEILI M, and KABIR E. TED: a texture-edge descriptor for pedestrian detection in video sequences[J]. Pattern Recognition, 2012, 45(3): 983-992. doi: 10.1016/j.patcog.2011.08.010.

YAN Zhiguo, YANG Fang, WANG Jian, et al. Face orientation detection in video stream based on Harr-like feature and LQV classifier for civil video surveillance[C]. IET International Conference on Smart and Sustainable City (ICSSC), Shanghai, 2013: 161-165. doi: 10.1049/cp.2013. 2029.

XIAO Pan, CAI Nian, TANG Bochao, et al. Efficient SIFT descriptor via color quantization[C]. IEEE International Conference on Consumer Electronics, Shenzhen, 2014: 1-3. doi: 10.1109/ICCE-China.2014.7029876.

YANG Jian, XU Wei, LIU Yu, et al. Real-time discrimination of frontal face using integral channel features and Adaboost[C]. IEEE Conference on Software Engineering and Service Science (ICSESS), Beijing, 2014: 360-363. doi: 10. 1109/ICSESS.2014.6933582.

WU Shuqiong and NAGAHASHI H. Parameterized AdaBoost: introducing a parameter to speed up the training of real AdaBoost[J]. IEEE Signal Processing Letters, 2014, 21(6): 687-691.doi: 10.1109/LSP.2014.2313570.

SCHMIDHUBER J. Deep learning in neural networks: an overview[J]. Neural Networks, 2015, 61: 85-117. doi: 10.1016/ j.neunet.2014.09.003.

RANZATO M, BOUREAU Y, and LECUN Y. Sparse feature learning for deep belief networks[C]. Proceedings of Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, 2007: 1185-1192.

余凯, 贾磊, 陈雨强, 等. 深度学习的昨天、今天和明天[J]. 计算机研究与发展, 2013, 50(9): 1799-1804.

YU Kai, JIA Lei, CHEN Yuqiang, et al. Deep learning: yesterday, today, and tomorrow[J]. Journal of Computer Research and Development, 2013, 50(9): 1799-1804.

LAW M T, THOME N, and CORD M. Bag-of-Words Image Representation: Key Ideas and Further Insight[M]. Switzerland, Springer International Publishing, 2014: 29-52.

WU Chunpeng, FAN Wei, HE Yuan, et al. Handwritten character recognition by alternately trained relaxation convolutional neural network[C]. International Conference on Frontiers in Handwriting Recognition, Heraklion, 2014: 291-296. doi: 10.1109/ICFHR.2014.56.

SOHN K, JUNG D Y, LEE H, et al. Efficient learning of sparse, distributed, convolutional feature representations for object recognition[C]. 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, 2011: 2643-2650. doi: 10.1109/ICCV.2011.6126554.

LEE H, GROSSE R, RANGANATH R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations[C]. International Conference on Machine Learning, Montreal, 2009: 609-616. doi: 10.1145 /1553374.1553453.

BAI Y, YU W, XIAO T, et al. Bag-of-words based deep neural network for image retrieval[C]. Proceedings of the ACM International Conference on Multimedia, New York, 2014: 229-232. doi: 10.1145/2647868.2656402.

BOUREAU Y, BACH F, LECUN Y, et al. Learning mid-level features for recognition[C]. IEEE Conference on Computer Vision Pattern Recognition, 2010: 2559-2566. doi:10. 1109/CVPR.2010.5539963.

YU K, LIN Y, and LAFFERTY J. Learning image representations from the pixel level via hierarchical sparse coding[C]. IEEE Conference on Computer Vision Pattern Recognition, Colorado Springs, 2011: 1713-1720. doi: 10. 1109/CVPR.2011.5995732.

HINTON G E. Training products of experts by minimizing Ccontrastive divergence[J]. Neural Computation, 2002, 14(8): 1771-1800. doi: 10.1162/089976602760128018.

施引文献

期刊类型引用(4)

1.	周塔，邓赵红，蒋亦樟，王士同. 一种面向中小规模数据集的模糊分类方法. 软件学报. 2019(12): 3637-3650 . 百度学术
2.	宋宗涛，陈岳林，蔡晓东，曾燕. 基于多分块三重损失计算的行人识别方法. 电视技术. 2017(Z4): 203-206+224 . 百度学术
3.	邹焱飚，周卫林，陈向志. 基于深度分层特征的激光视觉焊缝检测与跟踪系统研究. 中国激光. 2017(04): 95-106 . 百度学术
4.	成金庚，计科峰. 结合群组动量特征与卷积神经网络的人群行为分析. 科学技术与工程. 2017(14): 79-85 . 百度学术