一种新的基于距离加权的模板约简K近邻算法

doi:10.3724/SP.J.1146.2011.00051

一种新的基于距离加权的模板约简K近邻算法

doi: 10.3724/SP.J.1146.2011.00051

基金项目:

国家自然科学基金(61075110)和北京市自然科学基金(4082004, 4112011)资助课题

计量
- 文章访问数: 2993
- HTML全文浏览量: 101
- PDF下载量: 1228
- 被引次数: 2
出版历程
- 收稿日期: 2011-01-18
- 修回日期: 2011-06-07
- 刊出日期: 2011-10-19

A Novel Template Reduction K-Nearest Neighbor Classification Method Based on Weighted Distance

摘要

摘要: 作为一种非参数的分类算法，K近邻(KNN)算法简单有效并且易于实现。但传统的KNN算法认为所有的近邻样本贡献相等，这就使得算法容易受到噪声的干扰，同时对于大的数据集，KNN的计算代价非常大。针对上述问题，该文提出了一种新的基于距离加权的模板约简K近邻算法(TWKNN)。利用模板约简技术，将训练集中远离分类边界的样本去掉，同时按照各个近邻与待测样本的距离为K个近邻赋予不同的权值，增强了算法的鲁棒性。实验结果表明，该方法可以有效地减少训练样本数目，同时还能保持传统KNN的分类精度。
- 模式识别 /
- 距离加权 /
- 模板约简 /
- K近邻(KNN)
Abstract: As a nonparametric classification algorithm, K-Nearest Neighbor (KNN) is very efficient and can be easily realized. However, the traditional KNN suggests that the contributions of all K nearest neighbors are equal, which makes it easy to be disturbed by noises. Meanwhile, for large data sets, the computational demands for classifying patterns using KNN can be prohibitive. In this paper, a new Template reduction KNN algorithm based on Weighted distance (TWKNN) is proposed. Firstly, the points that are far away from the classification boundary are dropped by the template reduction technique. Then, in the process of classification, the K nearest neighbors weights of the test sample are set according to the Euclidean distance metric, which can enhance the robustness of the algorithm. Experimental results show that the proposed approach effectively reduces the number of training samples while maintaining the same level of classification accuracy as the traditional KNN.
- Pattern recognition /
- Weighted distance /
- Template reduction /
- K-Nearest Neighbor (KNN)

HTML全文

参考文献(1)

Cover T M and Hart P E. Nearest neighbor pattern classification [J].IEEE Transactions on Information Theory.1967, 13(1):21-27[2]Nasibov E and Kandemir-Cavas C. Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction [J].Computational Biology and Chemistry.2009, 33(6):461-464[3]Zhang Rui, Jagadish H V, Dai Bing Tian, et al.. Optimized algorithms for predictive range and KNN queries on moving objects [J].Information Systems.2010, 35(8):911-932[5]Toyama J, Kudo M, and Imai H. Probably correct k-nearest neighbor search in high dimensions [J].Pattern Recognition.2010, 43(4):1361-1372[7]Dudai S A. The distance-weighted k-nearest neighbor rule [J].IEEE Transactions on Systems, Man and Cybernetics.1976, 6(4):325-327[8]Ferri F and Vidal E. Colour image segmentation and labeling through multiedit-condensing [J].Pattern Recognition Letters.1992, 13(8):561-568[9]Segata N, Blanzieri E, Delany S J, et al.. Noise reduction for instance-based learning with a local maximal margin approach [J].Journal of Intelligent Information Systems.2010, 35(2):301-331[12]Wilson D R and Martinez T R. Reduction techniques for instance-based learning algorithms [J].Machine Learning.2000, 38(3):257-286[13]Wu Ying Quan, Ianakiev K, and Govindaraju V. Improved k-nearest neighbor classi cation [J].Pattern Recognition.2002, 35(10):2311-2318[14]Fayed H A and Atiya A F. A novel template reduction approach for the k-nearest neighbor method [J].IEEE Transactions on Neural Networks.2009, 20(5):890-896[15]Huang D and Chow T W S. Enhancing density-based data reduction using entropy [J].Neural Computation.2006, 18(2):470-495[16]Paredes R and Vidal E. Learning prototypes and distances: a prototype reduction technique based on nearest neighbor error minimization [J].Pattern Recognition.2006, 39(2):171-179[17]Brighton H and Mellish C. Advances in instance selection for instance-based learning algorithms [J].Data Mining and Knowledge Discovery.2002, 6(2):153-172

施引文献

期刊类型引用(2)

1.	王玮, 苏琦, 刘荫, 周伟, 于展鹏, 穆林. 基于云存储的异构海量数据搜索平台设计. 信息技术. 2017(06): 166-169 . 百度学术
2.	陈亮, 孟李林. 分组交换中一种存储管理方案的设计与实现. 西安邮电学院学报. 2011(06): 31-34 . 百度学术