一种新的基于距离加权的模板约简K近邻算法
doi: 10.3724/SP.J.1146.2011.00051
A Novel Template Reduction K-Nearest Neighbor Classification Method Based on Weighted Distance
-
摘要: 作为一种非参数的分类算法,K近邻(KNN)算法简单有效并且易于实现。但传统的KNN算法认为所有的近邻样本贡献相等,这就使得算法容易受到噪声的干扰,同时对于大的数据集,KNN的计算代价非常大。针对上述问题,该文提出了一种新的基于距离加权的模板约简K近邻算法(TWKNN)。利用模板约简技术,将训练集中远离分类边界的样本去掉,同时按照各个近邻与待测样本的距离为K个近邻赋予不同的权值,增强了算法的鲁棒性。实验结果表明,该方法可以有效地减少训练样本数目,同时还能保持传统KNN的分类精度。Abstract: As a nonparametric classification algorithm, K-Nearest Neighbor (KNN) is very efficient and can be easily realized. However, the traditional KNN suggests that the contributions of all K nearest neighbors are equal, which makes it easy to be disturbed by noises. Meanwhile, for large data sets, the computational demands for classifying patterns using KNN can be prohibitive. In this paper, a new Template reduction KNN algorithm based on Weighted distance (TWKNN) is proposed. Firstly, the points that are far away from the classification boundary are dropped by the template reduction technique. Then, in the process of classification, the K nearest neighbors weights of the test sample are set according to the Euclidean distance metric, which can enhance the robustness of the algorithm. Experimental results show that the proposed approach effectively reduces the number of training samples while maintaining the same level of classification accuracy as the traditional KNN.
-
Key words:
- Pattern recognition /
- Weighted distance /
- Template reduction /
- K-Nearest Neighbor (KNN)
-
Cover T M and Hart P E. Nearest neighbor pattern classification [J].IEEE Transactions on Information Theory.1967, 13(1):21-27[2]Nasibov E and Kandemir-Cavas C. Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction [J].Computational Biology and Chemistry.2009, 33(6):461-464[3]Zhang Rui, Jagadish H V, Dai Bing Tian, et al.. Optimized algorithms for predictive range and KNN queries on moving objects [J].Information Systems.2010, 35(8):911-932[5]Toyama J, Kudo M, and Imai H. Probably correct k-nearest neighbor search in high dimensions [J].Pattern Recognition.2010, 43(4):1361-1372[7]Dudai S A. The distance-weighted k-nearest neighbor rule [J].IEEE Transactions on Systems, Man and Cybernetics.1976, 6(4):325-327[8]Ferri F and Vidal E. Colour image segmentation and labeling through multiedit-condensing [J].Pattern Recognition Letters.1992, 13(8):561-568[9]Segata N, Blanzieri E, Delany S J, et al.. Noise reduction for instance-based learning with a local maximal margin approach [J].Journal of Intelligent Information Systems.2010, 35(2):301-331[12]Wilson D R and Martinez T R. Reduction techniques for instance-based learning algorithms [J].Machine Learning.2000, 38(3):257-286[13]Wu Ying Quan, Ianakiev K, and Govindaraju V. Improved k-nearest neighbor classi cation [J].Pattern Recognition.2002, 35(10):2311-2318[14]Fayed H A and Atiya A F. A novel template reduction approach for the k-nearest neighbor method [J].IEEE Transactions on Neural Networks.2009, 20(5):890-896[15]Huang D and Chow T W S. Enhancing density-based data reduction using entropy [J].Neural Computation.2006, 18(2):470-495[16]Paredes R and Vidal E. Learning prototypes and distances: a prototype reduction technique based on nearest neighbor error minimization [J].Pattern Recognition.2006, 39(2):171-179[17]Brighton H and Mellish C. Advances in instance selection for instance-based learning algorithms [J].Data Mining and Knowledge Discovery.2002, 6(2):153-172
计量
- 文章访问数: 2964
- HTML全文浏览量: 93
- PDF下载量: 1228
- 被引次数: 0