摘要:
该文提出一种改进的随机游动模型,并在此模型的基础上,发展了一种数据聚类算法。在此算法中,数据集中的样本点根据改进的随机游动模型,生成有权无向图G(V,E,d),其中每个样本点对应图G的一个顶点,并且假设每个顶点为可以在空间中移动的Agent。随后计算每个顶点向其邻集中顶点转移的概率,在随机选定邻集中的一个顶点作为转移方向后,移动一个单位距离。在所有样本点不断随机游动的过程中,同类的样本点就会逐渐的聚集到一起,而不同类的样本点相互远离,最后使得聚类自动形成。实验结果表明,基于随机游动的聚类算法能使样本点合理有效地被聚类,同时,与其他算法对比也说明了此算法的有效性。
Abstract:
In this paper, a modified model of random walk is proposed, and then a clustering algorithm is developed based on this model. In the algorithm, at first a weighted and undirected graph G(V,E,d) is constructed among data points in a dataset according to the model, where each data point corresponds to a vertex in the graph, and is regarded as an agent who can move randomly in space. Next, the transition probabilities of data points are computed, and then each data point chooses a neighbor randomly in its neighborhood as a transition direction and takes a step to it. As all data points walk in space at random repeatedly, the data points that belong to the same class are located at a same position, whereas those that belong to different classes are away from one another. Consequently, the experimental results demonstrate that data points in datasets are clustered reasonably and efficiently. Moreover, the comparison with other algorithms also provides an indication of the effectiveness of the algorithm.