Advanced Search
Volume 37 Issue 11
Nov.  2015
Turn off MathJax
Article Contents
Chen Li-min, Yang Jing, Zhang Jian-pei. A Fast Clustering Algorithm Based on Embedding Technology for Heterogeneous Information Networks[J]. Journal of Electronics & Information Technology, 2015, 37(11): 2634-2641. doi: 10.11999/JEIT150106
Citation: Chen Li-min, Yang Jing, Zhang Jian-pei. A Fast Clustering Algorithm Based on Embedding Technology for Heterogeneous Information Networks[J]. Journal of Electronics & Information Technology, 2015, 37(11): 2634-2641. doi: 10.11999/JEIT150106

A Fast Clustering Algorithm Based on Embedding Technology for Heterogeneous Information Networks

doi: 10.11999/JEIT150106
Funds:

The National Natural Science Foundation of China (61370083, 61073043, 61073041)

  • Received Date: 2015-01-21
  • Rev Recd Date: 2015-07-16
  • Publish Date: 2015-11-19
  • Research on clustering heterogeneous information networks is one of the current hotspots. Taking advantages of the sparsity of heterogeneous information networks, a fast clustering algorithm based on embedding technology for heterogeneous information networks of star network schema is proposed in this paper. First, the heterogeneous information network is transformed into some compatible bipartite graphs from the point of compatible view. Then, the approximate commute distance embedding of each bipartite graph is computed via random mapping and a linear time solver, and an indicator subset in each embedding indicates the target dataset. At last, a general model is formulated via all the indicator subsets, and a minimum value of the model is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. This proposed algorithm is effective by theory analysis and experimental verification.
  • loading
  • 肖杰斌, 张绍武.基于随机游走和增量相关节点的动态网络社团挖掘算法[J]. 电子与信息学报. 2013, 35(4): 977-981.
    Xiao Jie-bin and Zhang Shao-wu. An algorithm of integrating random walk and increment correlative vertexes for mining community of dynamic networks[J]. Journal of Electronics Information Technology, 2013, 35(4): 977-981.
    陈季梦, 陈家俊, 刘杰, 等. 基于结构相似度的大规模社交网络聚类算法[J]. 电子与信息学报. 2015, 37(2): 449-454.
    Chen Ji-meng, Chen Jia-jun, Liu Jie, et al.. Clustering algorithms for large-scale social networks based on structural similarity[J]. Journal of Electronics Information Technology, 2015, 37(2): 449-454.
    Sun Y and Han J. Mining heterogeneous information networks: principles and methodologies[J]. Proceedings of Mining Heterogeneous Information Networks: Principles and Methodologies, 2012, 3(2): 1-159.
    Huang Y and Gao X. Clustering on heterogeneous networks [J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2014, 4(3): 213-233.
    Gao B, Liu T Y, Zheng X, et al.. Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering[C]. Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, 2005: 41-50.
    Gao B, Liu T, and Ma W-Y. Star-structured high-order heterogeneous data co-clustering based on consistent information theory[C]. Proceedings of the 6th International Conference on Data Mining (ICDM 2006), Hong Kong, 2006: 880-884.
    Long B, Zhang Z M, Wu X, et al.. Spectral clustering for multi-type relational data[C]. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, 2006: 585-592.
    Sun Y, Yu Y, and Han J. Ranking-based clustering of heterogeneous information networks with star network schema[C]. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, 2009: 797-806.
    Li P, Wen J, and Li X. SNTClus: a novel service clustering algorithm based on network analysis and service tags[J]. Przeglad Elektrotechniczny, 2013, 89(1): 208-210.
    Li P, Chen L, Li X, et al.. RNRank: Network-Based Ranking on Relational Tuples[M]. Boston: Behavior and Social Computing, Springer International Publishing, 2013: 139-150.
    Wang R, Shi C, Philip S Y, et al.. Integrating Clustering and Ranking on Hybrid Heterogeneous Information Network[M]. Berlin: Advances in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, 2013: 583-594.
    Boden B, Ester M, and Seidl T. Density-Based Subspace Clustering in Heterogeneous Networks[M]. Berlin: Machine Learning and Knowledge Discovery in Databases, Springer Berlin Heidelberg, 2014: 149-164.
    Meng Q, Tafavogh S, and Kennedy P J. Community detection on heterogeneous networks by multiple semantic- path clustering[C]. 2014 6th IEEE International Conference on Computational Aspects of Social Networks (CASoN), Porto, 2014: 7-12.
    Meng X, Shi C, Li Y, et al.. Relevance Measure in Large-scale Heterogeneous Networks[M]. Boston: Web Technologies and Applications, Springer International Publishing, 2014: 636-643.
    Aggarwal C C, Xie Y, and Philip S Y. On dynamic link inference in heterogeneous networks[C]. SIAM International Conference on Data?Mining, Anaheim, 2012: 415-426.
    Khoa N L D and Chawla S. Large Scale Spectral Clustering Using Resistance Distance and Spielman-teng Solvers[M]. Berlin: Discovery Science, Springer Berlin Heidelberg, 2012: 7-21.
    Spielman D A and Teng S H. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems[C]. Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, 2004: 81-90.
    Spielman D A and Teng S H. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems[J]. SIAM Journal on Matrix Analysis and Applications, 2014, 35(3): 835-885.
    Fouss F, Pirotte A, Renders J M, et al.. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 355-369.
    Spielman D A and Srivastava N. Graph sparsification by effective resistances[J]. SIAM Journal on Computing, 2011, 40(6): 1913-1926.
    Achlioptas D. Database-friendly random projections[C]. Proceedings of the 20th ACM Sigmod-Sigact-Sigart Symposium on Principles of Database Systems, New York, 2001: 274-281.
    Koutis I, Miller G L, and Tolliver D. Combinatorial preconditioners and multilevel solvers for problems in computer vision and image processing[J]. Computer Vision and Image Understanding, 2011, 115(12): 1638-1646.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (1266) PDF downloads(594) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return