高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

大数据中一种基于语义特征阈值的层次聚类方法

罗恩韬 王国军

罗恩韬, 王国军. 大数据中一种基于语义特征阈值的层次聚类方法[J]. 电子与信息学报, 2015, 37(12): 2795-2801. doi: 10.11999/JEIT150422
引用本文: 罗恩韬, 王国军. 大数据中一种基于语义特征阈值的层次聚类方法[J]. 电子与信息学报, 2015, 37(12): 2795-2801. doi: 10.11999/JEIT150422
Luo En-tao, Wang Guo-jun. A Hierarchical Clustering Method Based on the Threshold of Semantic Feature in Big Data[J]. Journal of Electronics & Information Technology, 2015, 37(12): 2795-2801. doi: 10.11999/JEIT150422
Citation: Luo En-tao, Wang Guo-jun. A Hierarchical Clustering Method Based on the Threshold of Semantic Feature in Big Data[J]. Journal of Electronics & Information Technology, 2015, 37(12): 2795-2801. doi: 10.11999/JEIT150422

大数据中一种基于语义特征阈值的层次聚类方法

doi: 10.11999/JEIT150422
基金项目: 

国家自然科学基金(60173037, 6272496, 61272151),湖南省教育厅科研项目(2015C0589),湖南科技学院重点学科项目

A Hierarchical Clustering Method Based on the Threshold of Semantic Feature in Big Data

Funds: 

The National Natural Science Foundation of China (60173037, 6272496, 61272151)

  • 摘要: 云计算、健康医疗、街景地图服务、推荐系统等新兴服务促使数据的种类和规模以前所未有的速度增长,数据量的激增会导致很多共性问题。例如数据的可表示,可处理和可靠性问题。如何有效处理和分析数据之间的关系,提高数据的划分效率,建立数据的聚类分析模型,已经成为学术界和企业界共同亟待解决的问题。该文提出一种基于语义特征的层次聚类方法,首先根据数据的语义特征进行训练,然后在每个子集上利用训练结果进行层次聚类,最终产生整体数据的密度中心点,提高了数据聚类效率和准确性。此方法采样复杂度低,数据分析准确,易于实现,具有良好的判定性。
  • 程学旗, 靳小龙, 王元卓, 等. 大数据系统和分析技术综述[J]. 软件学报, 2014, 25(9): 1889-1909.
    Cheng Xue-qi, Jin Xiao-long, Wang Yuan-zhuo, et al.. Survey on big data system and analytic technology[J]. Journal of Software, 2014, 25(9): 1889-1909.
    Du Y, He Y, Tian Y, et al.. Microblog bursty topic detection based on user relationship[C]. IEEE 6th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 2011, 1: 260-263.
    孙吉贵, 刘杰, 赵连宇. 聚类算法研究[J]. 软件学报, 2008, 19(1): 48-61.
    Sun Ji-gui, Liu Jie, and Zhao Lian-yu. Clustering algorithms research[J]. Journal of Software, 2008, 19(1): 48-61.
    Choromanska A, Jebara T, Kim H, et al.. Fast spectral clustering via the nystr?m method[C]. Proceedings of the 24th International Conference, Algorithmic Learning Theory 2013, Singapore, 2013: 367-381.
    Hearn T A and Reichel L. Fast computation of convolution operations via low-rank approximation[J]. Applied Numerical Mathematics, 2014, (75): 136-153.
    Gajjar M R, Sreenivas T V, and Govindarajan R. Fast computation of Gaussian likelihoods using low-rank matrix approximations[C]. 2011 IEEE Workshop on Signal Processing Systems (SiPS), Beirut, Lebanon, 2011: 322-327.
    崔颖安, 李雪, 王志晓, 等. 社会化媒体大数据多阶段整群抽样方法[J]. 软件学报, 2014, 25(4): 781-796.
    Cui Ying-an, Li Xue, Wang Zhi-xiao, et al.. Sampling online social media big data based multi stage cluster method[J]. Journal of Software, 2014, 25(4): 781-796.
    Chen W Y, Song Y, Bai H, et al.. Parallel spectral clustering in distributed systems[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(3): 568-586.
    丁世飞, 贾洪杰, 史忠植. 基于自适应 Nystrm采样的大数据谱聚类算法[J]. 软件学报, 2014, 25(9): 2037-2049.
    Ding Shi-fei, Jia Hong-jie, and Shi Zhong-zhi. Spectral clustering algorithm based on adaptive Nystrm sampling for big data analysis[J]. Journal of Software, 2014, 25(9): 2037-2049.
    Chen X and Cai D. Large scale spectral clustering with landmark-based representation[C]. Proceedings of the 25th AAAI Conference on Artificaial Inteligence, San Francisco, USA, 2011: 313-318.
    慈祥, 马友忠, 孟小峰. 一种云环境下的大数据Top-K查询方法[J]. 软件学报, 2014, 25(4): 813-825.
    Ci Xiang, Ma You-zhong, and Meng Xiao-feng. Method for Top-K query on big data in cloud[J]. Journal of Software, 2014, 25(4): 813-825.
    Horng S J, Su M Y, Chen Y H, et al.. A novel intrusion detection system based on hierarchical clustering and support vector machines[J]. Expert Systems with Applications, 2011, 38(1): 306-313.
    Bahmani B, Moseley B, Vattani A, et al.. Scalable k- means++[J]. Proceedings of the VLDB Endowment, 2012, 5(7): 622-633.
    Zhang X and You Q. Clusterability analysis and incremental sampling for Nystrm extension based spectral clustering[C]. IEEE 11th International Conference on Data Mining (ICDM) , Vancouver, Canada, 2011: 942-951.
    Zhang K and Kwok J T. Clustered Nystrm method for large scale manifold learning and dimension reduction[J]. IEEE Transactions on Neural Networks, 2010, 21(10): 1576-1587.
    Vlachou A, Doulkeridis C, Kotidis Y, et al.. Monochromatic and bichromatic reverse top-k queries[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(8): 1215-1229.
  • 加载中
计量
  • 文章访问数:  1395
  • HTML全文浏览量:  119
  • PDF下载量:  804
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-04-10
  • 修回日期:  2015-09-01
  • 刊出日期:  2015-12-19

目录

    /

    返回文章
    返回