高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

支持联机分析处理的推特用户兴趣维层次提取方法

俞东进 倪智勇 孙景超

俞东进, 倪智勇, 孙景超. 支持联机分析处理的推特用户兴趣维层次提取方法[J]. 电子与信息学报, 2017, 39(9): 2081-2088. doi: 10.11999/JEIT170030
引用本文: 俞东进, 倪智勇, 孙景超. 支持联机分析处理的推特用户兴趣维层次提取方法[J]. 电子与信息学报, 2017, 39(9): 2081-2088. doi: 10.11999/JEIT170030
YU Dongjin, NI Zhiyong, SUN Jingchao. Extracting Dimension Hierarchy of Tweeters Interests for On-line Analytical Processing[J]. Journal of Electronics & Information Technology, 2017, 39(9): 2081-2088. doi: 10.11999/JEIT170030
Citation: YU Dongjin, NI Zhiyong, SUN Jingchao. Extracting Dimension Hierarchy of Tweeters Interests for On-line Analytical Processing[J]. Journal of Electronics & Information Technology, 2017, 39(9): 2081-2088. doi: 10.11999/JEIT170030

支持联机分析处理的推特用户兴趣维层次提取方法

doi: 10.11999/JEIT170030
基金项目: 

国家自然科学基金项目(61100043, 61472112),浙江省自然科学基金资助项目(LY12F02003),浙江省科技计划重点资助项目(2017C01010, 2016F50014)

Extracting Dimension Hierarchy of Tweeters Interests for On-line Analytical Processing

Funds: 

The National Natural Science Foundation of China (61100043, 61472112), The Natural Science Foundation of Zhejiang Province (LY12F02003), The Key Science and Technology Project of Zhejiang Province (2017C01010, 2016F50014)

  • 摘要: 从海量推特数据中探索用户兴趣的分布规律和相关性有利于实现精确的个性化推荐。联机分析处理(On- Line Analytical Processing, OLAP)提供了一种适合人们探究数据的直观形式。将OLAP技术应用于推特数据的关键是如何挖掘和构建推特用户的兴趣维层次。针对现有方法只能提取单一层次兴趣的不足,该文提出一种支持联机分析处理的推特用户兴趣维层次提取方法。该方法首先通过RestAPI获取推特数据,然后通过改进的LDA(Latent Dirichlet Allocation)模型挖掘用户的兴趣和子兴趣,最后在此基础上构建兴趣维层次结构。实验评估了该方法的模型效果和可扩展性,并证实与LDA和hLDA相比可以更有效地提取出推特用户的兴趣维层次并应用于联机分析处理。
  • ZHANG Yubao, RUAN Xin, WANG Haining, et al. Twitter trends manipulation: A first look inside the security of Twitter trending[J]. IEEE Transactions on Information Forensics and Security, 2017, 12(1): 144-156. doi: 10.1109/ TIFS.2016.2604226.
    BEHESHTI S M R, BENATALLAH B, and MOTAHARI- NEZHAD H R. Scalable graph-based OLAP analytics over process execution data[J]. Distributed and Parallel Databases, 2016, 34(3): 379-423. doi: 10.1007/s10619-014-7171-9.
    OUKID Lamia, BENBLIDIA Nadjia, BENTAYEB Fadila, et al. Contextualized text OLAP based on information retrieval [J]. International Journal of Data Warehousing and Mining, 2015, 11(2): 1-21. doi: 10.4018/ijdwm.2015040101.
    DRZADZEWSKI G and TOMPA F W. Partial materialization for online analytical processing over multi- tagged document collections[J]. Knowledge and Information Systems, 2016, 47(3): 697-732. doi: 10.1007/s10115-015- 0871-2.
    SISWANTO E, KHODRA M L, and DEWI L J E. Prediction of interest for dynamic profile of Twitter user[C]. International Conference of Advanced Informatics: Concept, Theory and Application, Bandung, 2014: 266-271.
    LIM K H and DATTA A. Interest classification of Twitter users using Wikipedia[C]. International Symposium on Wikis and Open Collaboration, Hong Kong, 2013: 1-2.
    PU X, CHATTI M A, US H T, et al. Wiki-LDA: A mixed- method approach for effective interest mining on Twitter data[C]. The 8th International Conference on Computer Supported Education, Rome, 2016: 426-433.
    XU Z, RU L, XIANG L, et al. Discovering user interest on Twitter with a modified author-topic model[C]. IEEE/WIC/ ACM International Conference on Web Intelligence, Lyon, 2011: 422-429.
    ZHAO W X, JIANG J, WENG J S, et al. Comparing Twitter and traditional media using topic models[C]. The 33rd European Conference on IR Research, Dublin, 2011: 338-349.
    BLEI D M, GRIFFITH T L, JORDAN M I, et al. Hierarchical topic models and the nested Chinese restaurant process[C]. International Conference on Neural Information Processing Systems, Vancouver, 2003: 17-24.
    OUKID L, BOUSSAID O, BENBLIDIA N, et al. TLabel: A new OLAP aggregation operator in text cubes[J]. International Journal of Data Warehousing and Mining, 2016, 12(4): 54-74. doi: 10.4018/IJDWM.2016100103.
    BERBEL TDRL and GONZLEZ SM. How to help end users to get better decisions? personalising OLAP aggregation queries through semantic recommendation of text documents[J]. International Journal of Business Intelligence Data Mining, 2015, 10(1): 1-18. doi: 10.1504/ IJBIDM.2015.069022.
    BOUAKKAZ M, LOUDCHER S, and OUINTEN Y. OLAP textual aggregation approach using the Google similarity distance[J]. International Journal of Business Intelligence Data Mining, 2016, 11(1): 31-48. doi: 10.1504/IJBIDM.2016. 076425.
    BEN K M, FEKI J, KHROUF K, et al. OLAP of the tweets: from modeling toward exploitation[C]. The 8th International Conference on Research Challenges in Information Science IEEE, Marrakech, 2014: 1-10.
    REHMAN N U, MANSMANN S, WEILER A, et al. Building a data warehouse for Twitter stream exploration[C]. IEEE/ ACM International Conference on Advances in Social Networks Analysis and Mining, Istanbul, 2012: 1341-1348.
    REHMAN N U, WEILER A, and SCHOLL M H. OLAPing social media: The case of Twitter[C]. IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Niagara, Ontario, Canada, 2013: 1139-1146.
    BLEI D M, NG A Y, and JORDAN M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3(1): 993-1022.
  • 加载中
计量
  • 文章访问数:  1248
  • HTML全文浏览量:  157
  • PDF下载量:  269
  • 被引次数: 0
出版历程
  • 收稿日期:  2017-01-11
  • 修回日期:  2017-08-16
  • 刊出日期:  2017-09-19

目录

    /

    返回文章
    返回