期刊文献+

基于用户关系的维吾尔文微博数据获取方法的研究 被引量:4

Research of User-Relationship Based Data Acquisition Method on Uyghur Microblog
在线阅读 下载PDF
导出
摘要 目前,互联网上的大部分群体性数据资源集中在微博、论坛等社交网络上.跨语言社会舆情分析是我国智能信息处理的一个研究热点.维吾尔语是我国主要少数民族语言之一,为了构建一个好的跨语言舆情分析系统,维吾尔文微博的数据获取显得尤为重要.维吾尔文微博数据获取最大的难点是微博开发商不提供API.本文以技术和经济为基础的"Guduk"微博为研究对象,提出了一种基于用户关系的维吾尔文微博数据获取爬虫系统方案,此方案解决了在不提供API情况下的数据获取难点.本文的研究为跨语言舆情分析系统提供大量的维吾尔文社交网络数据资源、数据获取方法和技术. At present, most of the mass of data on the internet resources are concentrated in Microblogs,forums and other social networks cross-language social public opinion analysis is a hotspot of intelligent information processing in China, and Uyghur is one of the major minority languages in China. In order to build a good cross-language public opinion analysis system, Uyghur microblog's data acquisition is particularly important. The biggest difficulty of Uyghur microblog data access is that the microblog developers does not provide API. Research object of this paper is the "Guduk" Microblog,based on the technology and economy and this paper presents a program that user relationship-based microblog data acquisition crawler system. This program solved the difficulty of data acquisition on the case of not providing API. This study provides a big amount of Uyghur social network data resources,data acquisition method and techniques for cross-language public opinion analysis system.
出处 《新疆大学学报(自然科学版)》 CAS 北大核心 2015年第1期74-79,共6页 Journal of Xinjiang University(Natural Science Edition)
基金 国家重点基础研究发展计划(973)项目(2014cb340506) 国家自然科学基金项目(61331011)
关键词 跨语言 舆情 数据获取 用户关系 网络爬虫 微博API Cross-language Public Opinion Data Extraction User Relationship Web Crawler Micro Blog API
  • 相关文献

参考文献6

二级参考文献37

共引文献166

同被引文献46

  • 1徐杰,施鹏飞.图像检索中基于标记与未标记样本的主动学习算法[J].上海交通大学学报,2004,38(12):2068-2072. 被引量:7
  • 2徐军,丁宇新,王晓龙.使用机器学习方法进行新闻的情感自动分类[J].中文信息学报,2007,21(6):95-100. 被引量:108
  • 3居胜峰,王中卿,李寿山,等. 情感分类中不同主动学习策略比较研究[C] //中国计算语言学研究前沿进展(2009-2011). 2011:506-511.
  • 4Li S,Huang C R,Zhou G,et al.Employing Personal/Impersonal Views in Supervised and Semi-Supervised Sentiment Classification[C].Proceedings of Annual Meeting of the Association for Computational Linguistics,2010:414-423.
  • 5Pang B,Lee L,Vaithyanathan S.Thumbs up?:sentiment classification using machine learning techniques[C].Proceedings of Emnlp,2002:79–86.
  • 6Dasgupta S,Ng V.Mine the Easy,Classify the Hard:A Semi-Supervised Approach to Automatic Sentiment Classification[C].Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP,2009,2.
  • 7龙军,殷建平,祝恩,等.主动学习研究综述[C].2007全国理论计算机科学学术年会,2007:300-304.
  • 8Pang B,Lee L.A Sentimental Education:Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts[C].Proceedings of the Acl,2004:271–278.
  • 9Riloff E,Patwardhan S,Wiebe J.Feature Subsumption for Opinion Analysis[J].In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing(EMNLP-06,2006:440-448.
  • 10Mcdonald R,Hannan K,Neylon T,et al.Structured Models for Fine-to-Coarse Sentiment Analysis[C].Proceedings of Annual Meeting of the Association of Computational Linguistics,2007.

引证文献4

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部