期刊文献+

大数据处理模式——系统结构,方法以及发展趋势 被引量:13

Big Data Processing Mode——System Architecture,Method and Develop Trend
在线阅读 下载PDF
导出
摘要 近年来,大数据处理的相关理论以及技术越来越受到工业界和学术界的重视.一方面,在科学研究过程中产生了大量的数据,对于数据的理解成为进行科学研究的一个重要手段.另一方面,随着信息技术的不断发展,企业在信息化过程中积累了大量的结构化和非结构化数据.企业管理与运营的这些数据已经成为企业的核心资产,深刻地影响着企业的业务模式,给企业决策、组织和业务流程带来显著的变化.因此,大数据处理的相关技术也受到工业界的极大关注.依据数据处理的时间特征,大数据处理模式可以分为"离线批处理式数据处理","查询式数据处理"以及"实时式数据处理"三种模式.本文从技术角度,总结了大数据处理的总体架构,并针对处理模式的不同,对大数据处理的不同层次进行展开讨论.大数据处理的基础是数据的存储,本文首先对大数据的存储展开一定的讨论,之后对上述三种模式展开叙述,使得读者能够对大数据系统的构建方面有一个初步的了解. In recent years ,big data processing related theories and techniques get more and more attention from industry and academic. On the one hand, the scientific research produces a large amount of data. Analyzing these data is an important part for scientific re- search. On the other hand, with the continuous development of information technology, enterprises accumulate a large amount of struc- tured and unstructured data during informatization process. How to manage and operate these data has become the company's core as- sets,profoundly affect the company's business model~ decision-making,organization and business processes. Therefore, a large data processing related technologies have also been of great concern to the industry. Based on the time characteristics of the data process- ing, big data processing mode can be divided into three modes offline batch data processing, query-based data processing and re- al-time data processing. This article summed up the general framework of big data processing from a technical point of view, and carry out discussion of each levels of big data processing for different processing mode. Because big data processing is based on big data storage, we first put some discussion on big data storage. Then we expand the description of these three modes so that the readers could have a preliminary understanding about big data system building.
出处 《小型微型计算机系统》 CSCD 北大核心 2015年第4期641-647,共7页 Journal of Chinese Computer Systems
基金 国家"八六三"高技术研究发展计划基金项目(2012AA012600)资助 教育部-中国移动科研基金项目(MCM20123021)资助
关键词 大数据 系统结构 实时系统 数据处理 分布式存储 big data system architecture real-time system data process distributed storage
  • 相关文献

参考文献3

二级参考文献164

  • 1Sims K. IBM introduces ready-to-use cloud computing collaboration services get clients started with cloud computing. 2007. http://www-03.ibm.com/press/us/en/pressrelease/22613.wss
  • 2Boss G, Malladi P, Quan D, Legregni L, Hall H. Cloud computing. IBM White Paper, 2007. http://download.boulder.ibm.com/ ibmdl/pub/software/dw/wes/hipods/Cloud_computing_wp_final_8Oct.pdf
  • 3Zhang YX, Zhou YZ. 4VP+: A novel meta OS approach for streaming programs in ubiquitous computing. In: Proc. of IEEE the 21st Int'l Conf. on Advanced Information Networking and Applications (AINA 2007). Los Alamitos: IEEE Computer Society, 2007. 394-403.
  • 4Zhang YX, Zhou YZ. Transparent Computing: A new paradigm for pervasive computing. In: Ma JH, Jin H, Yang LT, Tsai JJP, eds. Proc. of the 3rd Int'l Conf. on Ubiquitous Intelligence and Computing (UIC 2006). Berlin, Heidelberg: Springer-Verlag, 2006. 1-11.
  • 5Barroso LA, Dean J, Holzle U. Web search for a planet: The Google cluster architecture. IEEE Micro, 2003,23(2):22-28.
  • 6Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 1998,30(1-7): 107-117.
  • 7Ghemawat S, Gobioff H, Leung ST. The Google file system. In: Proc. of the 19th ACM Symp. on Operating Systems Principles. New York: ACM Press, 2003.29-43.
  • 8Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In: Proc. of the 6th Symp. on Operating System Design and Implementation. Berkeley: USENIX Association, 2004. 137-150.
  • 9Burrows M. The chubby lock service for loosely-coupled distributed systems. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation. Berkeley: USENIX Association, 2006. 335-350.
  • 10Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable: A distributed storage system for structured data. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation. Berkeley: USENIX Association, 2006. 205-218.

共引文献2101

同被引文献94

  • 1陶雪娇,胡晓峰,刘洋.大数据研究综述[J].系统仿真学报,2013,25(S1):142-146. 被引量:344
  • 2阚飙,徐建国.传染病监测的实验室网络化[J].疾病监测,2005,20(1):1-2. 被引量:21
  • 3徐路宁,张和明.产品设计阶段成本控制的相关对策[J].工业技术经济,2005,24(3):75-77. 被引量:21
  • 4Polgreen PM, Chen Y, Pennock DM, et al. Using internet sear- ches for influenza surveillance [ J ]. Clin Infect Dis, 2008,47 ( 11 ) : 1443 - 1448.
  • 5Hulth A, Rydevik G, Linde A. Web queries as a source for syn- dromic surveillance[ J]. PLoS One,2009,4 :e4378.
  • 6. Ginsberg J, Mohebbi MH, Patel RS, et al. Detecting influenza epidemics using search engine query data [ J ]. Nature, 2009,457 (7232) :1012 - 1014..
  • 7Althouse BM, Ng YY, Cummings DA. Prediction of dengue in- cidence using search query surveillance [ J 1. PLoS Negl Trop Dis,2011,5 :e1258.
  • 8Yuan QY, Elaine O, Ben FL, et al. Monitoring influenza epidem- ics in China with search query from baidu [ J ]. Chinese Influenza Epidemic,2013,8(2) :1 -7.
  • 91Vfilinovich GJ,WiUiams GM,Clements AC,et al. lntemet-based sur- veiUance systems for monitoring emerging infectious diseases [ J ]. Lancet Infect Dis,2014,14(2) :160 -168.
  • 10Chan EH,Sahai V,Cortrad C,et al. Using web search query data to monitor dengue epidemics: a new model for neglected tropical disease surveillance [ J ]. PLoS Negl Trop Dis,2011,5 : e1206.

引证文献13

二级引证文献113

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部