摘要
近年来,大数据处理的相关理论以及技术越来越受到工业界和学术界的重视.一方面,在科学研究过程中产生了大量的数据,对于数据的理解成为进行科学研究的一个重要手段.另一方面,随着信息技术的不断发展,企业在信息化过程中积累了大量的结构化和非结构化数据.企业管理与运营的这些数据已经成为企业的核心资产,深刻地影响着企业的业务模式,给企业决策、组织和业务流程带来显著的变化.因此,大数据处理的相关技术也受到工业界的极大关注.依据数据处理的时间特征,大数据处理模式可以分为"离线批处理式数据处理","查询式数据处理"以及"实时式数据处理"三种模式.本文从技术角度,总结了大数据处理的总体架构,并针对处理模式的不同,对大数据处理的不同层次进行展开讨论.大数据处理的基础是数据的存储,本文首先对大数据的存储展开一定的讨论,之后对上述三种模式展开叙述,使得读者能够对大数据系统的构建方面有一个初步的了解.
In recent years ,big data processing related theories and techniques get more and more attention from industry and academic. On the one hand, the scientific research produces a large amount of data. Analyzing these data is an important part for scientific re- search. On the other hand, with the continuous development of information technology, enterprises accumulate a large amount of struc- tured and unstructured data during informatization process. How to manage and operate these data has become the company's core as- sets,profoundly affect the company's business model~ decision-making,organization and business processes. Therefore, a large data processing related technologies have also been of great concern to the industry. Based on the time characteristics of the data process- ing, big data processing mode can be divided into three modes offline batch data processing, query-based data processing and re- al-time data processing. This article summed up the general framework of big data processing from a technical point of view, and carry out discussion of each levels of big data processing for different processing mode. Because big data processing is based on big data storage, we first put some discussion on big data storage. Then we expand the description of these three modes so that the readers could have a preliminary understanding about big data system building.
出处
《小型微型计算机系统》
CSCD
北大核心
2015年第4期641-647,共7页
Journal of Chinese Computer Systems
基金
国家"八六三"高技术研究发展计划基金项目(2012AA012600)资助
教育部-中国移动科研基金项目(MCM20123021)资助
关键词
大数据
系统结构
实时系统
数据处理
分布式存储
big data
system architecture
real-time system
data process
distributed storage