期刊文献+
共找到225篇文章
< 1 2 12 >
每页显示 20 50 100
Hotshots of Spatio-temporal Behavior of Chinese Residents in the Context of Big Data:Visual Analysis Based on CiteSpace
1
作者 LIU Tianlong WANG Fengyu JI Xiang 《Journal of Landscape Research》 2022年第5期47-51,共5页
By using CiteSpace software to create a knowledge map of authors,institutions and keywords,the literature on the spatio-temporal behavior of Chinese residents based on big data in the architectural planning discipline... By using CiteSpace software to create a knowledge map of authors,institutions and keywords,the literature on the spatio-temporal behavior of Chinese residents based on big data in the architectural planning discipline published in the China Academic Network Publishing Database(CNKI)was analyzed and discussed.It is found that there was a lack of communication and cooperation among research institutions and scholars;the research hotspots involved four main areas,including“application in tourism research”,“application in traffic travel research”,“application in work-housing relationship research”,and“application in personal family life research”. 展开更多
关键词 big data spatio-temporal behavior Visual analysis Hot topics TRENDS
在线阅读 下载PDF
Big Data Stream Analytics for Near Real-Time Sentiment Analysis 被引量:1
2
作者 Otto K. M. Cheng Raymond Lau 《Journal of Computer and Communications》 2015年第5期189-195,共7页
In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedente... In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies. 展开更多
关键词 big data data stream ANALYTICS SENTIMENT Analysis ONLINE Review
在线阅读 下载PDF
Applying Apache Spark on Streaming Big Data for Health Status Prediction
3
作者 Ahmed Ismail Ebada Ibrahim Elhenawy +3 位作者 Chang-Won Jeong Yunyoung Nam Hazem Elbakry Samir Abdelrazek 《Computers, Materials & Continua》 SCIE EI 2022年第2期3511-3527,共17页
Big data applications in healthcare have provided a variety of solutions to reduce costs,errors,and waste.This work aims to develop a real-time system based on big medical data processing in the cloud for the predicti... Big data applications in healthcare have provided a variety of solutions to reduce costs,errors,and waste.This work aims to develop a real-time system based on big medical data processing in the cloud for the prediction of health issues.In the proposed scalable system,medical parameters are sent to Apache Spark to extract attributes from data and apply the proposed machine learning algorithm.In this way,healthcare risks can be predicted and sent as alerts and recommendations to users and healthcare providers.The proposed work also aims to provide an effective recommendation system by using streaming medical data,historical data on a user’s profile,and a knowledge database to make themost appropriate real-time recommendations and alerts based on the sensor’s measurements.This proposed scalable system works by tweeting the health status attributes of users.Their cloud profile receives the streaming healthcare data in real time by extracting the health attributes via a machine learning prediction algorithm to predict the users’health status.Subsequently,their status can be sent on demand to healthcare providers.Therefore,machine learning algorithms can be applied to stream health care data from wearables and provide users with insights into their health status.These algorithms can help healthcare providers and individuals focus on health risks and health status changes and consequently improve the quality of life. 展开更多
关键词 big data streaming processing healthcare data machine learning IoT data processing Apache Spark
在线阅读 下载PDF
Incremental Learning Framework for Mining Big Data Stream
4
作者 Alaa Eisa Nora E.L-Rashidy +2 位作者 Mohammad Dahman Alshehri Hazem M.El-bakry Samir Abdelrazek 《Computers, Materials & Continua》 SCIE EI 2022年第5期2901-2921,共21页
At this current time,data stream classification plays a key role in big data analytics due to its enormous growth.Most of the existing classification methods used ensemble learning,which is trustworthy but these metho... At this current time,data stream classification plays a key role in big data analytics due to its enormous growth.Most of the existing classification methods used ensemble learning,which is trustworthy but these methods are not effective to face the issues of learning from imbalanced big data,it also supposes that all data are pre-classified.Another weakness of current methods is that it takes a long evaluation time when the target data stream contains a high number of features.The main objective of this research is to develop a new method for incremental learning based on the proposed ant lion fuzzy-generative adversarial network model.The proposed model is implemented in spark architecture.For each data stream,the class output is computed at slave nodes by training a generative adversarial network with the back propagation error based on fuzzy bound computation.This method overcomes the limitations of existing methods as it can classify data streams that are slightly or completely unlabeled data and providing high scalability and efficiency.The results show that the proposed model outperforms stateof-the-art performance in terms of accuracy(0.861)precision(0.9328)and minimal MSE(0.0416). 展开更多
关键词 Ant lion optimization(ALO) big data stream generative adversarial network(GAN) incremental learning renyi entropy
在线阅读 下载PDF
Construction of Smart City Spatio-Temporal Information Cloud Platform in Weifang,China
5
作者 LIU Qianzhong LIU Xiaojing ZHAO Pingting 《Journal of Donghua University(English Edition)》 EI CAS 2019年第6期615-622,共8页
On the basis of the digital Weifang geospatial framework,Smart Weifang spatio-temporal information cloud platform(WFCP)integrated legal person information,population,place name and address data,macroeconomic data and ... On the basis of the digital Weifang geospatial framework,Smart Weifang spatio-temporal information cloud platform(WFCP)integrated legal person information,population,place name and address data,macroeconomic data and so on.And it also expanded the data contents,such as the indoor and outdoor data,the overground and underground data,panoramic data and real data.It also introduced the contents of historical geographical information in different periods and real-time location information,address information of sensing equipment,real-time perception and interpreting information.It has overcome the difficulties of real-time access of Internet of Things(IoT)perception,multi-node collaboration,64-bit support,cluster deployment and has the characteristics of spatio-temporal management,ondemand service,large data analysis and micro-service architecture.It built spatio-temporal information big data center and spatio-temporal information cloud platform,realized the convergence and management of the distributed big data,deeply applied for land,transportation,environmental protection,police and subdistrict five areas,by supporting the integrated application of multi-source information and supporting intelligent deep application.In the aspect of hardware environment construction,according to the top-level design and unified arrangement of Smart Weifang,the WFCP was migrated to Weifang cloud computing center,to achieve the on-demand computing resources and dynamic scheduling load-based computing resources,to support the generalizing load map application. 展开更多
关键词 spatio-temporal information GEOSPATIAL framework dataSET HTML5 technology NewMap spatio-temporal data engine spatio-temporal big data center
在线阅读 下载PDF
Clustered Single-Board Devices with Docker Container Big Stream Processing Architecture
6
作者 N.Penchalaiah Abeer S.Al-Humaimeedy +3 位作者 Mashael Maashi J.Chinna Babu Osamah Ibrahim Khalaf Theyazn H.H.Aldhyani 《Computers, Materials & Continua》 SCIE EI 2022年第12期5349-5365,共17页
The expanding amounts of information created by Internet of Things(IoT)devices places a strain on cloud computing,which is often used for data analysis and storage.This paper investigates a different approach based on... The expanding amounts of information created by Internet of Things(IoT)devices places a strain on cloud computing,which is often used for data analysis and storage.This paper investigates a different approach based on edge cloud applications,which involves data filtering and processing before being delivered to a backup cloud environment.This Paper suggest designing and implementing a low cost,low power cluster of Single Board Computers(SBC)for this purpose,reducing the amount of data that must be transmitted elsewhere,using Big Data ideas and technology.An Apache Hadoop and Spark Cluster that was used to run a test application was containerized and deployed using a Raspberry Pi cluster and Docker.To obtain system data and analyze the setup’s performance a Prometheusbased stack monitoring and alerting solution in the cloud based market is employed.This Paper assesses the system’s complexity and demonstrates how containerization can improve fault tolerance and maintenance ease,allowing the suggested solution to be used in industry.An evaluation of the overall performance is presented to highlight the capabilities and limitations of the suggested architecture,taking into consideration the suggested solution’s resource use in respect to device restrictions. 展开更多
关键词 big data edge cloud cluster architecture performance engineering Raspberry pi dockers warm container technology data streaming
在线阅读 下载PDF
Sentiment Drift Detection and Analysis in Real Time Twitter Data Streams
7
作者 E.Susi A.P.Shanthi 《Computer Systems Science & Engineering》 SCIE EI 2023年第6期3231-3246,共16页
Handling sentiment drifts in real time twitter data streams are a challen-ging task while performing sentiment classifications,because of the changes that occur in the sentiments of twitter users,with respect to time.... Handling sentiment drifts in real time twitter data streams are a challen-ging task while performing sentiment classifications,because of the changes that occur in the sentiments of twitter users,with respect to time.The growing volume of tweets with sentiment drifts has led to the need for devising an adaptive approach to detect and handle this drift in real time.This work proposes an adap-tive learning algorithm-based framework,Twitter Sentiment Drift Analysis-Bidir-ectional Encoder Representations from Transformers(TSDA-BERT),which introduces a sentiment drift measure to detect drifts and a domain impact score to adaptively retrain the classification model with domain relevant data in real time.The framework also works on static data by converting them to data streams using the Kafka tool.The experiments conducted on real time and simulated tweets of sports,health care andfinancial topics show that the proposed system is able to detect sentiment drifts and maintain the performance of the classification model,with accuracies of 91%,87%and 90%,respectively.Though the results have been provided only for a few topics,as a proof of concept,this framework can be applied to detect sentiment drifts and perform sentiment classification on real time data streams of any topic. 展开更多
关键词 Sentiment drift sentiment classification big data BERT real time data streams TWITTER
在线阅读 下载PDF
基于大数据技术的DRGs绩效评价信息分析处理系统
8
作者 陈立真 《自动化技术与应用》 2025年第2期80-84,共5页
当前的信息处理分析系统在处理数据时缺乏实时性,导致系统平均响应时间较长。对此,设计基于大数据技术的DRGs绩效评价信息处理分析系统。根据绩效评价信息处理要求,建立包含三层结构软件层次总体架构;采用改进烟花算法建立信息集成模型... 当前的信息处理分析系统在处理数据时缺乏实时性,导致系统平均响应时间较长。对此,设计基于大数据技术的DRGs绩效评价信息处理分析系统。根据绩效评价信息处理要求,建立包含三层结构软件层次总体架构;采用改进烟花算法建立信息集成模型,汇总DRGs绩效评价信息;采用离散化数据流对汇总的信息进行实时、快速聚类分析;最后结合内存存储模式和同步复制技术,实现信息的分布式存储。实验结果表明:所提系统在不同数据量条件下的系统操作指令平均响应时间始终保持在4 s,说明所提系统的实时性较好。 展开更多
关键词 大数据技术 绩效评价系统 信息集成 流处理 信息存储
在线阅读 下载PDF
基于Spark Streaming的实时能耗分项计量系统 被引量:9
9
作者 武志学 《计算机应用》 CSCD 北大核心 2017年第4期928-935,共8页
能耗分项计量能够准确、及时、有效地发现能源使用问题,形成和实现最有效的节能措施。能耗分项计量系统需要对各项能源使用量在不同粒度上进行统计,既有实时性的需求,又需要涉及到聚合、去重、连接等较为复杂的统计需求。由于数据产生... 能耗分项计量能够准确、及时、有效地发现能源使用问题,形成和实现最有效的节能措施。能耗分项计量系统需要对各项能源使用量在不同粒度上进行统计,既有实时性的需求,又需要涉及到聚合、去重、连接等较为复杂的统计需求。由于数据产生快、实时性强、数据量大,所以很难统一采集并入库存储后再作处理,这便导致传统的数据处理架构不能满足需求。为此,提出基于Spark Streaming大数据流式技术构建一个实时能耗分项计量系统,对实时能耗分项计量的系统架构和内部结构进行了详细介绍,并通过实验数据分析了系统的实时数据处理能力。与传统架构不同,实时能耗分项计量系统在数据流动的过程中实时地进行捕捉和处理,一方面把捕捉到的异常信息及时报警到前端,同时把分类分项统计处理的结果保存到数据库,以便进行离线分析和数据挖掘,能有效地解决上述数据处理过程中遇到的问题。 展开更多
关键词 流式计算 能耗分项计量 SPARK streamING APACHE Kafka 大数据
在线阅读 下载PDF
基于动态网格的非平衡大数据密度聚类方法
10
作者 郭清 李睿 +3 位作者 李宇 章荣燕 刘伟 雷宇 《电子设计工程》 2025年第3期162-167,共6页
针对非平衡大数据当中进行聚类较为繁琐且聚类结果准确度不高的问题,提出一种以动态网格为基础的密度聚类方式。通过动态网格的划分,并设置相应网格密度的阈值,进行网格的自适应生成,实现相应的密度聚类效果。算法通过样本训练与测试对... 针对非平衡大数据当中进行聚类较为繁琐且聚类结果准确度不高的问题,提出一种以动态网格为基础的密度聚类方式。通过动态网格的划分,并设置相应网格密度的阈值,进行网格的自适应生成,实现相应的密度聚类效果。算法通过样本训练与测试对用户的异常轨迹进行监测,提出类相似的概念对不同的格簇进行划分,同时将噪声当成异常数据进行检测,保证数据检测的全面性。经过实际实验验证,改进算法对于非平衡大数据等问题的处理效果更优,精确度更高。 展开更多
关键词 动态网格 非平衡大数据 数据流 类相似 异常轨迹
在线阅读 下载PDF
基于Spark Streaming的电力流式大数据分析架构及应用 被引量:13
11
作者 田璐 齐林海 +3 位作者 李青 王红 田世明 卜凡鹏 《电力信息与通信技术》 2019年第2期23-29,共7页
近年来,为了应对许多业务需求的实时性要求,大数据流计算得到了研究。文章通过使用Apache Hadoop、Spark Streaming、Kafka和NoSQL Cassandra等开源资源,提出了一种用于电力流式大数据分析的通用架构。通过高吞吐量发布-订阅消息传递、... 近年来,为了应对许多业务需求的实时性要求,大数据流计算得到了研究。文章通过使用Apache Hadoop、Spark Streaming、Kafka和NoSQL Cassandra等开源资源,提出了一种用于电力流式大数据分析的通用架构。通过高吞吐量发布-订阅消息传递、实时计算和分布式存储系统的结合有效地解决并发访问数据流的收集、存储、实时分析等问题,从而实现电力行业流数据的实时分析。最后构建用电数据实时异常检测系统验证了其性能。 展开更多
关键词 SPARK streamING 电力流式大数据 电力数据分析 异常检测
在线阅读 下载PDF
基于Spark Streaming的实时流数据处理模型化研究与实现 被引量:2
12
作者 云惟英 苟宇 +1 位作者 王京 王丽莉 《测绘与空间地理信息》 2017年第S1期48-50,55,共4页
通过研究与分析,选取Spark Streaming技术实现对P实时流数据的处理.同时,研究出一套模型化的方式,实现动态装配软件的执行过程;并通过具体的实例展示了两者结合后,在数据处理的易用性、性能及吞吐量方面,都得到了大幅提升.
关键词 SPARK streamING 空间大数据 时实流数据
在线阅读 下载PDF
基于Spark Streaming的实时交通数据处理平台 被引量:13
13
作者 谭亮 周静 《计算机系统应用》 2018年第10期133-139,共7页
交通大数据是解决城市交通问题的最基本条件,是制定宏观城市交通发展战略规划和进行微观道路交通管理与控制的重要保障.针对于智能交通系统中数据产生快、实时性强、数据量大的特点,本文基于Spark Streaming和Apache Kafka的组合构建了... 交通大数据是解决城市交通问题的最基本条件,是制定宏观城市交通发展战略规划和进行微观道路交通管理与控制的重要保障.针对于智能交通系统中数据产生快、实时性强、数据量大的特点,本文基于Spark Streaming和Apache Kafka的组合构建了一个实时交通数据处理平台,用于处理通过双基基站采集的数据,采用时间窗口机制从持续的Kafka分布式消息队列中获取数据,并按照规则将数据分类处理后保存到数据库.本文对平台的系统架构和内部结构进行了详细的介绍,并通过实验验证了系统的实时处理能力,完全可以在大规模高并发的数据流下进行应用. 展开更多
关键词 大数据 流处理系统 双基基站数据 SPARK streamING APACHE Kafka
在线阅读 下载PDF
STGI:a spatio-temporal grid index model for marine big data 被引量:2
14
作者 Tengteng Qu Lizhe Wang +6 位作者 Jian Yu Jining Yan Guilin Xu Meng Li Chengqi Cheng Kaihua Hou Bo Chen 《Big Earth Data》 EI 2020年第4期435-450,共16页
Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB d... Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB database,this paper proposes a spatio-temporal grid index model(STGI)for efficient optimized query of marine big data.A spatio-temporal secondary index is created on the spatial code and time code columns to build a composite index in the MongoDB database used for the storage of massive marine data.Multiple comparative experiments demonstrate that the retrieval efficiency adopting the STGI approach is increased by more than two to three times compared with other index models.Through theoretical analysis and experimental verification,the conclusion could be achieved that the STGI model is quite suitable for retrieving large-scale spatial data with low time frequency,such as marine big data. 展开更多
关键词 GeoSOT spatio-temporal grid index model marine big data MONGODB
原文传递
基于Spark Streaming的海量日志实时处理系统的设计 被引量:7
15
作者 陆世鹏 《电子产品可靠性与环境试验》 2017年第5期71-76,共6页
在网络系统日志信息规模不断增长的情况下,结合运维中的实际需求,通过大数据技术,提出了一种基于Spark Streaming的海量日志实时处理系统,并详细地介绍了系统的底层日志数据收集、传输、计算、存储、查询存储等一系列功能的设计与实现... 在网络系统日志信息规模不断增长的情况下,结合运维中的实际需求,通过大数据技术,提出了一种基于Spark Streaming的海量日志实时处理系统,并详细地介绍了系统的底层日志数据收集、传输、计算、存储、查询存储等一系列功能的设计与实现。该系统不仅能够准确、实时地解析日志信息,对数据进行统计分析,而且能对历史日志数据进行实时存储和离线计算处理。 展开更多
关键词 大数据 SPARK streamING 日志分析 分布式计算
在线阅读 下载PDF
HybridTune: Spatio-Temporal Performance Data Correlation for Performance Diagnosis of Big Data Systems
16
作者 Rui Ren Jiechao Cheng +4 位作者 Xi-Wen He Lei Wang Jian-Feng Zhan Wan-Ling Gao Chun-Jie Luo 《Journal of Computer Science & Technology》 SCIE EI CSCD 2019年第6期1167-1184,共18页
With tremendous growing interests in Big Data, the performance improvement of Big Data systems becomes more and more important. Among many steps, the first one is to analyze and diagnose performance bottlenecks of the... With tremendous growing interests in Big Data, the performance improvement of Big Data systems becomes more and more important. Among many steps, the first one is to analyze and diagnose performance bottlenecks of the Big Data systems. Currently, there are two major solutions. One is the pure data-driven diagnosis approach, which may be very time-consuming;the other is the rule-based analysis method, which usually requires prior knowledge. For Big Data applications like Spark workloads, we observe that the tasks in the same stages normally execute the same or similar codes on each data partition. On basis of the stage similarity and distributed characteristics of Big Data systems, we analyze the behaviors of the Big Data applications in terms of both system and micro-architectural metrics of each stage. Furthermore, for different performance problems, we propose a hybrid approach that combines prior rules and machine learning algorithms to detect performance anomalies, such as straggler tasks, task assignment imbalance, data skew, abnormal nodes and outlier metrics. Following this methodology, we design and implement a lightweight, extensible tool, named HybridTune, and measure the overhead and anomaly detection effectiveness of HybridTune using the BigDataBench benchmarks. Our experiments show that the overhead of HybridTune is only 5%, and the accuracy of outlier detection algorithm reaches up to 93%. Finally, we report several use cases diagnosing Spark and Hadoop workloads using BigDataBench, which demonstrates the potential use of HybridTune. 展开更多
关键词 big data system spatio-temporal correlation rule-based diagnosis machine learning
原文传递
地面自动气象站数据流式处理设计与实现 被引量:2
17
作者 肖卫青 薛蕾 +7 位作者 刘振 罗兵 王颖 张来恩 郭萍 霍庆 韩书丽 何文春 《应用气象学报》 CSCD 北大核心 2024年第3期373-384,共12页
针对观测密度和频次日益增加的海量地面自动气象站数据,在气象大数据云平台(天擎)中设计了基于Storm的实时流式处理,利用大规模并行处理的优势提高地面自动气象站数据的处理时效。在流式处理中,设计处理拓扑直接解码标准格式的数据消息... 针对观测密度和频次日益增加的海量地面自动气象站数据,在气象大数据云平台(天擎)中设计了基于Storm的实时流式处理,利用大规模并行处理的优势提高地面自动气象站数据的处理时效。在流式处理中,设计处理拓扑直接解码标准格式的数据消息;消息确认采用手工确认的方式,将数据解码组件锚定数据接入组件,实现每条数据的可靠处理;数据解码时进行字节校验和时间检查等,过滤异常数据;应用批量加定时的发送策略,解决海量监控信息发送气象综合业务实时监控系统(天镜)的问题;集群部署时保留部分剩余资源,有效应对单节点异常。应用效果表明:国家气象站小时数据的服务时效由全国综合气象信息共享系统(CIMISS)的175 s提高至天擎的78 s,约6×10^(4)个区域气象站小时数据的服务时效由CIMISS的5 min提高至天警的2 min,实况分析系统将数据源切换至天擎后,相同时间检索可获取的站点数量较CIMISS增加1倍。2021年12月基于Storm的流式处理与天擎一同在国省业务化运行,实现了长期稳定运行,为MICAPS4、SWAN2.0、实况分析系统等用户提供高效稳定的地面自动气象站数据。 展开更多
关键词 气象大数据云平台 地面自动气象站 STORM RabbitMQ 流式处理 BUFR
在线阅读 下载PDF
不确定大数据流分类的决策树模型构建仿真 被引量:1
18
作者 杨知玲 谭树杰 《计算机仿真》 2024年第5期532-535,542,共5页
在不确定大数据流分类过程中,受噪声和孤立点的干扰,导致处理效果和分类精度无法达到预期要求。为解决上述问题,提出一种基于决策树模型的不确定大数据流分类算法。通过采用在线字典学习算法,对不确定大数据流去噪处理,消除噪声对分类... 在不确定大数据流分类过程中,受噪声和孤立点的干扰,导致处理效果和分类精度无法达到预期要求。为解决上述问题,提出一种基于决策树模型的不确定大数据流分类算法。通过采用在线字典学习算法,对不确定大数据流去噪处理,消除噪声对分类过程产生的干扰。构建决策树,在剪枝过程中通过特征过滤算法,滤除不确定大数据流中掺杂的孤立点。将去噪后的不确定大数据流,输入决策树模型中,完成分类工作。实验结果表明,所提算法处理后的不确定大数据流振幅明显减小,且分类精度高,具有一定的应用价值。 展开更多
关键词 决策树模型 在线字典学习算法 特征过滤 不确定大数据流 数据分类
在线阅读 下载PDF
基于Structured Streaming的实时文本画像系统设计与实现
19
作者 谢莹庆 熊义龙 曹炳尧 《工业控制计算机》 2022年第11期114-116,118,共4页
针对大数据环境下画像系统的实时性和准确性问题,提出一种基于Structured Streaming的实时画像系统设计与实现。利用canal组件对用户行为日志系统实现增量订阅,kafka消息中间件完成实时数据流接入,应用Structured Streaming实时计算框... 针对大数据环境下画像系统的实时性和准确性问题,提出一种基于Structured Streaming的实时画像系统设计与实现。利用canal组件对用户行为日志系统实现增量订阅,kafka消息中间件完成实时数据流接入,应用Structured Streaming实时计算框架对用户的实时数据进行分析处理,刻画用户的实时兴趣。通过改进的TF-IDF算法改善文本画像系统的准确性与可靠性,并借助Structured Streaming与静态数据良好的交互性减轻实时计算压力,提高系统响应速度。 展开更多
关键词 Structured streaming 大数据 画像系统 TF-IDF
在线阅读 下载PDF
医院大数据平台建设难点及关键技术研究 被引量:4
20
作者 宋雪 王觅也 +2 位作者 郑涛 师庆科 黄勇 《中国卫生信息管理杂志》 2024年第2期286-290,324,共6页
目的解决医院大数据平台在数据采集、治理及应用环节面临的困难。方法总结建设大数据平台的经验,深入分析该平台在各环节的建设难点,提出“流批一体”数据处理、“湖仓一体”存储、存算分离等关键技术方案。结果该平台已接入医院34个业... 目的解决医院大数据平台在数据采集、治理及应用环节面临的困难。方法总结建设大数据平台的经验,深入分析该平台在各环节的建设难点,提出“流批一体”数据处理、“湖仓一体”存储、存算分离等关键技术方案。结果该平台已接入医院34个业务系统数据、超过3PB的基因组学数据,提供超过2500TFLOPS的算力资源,为医院临床诊疗、管理决策、临床科研提供应用服务。结论以应用为驱动的大数据平台逐步实现了医院数据资产的统一存储和集中管理,有助于推动大数据技术在医疗领域的应用和发展。 展开更多
关键词 医疗大数据 数据采集 数据治理 数据应用 流批一体
在线阅读 下载PDF
上一页 1 2 12 下一页 到第
使用帮助 返回顶部