期刊文献+

一种基于NVMeoF存储池的分域共享并发存储架构 被引量:5

A regional shared and high concurrent storage architecture based on NVMeoF storage pool
在线阅读 下载PDF
导出
摘要 E级计算和大数据时代,为了充分利用超级计算机系统的并行计算能力,许多大数据应用程序在高性能计算HPC系统上运行,超级计算机的I/O模式更趋复杂,I/O瓶颈问题日益严峻。当前基于闪存的存储阵列或存储服务器已逐步应用在高性能计算机的并行存储系统中,但传统存储体系结构、I/O协议软件栈和存储网络的较高延迟使得新型存储介质不能发挥性能优势,存储系统依然存在I/O访问延迟高、并发I/O吞吐率和瞬发I/O(Burst I/O)带宽受限的问题。针对上述问题和技术挑战,提出了一种基于非易失存储介质NVM的分域共享并发存储架构,设计了一种支持NVMeoF网络存储的Burst I/O缓冲存储池NV-BSP,实现了虚拟化存储池资源管理、基于天河高速互连网的NVMeoF网络存储通信等关键技术,具有横向和纵向扩展能力,可有效支持面向特定计算任务的Burst I/O加速和低延迟远程存储访问。基于HPC和大数据应用程序混合运行性能分析模型,提出了一种混合应用程序QoS控制策略。小规模验证系统上的性能测评结果表明:NV-BSP存储池的读写性能可随并发I/O处理线程数良好扩展;与Linux操作系统自带的MD-RAID相比具有明显的性能优势;相比本地I/O访问,基于天河互连网络的NVMeoF远程存储读写延迟仅增加了59.25μs和54.03μs。通过计算与存储分离,NV-BSP在提供堪比本地存储池性能的同时,提高了系统存储资源动态调配的灵活性和系统可靠性。 In the era of exascale computing and big data,High Performance Computing(HPC)systems have been widely deployed as the infrastructure for big data analytics,in order to leverage their parallel computing capabilities.As the I/O patterns in HPC systems get increasingly complicated and heterogeneous,breaking through the I/O bottleneck is challenging and urgent for HPC systems.In recent years,flash-based storage arrays and storage servers have been gradually deployed in HPC storage systems.However,the conventional shared storage architectures,I/O software stack,and storage networking designs are primarily for Hard Disk Drives(HDD),which induces severe I/O overhead in the I/O path and prevents the HPC storage systems from taking full advantage of the performance benefits from Non-Volatile Memory(NVM).To achieve low I/O latency,high concurrent I/O throughput,and high burst I/O bandwidth,this paper proposes a regional shared and high concurrent storage architecture.We design an NVMeoF-based burst I/O storage pool(NV-BSP),which implements the key techniques such as virtualized storage pool resource management and NVeoF network storage communication based on Tianhe high-speed Internet.It has horizontal and vertical expansion capabilities and can effectively support Burst I/O acceleration and low-latency remote for specific computing tasks.Besides,we further propose a Quality-of-Service(QoS)control strategy for the storage systems with HPC and big data mixed applications.The experimental results on a prototype system show that NV-BSP achieves the scalable write performance as the number of I/O handling threads increases.Compared with the built-in MD-RAID in Linux,NV-BSP obtains higher I/O bandwidth.Compared with the node-local storage pool,I/O latencies of NVMeoF-based remote storage only increase 59.25us for read and 54.03us for write.By disaggregating storage from computation,NV-BSP significantly improves the system scalability and reliability while delivering the comparable performance to local storage.
作者 李琼 宋振龙 袁远 谢徐超 LI Qiong;SONG Zhen-long;YUAN Yuan;XIE Xu-chao(School of Computer,National University of Defense Technology,Changsha 410073,China)
出处 《计算机工程与科学》 CSCD 北大核心 2020年第10期1711-1719,共9页 Computer Engineering & Science
基金 国家重点研发计划(2018YFB0204301)。
关键词 存储系统结构 Burst Buffer NVMe SSD NVMeoF 高性能计算 大数据 storage architecture burst buffer NVMe SSD NVMe over fabrics high performance computing big data
  • 相关文献

参考文献4

二级参考文献38

  • 1Mark D H, Anne E C, Manoj P, et al. A System-level Specification Framework for I/O Architectures[C]//The 11^th SPAA, 1999:27- 30.
  • 2Yang X J, Dai H D. Operating System-centric Memory Consistency Model--Thread Consistency Model[ C]//The Fourth APPT'01, Germany, 2001 : 12- 15.
  • 3Iftode L, Singh J P. Scope Consistency: A Bridge between Release Consistency and Entry Consistency[ C ]//The 8^th SPAA, 1996:10- 18.
  • 4Li Q, Pang Z B, Guo Y F, et al. A GPDMA-based Distributed Shared I/O Solution for CC-NUMA System[C]//The 9^th Inter. Conf. for Yotmg Computer Scientists,.2008:172- 177.
  • 5Franks B. Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics. www.wiley.com. 2012.
  • 6Verta 0, Mastroianni C, Talia D. A super-peer model for resource discovery services in large-scale grids. Future Generation Computer Systems, 2005,21(8): 1235-1248.
  • 7Bent J, Grider G, Kettering Br, Manzanares A, McClelland M, Torres A, Torrez A. Storage challenges at Los Alamos National Lab. In: Proceedings of the 2012 Symposium on Massive Storage Systems and Technologies. 2012: 1-5.
  • 8Watson R W, Coyne R A. The parallel I/O architecture of the highperformance storage system. In: Proceedings of the 14th IEEE Symposium on Mass Storage Systems. 1995,27-44.
  • 9Lofstead J, Zheng F, Liu Q, Klasky S, Oldfield R, Kordenbrock T, Schwan K, Wolf M. Managing variability in the 10 performance of petascale storage system. IEEE Computer Society, 2010: 1-12.
  • 10Zhuge H. The Knowledge Grid. Singapore: World Scientific, 2004.

共引文献13

同被引文献59

引证文献5

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部