摘要
如何高效地解决非结构网格离散访存问题一直是科学与工程计算并行算法和应用领域关注的核心热点问题之一。基于国产申威异构众核架构而设计的分布式区块重连的优化算法,在解决应用课题中的非结构稀疏问题时能始终保持高效的计算性能。通过深入分析众核架构片上的通信机制来设计高效的消息分组策略,以提高从核片上阵列带宽的利用率,同时结合无栅栏数据分发算法充分发挥国产异构众核体系架构网络的性能。通过建立性能模型与实验测试分析可知,该算法在不同访存特征下平均内存带宽能达到理论值的70%以上,与主核串行算法相比具有平均10倍和最高45倍的加速性能。同时通过对多个不同领域的应用进行测试分析也证明了该算法的普适性。
How to efficiently solve the discrete-memory-accessing problem of unstructed-grid is one of the hot-spot issues in the field of parallel algorithms and application in scientific and engineering computing.The distributed block reconnection optimization algorithm,which is designed on the basis of domestic Sunway heterogeneous many-core architecture,can maintain high computing performance when solving the problem of unstructured sparsity in applications.After deeply analyzing the on-chip communication mechanism of the many-core architecture,an efficient message grouping strategy is designed to improve the bandwidth utilization of on-chip array on the slave core.At the same time,a barrier-free data distribution algorithm is combined to give full play to the network perfor-mance of the domestic heterogeneous many-core architecture.Through the establishment of perfor-mance models and experimental analysis,the average memory bandwidth of the proposed algorithm can reach more than 70%of the theoretical value under different memory access situations.Compared with the serial algorithm on the master core,it has an ave-rage of 10 times and a maximum of 45 times performance acceleration.At the same time,the universal applicability of the algorithm is proved by application tests in different fields.
作者
叶跃进
李芳
陈德训
郭恒
陈鑫
YE Yue-jin;LI Fang;CHEN De-xun;GUO Heng;CHEN Xin(National Supercomputing Center in Wuxi,Wuxi,Jiangsu 214000,China;Department of Computer Science and Technology,Tsinghua University,Beijin 100084,China)
出处
《计算机科学》
CSCD
北大核心
2022年第6期73-80,共8页
Computer Science
基金
国家重点研发计划“高性能计算”重点专项(2020YFB0204804,2016YFB0201100)。
关键词
国产众核架构
非结构网格
片上通信
消息分组
无栅栏数据分发
Domestic many-core architecture
Unstructed-grid
On-chip communication
Message grouping
Barrier-free data distribution