摘要
基于 BSP模型 ,该文提出了异步计算模型 (CSA- BSP) .该模型更准确地描述了并行机的性能参数 ,引导用户编写高效率的并行程序 ;在 CSA- BSP模型下 ,两个进程异步执行的位置至多相差 p- 1个超步 ;基于程序的执行时间 ,作者分析了 BSP、A- BSP和 CSA- BSP程序的效率 ,得出 CSA - BSP程序的效率是最高的 .在曙光并行机上 ,用“红黑格法”和“矩阵乘法”进行了验证 ,和 BSP模型相比 ,这两个 CSA- BSP程序的效率分别提高 2 0 %和 37% ;同时 ,其进程执行时间的和最大可以降低 8% .因此 ,按照 CSA- BSP模型编程对于提高程序效率和改善系统的吞吐率 ,都有良好的效果 .
Parallel computing models, the bridges between system architecture and application, are widely investigated. Many models, such as BSP and LogP, have been proposed. But no one has been accepted as the unique model for parallel computation. In BSP model, communication operations are arranged at the end of each super step. This means that each process will send or receive data almost at the same time, which increases the possibility of communication congestion. In this paper, based on BSP model and the concept of computation send segments, we propose an asynchronous parallel computing model, CSA BSP, which can more accurately describe the performance parameters of parallel computers and guide programmers to write high efficient programs. This model utilizes the overlap of computation and communication and makes communications spread around a super step, which will reduce the congestion of communication in a traditional BSP super step.Under CSA BSP model, we can estimate the execution time of a process and give its performance equation. In this model, two processes can execute in different super steps, at most p-1 super steps away from each other. Using program's executing time as the parameter, we analyze the efficiencies of parallel programs under BSP, A BSP and CSA BSP models. Compared with the BSP and A BSP programs, CSA BSP programs are more efficient. The results are verified by the programs of the 'Red and Black' method and the matrix multiplication. In our examples, compared to BSP programs, the efficiencies of CSA BSP programs increase by 20% and 37%. To analyze the throughput of CSA BSP model, another parameter, the total time used by all the processes in one application (PTS) is proposed. The CSA BSP program of 'Red and Black' method can reduce the PTS time by 8% against that in the BSP program. During this time all resources have been released and they can be used by other tasks. From theoretical analysis and experiment results, we can see that CSA BSP model can more accurately analyze the performance parameters of parallel computers. Programming with CSA BSP model can enhance the performance both from improving the program's efficiency and from increasing the throughput of computer systems.
出处
《计算机学报》
EI
CSCD
北大核心
2002年第4期373-380,共8页
Chinese Journal of Computers
基金
国家自然科学基金 (6993 3 0 2 0 )
国家高性能计算基金资助
关键词
并行计算模型
性能分析
异步BSP模型
程序优化
并行计算机
BSP, CSA BSP, parallel computing model, overlap of computation and communication, performance analysis