Floorplan, clock network and power plan are crucial steps in deep sub-micron system-on-chip design. A novel di- agonal floorplan is integrated to enhance the data sharing between different cores in system-on-chip. Cus...Floorplan, clock network and power plan are crucial steps in deep sub-micron system-on-chip design. A novel di- agonal floorplan is integrated to enhance the data sharing between different cores in system-on-chip. Custom clock network con- taining hand-adjusted buffers and variable routing rules is constructed to realize balanced synchronization. Effective power plan considering both IR drop and electromigration achieves high utilization and maintains power integrity in our MediaSoC. Using such methods, deep sub-micron design challenges are managed under a fast prototyping methodology, which greatly shortens the design cycle.展开更多
随着集成技术的快速发展,使得单个芯片上集成IP核数目越来越多。然而,晶体管密度和处理器工作频率的不断提升,使得功耗密度持续增加,导致芯片热量的不断上升。因此,MPSoCs面临不可避免的散热问题。提出了一种基于处理器核区域均温(Regio...随着集成技术的快速发展,使得单个芯片上集成IP核数目越来越多。然而,晶体管密度和处理器工作频率的不断提升,使得功耗密度持续增加,导致芯片热量的不断上升。因此,MPSoCs面临不可避免的散热问题。提出了一种基于处理器核区域均温(Regional Mean Temperature,RMT)的初始任务分配策略,该方法充分考虑到处理器核区域温度。通过向量距离计算处理器核温度梯度,使用遗传算法进行初始任务分配。实验结果表明,该策略相比于随机任务分配策略,峰值温度降低率、热点降低率和温度梯度降低率最高分别达到4.69%、42.31%和77.49%。展开更多
This paper proposes an object oriented model scheduling for parallel computing in media MultiProcessors System on Chip(MPSoC).Firstly,the Coarse Grain Data Flow Graph(CGDFG) parallel programming model is used in this ...This paper proposes an object oriented model scheduling for parallel computing in media MultiProcessors System on Chip(MPSoC).Firstly,the Coarse Grain Data Flow Graph(CGDFG) parallel programming model is used in this approach.Secondly,this approach has the feature of unified abstraction for software objects implementing in processor and hardware objects implementing in ASICs,easy for mapping CGDFG programming on MPSoC.This approach cuts down the kernel overhead and reduces the code size effectively.The principle of the oriented object model,the method of scheduling,and how to map a parallel programming through CGDFG to the MPSoC are analyzed in this approach.This approach also compares the code size and execution cycles with conventional control flow scheduling,and presents respective management overhead for one application in me-dia-SoC.展开更多
The application-specific multiprocessor system-on-chip(MPSoC) architecture is becoming an attractive solution to deal with increasingly complex embedded applications,which require both high performance and flexible pr...The application-specific multiprocessor system-on-chip(MPSoC) architecture is becoming an attractive solution to deal with increasingly complex embedded applications,which require both high performance and flexible programmability. As an effective method for MPSoC development,we present a gradual refinement flow starting from a high-level Simulink model to a synthesizable and executable hardware and software specification. The proposed methodology consists of five different abstract levels:Simulink combined algorithm and architecture model(CAAM),virtual architecture(VA),transactional accurate architecture(TA),virtual prototype(VP) and field-programmable gate array(FPGA) emulation. Experimental results of Motion-JPEG and H.264 show that the proposed gradual refinement flow can generate various MPSoC architectures from an original Simulink model,allowing processor,communication and tasks design space exploration.展开更多
The evolvable multiprocessor (EvoMP), as a novel multiprocessor system-on-chip (MPSoC) machine with evolvable task decomposition and scheduling, claims a major feature of low-cost and efficient fault tolerance. Non-ce...The evolvable multiprocessor (EvoMP), as a novel multiprocessor system-on-chip (MPSoC) machine with evolvable task decomposition and scheduling, claims a major feature of low-cost and efficient fault tolerance. Non-centralized control and adaptive distribution of the program among the available processors are two major capabilities of this platform, which remarkably help to achieve an efficient fault tolerance scheme. This letter presents the operational as well as architectural details of this fault tolerance scheme. In this method, when a processor becomes faulty, it will be eliminated of contribution in program execution in remaining run-time. This method also utilizes dynamic rescheduling capability of the system to achieve the maximum possible efficiency after processor reduction. The results confirm the efficiency and remarkable advantages of the proposed approach over common redundancy based techniques in similar systems.展开更多
3-D Networks-on-Chip (NoC) emerge as a potent solution to address both the interconnection and design complexity problems facing future Multiprocessor System-on-Chips (MPSoCs). Effective run-time mapping on such 3...3-D Networks-on-Chip (NoC) emerge as a potent solution to address both the interconnection and design complexity problems facing future Multiprocessor System-on-Chips (MPSoCs). Effective run-time mapping on such 3-D NoC-based MPSoCs can be quite challenging, as the arrival order and task graphs of the target applications are typically not known a priori, which can be further complicated by stringent energy requirements for NoC systems. This paper thus presents an energy-aware run-time incremental mapping algorithm (ERIM) for 3-D NoC which can minimize the energy consumption due to the data communications among processor cores, while reducing the fragmentation effect on the incoming applications to be mapped, and simultaneously satisfying the thermal constraints imposed on each incoming application. Specifically, incoming applications are mapped to cuboid tile regions for lower energy consumption of communication and the minimal routing. Fragment tiles due to system fragmentation can be gleaned for better resource utilization. Extensive experiments have been conducted to evaluate the performance of the proposed algorithm ERIM, and the results are compared against the optimal mapping algorithm (branch-and-bound) and two heuristic algorithms (TB and TL). The experiments show that ERIM outperforms TB and TL methods with significant energy saving (more than 10%), much reduced average response time, and improved system utilization.展开更多
基金Project supported by the Hi-Tech Research and Development Pro-gram (863) of China (No. 2002AA1Z1140)the Fok Ying TongEducation Foundation (No. 94031), China
文摘Floorplan, clock network and power plan are crucial steps in deep sub-micron system-on-chip design. A novel di- agonal floorplan is integrated to enhance the data sharing between different cores in system-on-chip. Custom clock network con- taining hand-adjusted buffers and variable routing rules is constructed to realize balanced synchronization. Effective power plan considering both IR drop and electromigration achieves high utilization and maintains power integrity in our MediaSoC. Using such methods, deep sub-micron design challenges are managed under a fast prototyping methodology, which greatly shortens the design cycle.
文摘随着集成技术的快速发展,使得单个芯片上集成IP核数目越来越多。然而,晶体管密度和处理器工作频率的不断提升,使得功耗密度持续增加,导致芯片热量的不断上升。因此,MPSoCs面临不可避免的散热问题。提出了一种基于处理器核区域均温(Regional Mean Temperature,RMT)的初始任务分配策略,该方法充分考虑到处理器核区域温度。通过向量距离计算处理器核温度梯度,使用遗传算法进行初始任务分配。实验结果表明,该策略相比于随机任务分配策略,峰值温度降低率、热点降低率和温度梯度降低率最高分别达到4.69%、42.31%和77.49%。
基金Supported by National Natural Science Foundation ofChina (No.60873112)
文摘This paper proposes an object oriented model scheduling for parallel computing in media MultiProcessors System on Chip(MPSoC).Firstly,the Coarse Grain Data Flow Graph(CGDFG) parallel programming model is used in this approach.Secondly,this approach has the feature of unified abstraction for software objects implementing in processor and hardware objects implementing in ASICs,easy for mapping CGDFG programming on MPSoC.This approach cuts down the kernel overhead and reduces the code size effectively.The principle of the oriented object model,the method of scheduling,and how to map a parallel programming through CGDFG to the MPSoC are analyzed in this approach.This approach also compares the code size and execution cycles with conventional control flow scheduling,and presents respective management overhead for one application in me-dia-SoC.
文摘The application-specific multiprocessor system-on-chip(MPSoC) architecture is becoming an attractive solution to deal with increasingly complex embedded applications,which require both high performance and flexible programmability. As an effective method for MPSoC development,we present a gradual refinement flow starting from a high-level Simulink model to a synthesizable and executable hardware and software specification. The proposed methodology consists of five different abstract levels:Simulink combined algorithm and architecture model(CAAM),virtual architecture(VA),transactional accurate architecture(TA),virtual prototype(VP) and field-programmable gate array(FPGA) emulation. Experimental results of Motion-JPEG and H.264 show that the proposed gradual refinement flow can generate various MPSoC architectures from an original Simulink model,allowing processor,communication and tasks design space exploration.
文摘The evolvable multiprocessor (EvoMP), as a novel multiprocessor system-on-chip (MPSoC) machine with evolvable task decomposition and scheduling, claims a major feature of low-cost and efficient fault tolerance. Non-centralized control and adaptive distribution of the program among the available processors are two major capabilities of this platform, which remarkably help to achieve an efficient fault tolerance scheme. This letter presents the operational as well as architectural details of this fault tolerance scheme. In this method, when a processor becomes faulty, it will be eliminated of contribution in program execution in remaining run-time. This method also utilizes dynamic rescheduling capability of the system to achieve the maximum possible efficiency after processor reduction. The results confirm the efficiency and remarkable advantages of the proposed approach over common redundancy based techniques in similar systems.
基金This work is supported by the National Natural Science Foundation of China under Grant Nos. 60873112 and 61028004, the National Natural Science Foundation of USA under Grant No. CNS-1126688.
文摘3-D Networks-on-Chip (NoC) emerge as a potent solution to address both the interconnection and design complexity problems facing future Multiprocessor System-on-Chips (MPSoCs). Effective run-time mapping on such 3-D NoC-based MPSoCs can be quite challenging, as the arrival order and task graphs of the target applications are typically not known a priori, which can be further complicated by stringent energy requirements for NoC systems. This paper thus presents an energy-aware run-time incremental mapping algorithm (ERIM) for 3-D NoC which can minimize the energy consumption due to the data communications among processor cores, while reducing the fragmentation effect on the incoming applications to be mapped, and simultaneously satisfying the thermal constraints imposed on each incoming application. Specifically, incoming applications are mapped to cuboid tile regions for lower energy consumption of communication and the minimal routing. Fragment tiles due to system fragmentation can be gleaned for better resource utilization. Extensive experiments have been conducted to evaluate the performance of the proposed algorithm ERIM, and the results are compared against the optimal mapping algorithm (branch-and-bound) and two heuristic algorithms (TB and TL). The experiments show that ERIM outperforms TB and TL methods with significant energy saving (more than 10%), much reduced average response time, and improved system utilization.