Deep neural networks have evolved remarkably over the past few years and they are currently the fundamental tools of many intelligent systems.At the same time,the computational complexity and resource consumption of t...Deep neural networks have evolved remarkably over the past few years and they are currently the fundamental tools of many intelligent systems.At the same time,the computational complexity and resource consumption of these networks continue to increase.This poses a significant challenge to the deployment of such networks,especially in real-time applications or on resource-limited devices.Thus,network acceleration has become a hot topic within the deep learning community.As for hardware implementation of deep neural networks,a batch of accelerators based on a field-programmable gate array(FPGA) or an application-specific integrated circuit(ASIC)have been proposed in recent years.In this paper,we provide a comprehensive survey of recent advances in network acceleration,compression,and accelerator design from both algorithm and hardware points of view.Specifically,we provide a thorough analysis of each of the following topics:network pruning,low-rank approximation,network quantization,teacher–student networks,compact network design,and hardware accelerators.Finally,we introduce and discuss a few possible future directions.展开更多
Dynamic neural network(NN)techniques are increasingly important because they facilitate deep learning techniques with more complex network architectures.However,existing studies,which predominantly optimize the static...Dynamic neural network(NN)techniques are increasingly important because they facilitate deep learning techniques with more complex network architectures.However,existing studies,which predominantly optimize the static computational graphs by static scheduling methods,usually focus on optimizing static neural networks in deep neural network(DNN)accelerators.We analyze the execution process of dynamic neural networks and observe that dynamic features introduce challenges for efficient scheduling and pipelining in existing DNN accelerators.We propose DyPipe,a holistic approach to optimizing dynamic neural network inferences in enhanced DNN accelerators.DyPipe achieves significant performance improvements for dynamic neural networks while it introduces negligible overhead for static neural networks.Our evaluation demonstrates that DyPipe achieves 1.7x speedup on dynamic neural networks and maintains more than 96%performance for static neural networks.展开更多
文摘Deep neural networks have evolved remarkably over the past few years and they are currently the fundamental tools of many intelligent systems.At the same time,the computational complexity and resource consumption of these networks continue to increase.This poses a significant challenge to the deployment of such networks,especially in real-time applications or on resource-limited devices.Thus,network acceleration has become a hot topic within the deep learning community.As for hardware implementation of deep neural networks,a batch of accelerators based on a field-programmable gate array(FPGA) or an application-specific integrated circuit(ASIC)have been proposed in recent years.In this paper,we provide a comprehensive survey of recent advances in network acceleration,compression,and accelerator design from both algorithm and hardware points of view.Specifically,we provide a thorough analysis of each of the following topics:network pruning,low-rank approximation,network quantization,teacher–student networks,compact network design,and hardware accelerators.Finally,we introduce and discuss a few possible future directions.
基金supported by the Beijing Natural Science Foundation under Grant No.JQ18013the National Natural Science Foundation of China under Grant Nos.61925208,61732007,61732002 and 61906179+1 种基金the Strategic Priority Research Program of Chinese Academy of Sciences(CAS)under Grant No.XDB32050200the Youth Innovation Promotion Association CAS,Beijing Academy of Artificial Intelligence(BAAI)and Xplore Prize.
文摘Dynamic neural network(NN)techniques are increasingly important because they facilitate deep learning techniques with more complex network architectures.However,existing studies,which predominantly optimize the static computational graphs by static scheduling methods,usually focus on optimizing static neural networks in deep neural network(DNN)accelerators.We analyze the execution process of dynamic neural networks and observe that dynamic features introduce challenges for efficient scheduling and pipelining in existing DNN accelerators.We propose DyPipe,a holistic approach to optimizing dynamic neural network inferences in enhanced DNN accelerators.DyPipe achieves significant performance improvements for dynamic neural networks while it introduces negligible overhead for static neural networks.Our evaluation demonstrates that DyPipe achieves 1.7x speedup on dynamic neural networks and maintains more than 96%performance for static neural networks.