期刊文献+

基于位串行计算的动态精度神经网络处理器

Bit-serial-based dynamic-precision neural network processor
在线阅读 下载PDF
导出
摘要 针对当前神经网络动态精度计算系统在周期性的模型重训练和动态精度切换的过程中会引入大量的计算和访存开销问题,提出了基于串行位计算的动态精度神经网络处理器(DPNN),其可支持任意规模、任意精度的神经网络模型;支持以非重训练的方式对模型数据精度进行细粒度调整,并消除了动态精度切换时因权值bit位重叠造成的重复计算与访存。实验结果表明,相较于自感知神经网络系统(SaNNs)的最新进展之一MinMaxNN,DPNN可使计算量平均降低1.34~2.52倍,访存量降低1.16~1.93倍;相较于代表性的bit串行计算神经网络处理器Stripes,DPNN使性能提升2.57倍、功耗节省2.87倍、面积减少1.95倍。 Aiming at the problem that the existing neural network dynamic-precision-computing system introduces a lot of computing and data access overhead in the process of periodic model retraining and switching,this paper proposes a dynamic-precision neural-network processor(DPNN)based on bit-serial-computing,which can support neural networks of any scales and bit-widths.DPNN supports fine-grained adjustment of model data accuracy without retraining,and eliminates repeated operands and data access caused by bits-of-synapses overlap during dynamic-precision-computing.The experimental results show that,compared with MinMaxNN,one of the latest advances in self-aware neural network systems(SaNNs),DPNN could reduce operands by 1.34-2.52 times and data access by 1.16-1.93 times on average.Compared with Stripes,the representative bit-serial-computing neural network processor,DPNN improves performance by 2.57 times,saves power-consumption by 2.87 times,and reduces area by 1.95 times.
作者 郝一帆 支天 杜子东 HAO Yifan;ZHI Tian;DU Zidong(State Key Lab of Processors,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
出处 《高技术通讯》 CAS 2022年第9期881-893,共13页 Chinese High Technology Letters
基金 国家重点研发计划(2018AAA0103300) 国家自然科学基金(62222214,U20A20227,U19B2019,U22A2028) 北京智源人工智能研究院 中国科学院稳定支持基础研究领域青年团队计划(YSBR-029) 中国科学院青年创新促进会资助项目。
关键词 神经网络处理器 动态精度计算 位串行计算 neural network processor dynamic precision computing bit-serial
  • 相关文献

参考文献3

二级参考文献10

  • 1Brent R P,Luk F T,Van Loan C F. Computation of the singular value decomposition using mesh-connected processors[J].JOURNAL OF VLSI AND COMPUTER SYSTEMS,1985,(03):242-270.
  • 2Cavallaro J R,Luk F T. CORDIC arithmetic for an SVD processor[J].Journal of Parallel and Distributed Computing,1988,(03):271-290.
  • 3JACK E Volder. The CORDIC trigonometric computing technique[J].IRE Tmnsactions on Electronics Computers Trans Electronic Computing,1959,(03):330-334.
  • 4Ray Andraka. A survey of CORDIC algorithms for FPGA based computers[A].1998.191-200.
  • 5Ahmedsaid A,Amira A,Bouridane A. Improved SVD systolic array and implementation on FPGA[A].2003.35-42.
  • 6Brent R P,Luk F T. The solution of singular-value and symmetric eigenvalue problems on multiprocessor arrays[J].SIAM Journal on Scientific and Statistical Computing,1985,(01):69-84.
  • 7Cavallaro J R,Keleher M P,Price R H. VLSI implementation of a CORDIC SVD processor[A].1989.256-260.
  • 8Ma Wei-wei,Kaye M E,Luke D M. An FPGA-based singular value decomposition processor[A].2006.1047-1050.
  • 9余凯,贾磊,陈雨强,徐伟.深度学习的昨天、今天和明天[J].计算机研究与发展,2013,50(9):1799-1804. 被引量:628
  • 10吴艳霞,梁楷,刘颖,崔慧敏.深度学习FPGA加速器的进展与趋势[J].计算机学报,2019,42(11):2461-2480. 被引量:63

共引文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部