基于位串行计算的动态精度神经网络处理器

Bit-serial-based dynamic-precision neural network processor

在线阅读下载PDF

导出

摘要针对当前神经网络动态精度计算系统在周期性的模型重训练和动态精度切换的过程中会引入大量的计算和访存开销问题,提出了基于串行位计算的动态精度神经网络处理器(DPNN),其可支持任意规模、任意精度的神经网络模型;支持以非重训练的方式对模型数据精度进行细粒度调整,并消除了动态精度切换时因权值bit位重叠造成的重复计算与访存。实验结果表明,相较于自感知神经网络系统(SaNNs)的最新进展之一MinMaxNN,DPNN可使计算量平均降低1.34~2.52倍,访存量降低1.16~1.93倍;相较于代表性的bit串行计算神经网络处理器Stripes,DPNN使性能提升2.57倍、功耗节省2.87倍、面积减少1.95倍。 Aiming at the problem that the existing neural network dynamic-precision-computing system introduces a lot of computing and data access overhead in the process of periodic model retraining and switching,this paper proposes a dynamic-precision neural-network processor(DPNN)based on bit-serial-computing,which can support neural networks of any scales and bit-widths.DPNN supports fine-grained adjustment of model data accuracy without retraining,and eliminates repeated operands and data access caused by bits-of-synapses overlap during dynamic-precision-computing.The experimental results show that,compared with MinMaxNN,one of the latest advances in self-aware neural network systems(SaNNs),DPNN could reduce operands by 1.34-2.52 times and data access by 1.16-1.93 times on average.Compared with Stripes,the representative bit-serial-computing neural network processor,DPNN improves performance by 2.57 times,saves power-consumption by 2.87 times,and reduces area by 1.95 times.

作者郝一帆支天杜子东 HAO Yifan;ZHI Tian;DU Zidong(State Key Lab of Processors,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)

机构地区中国科学院计算技术研究所处理器芯片全国重点实验室中国科学院大学

出处《高技术通讯》 CAS 2022年第9期881-893,共13页 Chinese High Technology Letters

基金国家重点研发计划(2018AAA0103300) 国家自然科学基金(62222214,U20A20227,U19B2019,U22A2028) 北京智源人工智能研究院中国科学院稳定支持基础研究领域青年团队计划(YSBR-029) 中国科学院青年创新促进会资助项目。

关键词神经网络处理器动态精度计算位串行计算 neural network processor dynamic precision computing bit-serial

分类号 TP183 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献3

1王婷,陈斌岳,张福海.基于FPGA的卷积神经网络并行加速器设计[J].电子技术应用,2021,47(2):81-84. 被引量：4
2王明钊,程华,王宇泽,刘鹏.基于精度可变乘法器的脉动阵列[J].南京大学学报（自然科学版）,2020,56(6):885-891. 被引量：3
3谭曼琼,徐成,刘彦.位串行SVD处理器的设计[J].小型微型计算机系统,2012,33(6):1358-1362. 被引量：2

二级参考文献10

1Brent R P,Luk F T,Van Loan C F. Computation of the singular value decomposition using mesh-connected processors[J].JOURNAL OF VLSI AND COMPUTER SYSTEMS,1985,(03):242-270.
2Cavallaro J R,Luk F T. CORDIC arithmetic for an SVD processor[J].Journal of Parallel and Distributed Computing,1988,(03):271-290.
3JACK E Volder. The CORDIC trigonometric computing technique[J].IRE Tmnsactions on Electronics Computers Trans Electronic Computing,1959,(03):330-334.
4Ray Andraka. A survey of CORDIC algorithms for FPGA based computers[A].1998.191-200.
5Ahmedsaid A,Amira A,Bouridane A. Improved SVD systolic array and implementation on FPGA[A].2003.35-42.
6Brent R P,Luk F T. The solution of singular-value and symmetric eigenvalue problems on multiprocessor arrays[J].SIAM Journal on Scientific and Statistical Computing,1985,(01):69-84.
7Cavallaro J R,Keleher M P,Price R H. VLSI implementation of a CORDIC SVD processor[A].1989.256-260.
8Ma Wei-wei,Kaye M E,Luke D M. An FPGA-based singular value decomposition processor[A].2006.1047-1050.
9余凯,贾磊,陈雨强,徐伟.深度学习的昨天、今天和明天[J].计算机研究与发展,2013,50(9):1799-1804. 被引量：628
10吴艳霞,梁楷,刘颖,崔慧敏.深度学习FPGA加速器的进展与趋势[J].计算机学报,2019,42(11):2461-2480. 被引量：63

共引文献6

1张晓帆,李广军.基于低硬件复杂度、高速CORDIC的SVD模块设计与实现[J].电子学报,2015,43(4):738-742. 被引量：5
2安国臣,袁宏拓,韩秀璐,王晓君,侯雨佳.基于FPGA的通用卷积层IP核设计[J].河北科技大学学报,2021,42(3):241-247. 被引量：3
3郭金贵.A*算法的FPGA实现[J].科学技术创新,2021(30):17-19.
4万朵,胡谋法,肖山竹,张焱.面向边缘智能计算的异构并行计算平台综述[J].计算机工程与应用,2023,59(1):15-25. 被引量：10
5刘晛,吴瑞琦,高尚尚,刘泽浩,刘海波,孔祥晔,王庆,郭乃宏,周锋,王如刚.基于ZYNQ的通用型卷积神经网络设计与实现[J].电子器件,2023,46(1):121-125. 被引量：2
6白洁.基于卷积神经网络任务分解算法的细粒度调度方法研究——以人工智能ChatGPT为例[J].自动化与仪器仪表,2024(9):43-46. 被引量：1

1Mike Fox.What the Folk[J].城市漫步（上海版、英文）,2022(7):20-23.
2Chunmei SHI,Dan LIU,Yonglu CUI,Jiajun XIE,Nathan James ROBERTS,Guangshun JIANG.Amur tiger stripes:individual identification based on deep convolutional neural network[J].Integrative Zoology,2020,15(6):461-470. 被引量：8
3项英倬,曾成金,李良德.非合作空中目标跟踪作战效能试验评估方法研究[J].电信技术研究,2022(2):1-8.
4Steve BRICE.The Investment Backdrop in 2022:Hawkish Fed and Rising Inflation[J].China Forex,2022(3):48-51.
5崔雨潇,马家豪,阎兵,戚厚军,蔡玉俊.飞秒激光诱导纳米金刚石薄膜表面周期性结构的摩擦学性能研究[J].金刚石与磨料磨具工程,2022,42(4):433-441.
6Kang Xu,Tong Li,Gaofei Guan,Jianlong Qu,Zhen Zhao,Xinsheng Xu.Optimization Design of an Embedded Multi-Cell Thin-Walled Energy Absorption Structures with Local Surface Nanocrystallization[J].Computer Modeling in Engineering & Sciences,2022(2):987-1002.
7S.G.Ayodele,D.Raabe,F.Varnik.Lattice Boltzmann Modeling of Advection-Diffusion-Reaction Equations:Pattern Formation Under Uniform Differential Advection[J].Communications in Computational Physics,2013,13(3):741-756.
8Ruixi Zheng,Zhiyou Jing.Submesoscale-enhanced filaments and frontogenetic mechanism within mesoscale eddies of the South China Sea[J].Acta Oceanologica Sinica,2022,41(7):42-53.
9Yantai Zhang,Yongan Shi,Baoyin Sun,ZhengWang.Estimation of Aleatory Randomness by S_(a)(T_(1))-Based Intensity Measures in Fragility Analysis of Reinforced Concrete Frame Structures[J].Computer Modeling in Engineering & Sciences,2022(1):73-96.
10Rongping BU,Fanrong XIAO,PGeorge LOVELL,Jichao WANG,Haitao SHI.Partial Masquerading and Background Matching in Two Asian Box Turtle Species(Cuora spp.)[J].Asian Herpetological Research,2022,13(3):168-179.

高技术通讯

2022年第9期

浏览历史

内容加载中请稍等...

基于位串行计算的动态精度神经网络处理器

参考文献3

二级参考文献10

共引文献6

相关作者

相关机构

相关主题

浏览历史