摘要
针对当前神经网络动态精度计算系统在周期性的模型重训练和动态精度切换的过程中会引入大量的计算和访存开销问题,提出了基于串行位计算的动态精度神经网络处理器(DPNN),其可支持任意规模、任意精度的神经网络模型;支持以非重训练的方式对模型数据精度进行细粒度调整,并消除了动态精度切换时因权值bit位重叠造成的重复计算与访存。实验结果表明,相较于自感知神经网络系统(SaNNs)的最新进展之一MinMaxNN,DPNN可使计算量平均降低1.34~2.52倍,访存量降低1.16~1.93倍;相较于代表性的bit串行计算神经网络处理器Stripes,DPNN使性能提升2.57倍、功耗节省2.87倍、面积减少1.95倍。
Aiming at the problem that the existing neural network dynamic-precision-computing system introduces a lot of computing and data access overhead in the process of periodic model retraining and switching,this paper proposes a dynamic-precision neural-network processor(DPNN)based on bit-serial-computing,which can support neural networks of any scales and bit-widths.DPNN supports fine-grained adjustment of model data accuracy without retraining,and eliminates repeated operands and data access caused by bits-of-synapses overlap during dynamic-precision-computing.The experimental results show that,compared with MinMaxNN,one of the latest advances in self-aware neural network systems(SaNNs),DPNN could reduce operands by 1.34-2.52 times and data access by 1.16-1.93 times on average.Compared with Stripes,the representative bit-serial-computing neural network processor,DPNN improves performance by 2.57 times,saves power-consumption by 2.87 times,and reduces area by 1.95 times.
作者
郝一帆
支天
杜子东
HAO Yifan;ZHI Tian;DU Zidong(State Key Lab of Processors,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
出处
《高技术通讯》
CAS
2022年第9期881-893,共13页
Chinese High Technology Letters
基金
国家重点研发计划(2018AAA0103300)
国家自然科学基金(62222214,U20A20227,U19B2019,U22A2028)
北京智源人工智能研究院
中国科学院稳定支持基础研究领域青年团队计划(YSBR-029)
中国科学院青年创新促进会资助项目。