期刊文献+

用于图像分类的深度卷积神经网络中的空间分割注意力模块 被引量:1

SPAM: Spatially Partitioned Attention Module in Deep ConvolutionalNeural Networks for Image Classification
在线阅读 下载PDF
导出
摘要 针对现有注意力机制常采用融合或压缩的方式来获取所需信息,会导致空间或通道维度损失的信息过多的问题,提出了一种有效的轻量级注意力模块SPAM,它可以在不经过通道融合或压缩的情况下获取注意力。对于输入的中间特征图,SPAM首先会自适应地采用平均池化和最大池化来进行特征提取;采用空间上局部的块特征代替点特征以减少计算量,并利用实例标准化(IN)层与深层卷积来捕获全局空间注意力;通道维度信息的重建通过分组卷积来完成;最终使用插值操作获得整体注意力,对输入特征图进行加权。SPAM可以方便地嵌入到各种主流卷积神经网络架构中,只需增加微量参数和计算量,就可以显著提高网络性能。为了证明SPAM的有效性,在ImageNet-1K、CIFAR-100和Food-101图像分类数据集上进行了大量实验,并使用Grad-CAM可视化了网络的关注区域。实验结果表明,在ImageNet-1K、CIFAR-100和Food-101数据集上,SPAM分别将基线网络的准确率最多提高了约1.08%、2.46%和1.09%。研究结果表明,嵌入SPAM的网络的性能都有较大提升;且相较于其他常用的轻量级注意力机制,SPAM的效果始终更好;SPAM使网络更关注目标对象所在区域,确切提高了网络的表达能力。 Existing attention mechanisms often use fusion or compression to obtain the required information,but this leads to a large quantity of information lost in the spatial or channel dimension.In order to solve this problem,the Spatially Partitioned Attention Module(SPAM),a really effective and lightweight attention module that can help obtain attention without channel fusion or compression,was proposed in the paper.For the input intermediate feature map,the SPAM first adaptively used average pooling and maximum pooling features for feature extraction,replaced the point feature with the local block feature in space to reduce the amount of calculation and used the IN layer and depthwise convolution to capture global spatial attention.Meanwhile,the reconstruction of channel dimension information was directly completed by group convolution.Finally,the interpolation operation was used to obtain overall attention and weight the input feature map.Notably,the SPAM can be easily embedded in various mainstream CNN architectures,and network performance can be significantly improved by increasing a few microparameters and calculations.To demonstrate the effectiveness of the SPAM,numerous experiments were conducted on the ImageNet-1K,CIFAR-100,and Food-101 datasets,and the network's regions of interest were visualized using Grad-CAM.On the ImageNet-1K,CIFAR-100,and Food-101 datasets,the SPAM improved the accuracy of the baseline network by up to about 1.08%,2.46%,and 1.09%,respectively.The results show that the performance of the network embedded with the SPAM components is greatly improved;compared to other commonly used lightweight attention mechanisms,the SPAM always works better;the SPAM can really induce the networks to pay more attention to the target object regions and accurately improve the expression ability of the networks.
作者 王方 乔瑞萍 WANG Fang;QIAO Ruiping(School of Information and Communication Engineering,Xi’an Jiaotong University,Xi’an 710049,China)
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2023年第9期185-192,共8页 Journal of Xi'an Jiaotong University
基金 陕西省重点研发计划资助项目(2020GY-074)。
关键词 卷积神经网络 注意力机制 图像分类 特征提取 convolutional neural networks attention mechanisms image classification feature extraction
  • 相关文献

参考文献4

二级参考文献17

共引文献427

同被引文献3

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部