摘要
电商图像背景较为复杂、文字区域形状多变,现有的文字检测模型无法精确检测文字位置这一问题。提出一种改进的文字检测模型——迭代自选择特征融合DBNet(iSFF-DBNet)。首先在主干网络提取特征后,在构建特征金字塔网络FPN的过程中引入注意力机制;然后提出了迭代自选择特征融合模块iSFF来提升模型的特征提取能力;最后引入双边上采样模块提升可微分二值化模块的自适应性能。实验结果表明,在ICPR MTWI 2018网络图像数据集文本检测任务中,对比标准的DBNet模型,所提改进模型的召回率和F-score分别提升了6.0%和2.4%。与其他文字检测模型相比,该模型在精确率和召回率上取得了平衡,能够更准确地检测文字。
Aiming at the problem that existing text detection models cannot accurately detect text locations due to complex backgrounds and variable text region shapes in e-commerce images,an improved text detection model,named Iterative Self-selective Feature Fusion DBNet(iSFF-DBNet),is proposed.Firstly,after extracting features from the backbone network,an attention mechanism is introduced in the process of building a Feature Pyramid Network(FPN),and an Iterative Self-selective Feature Fusion(iSFF)module is proposed to enhance the feature extraction ability of the model.Finally,a bilinear upsampling module is introduced to improve the adaptive performance of the differentiable binaryization module.Experimental results show that compared to the standard DBNet model,the recall and F-score of the improved model are increased by 6.0%and 2.4%,respectively,in the text detection task of the ICPR MTWI 2018 web-scale image dataset.Compared with other text detection models,this model achieves a balance between accuracy and recall,and can detect text more accurately.
作者
李卓璇
周亚同
LI Zhuo-xuan;ZHOU Ya-tong(School of Electronic and Information Engineering,Hebei University of Technology,Tianjin 300401,China)
出处
《计算机工程与科学》
CSCD
北大核心
2023年第11期2008-2017,共10页
Computer Engineering & Science
基金
京津冀基础研究合作专项(H2021202008,J210008)
内蒙古自治区纪检监察大数据实验室开放课题(IMDBD202105)。
关键词
文字检测
多尺度特征
特征融合
深度学习
character detection
multi-scale feature
feature fusion
deep learning