摘要
基于深度特征的目标跟踪网络凭借其对目标视觉特征强大的表征能力获得了令人印象深刻的表现。然而,在一些复杂的跟踪场景中常常涉及目标物体快速运动、光线变化、旋转等,仅仅依赖深度视觉特征难以准确地表征目标物体。针对以上问题,提出了一种基于融合特征的视频单目标跟踪网络。该网络结合了2种深度学习模型:卷积神经网络(convolutional neural network,CNN)和长短期记忆网络(long short-term memory,LSTM)。首先,运用长短期记忆网络提取目标基于时间序列的动态特征,产生当前时刻的目标状态,由此获得准确的预处理目标框;然后基于产生的预处理目标框,使用卷积神经网络提取目标的深度卷积特征,确定目标位置;在跟踪过程中,通过采集成功跟踪时目标样本,对网络参数进行短期和长期更新,以增强网络的适应性。对比实验结果表明,所提出的方法在目标运动过程中被部分遮挡、运动模糊、快速运动情况下具有优异的跟踪表现和鲁棒性。
Deep visual feature-based method has demonstrated impressive performance in visual tracking attributing to its powerful capability of visual feature representation.However,in some complex environments such as dramatic change of appearance,illumination variation and rotation,the extracted deep visual feature is insufficient for accurately characterizing the target.To solve this problem,we present an integrated tracking framework which combines a Long Short-Term Memory(LSTM)network and a Convolutional Neural Network(CNN).Firstly,the LSTM extracted dynamics feature of target on time sequence,resulting the state of target at present time step.With that state,the accurate preprocessed bounding box was obtained.Then,deep convolutional feature of the target was extracted using a CNN,based on the processed bounding box.Finally,the position of the target was determined based on the score of the feature.During tracking stage,in order to improve the adaptation of the network,the parameters of the network were updated using samples of the target captured while successful tracking.The experiment shows that the proposed method achieves outstanding tracking performance and robustness in cases of partial occlusion,out-of-view,motion blur and fast motion.
作者
张博言
钟勇
李振东
ZHANG Boyan;ZHONG Yong;LI Zhendong(Chengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, China;University of Chinese Academy of Sciences, Beijing 100049, China)
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
2019年第6期1310-1319,共10页
Journal of Northwestern Polytechnical University
基金
四川省科技厅科技成果转化项目(2014CC0043)资助
关键词
目标跟踪
卷积神经网络
长短期记忆网络
visual object tracking
convolutional neural network
long short-term memory network