摘要
人体姿态估计是诸多计算机视觉领域任务的基础,以往的人体姿态估计网络由于尺度变化的挑战会在特征提取的过程中丢失姿态信息,导致人体姿态估计的准确度难以提升。针对该问题考虑以并行网络的方式结合多尺度特征融合方法提取特征。优化特征提取的人体姿态估计方法分为两步:首先在多尺度特征融合阶段利用转置卷积和混合空洞卷积的操作以减少特征信息的丢失,其次在特征图输出阶段有权重的结合不同尺度的特征图来剔除冗余信息、保留姿态信息同时生成更高质量的高分辨率热图。试验表明,此种方法在COCO数据集上的试验结果相比先进方法 HRNet(High resolution net)准确率提高了2.1%。通过试验验证本方法在精度方面能够超过现有的主流人体姿态估计方法。该方法能更好地应对行人姿态估计中尺度变化的挑战,更加精确地定位复杂场景中小尺度人体的关键点位置。
Human pose estimation is the basis of many tasks in the field of computer vision.Due to the challenge of scale change,the previous human pose estimation network will lose pose information in the process of feature extraction,which makes it difficult to improve the accuracy of human pose estimation.To solve this problem,a parallel network combined with multi-scale feature fusion method is considered to extract features.The human posture estimation method for optimizing feature extraction is divided into two steps:firstly,in the multi-scale feature fusion stage,transpose convolution and mixed dilated convolution are used to reduce the loss of feature information.Secondly,in the feature map output stage,weighted feature maps of different scales are combined to eliminate redundant information,retain posture information,and generate higher quality high-resolution heat map at the same time.Experiments show that the accuracy of this method is improved by 2.1%compared with the advanced method HRnet(High Resolution Net).Experiments show that this method can surpass the existing mainstream human pose estimation methods in accuracy.This method can better meet the challenge of mesoscale change in pedestrian pose estimation,and more accurately locate the key points of small-scale human body in complex scenes.
作者
刘宏哲
陶相如
徐成
曹东璞
LIU Hongzhe;TAO Xiangru;XU Cheng;CAO Dongpu(Beijing Key Laboratory of Information Service Engineering,Beijing Union University,Beijing 100101;School of Vehicle and Mobility,Tsinghua University,Beijing 100084)
出处
《机械工程学报》
EI
CAS
CSCD
北大核心
2024年第16期306-313,共8页
Journal of Mechanical Engineering
基金
国家自然科学基金(62171042,62102033)
北京市重点科技(KZ202211417048)
北京市属高等学校高水平科研创新团队建设支持计划(BPHR20220121)
北京市自然科学基金(4232026,4242020)
北京联合大学学术研究(ZKZD202302)资助项目。
关键词
姿态估计
多尺度融合
人体检测
尺度适应
pose estimation
multi-scale fusion
human detection
scale adaptation