Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,...Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,current dynamic SLAM systems struggle to achieve precise localization and map construction.With the advancement of deep learning,there has been increasing interest in the development of deep learning-based dynamic SLAM visual odometry in recent years,and more researchers are turning to deep learning techniques to address the challenges of dynamic SLAM.Compared to dynamic SLAM systems based on deep learning methods such as object detection and semantic segmentation,dynamic SLAM systems based on instance segmentation can not only detect dynamic objects in the scene but also distinguish different instances of the same type of object,thereby reducing the impact of dynamic objects on the SLAM system’s positioning.This article not only introduces traditional dynamic SLAM systems based on mathematical models but also provides a comprehensive analysis of existing instance segmentation algorithms and dynamic SLAM systems based on instance segmentation,comparing and summarizing their advantages and disadvantages.Through comparisons on datasets,it is found that instance segmentation-based methods have significant advantages in accuracy and robustness in dynamic environments.However,the real-time performance of instance segmentation algorithms hinders the widespread application of dynamic SLAM systems.In recent years,the rapid development of single-stage instance segmentationmethods has brought hope for the widespread application of dynamic SLAM systems based on instance segmentation.Finally,possible future research directions and improvementmeasures are discussed for reference by relevant professionals.展开更多
Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO meth...Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO methods based on deep learning,weight factors are in the form of fixed values,which are easy to lead to overfitting.A new measurement system,for monocular visual odometry,named Deep Learning Visual Odometry(DLVO),is proposed based on neural network.In this system,Convolutional Neural Network(CNN)is used to extract feature and perform feature matching.Moreover,Recurrent Neural Network(RNN)is used for sequence modeling to estimate camera’s 6-dof poses.Instead of fixed weight values of CNN,Bayesian distribution of weight factors are introduced in order to effectively solve the problem of network overfitting.The 18,726 frame images in KITTI dataset are used for training network.This system can increase the generalization ability of network model in prediction process.Compared with original Recurrent Convolutional Neural Network(RCNN),our method can reduce the loss of test model by 5.33%.And it’s an effective method in improving the robustness of translation and rotation information than traditional VO methods.展开更多
Estimating the global position of a road vehicle without using GPS is a challenge that many scientists look forward to solving in the near future. Normally, inertial and odometry sensors are used to complement GPS mea...Estimating the global position of a road vehicle without using GPS is a challenge that many scientists look forward to solving in the near future. Normally, inertial and odometry sensors are used to complement GPS measures in an attempt to provide a means for maintaining vehicle odometry during GPS outage. Nonetheless, recent experiments have demonstrated that computer vision can also be used as a valuable source to provide what can be denoted as visual odometry. For this purpose, vehicle motion can be estimated using a non-linear, photogrametric approach based on RAndom SAmple Consensus (RANSAC). The results prove that the detection and selection of relevant feature points is a crucial factor in the global performance of the visual odometry algorithm. The key issues for further improvement are discussed in this letter.展开更多
Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly dist...Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.展开更多
Robust and efficient vision systems are essential in such a way to support different kinds of autonomous robotic behaviors linked to the capability to interact with the surrounding environment, without relying on any ...Robust and efficient vision systems are essential in such a way to support different kinds of autonomous robotic behaviors linked to the capability to interact with the surrounding environment, without relying on any a priori knowledge. Within space missions, above all those involving rovers that have to explore planetary surfaces, vision can play a key role in the improvement of autonomous navigation functionalities: besides obstacle avoidance and hazard detection along the traveling, vision can in fact provide accurate motion estimation in order to constantly monitor all paths executed by the rover. The present work basically regards the development of an effective visual odometry system, focusing as much as possible on issues such as continuous operating mode, system speed and reliability.展开更多
In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camer...In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.展开更多
无人车单一传感器同步定位与地图构建(simultaneous localization and mapping,SLAM)算法鲁棒性较差,现有多传感器融合方案则较少考虑车辆运动约束,导致横向定位漂移。为此,提出一种基于ORB-SLAM的视觉-惯性-车轮紧耦合优化方法,将三者...无人车单一传感器同步定位与地图构建(simultaneous localization and mapping,SLAM)算法鲁棒性较差,现有多传感器融合方案则较少考虑车辆运动约束,导致横向定位漂移。为此,提出一种基于ORB-SLAM的视觉-惯性-车轮紧耦合优化方法,将三者约束统一纳入后端的捆集优化(bundle adjustment,BA)。首先给出视觉里程计、惯性测量单元(inertial measurement unit,IMU)和基于阿克曼车辆模型的车轮里程计残差模型,然后建立基于ORB-SLAM的单目视觉-惯性-车轮融合的SLAM系统优化框架。在KAIST数据集和实际校园场景下的实验结果表明,与其他常用SLAM方法相比,本文改进算法有效减少了误差累积,定位与地图构建结果更稳健且精确。展开更多
There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can ...There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.展开更多
Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed ...Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.展开更多
随着移动机器人技术不断发展,里程计技术已经成为移动机器人实现环境感知的关键技术,其发展水平对提高机器人的自主化和智能化具有重要意义。首先,系统阐述了同步定位与地图构建(Simultaneous localization and mapping,SLAM)中激光SLA...随着移动机器人技术不断发展,里程计技术已经成为移动机器人实现环境感知的关键技术,其发展水平对提高机器人的自主化和智能化具有重要意义。首先,系统阐述了同步定位与地图构建(Simultaneous localization and mapping,SLAM)中激光SLAM和视觉SLAM的发展近况,阐述了经典SLAM框架及其数学描述,简要介绍了3类常见相机的相机模型及其视觉里程计的数学描述。其次,分别对传统视觉里程计和深度学习里程计的研究进展进行系统阐述。对比分析了近10年来各类里程计算法的优势与不足。另外,对比分析了7种常用数据集的性能。最后,从精度、鲁棒性、数据集、多模态等方面总结了里程计技术面临的问题,从提高算法实时性、鲁棒性等方面展望了视觉里程计的发展趋势为:更加智能化、小型化新型传感器的发展;与无监督学习融合;语义表达技术的提高;集群机器人协同技术的发展。展开更多
基金the National Natural Science Foundation of China(No.62063006)the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)the Research Project for Young andMiddle-Aged Teachers in Guangxi Universi-ties(ID:2020KY15013)the Special Research Project of Hechi University(ID:2021GCC028)financially supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,current dynamic SLAM systems struggle to achieve precise localization and map construction.With the advancement of deep learning,there has been increasing interest in the development of deep learning-based dynamic SLAM visual odometry in recent years,and more researchers are turning to deep learning techniques to address the challenges of dynamic SLAM.Compared to dynamic SLAM systems based on deep learning methods such as object detection and semantic segmentation,dynamic SLAM systems based on instance segmentation can not only detect dynamic objects in the scene but also distinguish different instances of the same type of object,thereby reducing the impact of dynamic objects on the SLAM system’s positioning.This article not only introduces traditional dynamic SLAM systems based on mathematical models but also provides a comprehensive analysis of existing instance segmentation algorithms and dynamic SLAM systems based on instance segmentation,comparing and summarizing their advantages and disadvantages.Through comparisons on datasets,it is found that instance segmentation-based methods have significant advantages in accuracy and robustness in dynamic environments.However,the real-time performance of instance segmentation algorithms hinders the widespread application of dynamic SLAM systems.In recent years,the rapid development of single-stage instance segmentationmethods has brought hope for the widespread application of dynamic SLAM systems based on instance segmentation.Finally,possible future research directions and improvementmeasures are discussed for reference by relevant professionals.
基金supported by National Key R&D Plan(2017YFB1301104),NSFC(61877040,61772351)Sci-Tech Innovation Fundamental Scientific Research Funds(025195305000)(19210010005),academy for multidisciplinary study of Capital Normal University。
文摘Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO methods based on deep learning,weight factors are in the form of fixed values,which are easy to lead to overfitting.A new measurement system,for monocular visual odometry,named Deep Learning Visual Odometry(DLVO),is proposed based on neural network.In this system,Convolutional Neural Network(CNN)is used to extract feature and perform feature matching.Moreover,Recurrent Neural Network(RNN)is used for sequence modeling to estimate camera’s 6-dof poses.Instead of fixed weight values of CNN,Bayesian distribution of weight factors are introduced in order to effectively solve the problem of network overfitting.The 18,726 frame images in KITTI dataset are used for training network.This system can increase the generalization ability of network model in prediction process.Compared with original Recurrent Convolutional Neural Network(RCNN),our method can reduce the loss of test model by 5.33%.And it’s an effective method in improving the robustness of translation and rotation information than traditional VO methods.
文摘Estimating the global position of a road vehicle without using GPS is a challenge that many scientists look forward to solving in the near future. Normally, inertial and odometry sensors are used to complement GPS measures in an attempt to provide a means for maintaining vehicle odometry during GPS outage. Nonetheless, recent experiments have demonstrated that computer vision can also be used as a valuable source to provide what can be denoted as visual odometry. For this purpose, vehicle motion can be estimated using a non-linear, photogrametric approach based on RAndom SAmple Consensus (RANSAC). The results prove that the detection and selection of relevant feature points is a crucial factor in the global performance of the visual odometry algorithm. The key issues for further improvement are discussed in this letter.
基金Supported by Tianjin Municipal Natural Science Foundation of China(Grant No.19JCJQJC61600)Hebei Provincial Natural Science Foundation of China(Grant Nos.F2020202051,F2020202053).
文摘Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.
文摘Robust and efficient vision systems are essential in such a way to support different kinds of autonomous robotic behaviors linked to the capability to interact with the surrounding environment, without relying on any a priori knowledge. Within space missions, above all those involving rovers that have to explore planetary surfaces, vision can play a key role in the improvement of autonomous navigation functionalities: besides obstacle avoidance and hazard detection along the traveling, vision can in fact provide accurate motion estimation in order to constantly monitor all paths executed by the rover. The present work basically regards the development of an effective visual odometry system, focusing as much as possible on issues such as continuous operating mode, system speed and reliability.
基金Supported by the National Natural Science Foundation of China(61501034)
文摘In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.
基金supported by the NIBIB and the NEI of the National Institutes of Health(R01EB018117)。
文摘There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.
基金the National Natural Science Foundation of China(No.62063006)to the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)to the Research Project for Young and Middle-aged Teachers in Guangxi Universities(ID:2020KY15013)to the Special Research Project of Hechi University(ID:2021GCC028)supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.
文摘随着移动机器人技术不断发展,里程计技术已经成为移动机器人实现环境感知的关键技术,其发展水平对提高机器人的自主化和智能化具有重要意义。首先,系统阐述了同步定位与地图构建(Simultaneous localization and mapping,SLAM)中激光SLAM和视觉SLAM的发展近况,阐述了经典SLAM框架及其数学描述,简要介绍了3类常见相机的相机模型及其视觉里程计的数学描述。其次,分别对传统视觉里程计和深度学习里程计的研究进展进行系统阐述。对比分析了近10年来各类里程计算法的优势与不足。另外,对比分析了7种常用数据集的性能。最后,从精度、鲁棒性、数据集、多模态等方面总结了里程计技术面临的问题,从提高算法实时性、鲁棒性等方面展望了视觉里程计的发展趋势为:更加智能化、小型化新型传感器的发展;与无监督学习融合;语义表达技术的提高;集群机器人协同技术的发展。