Visual SLAM in Human Populated Environments: Exploring the Trade-off between Accuracy and Speed of YOLO and Mask R-CNN
Abstract: Simultaneous Localization and Mapping (SLAM) is a fundamental problem in mobile robotics. However, the majority of Visual SLAM algorithms assume a static scenario, limiting their applicability in real-world environments. Dealing with dynamic content in Visual SLAM is still an open problem, with solutions usually relying on direct or feature-based methods. Deep learning techniques can improve the SLAM solution in environments with a priori dynamic objects, providing high-level information of the scene. This paper presents a new approach to SLAM in human populated environments using deep learning-based techniques. The system is built on ORB-SLAM2, a state-of-the-art SLAM system. The proposed methodology is evaluated using a benchmark dataset, outperforming other Visual SLAM methods in highly dynamic scenarios.