Computer Vision News - February 2019

Visual SLAM approaches Fundamental Pipeline: The term Visual SLAM includes all approaches to self-localization and mapping that take images as their input. The main difference between Visual SLAM and SLAMs based on other sensors is the need to produce depth data from camera images. There are two approaches to handling this challenge: 1) Depth estimation based on feature extraction, such as SIFT or ORB, and 2) Depth estimation from the image’s pixels directly. Both approaches use the same pipeline. Basic Visual SLAM pipeline: Tracking → Mapping → Global Optimization → Relocalization Tracking -- Consecutive-image tracking is used to deduce camera trajectory and depth information, usually using non-linear optimization. Most approaches use key frames as a basis for tracking, acquiring a new key frame whenever the tracking algorithm detects there is no longer sufficient overlap between the current camera image and the key frame. Mapping -- The process of producing a map from the sensor data. At this stage, there is a significant difference between feature based and direct methods. The first create sparse feature maps, while the second creates semi-dense point maps. Some approaches use key frames that include depth and scale data. Global Optimization -- The tracking process includes errors, and the Global Optimization step is used to correct the car’s World Map. The optimization step is computationally intensive and therefore only invoked every couple of minutes. This step usually relies on recognizing previously seen locations or objects -- and so detects Loop-Closure by default. Some approaches also use 3D data for the optimization step. Relocalization -- The process of placing a sensor at an unknown pose on the Map, and then trying to assess the pose. This is most often done by attempted matching of the actual sensor data with the Map. Many approaches use descriptive image features. The process of interpolating a sensor at an unknown pose on the Map, and then attempting to assess the pose Research 6 Research Computer Vision News … how Deep Learning methods and their specific capabilities can be used to replace individual elements / stages of the Visual SLAM pipeline […] combining those solutions into an overall Deep Learning Visual SLAM