Computer Vision News

13 Is Mapping Necessary for Realistic...  The effect of training with a larger dataset, which substantially improves performance.  The effect of train-time and test-time augmentations (Flip and Swap), which are found to be more effective with a larger training set.  The effect of a Deeper encoder (ResNet-50 vs ResNet-18)  The effect of Dataset transfer, which unsurprisingly shows poor performance when the agent is trained with visual odometry and sets an open question for future need of a universal (cross-dataset) VO module. The experiments lead to an attempt to deploy the learned agent on a real-world challenge. Across 9 episodes, the LoCoBot provided with this agent achieves 11% Success, 71% SoftSPL and makes it 92% of the way to the goal (SoftSuccess). Watch a demo! This work investigates the link between mapping and navigation, reaching the conclusion that this is a weak link, and proving that the only performance bottleneck in the PointNav task is the agent’s ability to self-localize. The authors finally hint at some exciting futurework on analysis of indirect links (frommapping to localization to navigation) and invariance of the approach to datasets and embodiment specificity. Looking forward already!

Computer Vision News - June 2022