ECCV 2020 Daily - Thursday

2 Poster Presentation 18 DAILY T h u r s d a y Thinking about adapting the work in the future, Tony says he is currently working on including camera poses and some geometry into the descriptor part so that he can make the descriptor more optimized for the camera poses, not just retrieving from a database of images. He describes his work on the geometry part so far as “tricky” because it is hard to find representations of geometry that are invariant to a lot of constraints seen in this kind of image retrieval dataset. He originally assumed a model that worked well on image retrieval would also work well on camera localization, but this is not necessarily the case, as the two are very different. “In camera localization you have lots of different parameters to think about,” he explains. “You have the camera models. They can have different focal lengths. Images can be taken in different conditions and from extreme viewpoints. To get the global descriptor from one image to just a single factor, you have to do some sort of averaging across the feature map, so you lose all spatial context . Imagine for normal human perception, for human localization, you have to know exactly which part of the building belongs to top-left of the image or bottom-right of the image. You have to know those things, but when you turn them into numbers, they’re all lost. We need to find a way to make sure that the output descriptor that we have includes that information , so that we actually know whether these two images come from similar poses or not.”