Computer Vision News - December 2019

Marc Pollefeys 23 mostly in the Cambridge lab. There’s also eye tracking. First, we look at the eyes to do high-quality biometric authentication by having cameras read your iris. At the same time, those cameras can also track your eyes. The most important direct outcome of that is that we can calibrate the display to where your eyes are and optimise the viewing quality for the position of your eyes. A practical use for this is an auto- scrolling feature. As you read text, it naturally keeps scrolling, following your gaze. Beyond that, it can also be used in combination with hands as an input modality of where you’re looking. There are many things like this built-in to HoloLens. As we go forward, we want to go even further. We want HoloLens to be able to understand what the user is doing in the scene at the semantic level. So, not only track the hands in detail, but go beyond that and recognise the scene and objects that are in front of them. Those are all really important things to make the device even better suited to assist the user in solving whatever problem or task they’re trying to address. In Zurich, one of the key things my team focuses on is being able to share experiences across multiple users and devices. Both at the same time, but also so that I can place information in the world and then later I can either retrieve it myself or have other peoplewith their devices retrieve it. In the same way that computers became a lot more useful once they were networked and once I could place something on the internet that somebody else could read - think Wikipedia - this is the same thing but for annotating and placing information in the context of the real world. That becomes a lot more interesting when I place information somewhere in the world and I can have another person access that information who is entitled to do so. That requires that my devices always build up a 3D SLAM map of the environment that I traverse, and I need to be able to share that map with other devices. The natural way to do that is to build up this map in the cloud. If I map part of the space and then somebody else walks through the space with their device and so on, all of us together can share and build it up. Essentially each of those is a small puzzle piece describing the world. If I assemble all those puzzle pieces, I can build and maintain an up-to-date map of the world that allows everybody to share information that’s aligned and attached to the world. That’s really the ambition. The first step in doing that is a mixed reality service called Azure Spatial Anchors. That’s about being able "Think of it like Post-its that you can attach to the world." Best of ICCV 2019