Computer Vision News - February 2022

14 Congrats, Doctor! Artificial intelligence agents operating in the real world need to create internal representations of objects to reason about them. For instance, self-driving cars need to reason about pedestrians, cyclists, and other cars and anticipate their actions so they can prevent collisions with them. They can represent these objects using structural representations like object keypoints, parts, or full 3D shape models such as meshes. Deep learning has demonstrated great success in learning these structural representations from images. This success heavily relies on largedatasets of annotated images that are expensive to collect. This complicates thedeployment of AI intonovel environments. Self-supervised learning aims to reduce this annotation bottleneck by using the actual input images as the source of supervision. A simple example of this is an autoencoder that is tasked to reconstruct the input image that was passed through a tight information bottleneck. We extend this autoencoding framework by engineering bottlenecks that, given a collection of images or videos of an object category, disentangle the object structure fromother factors, such as the object pose from the appearance. Moreover, the bottlenecks and regularizers are designed to enforce the structural representation to be in the form of 2D [ 1 , 2 ] , 3D object landmarks [ 3 ] , or 3D mesh [ 4 ] . Tomas Jakab has recently completed his PhD at the Visual Geometry Group (VGG), University of Oxford. His research is focused on self- supervised learning of structural representations of objects. He developed self-supervised learning methods for object keypoints detectors, 3D keypoints for shape control, and 3D object category reconstruction from a single image. During his PhD, he also worked as a research intern at Google Research New York. Currently, he continues his work at VGG improving self- supervised 3D object reconstruction. Congrats, Doctor Tomas!