Computer Vision News - October 2018

Daily Tuesday Michelle Guo is a graduate student at Stanford University under the supervision of Fei-Fei Li. She speaks to us about the poster she presented at ECCV on fewshot learning for 3D action recognition. Michelle tells us that the work provides a valuable solution for an important problem, which is that there are very few 3D-labelled video datasets . Fewshot learning has recently become a prominent topic in computer vision. However, most of the work has been focused on RGB and 2D domains, so the 2D domain has many labelled datasets in comparison to its 3D counterpart. She says they are taking advantage of the inherent structure that 3D video datasets have. 3D data has a lot of spatial structure that they can leverage to learn what they call modular graph structures which can be present across several different actions. By learning the essential modular components of different actions, they can further address the problem of fewshot learning. They train with very few data examples and then have to be able to generalise to novel classes and unseen classes during test time. Michelle explains more on the central methods used: “ This work basically has two important steps. The first step is graph generation . Our model is able to automatically generate graphs from input 3D videos. The second step from the generated graphs is to learn a deep distance metric to perform a graph matching . This graph matching is a prevalent method for fewshot learning methods in RGB. We wanted to leverage that same method and extend it towards 3D and also towards graphs . What you have is an end-to- end neural network that is able to learn how to generate and match graphs to be able to perform this fewshot action recognition in 3D. ” Poster Presentation Neural Graph Matching Networks for Fewshot 3D Action Recognition 16 A valuable solution for an important problem

RkJQdWJsaXNoZXIy NTc3NzU=