Imagine asking your AI assistant to find your misplaced mobile phone or keys. The task at the heart of this paper, VQ3D, or Visual Queries with 3D Localization, holds the promise of helping people locate their belongings and the objects of interest in their daily lives. Last year, the Ego4D benchmark introduced multiple tasks related to video understanding, including VQ3D, a fusion of 3D geometric comprehension and egocentric video understanding. Jinjie’s new take on this is a multi-stage, multimodule solution looking to improve. Jinjie Mai is a master’s student in Bernard Ghanem’s lab at KAUST. His paper on 3D object localization in egocentric videos has been accepted as an oral and poster this year. He speaks to us ahead of his presentation this morning. EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries 4 DAILY ICCV Wednesday Oral Presentation
RkJQdWJsaXNoZXIy NTc3NzU=