sets and 2D data sets like COCO and ADE20K. These are standard 2D segmentation data sets that have been used for a long time”, he explains. “And I think we found quite an elegant way to use these data sets - even though our model is 3D - to still be able to get the diversity from these 2D data sets!” This project opens new directions. Actually, Lojze was almost stressed by the number of different things they could do after this. The clear one for him stems from the need for much more data to really have a general model, similar to how the foundation models in 3D operate. “My hope,” he describes, “is that we can use the entirety of 3D data sets that we already have in some sort of unsupervised way to still train panoptic segmentation on top of already existing data. The long-term goal for us is to use these models to enable reasoning in 3D. Today there's some developments in 3D reasoning, but it is very fractured. Everybody develops their own data set of what kind of use cases they want to cover and so on. So I think we're still waiting for the big breakthrough in 3D reasoning!” This work is deeply connected to computer vision. “I would say that 3D vision and semantic understanding”, he confirms, “are two important pillars of computer vision that have been researched since the very start of computer vision. I think merging these two concepts into a unified model is one more example of trying to unify all the sub-fields.” To learn more about Lojze’s work, visit Poster Session 2 (Exhibit Hall I) from 15:00 to 17:00 [Poster 79]. 14 DAILY ICCV Tuesday Poster Presentation
RkJQdWJsaXNoZXIy NTc3NzU=