Computer Vision News Computer Vision News 8 remove the cup, its shadow and other elements should be removed as well. It is helpful to do this in the attention space and update image latents accordingly, so that the change is actually reflected to the environment as well. These findings did not come without major challenges! The earlier direction for the team was not this geometric-based image editing. They actually wanted to do a novel view synthesis for regular everyday objects. That means that if you have seen a photo of this cup from the front and you want to see the cup from the back, how would the cup look? Rahul wanted to do it without training because datasets for objects exist, but if you want to do it for scenes, like to see the same door but from a different viewpoint, you want something that can be done without training a large model. When they followed this first direction, they faced issues: these models cannot give you a precise view of something like a novel direction - they could do this only for small changes in angle, up to about 45 degrees. This is what brought the team to merge some attributes of this novel view synthesis and geometry with this image editing. That's how GeoDiffuser came into existence. On top of that, Rahul added these loss optimizations which allowed to remove objects, move them and more. Best Student Paper Award
RkJQdWJsaXNoZXIy NTc3NzU=