Computer Vision News - April 2025

9 Computer Vision News Computer Vision News Why did the jury precisely wanted this paper to be the winner, out of hundreds of student papers. Did Rahul think about this? “I feel one main strength is that our paper is very general,” is Rahul’s reply, “so it can be applied to a wider audience. Any model that has attention blocks can use some ideas from this work to update or tweak the outputs. Another strength is that visually our results were really pleasing! Finally, in our supplement we have a fine analysis of why, if I remove an object, the shadows should be removed too. We have more analysis of how other works do it and where they fail and even where we fail as well!” The most immediate future direction for this work is translate this effort from image models to video. “We were seeing how video diffusion models process these latents,” concludes Rahul “and how we can maybe edit them. The concern is that video diffusion models are compressing their latents very much, so there is a disconnect between explainability of how these models are compressing and which regions are important for what types of edits. Our next work is going deeper into this and also look into scene generation and 3D reconstruction - a fusion of these two!” GeoDiffuser “We have more analysis of how other works do it and where they fail and even where we fail as well!”

Made with FlippingBook

RkJQdWJsaXNoZXIy NTc3NzU=