Computer Vision News - October‏ 2023

15 Computer Vision News function, we can learn embedding spaces that are better suited for Knearest neighbor retrieval. And as retrieval is one of the cornerstones for multimodal learning, this is something that's actually pretty cool. We have a lot of interesting papers on the usage of language together with video or how we can actually make text better for video. We have some interesting ones, which will hopefully be available soon, but I cannot talk about them at the moment. [laughs] Let's tell the ICCV people that they should come to the poster of Nina Shvetsova, Sirnam Swetha and Wei Lin. Come to the three posters of these young and fine people and ask questions. You might, by chance, find Hilde there. Three posters are no mean feat! Absolutely! Actually, there are four posters. Which is the fourth? Nina has two. Nina has Sorting and In-Style. Okay, so you will have to tell me about Nina, who was able to get two first-author posters at the same conference… What is special about her way of working? [laughs] Well, let’s first say Nina is great! Nina is also working with me. Nina is my first PhD, so it's always something special. And first, just to not overstate, the sorting paper was a lot of hard work, and it got rejected twice or even three times. Whoever gets rejected always resubmits and makes it better. At some point, it will work. But the second thing is this In-Style paper, which is a bit more about this research on how to use language models to make video annotations better. So that's generally Nina's idea; it’s all hers. I think it's super cool work, and hopefully, it helps the video community to solve a few of our problems. What is the most difficult thing that you have done in this field until now? Oh, my God. That's a good question! Thankyou. [laughs] Um, I don't know, actually, because every project is kind of unique! Hilde Kuehne

Made with FlippingBook

RkJQdWJsaXNoZXIy NTc3NzU=