ECCV 2016 Daily - Thursday

model work? Ranjay: There is one very hard challenge in this model. It’s the fact that a lot of visual concepts occur in a long tail distribution. There are some things like cars on the street, people walking with their dogs, or cat images which occur very frequently. We don’t see a lot of examples of, let’s say, an elephant drinking milk or a dog riding a surfboard. The question is how do you learn to predict those things if you have barely seen any examples of them. I think that was the biggest challenge that this paper tries to address. What we do is we use language priors. We say that even if we haven’t seen it in the visual world. We might not have seen examples of elephants drinking milk, but maybe we read about them in some text somewhere. Maybe we can leverage the stories we have read to improve the model that we have. That’s what we do. ECCV Daily: What problems will this solve in the real world? Ranjay: I like to work on problems that I can see having an immediate impact and application that can help people. We’ve seen a lot of work by Facebook in being able to identify different objects in an image to help blind people, for example. When someone who is visually impaired looks at an image on Facebook, they can see that it has three people with maybe food, a glass, etc. I want to be able to take that a step further. I want it to be able to say that it’s a professional scene where people are wearing suits or they are holding glasses that contain wine versus a more party-like scene where they are at a restaurant, where they are eating pizza, and they’re holding beer bottles instead. All of these contexts change the way that people react to different images. It would give people a better sense of what is happening in the image. That’s the kind of applications I see moving forward. I would like to see smarter self- driving car models that, if they see a man walking on the sidewalk towards the road, if they can identify if the man is running with someone else or running on the street, then maybe they should stop because the person is less likely to slow down. Presentations 11 ECCV Daily: Thursday

RkJQdWJsaXNoZXIy NTc3NzU=