Computer Vision News - August 2018

12 Challenge Computer Vision News With Devi Parikh In order to do so, the AI agent must have some understanding about the content of the image and at the same time remember the previous exchanges in the dialogue in order to be able to continue it in a meaningful way. Something like the exchange in the image here on the left: first, the agent explains the image by the way of a simple caption and then it answers specific questions. In this example proposed by the organizers of the challenge, the AI agent is able to answer correctly to the first few questions, while it is less sure about the answer to the following ones. The challenge is meant to help push the state-of-the-art in the agent’s ability . The challenge is conducted on a dataset which is based on COCO images . This dataset (VisDial v1.0) contains as much as 1.3 million question-answer pairs (starting from an image caption) on about 130,000 images (10 pairs per image). The challenge is currently open and the deadline for submissions is August 15 . Winners will be announced at ECCV2018 , which will take place in Munich, Germany in September. We asked Devi Parikh , Assistant Professor at the School of Interactive Computing, Georgia Tech and Research Scientist at Facebook AI Research (FAIR) to tell us how the challenge is going. Here is her answer: “ The challenge is going well so far! One thing we are excited about in particular is how we've set up the evaluation! Evaluating dialog models -- or any natural language generation task for that matter -- is an open problem. But the way we've setup the evaluation for this challenge helps make the evaluation more meaningful than it was so far. ” The Visual Dialog Challenge 2018 is built on the belief that an Artificial Intelligence agent can hold a significant dialogue with a human about visual content. Challenge