CVPR Daily - 2018 - Thursday

Hailing originally from Russia, Ksenia Konyushkova is a final-year PhD student at CVLab in Switzerland at EPFL, working with Pascal Fua. She speaks to us ahead of her spotlight and poster today. The work she is presenting is about reducing annotation load for bounding box annotations and was completed during her four-month internship at Google with Jasper Uijlings , Christoph Lampert and Vittorio Ferrari . Ksenia explains that they wanted to train an object detector and to do this, needed to annotate bounding boxes. If humans have to manually draw bounding boxes, it can be a time- consuming task. However, previous literature has proposed a solution called box verification series. The idea being that a detector is trained -- maybe a weak detector that is trained only with image-level labels – then this detector is applied to an image and it generates some box proposals. Humans can then quickly verify the box proposals by looking at them and saying if they are correct or not. The problem with that, Ksenia goes on to say, is that it doesn’t always work. Maybe this object is not detected at all, or maybe, as you show the boxes in the order of decreasing score of the detector, the real box comes as the hundredth box, so then it will take a very long time to reach the box that you want to get. In this work, she is trying to choose what kind of annotation method is the best. For example, a clear image with just one distinct object in the middle is very likely to be detected even by a weak detector, so by doing a box verification series, it will take very little time. However, to detect a small object in a crowded scene, with a weak detector it is unlikely that the object will be found, or it will come very late in the series of boxes. Ksenia is training agents that are trying to figure out which of these modalities of annotation should be used – either verification or manual drawing. Ksenia tells us there are two ways to solve this problem. The first one is a model-based approach: “ I try to predict the probability that a given bounding box is going to be accepted by the user. Then if we know this probability, we can compute the time that it will take to annotate an image with any sequence of actions. For example, we’ll say we’ll verify the first bounding box and if it is rejected, we will ask the user to draw. Since we know 8 Ksenia Konyushkova Thursday

RkJQdWJsaXNoZXIy NTc3NzU=