Computer Vision News - November 2016

8 Computer Vision News Research Research As a baseline, the performance of MCG and newer HED [ 2] (denoted as MCG- HED) were evaluated. VGGNet-Side and ResNet50-Side were trained by naïvely taking the 5 side- outputs directly as the contour detections (without the and ) Evaluation of two variants of the “base CNN” Ours(VGGNet) and Ours(ResNet50) reveals that the deeper architecture of ResNet translates into better boundaries and regions. Lastly, using Ours(ResNet50) to test the four possible combinations (with and without normalized cat globalization and orientation) revealed that, when coupled with trained orientations, globalization actually decreases performance, so it can quite safely be removed, allowing significant speed up. More in-depth results for the Contour Orientation, Generic Image Segmentation and Fast Hierarchical Regions methods, as well as comparisons against more recent techniques (POISE, MCG and SCG, LPO, GOP, SeSe, GLS, and RIGOR) can be found in the original paper. Interview with the authors: Computer Vision News: What is the main novelty in your approach? Jordi: The first main novelty is to bring multiscale image segmentation to a single path of a CNN. We proved in previous papers on multiscale combinatorial grouping (MCG) that segmenting the different scales was very good and got better results. Other computing approaches run the same algorithm and the same CNN on different scales of the image, but that’s suboptimal. One of the novelties here is that we have multi-scale contour- detection with a single pass of the CNN. The second novelty is that, apart from assessing whether there is a boundary or not in each pixel, we also learn the orientation of that boundary. We say whether or not there is a contour here, and at which orientation that contour is. With these three things, the multiscale, contours, and the orientation, we do the same trick that we were doing on the multiscale combinatorial grouping. From these contours we get to regions. In this step from contours to regions, having the orientation helps us in closing these contours into a region. That also gives us a boost. All of that is done in less than a second on a CNN with much improvement upon previous MCGs that were about 25 seconds per image. With this, we have state of the art object proposals in segmentation. CVN: Before you found this solution, did you try any algorithms that brought you in the wrong direction? Jordi: We tried the classical approach of downsampling the images and applying the same thing every time. Kevis and Ralph at MICCAI 2016, discussing common passions: coffee and computer vision…