Computer Vision News - June 2018

Methods: First, we’ll give an overview of the network’s structure and then go into further detail about each component: 1. A CNN encoder [in green in the figure below] is used to extract features from the image cropped around the user-marked bounding box. 2. Next, another network [in blue] gets the features produced as input to predict the first vertex, from which construction of the polygon for the object will start. 3. The image features and first vertex are the input for an RNN Decoder [orange]. Each iteration of the RNN produces the next vertex of the polygon [marked by a red square position within a gray box]; the iterations are represented by a broken line. 4. The RNN decoder network includes a Visual Attention mechanism [white], which uses weights to focus the RNN decoder on a certain area in which to search for the next vertex . 5. An Evaluator Network [pink] gets a set of candidate polygons proposed by the RNN decoder as input and selects the best polygon from among them. 6. Finally, a network called GGNN (gated graph neural network) [in yellow], works at a higher resolution to refine the polygon produced, by adding vertices and adjusting the overall polygon. Now, let’s look more closely at each component of the network: 1. First, let’s describe the structure of the CNN Encoder [green]: The Encoder network is based on a ResNet-50 architecture, with the following modifications: (1) Reducing the stride of the network and introducing dilation factors. (2) The original average pooling and FC layers were removed. (3) A skip- layer architecture was added to certain convolutional layers in the network, and all skip layer outputs are concatenated (the skip layers capture both low-level features such as edges and corners and high-level semantic features). (4) Finally, a combination of conv layers and max-pooling operations was used to obtain the final feature map. In the figure below the 112x112 blue tensor is fed directly to Computer Vision News Research 5 Research Computer Vision News

RkJQdWJsaXNoZXIy NTc3NzU=