Computer Vision News - October 2016

Computer Vision News Computer Vision News Research 29 Research The DeepPyramid DPM model overview (figure below): (1) an image pyramid is built from a color input image; (2) each pyramid level is fed through a truncated SuperVision/AlexNet that ends at convolutional layer 5 (conv 5 ); (3) the pyramid of conv 5 feature maps depth is 256 as in SuperVision/AlexNet; (4) each conv 5 level is fed into a DPM-CNN (more details in the next paragraph), which (5) produces a pyramid of DPM detection scores. A single-component DPM-CNN operates on a feature pyramid level: (1) the pyramid level is convolved with the root filter and the P filter parts, generating P+1 convolution maps; (2) those are processed with a distance transform (DT) pooling layer and (3) a sparse object geometry filter. The output is a single- channel score map for the DPM component. Training Testing Input Set of images Trained CNN network (SuperVision. AlexNet) Color image pyramid Output Optimized hyperparameter: distance transform (DT), geometry filter DPM score map for the input pyramid level Dataset PASCAL VOC 2007 PASCAL VOC 2010-2012