Computer Vision News - December 2018

The images at the bottom of the previous page are an example from the ablation study: use of TF (temporal fusion) or MR (mask refinement) separately achieved limited improvement, as seen in (b) and (c). However, combining TF and MR achieved a significantly improved performance (d). Above, sample results from the DAVIS 2017 test-dev set. The left-most column shows the ground-truth mask input for the first frame. Consecutive columns are segmentation results for subsequent frames. Different colors are used to highlight different objects. These samples show the highly challenging nature of the dataset. The authors didn’t employ specific object detectors. Above, sample results from the DAVIS 2016 dataset. Conclusion: This paper is an attempt to mine the higher-order potentials of combining MRF/CRF with CNNs , by embedding a feed-forward pass of a CNN inside the inference of an MRF model. The authors implement an innovative spatio- temporal MRF model for video object segmentation. Their algorithm performs inference in the MRF model, and alternates between temporal fusion and a mask refinement feed-forward CNN, to incrementally infer video object segmentation. The algorithm achieved state-of-the-art results on the DAVIS 2017 public benchmark . Research 9 Research Computer Vision News

RkJQdWJsaXNoZXIy NTc3NzU=