Computer Vision News

BEST OF CVPR 5 Bowen Cheng “We try to reduce this annotation time by introducing a different form of annotation,” Bowen tells us. “We’re not trying to annotate the mask. Instead, for each object, we first annotate its bounding box. Doing this can take only seven seconds, which is more than five times faster than annotating the mask of every object. Then within this bounding box, we randomly sample a few points. We present each point to a human annotator and ask whether it is on the object or the background.” This reduction in annotation time provides the opportunity to collect more data for training these instance segmentation models. Also, this point-based annotation is compatible with existing instance segmentation algorithms. You can replace the mask annotation with this point- based annotation without changing the architecture, the training algorithm, or the loss. When he started working on this, Bowen says another paper, NeRF: Neural Radiance Fields – which received a Best Paper Honorable Mention at ECCV 2020 – had just come out. “NeRF is used for 3D representation, and at that time, we were wondering if NeRF can represent an object in 3D, can we use NeRF to also represent an object in 2D and the mask of every object in an image?” he ponders. “We found that when we trained this model, we didn’t need the entire mask; we can sample several points inside this object to train the model, which is how NeRF works. We tried to take this one step further. If it works for this NeRF- style model, can these points be used to supervise any instance segmentation model? Surprisingly, it works well for any arbitrary instance segmentation model.” There are two ways to describe the

Computer Vision News - July 2022