Computer Vision News - January 2022

5 Resolution-robust Large Mask Inpainting This method proposed is called large masked inpainting (LaMa) and is based on the following: • A new inpainting network architecture that uses fast Fourier convolutions, which have the image-wide receptive field; • A high receptive field perceptual loss; • Large training masks, which unlocks the potential of the first two components. A good architecture should have units with as wide-as-possible and the receptive field as early as possible in the pipeline. ResNet and other conventional fully convolutional models suffer from slow growth of effective receptive field which might be insufficient, especially in the first layers of the network. Fast Fourier convolution (FFC) is the recently proposed operator that allows to use global context in early layers. FFCs are proven to suit the capture of periodic structures, which are common in human-made environments. The inpainting problem is inherently ambiguous with many plausible fillings for the same missing areas. To design the components of the proposed loss the following ideas were taken into the game The image shows the transfer of impainting model to higher resolutions. The LaMa models were trained on 256x256 crops from 512x512 and MADF was trained directly on the latter. With the increase of the resolution the models with regular convolutions produce critical artifacts with the FFC-based models generating finer details and being more consistent structurally. .

RkJQdWJsaXNoZXIy NTc3NzU=