Computer Vision News - January 2019

Mask R-CNN training for the COCO human keypoint detection task: for this task, the random initialization network can learn more quickly than the pre- trained then fine-tuned network, even without additional training time. Keypoint detection is a task more sensitive to spatially localized predictions. This is evidence that ImageNet pre-training provides only weak training for spatially localized predictions and that, for these tasks, training from scratch may be perfectly equivalent. Summary of the paper’s insights ● Training networks on their target tasks from scratch (random initialization) is viable, even with no changes to architecture. ● Training from scratch requires more iterations for convergence. Pre-trained networks converge much faster. ● In many different configurations and circumstances, training from scratch may achieve performance on par with that of pre-trained then fine-tuned networks. Including even training on COCO on just 10k images. ● ImageNet pre-training doesn’t necessarily help reduce overfitting, except in the case of very small datasets. ● ImageNet pre-training is less useful if the target task is more related to object localization than classification. 9 Research Computer Vision News Pre-trained networks converge much faster… Rethinking ImageNet Pre-training

RkJQdWJsaXNoZXIy NTc3NzU=