Computer Vision News - November 2018

The Method and the Experiment: To efficiently optimize in the architectures and hyperparameters joint space, the authors chose to use BOHB , a recent combination of Bayesian Optimization (with its guarantees of convergence) and Hyperband searches . Like hyperband, BOHB uses evaluations with different training-time limits to accelerate optimization. BOHB allows you to run optimizations under different constraints, known as budgets. BOHB uses kernel density estimators to select promising candidates instead of sampling new configurations at random. The baseline network for the experiment is the PreAct ResNet-18 (He et al., 2016) and WideResNet-28-10 (Zagoruyko and Komodakis, 2016). The dataset for the experiment was CIFAR-10. In the optimization phase the CIFAR-10 dataset was split into 3 subsets: 45k data points for training, 5k for validation and 10k for testing. The authors evaluated the following hyperparameters: The parameters of each model are initialized as described by He et al. (2016) and trained using SGD with an Initial Learning Rate of 0.1. Batch Size was 32. L2 regularization was applied with a factor of 10-4 and 5 * 10-4 to 3-branch and 2-branch networks, respectively. Momentum (Nesterov's) was 0.9. MixUp (Hongyi Zhang, 2018) with a value of 0:2. CutOut (DeVries and Taylor, 2017) was applied with a mask length of 16. Use ShakeDrop with alpha parameter set to 0 (Yamada et al., 2018). All networks were trained on one Nvidia GTX 1080Ti GPU. All the models in 2 are trained for 3h with the initial learning rate annealed using a cosine function with T0 = 720s, Tmult = 2 (Loshchilov and Hutter, 2017). So far we are dealing with “classic” hyperparameters, the ones usually handled in the second stage of NAS, given a specific selected architecture -- these are listed in the top part of the table below. The architecture parameters are found in the second part of the table and deal, as you might expect, with the number of blocks, number of filters and filter size: Computer Vision News Research 15 Research Computer Vision News

RkJQdWJsaXNoZXIy NTc3NzU=