Computer Vision News - November 2018
Every month, Computer Vision News reviews a research paper from our field. This month we have chosen Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search . We are indebted to the authors ( Arber Zela , Aaron Klein , Stefan Falkner and Frank Hutter ), for allowing us to use their images to illustrate our review. Their article is here . One of the key questions in the deep neural network field is how to find the optimal architecture among the seemingly infinite possibilities? And especially: what is the optimal number of layers for the network? What is the optimal filter size and number of filters? Neural Architecture Search (NAS) is the emerging field of research attempting to deal with and offer some solutions for these issues. Traditionally, this type of research is conducted in two stages: first, we find the optimal architecture for a small number of steps (or using a partially pre-trained network). Then, as a separate stage, given the selected architecture, you search for the optimal hyperparameters for training the selected architecture. In this paper the authors demonstrate that a one-stage approach that combines the architecture selection and hyperparameter optimization is preferable to the two-stage approach. Furthermore, the authors demonstrate that the common practice of using training on a small number of steps during the initial NAS stage and a much larger number of training steps for the second stage is inefficient due to a low correlation between the initial training stage and the second longer one. Novelty: The paper’s innovation and contributions to the field: 1. Combination of Bayesian optimization and Hyperband to perform efficient joint neural architecture and hyperparameter search. 2. Demonstration that the approach of using short-step-number training on an architecture is uncorrelated with the results longer (realistic) training will produce. And demonstration of how the combined one-stage approach deals with this challenge. 3. The authors showed how, given an “ uncompromising ” limit of just 3 hours for training and using one-stage combined architecture and hyperparameter optimization, they achieved competitive results. 14 Research: Towards Automated Deep Learning Research by Assaf Spanier Computer Vision News … efficient joint neural architecture and hyperparameter search
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=