Computer Vision News - November 2018
Results: Importance indicates the ratio of the variance the individual choice(s) explain. The dashed gray line indicates the value of the best found configuration: Parameter importance plots for three hyperparameters: The top row are results after 400 sec of training and the bottom row results are after 3600 sec (1 hour). The columns are: left column -- evaluation of different learning rates; middle column -- different residual block numbers; right column -- number of residual blocks plotted against the CutOut length (a kind of regularizer that, as the name suggests, “cuts out” different portions of the image). Correlations between the different training time limits (budgets). You can clearly see that the correlation between similar lengths of training time is very high. Correlation between adjacent budgets is very high. In the top right-hand graph you can see the significant lack of correlation in error rate, with a 27-fold difference between 400 sec and 10,800 sec (3 hours) of training. In the bottom left-hand graph, on the other hand, you can see the high correlation between 1200 and 3600 sec of training -- only a 3-fold difference. Hence, there is no correlation in error rate between the shortest and longest training times. To summarize, the correlation in error rate decreases as the difference in length of training time increases, making the short training time uninformative about the best configurations for the longest training time: Research 16 Research Computer Vision News
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=