Computer Vision News

1. Review the overall structure of the network, and make sure the output dimensions of each layer are appropriate to the input dimensions of the next layer. 2. Make sure the loss value is reasonable. There are a number of tests you can run for this with the network in its preliminary state, and all weights initialized at very small values: a. The loss value produced should be the maximal loss, (as determined by the loss function you have chosen to work with and the number of the categories). b. Verify that the network is functioning properly by checking that when you add a regularization term to the above loss function the loss value goes up a little. 3. You should run an overfit test -- that is, run the network on a simple, clear- cut dataset. There are two types of datasets that might be relevant here: 1) A set of synthesized data that are very easy to tell apart -- such as images that are all either clearly red or clearly blue. 2) The other extreme -- a set of unclassifiable data, where you expect no correlation between input and output -- such as synthesized images where each pixel has a randomly determined value. If the network fails these very simple classification tests, there is a problem with its structure or it suffers from very flawed weights initialization. 4. Continued overfit testing -- train the network on a very small subset of your real-world data samples. If the network fails to learn on this small set, then, again, there is likely a problem with the code and network structure. 5. If the network managed to learn successfully on a small dataset, you can start adding more and more data, while at the same time monitoring the activations of hidden units with histograms to see that they are “healthy”. 6. Tuning the learning rate hyper-parameter -- if the network isn’t learning (that is, it’s loss isn’t decreasing), you can lower the learning rate -- you won’t reach convergence at these rates, but the important thing is to verify that accuracy is going up and loss is going down as training progresses. In the same context, you can lower batch-size to 1 (a single image) -- just to get a very quick feedback and make sure the general trajectory is that the network is learning. 7. Setting the regularization hyper-parameter: in general, you should start with a small regularization term. Then, as you tune the learning rate, as described in (6) above, your loss term should be going down. If it doesn’t, you can slowly increase your regularization term and re-run the training process. 8. Hyper-parameters in general should be tuned as follows: first, test a range of values (in logarithmic scale) by running them for a limited number of for Deep Learning in TensorFlow and Keras 11 Tool Computer Vision News

Computer Vision News - June 2018