Computer Vision News - March 2020
2 Summary Research 6 It can be seen that these two metrics are indeed informative. Convexity of the graph means that we loose accuracy faster than we loose weights; concavity on the other hand, means the opposite, i.e. that we prune weights but the accuracy is preserved. It is often desirable to prune the entire filters output instead of prunning the filter in specific locations. To this end, the authors suggest to analyze the interaction of the l-th layer filter with the (l+1)-th layer. Specifically, they rank the F k filters in the l-th layer by averging the kernel wise distance across the input channels in the (l+1)-th layer. They observed that this criteria lead to good filter selection. The second part of themethod is to choose the per-layer prunning ratio. To set this ratio automatically, the authors suggest a Bayesian optimization. The proposed objective to this optimization for each layer is of the form: Results Above, the three loss terms are the classification error, the number of pararmeters and the network size, respectively. x denotes the pruning ratio to be determined. In the Bayesian optimization process, multiple pruning ratios are tested for each layer and the evaluation is done on a small subset of data. This problem can be solved in a closed form manner based on the previous observation (more details can be found in the paper). ( ) = ( ) + 1 ( ) + 2 ( ) To test the results of the method, the authors examined two different architectures; VGG and ResNet , for the task of image classification. The experiments were done on CIFAR-10 and ImageNet with 4 different quantizations: BinaryConnect (full precision activation with binary weights), Binarized Network (binary activation and weights), Xnor Network (binary scaled activation and weights) and DoReFa Network (2-bit activation and weights). The below tables compare the top-1 accuracy for VGG-11 (upper table) and ResNet-14 (lower table) on CIFAR-10. The two tables show the original accuracy, the final accuracy, the prunning ratio, the model memory size and the speed-up from using the method. It can be seen that with a with a small price of accuracy, the method can prune up to 53% of the network weights.
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=