Computer Vision News

5 The main idea of a convolutional network for computer vision and recognition is that a ConvNet is composed of Layers. The communication protocol of each layer is also based on a simple idea: an input 3D volume is transformed to an output 3D volume using a differentiable function with optional parameters. Tutorial with Flux This demo tutorial creates a Convolutional Neural Network (ConvNet) to classify the MNIST dataset. The architecture is made of three feature detection layers (Conv -> ReLU -> MaxPool) and followed by a final dense layer that classifies MNIST handwritten digits. For 20 epochs this has about 99% accuracy, which is impressive. The model is also saved in the file mnist_conv.bson . and it shows how to create a model but also train, save and early exit, if that is needed. The following packages are needed: ConvNets using Julia and the Flux machine learning framework The input volume on red, with size 32x32x3 and an example of how neurons are composed in the first neural network layer [image from Stanford CS231n course]. using Flux, Flux.Data.MNIST, Statistics using Flux: onehotbatch, onecold, logitcrossentropy using Base.Iterators: partition using Printf, BSON using Parameters: @with_kw using CUDA CUDA.allowscalar(false) With default values for learning rate, batch size, number of epochs, and path for saving the file mnist_conv.bson : @with_kw mutable struct Args lr:: Float64 = 3e-3 epochs:: Int = 20 batch_size = 128 savepath::String = "./" end

Computer Vision News - June 2021