Computer Vision News - May 2018
The basic idea at the heart of adversarial attack methods is using this exact same training mechanism, but this time rather than updating the weights used by the network -- we update the data -- yes, the image itself (you read that correctly)! We will see this in Code 2 below: the first difference from Code 1 is that, rather than training a network from scratch, we load the parameters of the existing trained network. The dataset we shall use will be a single image which we shall initialize to random numbers. The rest of the difference comes in the last 3 lines. First, we change the loss we shall be using -- setting it to the loss corresponding to the label of the image we are trying to portray (falsify). Let’s say we want to mimic a cat image. In each iteration of the image the code returns the loss for the cat image selected for falsification (more on this in the next paragraph). Then, the same backpropagation process is followed, only this time rather than using the gradient of the weights we take the gradient of the data itself. Note that the backpropagation process in any case calculates both gradients: one for the weights and one for the data. If in the usual DNN training process we are interested in the weights gradient, in this case, because we want to purposefully create examples that will fool the network, we’ll take the data gradient -- and update the data itself rather than the weights (in the last line). But what does this change in loss represent? There are several possibilities and we shall demonstrate two of them. For demonstrating this we’ll use the MNIST handwritten digit database, which is comprised of 28x28 pixel 2D grayscale images of handwritten digits, from 0 to 9, like the ones below: As our example, we’ll take the simplest network representing the digit “1” on the right. For creating a mimic image, we’ll use a small, simple network which can be 8 Tool Computer Vision News Code 2 Pseudo code of training neural network network.load_weights() dataset = random((784, 1)) data_ = dataset.sample_data_batch() while True: Loss , Wh, h = forward ( data_ ) Loss’ = Adversarial(loss) dData_ = network . backward ( Wh , Loss’ ) data_ += learning_rate * dData_ Adversarial Attacks on Deep Learning
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=