Computer Vision News - July 2019
The fMRI input is a sample = 1 , 2 , . . , where x i is a scalar value specifying the signal amplitude of the i'th voxel in the brain. Using the measurements of d voxels, the regression function is represented as: where w i are learnable weights and w 0 is the bias term. Given an input image, a forward pass to the DNN decoder (described in the previous section) produces the visual feature vector with entries t 1 ,.. t L where t l is the l'th entry of this vector. Then, each t l is fitted using the regression function above. The objective of the regression is to maximize the likelihood of t l given the voxel amplitudes X, the weights W, and a fixed parameter . This likelihood function has the form of: where N is the number of samples, and is the n sample of the t l entry. In order to get sparse estimator, the authors used Bayesian parameter estimation and adopted the automatic relevance determination prior. Image reconstruction from DNN layer Remember that, during test time, we don't have the input image nor the visual feature vector. What we do have is the feature vector generated from the brain signal, which is supposed to be a good approximation of the visual feature vector (that was generated from the image). To reconstruct the image, the authors use optimization in the image space that aims at finding the image, the DNN features of which are the closest to the fMRI signal feature vector. In order to do so, the authors formulated the optimization problem of the form: where v is a vector in which elements are pixel values of an image (of size 224x224x3), v* is the reconstructed image, Φ is the l'th visual feature vector entry, and y l is the corresponding translated signal generated from the fMRI signal. The goal of this optimization is to find the best image to fit the brain signal. DGN constraint In order to improve the 'naturalness' of the reconstructed images, the authors use a generative adversarial network (GAN). Using a pre-trained deep generator network (DGN), denoted by G(z), they modified the optimization objective to be of the form: Research 10 Research Computer Vision News = Σ + 0 ൡ , , = ෑ =1,.., 2 0.5 ex p{ − 1 2 − ∗ = { 1 2 ቀΦ − ൯ 2 }
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=