Computer Vision News - December 2020

6 Reinforcement learning techniques can be used for a variety of tasks. Training deep neural network models to grasp objects is among these, and it requires both the ability to learn visual representations and to change the environment accordingly. Research The combination of a CycleGAN with a reinforcement learning technique is employed in order to inform the GAN about which components of the image are relevant and avoid hiding information in the adapted image that might be fundamental. This is done by enforcing RL consistency losses on all the inputs and generated images. More on Q-learning This is a reinforcement learning technique which learns a Q-function, using the below equations, to define the best policy to apply in order to maximize the total expected future reward. The definitions are reported below. Environment of states Actions Rewards Next states Estimate of next state’s value Discount factor Distance metric Q-function (updated to minimise TD loss) Temporal difference (TD) loss Policy s = input image a = candidate action r s’ V(s’) d ( , ) ( ( , ), + ( ′ )) ( | ) = ( , )