Computer Vision News - December 2020
7 RL-CycleGAN The RL-CycleGAN jointly trains the RL model with the CycleGAN. The Q-learning estimates Q-values from each image (x, G(x), F(G(x)), y, F(y), G(F(y))) in the CycleGAN model. These are indicated by (triple for the 1st originally simulated scene), and (triple for the 2nd originally real scene). Similar Q-values are encouraged within a triple by the RL-scene consistency loss: RL-CycleGAN , ′ , ′′ , ′ , ′′ ℒ − ( , ) = ( , ′ ) + ( , ′′ ) + ( ′ , ′′ ) + ( , ′ ) + ( , ′′ ) + ( ′ , ′′ ) Two different Q-networks ( , ) are trained for simulation-like (x, F(G(x)), F(y)) and real-like images (G(x), y, G(F(y))), using the standard TD-loss. To compute the Q-function loss, the generator or pair of generators are applied to both current image x and next image first, and then combined with the TD- loss. ′
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=