ICCV Daily 2021 - Wednesday

texture reconstruction . In contrast to existing methods for this task, transformers are able to effectively exploit the global information of the input image. “ It is important to use global information for the task of 3D human texture reconstruction because the input and output are not strictly aligned in the spatial space, ” he tells us. “ You need the global context for a good prediction about the 3D human texture. CNNs are by design very good at grasping local information, but not so good at grasping global information. That is the main reason we use transformers here and is the most important contribution of this work. ” However, Xiangyu found the task did not work so well if he used the common transformer architecture. This was confusing at first, until he worked out why. “ In previous architectures, like natural language processing, the input and output are in the same space, but here, they are different, ” Xiangyu explains. “ The input is a query map defined by us, and the output is in the pixel space. 5 DAILY ICCV Wednesday Xiangyu Xu