ICCV Daily 2021 - Wednesday

That’s the reason we couldn’t use the common transformer architecture for our task and had to propose a new design. ” The team propose a transformer-based network called Texformer – a play on words as it is a transformer for texture reconstruction. The network is different from any previous transformer-based architecture and works really well for this task. Another problem Xiangyu and his colleagues had was that the transformer structure uses a mechanism called attention, which is memory heavy and computationally expensive. It is fine for a task like natural language processing which has a 1D signal, but in this task the input image is a 2D signal which raises the stakes considerably. To solve this, he proposes a low- rank attention layer that reduces the memory and computational cost. The novel contributions to the task of 3D human reconstruction don’t stop there. “ We also combine the output of RGB values and texture flow, ” he adds. “ In previous works, people use either RGB or optical flow as the output. In this work, we combine those two outputs and get the best of both worlds. We also have new innovations in terms of the loss function that are beneficial for this task. We have a part-style loss and a face-structure loss 6 DAILY ICCV Wednesday Oral Presentation “I believe this is the first work to use transformers for 3D human texture reconstruction…”