ICCV Daily 2021 - Friday
Han says one of the motivations for this paper is to unify the architectures of NLP and computer vision, and he and the team hope Swin Transformer’s strong performance on various vision problems can drive this belief deeper in the community and encourage modeling convergence between the two. The next step is whether vision can have very big models, like GPT-3 in NLP, and whether NLP and computer vision signals can be better joined due to converged modeling. “ In previous years, NLP and computer vision progressed in parallel, but now because Transformers are used in both NLP and computer vision, they have the potential to be really joined together, ” he tells us. “ In previous years, computer vision tasks handled, for example, 100 categories, but now because NLP and computer vision can be better joined together, we can deal with almost all concepts for visual signals. I think this could be a big change and we want to study in this direction. ” Han’s work at Microsoft Research As ia involves research on how to build vision for artificial general intelligence (AGI) . 6 DAILY ICCV Friday Oral Presentation “In previous years, NLP and computer vision progressed in parallel, but now because Transformers are used in both NLP and computer vision, they have the potential to be really joined together…”
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=