ICCV Daily 2021 - Tuesday

Xi Yin is a Research Scientist at Facebook AI Research. She graduated from Michigan State University. More than 100 inspiring interviews with successful Women in Computer Vision in our archive Xi, can you tell us about your work at Facebook? I’m generally interested in computer vision and machine learning. My research background is in two different areas. One is the face. I worked on facial recognition for my thesis. After graduation, I spent two years at Microsoft and one year at Facebook. In recent years, I started to expand my research area to multi- modal understanding of vision and language. I find that very exciting. That’s an area that I’d like to focus on more in the future. What about this new field attracted you? Vision and language are the two very fundamental human capabilities. Humans use the interaction between vision and language in almost everything we do in our daily life. Just imagine when the kids start to learn. They read storybooks with pictures and words. They learn to interact with language and how they visualize the world. For a machine to really have intelligence, it is crucial to understand this multi-modal. From a research perspective, vision and language data are easy to acquire. A lot of the images available online have those associated alt-text. That can easily scale to a lot of data compared to the vision supervised training where annotation data is very challenging. In recent years, we have seen that vision language data can help do vision tasks. That is very promising research. What is the most challenging part of this field for you? There are many challenges. For example, when we think of images 20 DAILY ICCV Tuesday Women in Computer Vision “Actually, that’s the first time I felt like, ‘ Oh wow, I’m doing well!’ ”