CVPR Daily - Wednesday
HSSLAB Inspur and Xidian University. The winning team’s solution leveraged a combination of the BLIP model with an answer grounding algorithm to predict answers, resulting in an improved accuracy of 6 percentage points from previous year’s winning submission. The Answer Grounding for VQA Challenge, newly introduced this year, builds on the VQA Challenge by inviting teams to build algorithms that can additionally locate where in an image a VQA model ‘looks’ when answering a question about that image. Of the 16 participating teams, the team from ByteDance and Tianjin University secured the first place and the team from HSSLAB Inspur achieved the second place in the challenge. Th e ORBIT Few-Shot Object Recognition Challenge, also newly introduced this year, invited teams to build a teachable object recognizer using the ORBIT dataset. Unlike a generic object recognizer, a user can ‘teach’ a teachable object recognizer to recognize their specific personal objects by providing just a few short clips of their objects. The Challenge’s two winning teams, based at HUAWEI Canada and the Australian National University respectively, called on the latest advances in few-shot learning. Their solutions improved recognition accuracy of users’ personal objects by 8-10 percentage points. Winning teams from all three challenges were invited to present their approaches at the workshop. In addition, all teams will be awarded financial prizes sponsored by Microsoft. The panel discussions swept more broadly across the current state of computer vision research and the broader assistive technology ecosystem. Perspectives were drawn from each group of experts - computer vision researchers, access tech advocates who are blind, and industry specialists - as well as from interdisciplinary discussions amongst all three. 25 DAILY CVPR Wednesday VizWiz Grand Challenge
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=