ICCV Daily 2023 - Wednesday

This problem becomes particularly significant when deploying computer vision models in realworld scenarios where unseen situations may arise. Before shipping these models to the field, it is important to test that they would also do well on these corner cases. Jan highlighted two critical motivations for this: safety and fairness. “In terms of safety, if we have an autonomous system, it must behave safely on corner cases,” he asserts. “In terms of fairness, if we have demographic minority groups underrepresented in training and test data, we want the system to perform well on those groups.” 23 DAILY ICCV Wednesday Jan Hendrik Metzen PromptAttack identifies the subgroup „rear views of small orange minivans in front of snowy forest“ as systematic error of a VGG16, which misclassifies 25% of the corresponding samples as snowplows (not as minivans). A ConvNeXt-B classifies the same samples with 99% accuracy. What sets this research apart is its innovative approach to identifying systematic errors by introducing the concept of an operational design domain. This domain encompasses all the scenarios where the system should perform well. Using a text-to-image model like Stable Diffusion, it synthesizes images within this domain and rigorously tests systems against them, which has not been done before. However, relying on text-to-image models presents a challenge because they occasionally produce images that do not align with the intended text prompts. Ensuring the faithfulness of these generated images is vital since they are being used to validate downstream systems, and it would not be possible to screen thousands of images manually. Jan had to perform moderate prompt engineering and assign specific classifiers to address this.

RkJQdWJsaXNoZXIy NTc3NzU=