CVPR Daily - Friday

14 DAILY CVPR Friday Highlight Presentation The Devil is in the Fine-Grained Details: Evaluating Open-Vocabulary Object Detectors for Fine-Grained Understanding Unlike traditional object detection, where the objects are predefined during training, open-vocabulary models can recognize objects described by natural language sentences defined at inference time. “I find these open-vocabulary models very interesting because they offer flexibility,” Lorenzo begins. “They allow end users who may only be interested in recognizing a specific set of objects to use these models without training.” However, despite their promise, current open-vocabulary models struggle with recognizing finegrained properties of objects, such as colors or materials. Lorenzo tells us he was surprised by this, considering recent advancements in generative AI. “It’s quite outstanding that we struggle with discerning fine-grained properties in object detection, which we might assume is an easier task than image generation,” he points out. “We searched the scientific literature on Lorenzo Bianchi is a PhD student at the University of Pisa and CNR-ISTI, supervised by Giuseppe Amato, Fabio Carrara, Nicola Messina, and Fabrizio Falchi. He is working on multimodal deep learning, focusing on image-text interaction in deep learning models. Before his poster session this morning, Lorenzo talks to us about his highlight paper on open-vocabulary object detection.

RkJQdWJsaXNoZXIy NTc3NzU=