Computer Vision News - July‏ 2024

Computer Vision News Computer Vision News 34 Workshop Presentation Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero-shot Medical Image Segmentation Sidra Aleem is a final-year PhD researcher at ML-Labs, Dublin City University, focusing on domain adaptation for biomedical imaging using foundation models. Following her oral presentation on Monday at the CVPR 2024 Workshop on Domain adaptation, Explainability, and Fairness in AI for Medical Image Analysis (DEFAI-MIA), she speaks to us about her paper on test-time adaptation with foundation models. Medical image segmentation, critical for clinicians in diagnosis and prognosis, is the focus of Sidra’s innovative work. She proposes a novel cascade of two foundation models, Meta’s Segment Anything Model (SAM) and OpenAI’s CLIP, leveraging their unique capabilities to enhance zero-shot organ segmentation accuracy in medical imaging. “These foundation models have completely revolutionized the world around us,” she tells us. “While they’ve been predominant in natural imaging, their effective application has yet to be explored in medical image segmentation.” Her approach involves using SAM to generate all the different region proposals from medical images. She then employs CLIP, a multimodal model designed to process text and images, to identify the specific organ for segmentation. While CLIP has already been extensively tested on natural images, Sidra has successfully adapted it to the unique challenges of medical imaging. “As we know, one of the widely used applications of CLIP is image retrieval,” she explains. “My objective was to utilize CLIP to get the region of interest from all these region proposals. For the text part, in medical imaging, we need domain knowledge. To mitigate that issue, I generated text prompts using ChatGPT.” Regarding lung segmentation, Sidra used ChatGPT to generate 20 attributes

RkJQdWJsaXNoZXIy NTc3NzU=