WACV 2025 Daily - Saturday

Winter Conference on Applications of Computer Vision Saturday 2025 WACV

Cagla also has an accepted paper here at WACV. Don’t miss her Oral and Poster presentations today: LLM-generated Rewrite and Context Modulation for Enhanced Vision Language Models in Digital Pathology. Her paper introduces a novel approach to improving vision-language models (VLMs) for digital pathology, addressing key challenges such as limited largescale datasets and the sensitivity of zero-shot classification tasks to prompt variations. To overcome these limitations, the study leverages large language models (LLMs) to generate enriched language rewrites for a public pathology dataset, demonstrating that this augmentation enhances performance in tasks like zero-shot classification and text-to-image and image-to-text retrieval. Additionally, the paper presents a context modulation layer that refines image embeddings to better align with paired text, further improving model accuracy. As part of this work, the study constructs the largest publicly available pathology caption dataset to date, comprising 8 million captions. These advancements demonstrate the value of carefully leveraged synthetic data in building more robust and reliable multimodal models for medical imaging. 2.3.4 From Visual Explanations to Counterfactual Explanations with Latent Diffusion 3.2.5 Uncertainty-based Data-wise Label Smoothing for Calibrating Multiple Instance Learning in Histopathology Image Classification 2.13 MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training 2.27 CusConcept: Customized Visual Concept Decomposition with Diffusion Models Cagla’s picks of the day: Cagla Deniz Bahadir is a finalyear PhD candidate in Biomedical Engineering at Cornell University. Her research focuses on the intersection of Machine Learning and Medical Imaging, with an emphasis on enhancing the reliability and robustness of medical visionlanguage models. For today, Saturday 1 2 Cagla’s Picks DAILY WACV Saturday Orals Posters

3 DAILY WACV Saturday Keynote today – Hannah Kerner What is Hannah looking at? Probably at the hidden world of satellite remote sensing data and the treasures it holds for computer vision researchers. Come to her keynote at 9:00 to find out!

Diffusion models are a highly effective tool for image generation, often producing photorealistic results when conditioned on specific characteristics. However, guiding a pre-trained diffusion model to generate realistic images from a class not included in the original conditioning process remains a challenging problem. Although successful, previous solutions, such as the ADM-G guiding approach, provide minimal guidance during the final stage of the denoising process, leading to lower-quality outputs. To address this, the team proposes GeoGuide, a novel tool designed to guide diffusion models effectively. “It’s not obvious how to generate high-quality images since directly sampling elements can produce results that are not exactly in the manifold of the data,” Przemysław explains. Jacek elaborates: “When you look at the details, for example, you can have a dog with five legs or two heads! Diffusion models sometimes switch off from realistic situations to something strange. We wanted to understand why that was happening and how to force the model to produce results in the true data manifold, which satisfies the prompt!” The team proposes a guidance strategy using a neural network to 4 DAILY WACV Saturday Oral Presentation Przemysław Spurek and Jacek Tabor are professors at the Jagiellonian University in Krakow. Alongside first author Mateusz Poleski (who prepared this publication as part of his master's degree thesis), they are co-authors of a fascinating paper accepted as a poster and oral this year and are here to speak to us about their work GeoGuide: Geometric guidance of diffusion models Jacek Tabor Przemysław Spurek Mateusz Poleski

condition the sampling procedure to force the diffusion model to produce some elements from a specific class. One of the key innovations in their work is the normalization of classifier gradients. Controlling these gradients is crucial – if they are too weak or too strong, they can distort the generated image. Their solution balances this influence, optimizing the model’s output. Jacek highlights two significant challenges they faced: conducting extensive, time-consuming experiments to find the correct model and understanding and building the theoretical foundation. “We observed that if you train the diffusion model and want to force it to present elements of the given class, then often the resulting image was non-realistic and elements of the class were too big or too artificial,” he tells us. “This suggested that maybe something was wrong with the underlying theory. We started investigating this problem and observed that you can switch from the probabilistic to the geometric perspective.” ADM-G is based on a probabilistic approach, but the team found that adopting a geometric perspective led to significantly better results. By analyzing the diffusion model in terms of the distance of its trajectory from the data manifold, they developed a model that is easy to implement and outperforms the probabilistic approach. With diffusion models being one of the hottest topics in modern computer vision research, Jacek believes they were chosen for a coveted oral presentation this year because their work delves deeper into understanding what is happening behind the model. 5 DAILY WACV Saturday GeoGuide

6 DAILY WACV Saturday Oral Presentation

Comparing their approach to working on a car, he explains that whilst you could slightly modify the car’s design to make it go faster, real improvement comes from understanding what is happening inside and working on the engine. “We're trying to get to the heart of the problem and understand what’s happening there,” he clarifies. “After understanding the problem, we want to reinvent the engine from the beginning without modifying the details of the model.” GeoGuide lays the groundwork for further advancements in diffusion model guidance, and the team intends to continue this research direction with more papers to come. “We think our solution can be used in many other diffusion model applications,” Przemysław says. “We use the classifier for guidance, but another important task is to guide the diffusion model by another diffusion model or some version of another diffusion model. This is called classification-free guidance. We believe that our solution can also be applied to this task.” We look forward to seeing where the future takes this work, which is backed by the strong academic tradition of the team’s institution, Jagiellonian University, the oldest university in Poland. “It was established in 1364,” Jacek reveals. “I think it’s the ninth oldest in Europe and is very nice. If you want to come to Krakow, we invite you!” I visited a few months ago and it is definitely worth a visit! To learn more about GeoGuide and meet the team, visit Poster Session 1 from 11:15 to 13:00 and Oral Session 2.1: 3D Computer Vision II from 14:00 to 15:00. 7 DAILY WACV Saturday GeoGuide

8 DAILY WACV Saturday Oral Presentation This work is about fine tuning diffusion models to perform geometry tasks such as depth and surface normals estimation. Depth estimation and surface normal estimation are very important for robotic navigation, 3D reconstruction, image and video editing and a bunch of other important computer tasks. “The novelty of this work,” Gonzalo reveals “is about repurposing stable diffusion as an efficient deterministic geometry estimation model.” But was the biggest challenge in doing this? One of the challenges was working with large models. Stable diffusion is a big model in general and also inference with diffusion models has some computational overhead as well. You have to do multi-step inference and so evaluating these models and doing certain ablations or tests takes also a lot of time. “I would say if you evaluate a model for 50 steps and evaluating on a single data set may take a lot of time,” Gonzalo shares. “You cannot really quickly iterate over your work. And I think that's what's nice about our work, where now you can use diffusion models in an end-to-end manner and generate predictions in a single neural pass very quickly!” In general, we would like to use very powerful models to estimate 3D scenes and get kind of new knowledge and new predictions without, for example, the use of sensors and the like. “In 3D, I'm particularly interested in 3D reconstruction of scenes,” Gonzalo Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Gonzalo Martin Garcia is a Master's student and a research assistant at the RWTH Aachen Computer Vision Group, under the supervision of Karim Abou Zeid and Christian Schmidt. Gonzalo is also the first author of an exceptional paper that was accepted as a poster and oral at WACW 2025.

9 DAILY WACV Saturday Fine-Tuning Image-Conditional … says. “In this case, it's monocular, which is a pretty ill-posed problem, making it very difficult. And in that sense, you do require a strong prior from a model. And thus, our work focuses on that in utilizing a very strong prior such as stable diffusion for our predictions.” People coming to Gonzalo’s poster and oral presentations will discover that you can use diffusion models for these geometry tasks in an efficient manner. You can do single to few steps inference. They will also find out that you can directly fine tune these models on a desired downstream task. In this case, it's depth and surface normal estimation. But for any geometric task that someone's interested in, they can now train it end-to-end and get very fast and reliable inference. There are not many master's students who are first authors on major conferences’ accepted papers, so congratulations to Gonzalo. What makes him the proudest? “I think having gone through the whole research process, reading all papers, gathering knowledge, trying everything out and finding kind of the light, finding an idea and keep pursuing it and then getting a breakthrough, writing the paper”, is Gonzalo’s extensive reply. “And the whole process was very, very interesting and very much fun!”

And since he read all these papers on this subject, which work did the biggest impression on him? “When I began research, I was really impressed by Marigold, which is the original model from our work. And that's why I really kind of pursued it and really tried to understand it,” Gonzalo explains. “I really like the idea of fine tuning these diffusion models to perform a non-generative task, more geometric or classical computer vision related tasks. For me, it's definitely Marigold!” A major lesson from this paper is that end-to-end fine-tuned diffusion models can be relatively simple. You can fine tune them on just one loss function, very standard loss functions and get very good results. Regarding future work, there is a lot of potential to introduce new loss functions, new losses and new tasks and really push this line of research. 10 DAILY WACV Saturday Oral Presentation

To conclude, Gonzalo shares with us curious moment in this work that will make our readers smile. “At one point, I was training Marigold for different tasks, like infilling or segmentation. For segmentation, I was looking at the single step and multi-step predictions. And I was noticing how the single step predictions are generally blurrier. For example, the shelf was the color lilac. And with more steps, I was seeing the color lilac kind of slowly derailing into a different color. For me, that was like the realization that there is an accumulation of error. It's something weird that's happening here. I find it really funny because I didn't even know: is it my bug? Is it what is causing this drift? And then I went back to all other models. It turns out multi-step inference for these types of models does produce certain artifacts which do negatively impact the performance and with more iterations, they become more pronounced.” To find out how fine-tuning imageconditional diffusion models is easier than you think, visit Poster Session 1 today (Saturday) from 11:15 to 13:00 and Oral Session 4.1: 3D Computer Vision IV tomorrow (Sunday) from 10:15 - 11:15. 11 DAILY WACV Saturday Fine-Tuning Image-Conditional …

12 DAILY WACV Saturday Poster Presentation The authors are interested in the prototype learning concept for interpretability in case of semantic segmentation. They saw this model for interpretable semantic image segmentation that is called ProtoSeg. When they read it, they realized that one of the main issues was not integrating scale into this interpretability process. How can we bring scale into prototype learning? Does it help performance and does it help interpretability? “The whole paper is about this,” reveals Hugo, “and building an architecture around this principle. I came up with several ideas on the topic of prototype for semantic segmentation to my PI and discussing with him we realized that this would be the most interesting direction to look for!” Of course, we need interpretability because we need to understand the decision process of black box models. This is all the most true if we want to use AI for critical tasks when human lives are at stake, like for medical tasks. In this specific case, Hugo is using it for wild fires. If we want to use this, we need to be able to understand the decision process of the model, without being worried about any type of errors that could impact the users. The evaluation process of the method was the major challenge in this work. Hugo thinks they have a fairly better performance of the method in terms of IU accuracy. But it's hard to convey to the reviewers why this would be better in terms of interpretability. “I think this was the most difficult part,” claims Hugo, “because there Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation Hugo Porta is a PhD student at EPFL in Lausanne. Hugo is also the first author of a paper that was accepted as a poster at WACV 2025.

13 DAILY WACV Saturday Multi-Scale Grouped Prototypes … is no clear metrics or quantitative evaluation of prototype methods, except a few papers that introduced this for classification. We were adding scale as a dimension, but is this really useful and how does it help?” The solution came via that paper that was doing a quantitative analysis for classification. In the second part of the paper, the authors were able to extend this to semantic segmentation. Why should we visit Hugo’s poster today? “I think you will discover the main interest of this method, like why scale and what does it bring,” Is Hugo’s reply. “I hope that this will encourage you to come, if you're interested in prototype learning or in interpretability at large, to see that there is still new direction to discover for specific tasks, outside of the realm of classic tasks like classification, that still need to be

studied. And for semantic segmentation, we are only starting, I think, and there is only like two papers on prototype learning for semantic segmentation.” Can we imagine a specific use case for this work? Hugo thinks of one reuse case of how this would have worked, and he added it into the appendix of the paper by requirements of one of the reviewers during the WACV review process. It shows a basic example of why scale helps detect, confronting factors in the process. For the classification of a cow instead of a sheep, sometimes at lower scale, we look at texture pattern and the hair pattern looks like a sheep on a given cow. “But if you look at broader scale,” concludes Hugo, “then you will see like more focus on the shape of the part, of the different parts. And through this process, we can understand through our scale analysis why it was classified as sheep and not as a cow: because there was a too strong focus on the lower scale information and not the higher scale. This is a concrete example that we put in the appendix of the paper to help people understand in a real use case how this would work. And obviously this could be broadened to medical and other applications. As I mentioned quickly, we work on satellite data, so scale is very important!” Hugo works in the lab of Earth Observation and Computational Environmental Science at EPFL. 14 DAILY WACV Saturday Poster Presentation WACV Daily Editor: Ralph Anzarouth Publisher & Copyright: Computer Vision News All rights reserved. Unauthorized reproduction is strictly forbidden. Our editorial choices are fully independent from IEEE, WACV and the conference organizers.

His research focuses on interpretability and wildfire prediction. This is one theoretical paper into a more broadened thesis where we want to see if we can match through the interpretable model what we know from science on the wildfire pattern, focusing on the boreal ecosystem, in Canada and other countries. Hugo is building a huge benchmark on this wildfire prediction task for Canada, going back to more theoretical work on interpretability to see how can we apply those methods to non-RGB image data. To ask questions about Hugo’s work on interpretable semantic segmentation, visit Poster Session 1 today (Saturday) from 11:15 to 13:00. 15 DAILY WACV Saturday Multi-Scale Grouped Prototypes …

Devi Parikh is a co-founder at Yutori. She was until last year an Associate Professor at Georgia Institute of Technology and a Senior Director for Generative AI at Meta. Devi, let's start with the most difficult question. You knew that this question would arrive. How can you abandon a fantastic tenure at Georgia Tech and a brilliant position at Meta to enter into this new adventure. Yeah, yeah. I'm happy to share that. I also talk about it a little bit at my ICLR 2024 keynote. If anyone's interested in a fuller response, they can also check that out. I think it's a few things. One is that I value new experiences. I was having a great time at Georgia Tech. I was having a great time at Meta. I was in the GenAI organization where a lot of exciting work was happening at a fast pace. It was high priority work. The leadership was fully invested. There were resources, awesome people. All of that was great. But I was starting to miss driving things myself. I was in a role where I had multiple teams reporting to me. They had their amazing team leads. It didn't seem like they needed much from me on a day-to-day basis. I was more making sure the right people were talking to each other. It was more reacting to whatever was coming my way and Read 160 FASCINATING interviews with Women in Computer Vision Women in Computer Vision 16 DAILY WACV Saturday

making sure everything's going smoothly, as opposed to feeling like I am driving something where what I left off the night before, I'm picking it up again the next morning and pushing on it. And yeah, I gave it a few months. I talked to my various mentors. Everyone was very supportive with whatever changes I wanted to make. But I felt like I might as well go do something entirely new that I haven't done before. Tell me first what is the most exciting thing about Yutori. And then, you can explain what it is, why do we need it, and why it will be successful. We very recently came out of stealth and we haven't yet shared sort of the mission and the vision and what we are going after and things of that sort. It was on the same day that OpenAI had released Operator, and we have internally been building a lot of the web agents technology. That had seemed like a good time to share with the world what we've been up to in this time. What can we know about it? We put out these web agent capabilities where we can have multiple agents that are out there on the web, navigating the web, taking actions to complete tasks on behalf of the user. And so the things that we focused on there is this multi-agent capability. You can have multiple of these agents in parallel doing the work that needs to happen and setting it up in a way where we are surfacing the right level of information back to the user so that there isn't an information overload. But at the same time, it's not a black box where they feel like they know what's going on and they know what progress is being made. You guys are the only ones to do it or there will be plenty of others doing it? There is quite a bit happening in the web agent space right now, both at larger companies and at startups. At startups, a lot of it tends to be enterprise facing. We are imagining sort of helping consumers on a dayto-day basis getting their tasks done. But it is a crowded space. It is early. We'll have to see how things go... You are certainly doing it because 17 DAILY WACV Saturday Devi Parikh

there is something that is tempting you. So what is the most exciting part of this? Yeah, I think there's a few different dimensions. The answer would be different for different dimensions. On the technical front, I think what's exciting is that we are at a stage where these web agents that are reliable, that can get things done for you, are within grasp, but it's not already solved. It feels like at that sweet spot, where it makes sense to go after that problem. And if we go after it, we can make very real progress and have something useful that is out there in the short enough term. But it's not solved yet. It's not like you can just take in an API and just build an app around it and have it out there and be sufficiently reliable. So from the technical perspective, that's what's exciting. On the personal front, like I was saying, I value new experiences. And this is amazing. It's been exciting to start from scratch. We are a small team. I enjoy working with these people. They're very strong in what they do. And anything that we do right now is progress because we're starting from zero. We haven't even hit the wall yet. One of my co-founders is Dhruv Batra, who is my spouse, who I've worked with a lot in the past couple of decades. And Abhishek Das, who is a very good friend of both of us. Doing this together with people who you like spending time with and who you trust is another element of excitement as well. This is quite a gamble for you. Why do you think that you guys will succeed? I think if you look at the web agent ecosystem of all the efforts that are happening in this space, we are probably one of the fewer teams that have the intersection of sort of a background in having trained these models. And even the team that we've put together, we have the multimodal posttraining lead of Llama3, who's on our team from Meta, someone who's worked on Gemini at Google, someone who was at a startup called Minion that was also working on web agents. Soon we have someone joining from Llama4 who was leading multimodal post-training for Llama4. We have this set of expertise that we've put together that Women in Computer Vision 18 DAILY WACV Saturday

that we think is very valuable to pushing the technology to getting these web agents to be in a reliable place. I think that intersection of wanting there to be a product out there that people are using and having the AI muscle to push the technology to make it happen is not as common, as far as we can tell. But like I said, it is a crowded space and a lot of smart people out there thinking about this problem. We'll just have to see. Did you shed a little tear when you quit Georgia Tech? Not so much when I quit Georgia Tech because I had already started spending more time at Meta. It was a little bit of a slow progression. But the one point that I do remember is when the first batch of my students were graduating. There were some new students coming in but I wasn't recruiting quite as actively. So as my first batch of students graduated, I just felt like, wait, this is just not my lab. Like my sense of what my lab is was tied to that first batch of students that came in. And yeah, I had gotten quite attached to them. In hindsight, I am now realizing that it was because that is who I thought of as my lab. And it was hard for me to picture how my lab cannot have these people. And so that may be a more emotional slash meaningful moment. How many people did graduate whom you supervised? That's an interesting question. I'll have to go back and count. I actually haven't counted it. They're not that many across PhD students and master students. It's probably been dozens. Tell me what is the most surprising thing that you have learned from one of them? I don't know if this is surprising, but I guess it is because it's not something I expected. This was technically Dhruv's student, but I was collaborating very closely with him. This is Abhishek Das, our third co-founder. Even when he was a student, like his first or second year in grad school, he had this sense of wanting to push the field forward, regardless of who made it go forward. So even if he was working on a problem, for example, if I put 19 DAILY WACV Saturday Devi Parikh

myself in his position, if I'm working on a problem and if somebody else publishes a paper in that space, I get a little bit nervous about like, oh wait, like I was going to be doing this. Now they have this other thing and like, how are we going to compete and will we outperform them or not? I get a little bit competitive in that sense. It's a bummer. Yeah. And I think that is how most people react to a situation like that. But for him, it seemed like that was just not a thing. He was just genuinely glad that somebody else has pushed this field forward. And at first, when he said something like this, I thought, I don't know, like you can't genuinely feel that. Maybe he's saying it, but he doesn't really mean it. But over the years, I got to learn that this is his perspective that he just thinks it's good overall for the ecosystem for things to move forward. He wants to do well. He wants to win. He is competitive, but it's not in this sort of feeling of like, oh, this other person did this thing before us or things like that. And that, I thought, especially when he was a student for second year, I thought was a very mature way of looking at it, which I hadn't encountered before. Did you shed a little tear when leaving Meta? Yeah [laughs], I had spent a lot of time there. I was quite close to my teams and my peers and the rest of the organization. In GenAI, it was a new organization that had come together very quickly and you were pushing very aggressively on a lot of goals. There was just a lot happening and a lot of opportunities to connect with people and get to know them better. Yeah, I do miss them, both my peers and my teams. This is something that all our community knows, the AI group at Meta is very impressive and we are doing fantastic… they are doing fantastic things! Devi, where were you born? I was born in the U.S., but I grew up in India. At what age did you come back from India? I came back when I was 17, after high school. Women in Computer Vision 20 DAILY WACV Saturday

Every month two million Indian children are born, which means that probably almost a couple of million are entering the workforce every month. They all want to stand out, to create their own space in the work world. How can someone out of this cohort become Devi Parikh? Your advice could be very precious for them. Yeah, I don't know if I have anything useful to say there. The things that have worked for me have been sort of pursuing things that I find interesting. that I feel excited about, because I think that is what makes it sustainable. And that is what allows me to sort of put in a lot of time and effort and keep pushing for things. My general philosophy tends to be that I'll have the best shot at succeeding if the thing that I'm doing is something I like. And so that's what I've tended to optimize. That's not necessarily the most strategic career advice, like there are things that I enjoy doing that aren't necessarily very impactful in sort of the traditional way of thinking about impact. And so that may not be the most strategic thing for a career. But yeah, that's how I operated and it has worked out. How do you define today what your work is about, Devi? Yeah, yeah. It's been a while since I've had to describe it. I've worked in AI for a couple of decades at this point and my background was in computer vision. Over time, I 21 DAILY WACV Saturday Devi Parikh

got more and more interested in finding ways for people to be able to interact with machines in natural ways and making that interaction deeper and richer. That is what moved me from computer vision to the intersection of vision and language and sort of a lot of the multimodal capabilities in these AI models, looking at how we can use AI for enhancing human creativity. That is what got me interested in AI for creativity, generative models for images, for video, which is what sort of led to this intersection with the GenAI excitement. A lot of the work that we're doing now feels like it's in continuation of that story. Your word for the little Devi who arrives in the US at 17, a little word of positive advice. I think back then, I wouldn't have even considered the possibility that I could be in the positions that I've been in; that I could be doing the kind of work that I've done. That was not even within the realm of imagination! If at the time someone had asked me what's the best case scenario, I probably wouldn't have thought of this. So maybe just telling that to my younger self: to try and broaden the horizon of possibilities, that I have a very narrow view of the world and there's much more beyond it that's possible! Women in Computer Vision 22 DAILY WACV Saturday Read 160 FASCINATING interviews with Women in Computer Vision! Read 160 FASCINATING interviews with Women in Computer Vision!

23 DAILY WACV Saturday UKRAINE CORNER Russian Invasion of Ukraine Our sister conference CVPR condemns in the strongest possible terms the actions of the Russian Federation government in invading the sovereign state of Ukraine and engaging in war against the Ukrainian people. We express our solidarity and support for the people of Ukraine and for all those who have been adversely affected by this war.

24 DAILY WACV Congrats, Doctor Iris! Saturday Iris Vos defended her PhD on January 28th at Utrecht University. During her time at the Image Sciences Institute (UMC Utrecht), she worked on developing advanced computational methods to study brain blood vessels on a large scale. Under the supervision of Ynte Ruigrok, Birgitta Velthuis, Hugo Kuijf, and Jelmer Wolterink, her research focused on using artificial intelligence and image analysis to predict the development of intracranial aneurysms. After completing her PhD, she moved into industry and now works as a Data Scientist at Datacation. Congrats, Doctor, Iris! A brain aneurysm – also known as an intracranial aneurysm – is a weak spot in the wall of a brain blood vessel that starts to bulge out. Think of it like an inflating balloon. While most aneurysms do not cause problems and remain undiagnosed, some can grow over time, increasing the risk of rupture and leading to life-threatening bleeding. The problem is that that we don’t yet have a reliable way to predict who is most at risk. As a result, current screening protocols are only moderately effective. If we could identify specific features in images that indicate higher risk, we could make screening more efficient. High-risk individuals could be monitored closely, while the screening for those at low risk could be reduced or even stopped. Identifying these features, or imaging markers, requires analyzing large datasets of patient images. However, brain blood vessels have highly complex structures with significant anatomical variability: some vessels may be fused, underdeveloped, duplicated, or even missing entirely. Their sizes can also range widely, from less than 0.5 mm to over 3 mm in diameter. Some of these variations can influence blood flow dynamics and potentially contribute to aneurysm development. Evaluating blood vessels manually by radiologists is time-consuming and subjective, which is why we need (semi-)automated methods that can handle the complex anatomy and structural variability of blood vessels.

In her research, Iris developed userfriendly tools for standardized and reproducible blood vessel analysis (Figure 1). These tools enabled the study of over 3,000 brain images, leading to the identification of several potential risk factors associated with aneurysms. She also used graph neural networks, a subfield of AI, to automatically detect and label important blood vessels and bifurcations (branching points). Unlike traditional methods, graph neural networks can capture complex relationships between vascular structures. These techniques can be used to recognize subtle patterns in data that could help identify new imaging markers for aneurysm development. One of the biggest obstacles in this field is the detection of very small blood vessels, some of which are less than 1 mm in diameter. Standard image analysis techniques often struggle to accurately extract these vessels, leading to incomplete data. To address this issue, Iris and her colleague Diewertje Alblas, developed a deep-learning based method that uses path optimization techniques guided by artery orientation. This approach successfully extracts even the smallest blood vessels, including those under 1 mm. 25 DAILY WACV Saturday Iris Vos Figure 1. Analysis of brain blood vessels. Upper: automated measurement of blood vessel diameter. Lower: automated measurement of bifurcation angle.

Double-DIP Don’t miss the BEST OF WACV 2025 iSCnul i Ccbkos cmhreipbrue t feor rVf ri sei eo na nNde wg es toi tf iMn ya or cuhr. m a i l b o x ! Don’t miss the BEST OF WACV 2025 in Computer Vision News of March. Subscribe for free and get it in your mailbox! Click here Target with solid fill

27 DAILY WACV Saturday Workshops and Tutorials Awesome Abby Stylianou (top) moderating a panel at CV4EO, the workshop on Computer Vision for Earth Observation Applications. The case for automated insulin delivery system (bottom) brought by Ayan Banerjee at the Data Protection and Privacy in Biomedical, Healthcare and Medicine Tutorial, run by the World Privacy Forum.

28 DAILY WACV Workshop Saturday Satya Mallick (top) and Babak Taati (bottom) at CV4Smalls Workshop - Computer Vision with Small Data: A Focus on Infants, Toddlers, and the Elderly. The workshop was co-organized by awesome Sarah Ostadabbas (top in the next page). Exceptional guest Paula Ramos (bottom in the next page) displayed machine learning wisdom for all.

29 DAILY WACV Saturday CV4Smalls

Made with FlippingBook

RkJQdWJsaXNoZXIy NTc3NzU=