Computer Vision News - April 2025

How did Alex work to close this gap? He lists a set of three contributions. The first one is domain adversarial learning or a domain adaptation framework that works not only on image level features (which is common, like for object detection) but also on graph level features; for this we need to view the output of the transformer network as some kind of abstract graph representation and we are aligning this between the two domains by using domain adversarial learning. The second contribution is tackling the challenge of different image dimensionalities. The source domain has 2D images, namely satellite images - but when we want to extract the graph from 3D medical scans, we need to bridge this gap of image dimensionalities by introducing a very simple projection function. “This projection function,” explains Alex “works by just taking a 2D image, which is a satellite image, and center it inside of a 3D zeroinitialized volume and then we are randomly rotating the whole volume and in that way we have some kind of synthetic 3D volume that is then automatically aligned with our first contribution, namely the domain adaptation framework.” The third contribution is called by the authors regularized edge sampling loss and this is tackling the challenge that we have very different edge distributions between the two domains. For example, some 11 Computer Vision News Computer Vision News Cross-domain and … Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

RkJQdWJsaXNoZXIy NTc3NzU=