Computer Vision News

BEST OF CVPR 25 Damien Robert Point clouds and images are the typical data produced by 3D acquisition systems . This work proposes a simple set-up to merge the two without requiring meshing or depth sensors, but simply raw point clouds, images, and the corresponding camera poses . When you have multiple images of a 3D scene, each object may be seen in a variable number of images. Some images may be close-up, some far away, while some may be occluded or see things from slanted angles. This situation is called the multi-view problem . “Acquisition systems often produce 3D point clouds and images, and we want to be able to combine the information from each modality because they are complementary,” he explains. “The point clouds describe the scene’s geometry, and the images describe other things like the texture and the context. We want to use information from both. We’re not the first ones to do it, but we’re the first to learn multi- view aggregation for large scenes.” The team wanted to work at a large scale, with scans of cities and buildings rather than a small point cloudof an isolated chair, for example, where you only have pictures of that object. However, manipulating the information that connects the point cloud and images at a large scale is difficult. “I had to spend a lot of time coding to efficiently manipulate the link between images and point clouds and never break this connection through the whole pipeline,” Damien tells us. “You must ensure that each 3D point is properly connected to the corresponding pixels. It’s very tricky. ”

Computer Vision News - July 2022