“We also propose an efficient ray sampling technique,” he continues. “In neural radiance fields, we need to project the rays into the views and then sample the points on the rays. In our case, we need to have very long rays, around 100 meters. Therefore, we propose an efficient technique to sample a small amount of points on the rays. For example, we only need 60 points for 100 meters, reducing the computation required.” Anh-Quan believes this is the first instance of a large-scale, selfsupervised 3D reconstruction method that operates solely from a single image. The work has also built on neural radiance fields, a highly popular and award-winning method, and demonstrated its ability to generalize to unseen nuScenes images. Advisor Raoul de Charette told us that he finds SceneRF particularly interesting because it alleviates the need for 3D groundtruth, thus stepping towards arbitrary 3D reconstruction from a video stream. “It’s a very challenging project because of the setting,” Anh-Quan adds. “That’s the thing I like about it. It’s tough to solve this problem, and I remember we only solved it several weeks before the deadline!” The practical applications of this research in the real world are farreaching. In autonomous driving, where the prediction of 3D scenes is essential, this method eliminates the need for 3D groundtruth supervision during training of the computer vision network, enabling training on larger image datasets. 13 DAILY ICCV Thursday SceneRF To address these challenges, he extends the PixelNeRF method to enable it to reconstruct the 3D geometry from the image and work on large-scale scenes. He proposes a novel encoderdecoder architecture designed to expand the image’s field of view, allowing the extraction of features from points outside the immediate view.
RkJQdWJsaXNoZXIy NTc3NzU=