Computer Vision News - June 2019
convolutional layer of Laina et al. (2016) to generate the basis depth maps, denoted by B. The final depth map is generated as a linear combination of these basis depth maps. It gives the desired depth estimation as d j =ReLU(w T B[j]) where w is learnable weights (learned at real-time) and B[j] is the j'th column of the feature maps generated from the encoder decoder scheme. Bundle Adjustment Layer: the main advantage of the BA layer lies in the update of the Levenberg-Marquardt algorithm. The LM algorithm is used to iteratively optimize non-linear least squares problems of the form ( − , χ ) 2 where χ is the optimization variable. Using the Jacobian matrix J, LM defines the update rule by the equation: The main challenge here is that in the LM algorithm, the damping factor is determined by a non-smooth thresholding. This makes the update non- differentiable. To cope with the challenge of determining the damping factor, the author uses a network to predict the optimal value of . This makes the update rule differentiable and allows to minimize over this parameter. This network gets as input the feature pyramid F, camera parameters T and depth values d (computed by the depth maps) from the previous iteration and returns the damping factor (or equivalently the current update). It is trained in a supervised manner using the ground truth camera parameters. The following figure explains this process: The three components described above give a differentiable pipeline that can be trained from end to end using back-propagation . Results To demonstrate the effectiveness of their method, the authors compared their method to DeMoN which is a depth and motion network for learning monocular stereo, as well as to conventional BA. You can appreciate their results by looking at the following table: 7 Research Computer Vision News − Δχ = ( − χ BA-Net: Dense Bundle Adjustment Network
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=