Computer Vision News

8 Research A 2D U-Net architecture is employed for the segmentation module, with specific choice of hyper-parameters. Modifications to the original network, include the employment of Exponential Linear Unit (ELU) activation, a batch normalization layer after each non-linear activation and a combined loss function (Cross Entropy + Soft Dice Loss) with a weighting factor of 0.5 for the CE loss. The transformation module is built with a localisation network , which predicts the parameters of a transformation matrix, M; the grid generator , which implements the transform, T; and the sampler , which is responsible for the interpolation. The localisation network outputs three rotation parameters φ, θ, ψ and three translation parameters t = [tx, ty, tz], that are fed to the Euler2Affine layer which converts them into an affine matrix representation to allow valid inverse transformations, such that cycle constraints can be imposed. The MSE loss is always employed to compute the output of the transformation with the original AX/SAX images. The cycle consistency , added to the network to improve stability and ensure correct rotational parameters, comes in the shape of a composed of two MSE losses that measure the quality of the target to source transformation and the corresponding inverted output transformation, both output by the Eurler2Affine layer. The other novel element, the task-oriented guidance , is added to fine-tune the translation parameters because most of the predicted transformations excluded relevant lower slices of the volume. This is created by feeding the domain transformed image into a branched out sub-network (the Segmentation Module) whose gradients are back propagated into the same localisation network. This allows this network branch to maximize the number of predicted foreground voxels ( ) and to therefore reduce cutting of relevant structures due to sub-optimal translation. Finally, the overall loss of the network results in: = + 0.1 Two main datasets are employed to train this network: the TOF dataset, made up of two sub-cohorts, and the ACDC Dataset, including healthy individuals and some affected by four different conditions. The segmentation module was trained using ⃗ = + 0.1 = + 0.1

Computer Vision News - April 2021