MICCAI 2021 Daily

32 DAILY MICCAI Tuesday Medical Imaging Technology Cranial Vault (BTCV) dataset and spleen and brain tumor segmentation using Medical Segmentation Decathlon (MSD) dataset. On the BTCV dataset, UNETR is currently the state-of-the-art methodology on both Standard (only training with challenge data) and Free Competition (training with additional data) public leaderboards . In addition, UNETR so far has shown to be more efficient in comparison to other transformer-based models (e.g. TransUNet) and CNN-based baselines in terms of number of FLOPs and inference time. See Table 1 for comparison of number of parameters, FLOPs and averaged inference time for various models in BTCVexperiments. Comparison of number of parameters, FLOPs and averaged inference time for various models in BTCV using a sliding window approach. Overview of UNETR architecture. A 3D input volume is divided into a sequence of uniform non-overlapping patches and projected into an embedding space using a linear layer. The sequence is added with a position embedding and used as an input to a transformer model. The encoded representations of different layers in the transformer are extracted and merged with a decoder via skip connections to predict the final segmentation. In the spirit of open innovation and to accelerate the research in this emerging field, NVIDIA has open-sourced UNETR via MONAI Github public repository. In addition, a standalone UNETR repository is available in MONAI research contributions repository. Furthermore, two UNETR tutorials (pure MONAI and MONAI + PyTorch Lightning) for multi-organ segmentation using BTCV datasets are available on MONAI tutorials for researchers to further explore this methodology in practice. Two notable approaches that have leveraged transformers for medical image segmentation are TransUNet and CoTr. These approaches will be discussed in detail in the following sections.

MICCAI 2021 Daily - Tuesday