Computer Vision News - February 2024

Computer Vision News 34 While initially unsure of its applicability to medical images, Bo and Jun immediately recognized the potential impact of adapting SAM for this task. Foundation models were still in their infancy, so there were many uncertainties, but deciding they were onto something, they got to work. In the end, through meticulous hyperparameter tuning and engineering efforts, they discovered it could effectively handle diverse medical image modalities. “Nowadays, it’s very common for us to use deep learning for medical image segmentation, but before the deep learning era, mathematical models were the most popular method for this task,” Jun recalls. “These models are inherently transparent but have limited usability because they require much hyperparameter tuning when segmenting a new image.” There has been a paradigm shift from the original models to the current popular deep learning models. In 2015, with the emergence of fully convolutional neural networks, such as U-Net, FCN, and W-Net, features could be learned end-to-end, significantly improving the automaticity compared to previous mathematical models. The inference process can be fully automatic, too, without additional parameter tuning. However, these models are usually customized for specific images or modalities with limited generalizability and adaptability. “The SAM paper was released in early April last year, and I looked at its demo on natural images and realized it was a great breakthrough,” Jun tells us. “I experienced the traditional mathematical segmentation methods, which were kind of painful because they required a lot of hyperparameter tuning. We tried SAM on our medical images and found that performance wasn’t very good since its training set mainly contained natural images.” MedSAM represents another paradigm shift from specialist models to generalist or foundation models. Jun identifies three key components to develop a generalized model. “First, a great network architecture; second, a Nature Communications Paper MedSAM is developed on a large-scale medical image dataset with over 1.5M+ image-mask pairs.

RkJQdWJsaXNoZXIy NTc3NzU=