Computer Vision News - August 2024

33 Argho Sarkar Computer Vision News Computer Vision News for saving lives, providing medical assistance, and conducting evacuation efforts. On the other hand, the integration of machine learning models into smart decision support systems raises concerns about model explanation. In remote sensing, limited contextual information can lead to shortcut learning that leads to accurate results with false explanation. Addressing these challenges, Argho's thesis focuses on two key aspects. Firstly, it develops a questionanswering framework for efficient damage assessment using remote sensing imagery. Secondly, it aims to enhance the trustworthiness of model outcomes by developing novel machine learning frameworks tailored for remote sensing in both multi-modal and uni-modal contexts. To achieve the goal, Argho and BINA Lab introduce two large-scale benchmark visual questionanswering datasets for damage assessment, named FloodNet-VQA and RescueNet-VQA. These are the only existing datasets for developing visual question answering frameworks for damage assessment, providing new opportunities for the AI research community. In the later work, supervised attention-based frameworks named SAM-VQA (Supervised Attention Module for Visual Question Answering for Post-Disaster Damage Assessment on Remote Sensing Imagery) are proposed. This framework models the image and question to provide accurate answers with rational visual explanations. It uses manually annotated visual mask that highlights relevant image portions necessary for answering a given question to supervise the attentionobtaining process. SAM-VQA offers improved explanations and achieves higher accuracy compared to stateof-the-art VQA algorithms. In the last work, Argho proposes a novel learning strategy for consistent and robust visual explanations in image classification task for remote sensing. This strategy proposes two distinct loss functions designed to ensure consistency and robustness in visual explanations. The integration of these proposed losses enables the model to obtain improved visual features compared to baseline convolutional architectures, resulting in higher accuracy and enhanced visual explanations. In summary, Argho’s research works make significant contributions in damage assessment and enhance the reliability of model outcomes in remote sensing applications. You can find Argho here. Congrats, Doctor Argho!

RkJQdWJsaXNoZXIy NTc3NzU=