ECCV 2020 Daily - Wednesday

Ultimately deep learning will replace classical techniques for the majority of medical imaging applications. Deep learning can offer better accuracy (e.g. for image segmentation) or faster run times (e.g. for image reconstruction). There is particular excitement around AI for disease diagnosis. However, for AI to be trusted by medical users and to gain regulatory approval, algorithms must be proven safe and effective. Currently this requires a high level of input from the human medical expert: for selecting the data cohort, for defining the task, for data pre-processing and model design, and especially for creating detailed pixel-level annotations to train and evaluate the model. An alternative, more scalable approach, is to mine data and labels from existing large repositories of imaging and associated radiology reports at health institutions. Several such chest X-Ray datasets have been recently released into the public domain. However, problems arise around the potential biases present in the data and the labelling which can lead to learning of spurious correlations and optimistic estimation of clinical task performance; this is compounded by the fact that algorithms trained with image-level supervision are not straightforward to interpret. Beyond Imaging: Multimodal AI using the whole patient record As more medical data goes digital within the patient’s electronic health record (EHR), multimodal AI is a tantalising possibility. Some EHR data is unstructured, such as images and text, whilst other EHR data is structured, such as medications and lab test results. Multimodal learning is a challenging area, in which we need to make decisions around the fusion stage (early, middle, late?) and the fusion method (additive, multiplicative?), and to develop methods to bring different modalities into the same semantic space, as well as understanding and developing robustness to domain transfer issues for other data types than imaging. In her talk, Alison describes some first examples of mixing imaging with other modalities in the medical domain – imagingwith (clinical)metadata, imagingwith text, imagingwith genetics – concluding that this is a promising area but that we have not yet seen convincing step change: for image-oriented tasks, adding in non-imaging data has had mixed results (see Fig. 6 for a successful example), and for clinically-oriented tasks, adding in imaging data seems to be low-value. Going forward, we need better alignment 2 Tutorial 2 DAILY W e d n e s d a y

RkJQdWJsaXNoZXIy NTc3NzU=