uncertainty heat maps at the pixel level, and then we aggregate them into one score so that we have an uncertainty for the whole image. For failure detection, where we want to have whole images being regarded as a failure, we need to have an automatic system to say, at the image level, this is a failure.” The question arises: what significance does a glowing pixel of uncertainty hold when determining the uncertainty for a lesion? Here, the aggregation process becomes vital to facilitate human interaction with the model’s output. “It’s a similar thing for active learning, where you select data points to be annotated,” Carsten points out. “Just one glowing pixel does not have an inherent meaning. It has to be aggregated. Aggregation is a crucial part of uncertainty methods, which we found in our study also has a very large factor for the final performance.” Looking ahead, Kim highlights ValUES’s potential to serve as a foundational framework rather than an endpoint, opening avenues for the wider community to benchmark and refine new uncertainty methods. She says: “We argue that our benchmark should be used for newly developed methods to inform practitioners how they can use these methods.” Carsten agrees and sees the work as bridging the gap between theory and practice. “There are a lot of great developments in theory that never make it into practical applications because they’re very complex, and practitioners are unsure about the benefits for their downstream tasks,” he tells us. “Developments can be benchmarked in our setting, and then practitioners can look at the results and make an informed decision to use some newly developed method or not.” 15 Computer Vision News Computer Vision News ValUES: A Framework for Systematic …
RkJQdWJsaXNoZXIy NTc3NzU=