Computer Vision News - May 2023

17 Annika Reinke of metric pitfalls. When comparing algorithms against each other, she demonstrated that rankings are typically unstable, meaning that an algorithm could be the best simply due to the nature of a ranking calculation and not due to actually being the best fit for solving a particular research task. Finally, she found that challenges and submitted algorithms are typically not reproducible. Improving validation Uncovering problems is good, solving them is even better! To overcome the described issues, Annika and her team proposed several improvements:  a structured challenge submission system now collects comprehensive information about challenge designs, which can be critically reviewed by independent referees. To promote the selection of validation metrics based on their suitability to the underlying research problem rather than popularity, she proposed a problem-drivenmetric recommendation framework that empowers researchers to make educated decisions while being made aware of the pitfalls to avoid. To enable uncertainty-based ranking analysis, she presented an open-source toolkit including several advanced visualization techniques for benchmarking experiments. To facilitate and enhance challenge transparency, she presented a guideline for challenge reporting and introduced challenge registration, i.e. publishing the complete challenge design before execution. Finally, she showed that challenge results can be used for a dedicated strength-weakness analysis of participating algorithms, fromwhich future algorithmdevelopment could heavily benefit in addressing unsolved issues. We hope that Annika's work paves the way for high-quality and thorough algorithm validation, which is crucial to avoiding translating inefficient or clinically useless algorithms into clinical practice.

RkJQdWJsaXNoZXIy NTc3NzU=