Computer Vision News - August 2020

Challenge: SARAS 18 three cases. In the end, we have invited the top three participants in the average MAP and the AP at 50 ranking to speak at the challenge.” At the challenge event tomorrow, five invited teams will describe their methods. Juan Pablo Wachs , Associate Professor at Purdue University, will also speak about his work on action detection in surgical robotics . Fabio points out that this is the first challenge related to action detection in surgical robotics . Possibly even in medical imaging. There have been previous MICCAI challenges on action recognition in the surgical domain, but in action recognition you only have to estimate the label of the action for each video frame, you do not have to detect it, so you do not have to localize it using a bounding box in the image frame. Recognizing the action is good, but you need to detect it and localize it in the surgical scene to allow the SARAS arms to react to it and achieve full autonomy. Looking ahead, Fabio says they plan to write a journal paper and submit it to Medical Image Analysis about the top challenge and the methods from the participants. They also plan to expand thechallengebyadding further tasks and increasing the size of the dataset. They are considering submitting it to MICCAI next year by including data from seven videos produced by the SARAS platform that they are currently annotating. They are based on phantoms rather than real patients, but they capture operations on 3D printed anatomies that were created by Dundee University in Scotland , which is another SARAS partner. Fabio adds: “This challenge task was pure single frame detection, whereas here in Brookes we are world leaders in action tube detection. So, not just detecting the bounding boxes frame by frame, but detecting the entire action instance described as a series of bounding boxes linked up in time, which people call action tubes in the literature. The baseline that we used in this challenge is based on the RetinaNet object detector because the task was frame-level detection. It was not about Best of MIDL 2020