Computer Vision News

model.eval() out_image = [] start_y, end_y = PATCH_SIZE[0], x.shape[0] - PATCH_SIZE[0] start_x, end_x = PATCH_SIZE[1], x.shape[1] - PATCH_SIZE[1] for i in range (start_y, end_y): start_time = time.time() for j in range (start_x, end_x): # patch extraction center = [i, j] patch = x[center[0]-PATCH_SIZE[0]//2 : center[0]+PATCH_SIZE[0]//2, center[1]-PATCH_SIZE[1]//2 : center[1]+PATCH_SIZE[1]//2] # preprocess and FP patch = patch.reshape(1, 1,PATCH_SIZE[0], PATCH_SIZE[1]) patch = torch.from_numpy(patch) with torch.no_grad(): patch = Variable(patch) patch = patch.float() out = model.forward(patch) out = out.data.numpy() out_image.append(out) print (f'Rows processed: {str(i+1 - PATCH_SIZE[0])}\t Time taken: {str(time.time() - start_time)}') [ ] 3 Summary 3 PyTorch and Pixel-Wise Live ... 5 Network rolling The image is raster-scanned from the location (0,0) to the (img_size, img_ size) which represents the network rolling. The edge pixels are not processed thus the image size is reduced. To balance this, zero-padding is used. The start and end point of this rolling for every row is patch_size and image_ shape – patch_size Post-processing and final output The output tensor has a shape of [N, 1, 2] with N is the number of patches processed during raster scanning ("rolling”). np.argmax on the 2nd dimension returns binary values for every patch. Once all the patches are binarized, the tensor is reshaped into a 128x128 2D array.

Computer Vision News - June 2020