Computer Vision News - March 2018
The results table demonstrates the main goal the authors wanted to achieve: maximizing recall, while preserving at least 91% precision. The model proposed by the authors is the only one using local training in parallel with batch-normalized class embedding. Furthermore, it is the only one showing a significant improvement over competing models’ results, especially compared with the pairwise statistics model. Deeper analysis of the low-dimensional embedding matrix R: the column of the neighbor embedding matrix R is a vector which represents the corresponding class. Using t-SNE visualization to reduce the 32D space into a 2D space allows the embedded classes to be seen clustered according to shelf-‘semantic’ similarity and relations. Conclusion: The authors’ model is an evolution of a context-less classifier into a contextually- sensitive one. The network learns deep contextual and visual feature vectors for neighboring classes for precise, structured prediction of object categories, to handle the need of both input and context data for a large learning capacity. The authors tested their method on a dataset that contains spatial sequences of objects and a large number of visually similar classes: it outperformed all the other tested models. The author demonstrate that the Markovity and stationarity assumptions make it sufficient to train with individual objects as samples to enrich the training data diversity, allow for a simple embedding batch normalization and boost the non- convex optimization process both in terms of time and performance. 8 Research Research Computer Vision News
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=