[특별세미나]One-Shot Object Detection by Matching Anchor Features
Bio: Anton Osokin is currently a researcher at the Samsung lab, CS HSE, Moscow, Russia, where he works with the Bayesian Methods research group and teaches at CS HSE. He did both his undergrad and Ph.D. studies at the Department of Computational Mathematics and Cybernetics (CMC) at Lomonosov Moscow State University, Russia, under the supervision of Dmitry Vetrov. In 2014, he departed to France to work with Simon Lacoste-Julien and Francis Bach as a member of the SIERRA project-team (machine learning) at INRIA/École Normale Supérieure in Paris. In 2016, he moved next door to the WILLOW project-team (computer vision) to work with Ivan Laptev. In 2017, he moved back to Moscow to join CS HSE. His research interest lies in the fields of machine learning, discrete optimization, and computer vision. Most of all, he likes when knowledge accumulated in different areas comes together and resonates. His current projects are mostly concentrated around structured-output prediction: practical methods using deep learning and theoretical understanding of specific challenges of structured prediction.
Abstract: In this paper, we consider the task of one-shot object detection, which consists in detecting objects defined by a single demonstration. Differently from the standard object detection systems, the classes of objects used for training and the classes of objects used for testing do not overlap. Our model is based on matching local features instead of global representations. We use dense correlation matching to find correspondences, a feed-forward geometric transformation model to align features and bilinear resampling of the correlation tensor to compute the detection score of the aligned features. All the components are differentiable, which allows end-to-end training. Experimental evaluation on the challenging task of retail product detection shows that our method can detect unseen classes (e.g., toothpaste, when trained on other types of retail products) and outperforms a strong baseline by a large margin.