Deep Neural Network Models for Video Story Understanding

  • 2,985


Gunhee Kim is an assistant professor in the Department of Computer Science and Engineering of Seoul National University from 2015. He was a postdoctoral researcher at Disney Research for one and a half years. He received his PhD in 2013 under supervision of Eric P. Xing from Computer Science Department of Carnegie Mellon University. Prior to starting PhD study in 2009, he earned a master’s degree under supervision of Martial Hebert in Robotics Institute, CMU. His research interests are solving computer vision and web mining problems that emerge from big image data shared online, by developing scalable and effective machine learning and optimization techniques. He is a recipient of 2014 ACM SIGKDD doctoral dissertation award, and 2015 Naver New faculty award.



Recently there has been a hike of interest in leveraging deep learning techniques to jointly understand visual data and natural language text in the research of computer vision and machine learning. In this presentation, I will briefly introduce three of our recent papers toward this direction that are published in CVPR 2017 and ICCV 2017. First, I will shortly introduce a new dataset named TGIF-QA for video-based question answering. Next, I will introduce an approach that won three tracks of a video-to-language challenge named LSMDC 2016-2017. Finally, I will discuss a memory network model with CNN-based memory read/write operations for MovieQA tasks.