The Rise and Fall of Continuous Learning with Emerging Vision Transformers

2024-11-12
  • 42

[Abstract]

Continual Learning (CL) is an emerging machine learning paradigm that learns from a continuous stream of data. CL presents significant challenges, as it must preserve accuracy while handling continuously drifting data (i.e., non-IID data) and maintain high energy efficiency to be deployable on real-world devices. Traditionally, system researchers have primarily focused on CNN-based CL, which excels in resource efficiency during inference, and explored how to effectively schedule GPU resources for concurrent inference and training tasks. Such online scheduling algorithms are complex because they must solve the multi-dimensional optimization problem of using minimal resources to maximize accuracy improvements for training, especially in the face of irregular data drifts with varying intensities.

Our research group has recently looked into the possibility of empowering the offline stage of CL (i.e., before deploying the model) instead of the online stage. Our high-level idea is to take advantage of emerging Vision Transformers like DINO as a backbone model and exploit a huge archive of real-world data to build a robust inference pipeline before deploying it on a device. Our goal, where feasible, is to completely carve out cumbersome and expensive model training in some CL scenarios. Through initial explorations in a few video analytics scenarios, we have observed great potential for this approach, though it poses tricky accuracy issues for certain types of classes, which we have been tackling over the past few weeks. I am taking this seminar as an opportunity to introduce our WIP system, LITE (Live It To Expert), and I look forward to engaging in deeper and broader conversations during this seminar.

[Biography]

I am an Associate Professor in GSAI/CSE at POSTECH. Previously, I was an Associate Professor at UNIST and a visiting professor in the DeepSpeed team at Microsoft. Prior to joining UNIST in 2018 fall, I spent several years in industry with Systems Research Group at Microsoft Research (2015-2018) and Systems Research Group at ARM Research (2014-2015). My research interests center on building system platforms for AI and big data processing, cloud computing, and various emerging hardware. I finished my PhD in CS at Rice University, my MS in CS at KAIST, and my BE in CE at Kwangwoon University. I won the Best Paper Awards from ICDE 2022 and SYSTOR 2016 and was selected twice as a Meta Faculty Research Award Finalist in 2020 and 2022.

LIST