세미나안내
Building Robust Foundation Models: Insights on Efficiency
[Abstract]
This talk introduces my recent works at NAVER AI Lab on cost-efficient methods for building foundation models. While the term foundation model has become widely adopted across domains such as language modeling and multi-modal modeling, adaptation after pre-training remains necessary for targeted downstream use cases. In particular, pre-trained models often lack robustness to unseen or out-of-distribution queries, which makes it important not only to tailor them to specific in-distribution tasks through fine-tuning, but also to improve their overall robustness. The proposed methods address this challenge through model merging, grounded in the principle of linear mode connectivity observed across a wide range of pre-trained weights, with efficiency in mind. The resulting models exhibit improved performance on both in-distribution and out-of-distribution tasks, which demonstrates that robustness has been successfully imbued into the foundation models. These results suggest that simple and practical techniques can play a meaningful role in building more robust foundation models. Lastly, I will briefly introduce some of NAVER AI Lab’s recent work to highlight our current research directions.
[Biography]
Dongyoon Han is a Research Scientist at NAVER AI Lab and an Adjunct Professor at KAIST GSAI. He actively explores advancements in large language models and multi-modal models from a machine learning perspective. He is particularly passionate about designing efficient and robust deep neural networks, along with effective training and evaluation methods, actually regardless of specific application topics. Before his current role, he received his Ph.D. from KAIST in 2018 after earning his B.S. from KAIST in 2011. He served as an Area Chair for the NeurIPS D&B from 2023. For more details, please visit his personal page: https://dongyoonhan.github.io/