세미나안내
Explaining Neural Networks Through the Lens of Vision-Language Models
[Abstract]
Deep learning has made remarkable progress in various fields, yet its black-box nature remains a significant challenge. Understanding how these models learn and represent features is crucial for improving trustworthiness and real-world applicability, in particular for mission-critical applications. This presentation introduces the potential of Vision-Language Models (VLMs), which bridge vision and language, as a new direction to explain neural networks. First, I will introduce recent approaches for neuron-concept association through the lens of VLMs, which uncover the internal representation of models. Then, I will introduce recent studies of generating natural language explanations with VLM-based cross-modal retrieval to justify the decision of models. Finally, I will introduce my future research direction to trustworthy AI for the surgical domain.
[Biography]
Seong Tae Kim is an Assistant Professor in School of Computing at Kyung Hee University, where he leads Augmented Intelligence Lab. Prior to joining KHU, he was a Senior Research Scientist in Chair for Computer Aided Medical Procedures at Technical University of Munich. His research interests span various topics of trustworthy AI, including the interpretability of AI models, natural language explanation, and uncertainty analysis. He has served as Area Chair for MICCAI (2022-2024), Associate Editor of IEEE Transactions on Circuits and Systems for Video Technology (2022-present), and Organizing Committee of MICCAI2025 and MICCAI Workshops on Graphs in Biomedical Image Analysis in 2025.