Text-to-3D Generation, Manipulation, and Analysis

2023-05-25
  • 895

[Abstract]

We are witnessing a remarkable revolution in AI, like ChatGPT which is capable of writing text like humans, and StableDiffusion and Midjourney which create images like real photos. The next step for AI models is to achieve 3D understanding, although it poses challenges as obtaining billion-scale 3D datasets is impossible. In this talk, I will delve into the current progress of AI models in 3D generation and analysis and discuss potential roadmaps for achieving human-level 3D perception. I will share details about our research in this field. To begin, I will introduce our new dataset specifically designed for language-to-3D tasks and explain how it can effectively be utilized in various shape analysis and manipulation tasks. I will also present our novel 3D generative model, a versatile tool that enables shape generation, editing, and text-based manipulation. Lastly, I will discuss the utilization of 2D image generative models for 3D generation and editing, as well as the challenges involved and our approaches to address them.

[Biography]

Minhyuk Sung is an assistant professor in the School of Computing at KAIST, affiliated with the Graduate School of AI and the Metaverse Program. Before joining KAIST, he was a Research Scientist at Adobe Research. He received his Ph.D. from Stanford University under the supervision of Professor Leonidas J. Guibas. His research interests lie in vision, graphics, and machine learning, with a focus on 3D geometric data processing. His academic services include serving as a program committee member in Eurographics 2022, SIGGRAPH Asia 2022 and 2023, and AAAI 2023.

LIST