Prof. Niloy J. Mitra, University College London
Bio: Niloy J. Mitra leads the Smart Geometry Processing group in the Department of Computer Science at University College London. He received his PhD from Stanford University under the guidance of Leonidas Guibas. His current research focuses on developing machine learning frameworks for generating high-quality geometric and appearance models for CG applications. Niloy received the 2019 Eurographics Outstanding Technical Contributions Award, the 2015 British Computer Society Roger Needham Award, and the 2013 ACM Siggraph Significant New Researcher Award. He also leads the Adobe Research London Lab. For more details, please visit http://geometry.cs.ucl.ac.uk/index.php. Besides research, Niloy is an active DIYer and loves reading, bouldering, and cooking.
Title: Deep 3D Generative Modeling
Abstract: Deep learning has taken the Computer Graphics world by storm. While remarkable progress has been reported in the context of supervised learning, the state of unsupervised learning, in contrast, remains quite primitive. In this talk, we will discuss recent advances where we have combined knowledge from traditional computer graphics and image formation models to enable deep generative modeling workflows. We will describe how we have combined modeling and rendering, in the unsupervised setting, to enable controllable and realistic image and animation production. The work is done in collaboration with various students and research colleagues.
Prof. Matthias Nießner, Technical University of Munich
Bio: Dr. Matthias Nießner is a Professor at the Technical University of Munich where he leads the Visual Computing Lab. Before, he was a Visiting Assistant Professor at Stanford University. Prof. Nießner’s research lies at the intersection of computer vision, graphics, and machine learning, where he is particularly interested in cutting-edge techniques for 3D reconstruction, semantic 3D scene understanding, video editing, and AI-driven video synthesis. In total, he has published over 70 academic publications, including 22 papers at the prestigious ACM Transactions on Graphics (SIGGRAPH / SIGGRAPH Asia) journal and 43 works at the leading vision conferences (CVPR, ECCV, ICCV); several of these works won best paper awards, including at SIGCHI’14, HPG’15, SPG’18, and the SIGGRAPH’16 Emerging Technologies Award for the best Live Demo. Prof. Nießner’s work enjoys wide media coverage, with many articles featured in main-stream media including the New York Times, Wall Street Journal, Spiegel, MIT Technological Review, and many more, and his was work led to several TV appearances such as on Jimmy Kimmel Live, where Prof. Nießner demonstrated the popular Face2Face technique; Prof. Nießner’s academic Youtube channel currently has over 5 million views. For his work, Prof. Nießner received several awards: he is a TUM-IAS Rudolph Moessbauer Fellow (2017 – ongoing), he won the Google Faculty Award for Machine Perception (2017), the Nvidia Professor Partnership Award (2018), as well as the prestigious ERC Starting Grant 2018 which comes with 1.500.000 Euro in research funding; in 2019, he received the Eurographics Young Researcher Award honoring the best upcoming graphics researcher in Europe. In addition to his academic impact, Prof. Nießner is a co-founder and director of Synthesia Inc., a brand-new startup backed by Marc Cuban, whose aim is to empower storytellers with cutting-edge AI-driven video synthesis.
Talk Title: The Revolution of Neural Rendering
Abstract: In this talk, I will present our research vision in how to create a photo-realistic digital replica of the real world, and how to make holograms become a reality. Eventually, I would like to see photos and videos evolve to become interactive, holographic content indistinguishable from the real world. Imagine taking such 3D photos to share with friends, family, or social media; the ability to fully record historical moments for future generations; or to provide content for upcoming augmented and virtual reality applications. AI-based approaches, such as generative neural networks, are becoming more and more popular in this context since they have the potential to transform existing image synthesis pipelines. I will specifically talk about an avenue towards neural rendering where we can retain the full control of a traditional graphics pipeline but at the same time exploit modern capabilities of deep learning, such as handling the imperfections of content from commodity 3D scans. While the capture and photo-realistic synthesis of imagery open up unbelievable possibilities for applications ranging from entertainment to communication industries, there are also important ethical considerations that must be kept in mind. Specifically, in the content of fabricated news (e.g., fake-news), it is critical to highlight and understand digitally-manipulated content. I believe that media forensics plays an important role in this area, both from an academic standpoint to better understand image and video manipulation, but even more importantly from a societal standpoint to create and raise awareness around the possibilities and moreover, to highlight potential avenues and solutions regarding trust of digital content.
Prof. Heng Tao Shen, University of Electronic Science and Technology of China
Bio: Professor Heng Tao Shen, ACM Fellow and OSA Fellow, is Dean of School of Computer Science and Engineering and Executive Dean of AI Research Institute at University of Electronic Science and Technology of China (UESTC). He obtained his BSc with First Class Honours and PhD from Department of Computer Science at National University of Singapore in 2000 and 2004 respectively. He was a professor at the University of Queensland before joining UESTC. His research has made contributions to the field of hashing big multimedia data, from hashing theory, to algorithms and applications, and he has led the charge to address the challenging problem of cross-media understanding and retrieval. He has published 300+ peer-reviewed papers, including 110+ IEEE/ACM Transactions, and 200+ CCF-A ranked papers. He has received 8 Best Paper Awards, including ACM Multimedia 2017, ACM SIGIR 2017, and IEEE Transactions on Multimedia 2020. He is General Co-Chair of ACM Multimedia 2021, former TPC Co-Chair of ACM Multimedia 2015, and an Associate Editor of ACM Transactions of Data Science, IEEE Transactions on Image Processing, IEEE Transactions on Multimedia, IEEE Transactions on Knowledge and Data Engineering, Pattern Recognition, and Journal of Software.
Talk Title: Cross-Media Intelligence
Abstract: It has been shown that heterogeneous multimedia data gathered from different sources in different media types can be often correlated and linked to the same knowledge space. Towards cross-media intelligence, cross-media understanding, retrieval and interaction has attracted huge amount of attention due to its significance in both research communities and industries. In this talk, we will introduce the state of the art on this topic and discuss its future trends.
Prof. Jun-Yan Zhu, Carnegie Mellon University
Bio: Jun-Yan is an Assistant Professor with The Robotics Institute in the School of Computer Science of Carnegie Mellon University. He also holds affiliated faculty appointments in the Computer Science Department and Machine Learning Department. Prior to joining CMU, he was a Research Scientist at Adobe Research and a postdoctoral researcher at MIT CSAIL. He obtained his Ph.D. from UC Berkeley and his B.E. from Tsinghua University. He studies computer vision, computer graphics, computational photography, and machine learning. He is the recipient of the Facebook Fellowship, ACM SIGGRAPH Outstanding Doctoral Dissertation Award, and UC Berkeley EECS David J. Sakrison Memorial Prize for outstanding doctoral research. His co-authored work has received the NVIDIA Pioneer Research Award, SIGGRAPH 2019 Real-time Live! Best of Show Award and Audience Choice Award, and The 100 Greatest Innovations of 2019 by Popular Science.
Talk Title: GANs for Everyone
Abstract: The power and promise of deep generative models such as StyleGAN, CycleGAN, and GauGAN lie in their ability to synthesize endless realistic, diverse, and novel content with user controls. Unfortunately, the creation and deployment of these large-scale models demand high-performance computing platforms, large-scale annotated datasets, and sophisticated knowledge of deep learning methods. This makes it a process not feasible for many visual artists, content creators, small business entrepreneurs, and everyday users.
In this talk, I describe our recent efforts in making GANs more accessible for a broad audience, through improved computational- and data-efficiency as well as better interface between humans and models. First, we reduce the inference time and model size of recent GANs models by 6-20x, allowing their easy deployment on consumer laptops and mobile devices. Second, we present a data-efficient training method that can learn a model with only one hundred photos, bypassing the necessity of collecting large-scale datasets. Finally, we introduce new human-centered model creation interfaces that allow a user to directly customize new models with minimal user efforts.