Speakers – ICIG 2021

Prof. Niloy J. Mitra, University College London

Bio: Niloy J. Mitra leads the Smart Geometry Processing group in the Department of Computer Science at University College London. He received his PhD from Stanford University under the guidance of Leonidas Guibas. His current research focuses on developing machine learning frameworks for generating high-quality geometric and appearance models for CG applications. Niloy received the 2019 Eurographics Outstanding Technical Contributions Award, the 2015 British Computer Society Roger Needham Award, and the 2013 ACM Siggraph Significant New Researcher Award. He also leads the Adobe Research London Lab. For more details, please visit http://geometry.cs.ucl.ac.uk/index.php. Besides research, Niloy is an active DIYer and loves reading, bouldering, and cooking.

Title: Deep 3D Generative Modeling

Abstract: Deep learning has taken the Computer Graphics world by storm. While remarkable progress has been reported in the context of supervised learning, the state of unsupervised learning, in contrast, remains quite primitive. In this talk, we will discuss recent advances where we have combined knowledge from traditional computer graphics and image formation models to enable deep generative modeling workflows. We will describe how we have combined modeling and rendering, in the unsupervised setting, to enable controllable and realistic image and animation production. The work is done in collaboration with various students and research colleagues.

Zhouchen Lin, Peking University

Bio: Zhouchen Lin is a professor with the Key Lab. of Machine Perception, School of Artificial Intelligence, Peking University. His research areas include machine learning, numerical optimization, computer vision, and image processing. He has published more than 230 peer-reviewed papers on top journals and conferences, and three books. His Google Scholar citation number is over 22,000. He has been area chairs of CVPR, ICCV, NIPS/NeurIPS, ICML, ICLR, AAAI, and IJCAI many times. He is currently an ICPR 2022 Program Co-chair and ICML 2022 Senior Area Chair. He was an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence and is currently an associate editor of the International Journal of Computer Vision. He is a fellow of IEEE, IAPR, and CSIG.

Talk Title: Equivariant Deep Networks

Abstract: Up to date, designing deep networks is still ad hoc and not many theories can be used. However, various image processing and computer vision tasks have the rotation and translation equivariance. Namely, when the input is rotated or translated by some amount, the output will also be rotated or translated by the same amount. This provides a useful guidance in desiging deep networks. In this talk, I will introduce our latest work on this line. I will show that the equivariant convolutions can be designed in a pricipled way, on both Euclidean spaces and manifolds, and the parameter efficiency can be significantly improved.

Prof. Heng Tao Shen, University of Electronic Science and Technology of China

Bio: Professor Heng Tao Shen, ACM Fellow and OSA Fellow, is Dean of School of Computer Science and Engineering and Executive Dean of AI Research Institute at University of Electronic Science and Technology of China (UESTC). He obtained his BSc with First Class Honours and PhD from Department of Computer Science at National University of Singapore in 2000 and 2004 respectively. He was a professor at the University of Queensland before joining UESTC. His research has made contributions to the field of hashing big multimedia data, from hashing theory, to algorithms and applications, and he has led the charge to address the challenging problem of cross-media understanding and retrieval. He has published 300+ peer-reviewed papers, including 110+ IEEE/ACM Transactions, and 200+ CCF-A ranked papers. He has received 8 Best Paper Awards, including ACM Multimedia 2017, ACM SIGIR 2017, and IEEE Transactions on Multimedia 2020. He is General Co-Chair of ACM Multimedia 2021, former TPC Co-Chair of ACM Multimedia 2015, and an Associate Editor of ACM Transactions of Data Science, IEEE Transactions on Image Processing, IEEE Transactions on Multimedia, IEEE Transactions on Knowledge and Data Engineering, Pattern Recognition, and Journal of Software.

Talk Title: Cross-Media Intelligence

Abstract: It has been shown that heterogeneous multimedia data gathered from different sources in different media types can be often correlated and linked to the same knowledge space. Towards cross-media intelligence, cross-media understanding, retrieval and interaction has attracted huge amount of attention due to its significance in both research communities and industries. In this talk, we will introduce the state of the art on this topic and discuss its future trends.

Prof. Jun-Yan Zhu, Carnegie Mellon University

Bio: Jun-Yan is an Assistant Professor with The Robotics Institute in the School of Computer Science of Carnegie Mellon University. He also holds affiliated faculty appointments in the Computer Science Department and Machine Learning Department. Prior to joining CMU, he was a Research Scientist at Adobe Research and a postdoctoral researcher at MIT CSAIL. He obtained his Ph.D. from UC Berkeley and his B.E. from Tsinghua University. He studies computer vision, computer graphics, computational photography, and machine learning. He is the recipient of the Facebook Fellowship, ACM SIGGRAPH Outstanding Doctoral Dissertation Award, and UC Berkeley EECS David J. Sakrison Memorial Prize for outstanding doctoral research. His co-authored work has received the NVIDIA Pioneer Research Award, SIGGRAPH 2019 Real-time Live! Best of Show Award and Audience Choice Award, and The 100 Greatest Innovations of 2019 by Popular Science.

Talk Title: GANs for Everyone

Abstract: The power and promise of deep generative models such as StyleGAN, CycleGAN, and GauGAN lie in their ability to synthesize endless realistic, diverse, and novel content with user controls. Unfortunately, the creation and deployment of these large-scale models demand high-performance computing platforms, large-scale annotated datasets, and sophisticated knowledge of deep learning methods. This makes it a process not feasible for many visual artists, content creators, small business entrepreneurs, and everyday users.

In this talk, I describe our recent efforts in making GANs more accessible for a broad audience, through improved computational- and data-efficiency as well as better interface between humans and models. First, we reduce the inference time and model size of recent GANs models by 6-20x, allowing their easy deployment on consumer laptops and mobile devices. Second, we present a data-efficient training method that can learn a model with only one hundred photos, bypassing the necessity of collecting large-scale datasets. Finally, we introduce new human-centered model creation interfaces that allow a user to directly customize new models with minimal user efforts.