ICIG Summit Forum: The First CSIG Summit Forum of Annual Progress in Image and Graphics （ICIG第一届CSIG年度学科进展高峰论坛）

Date＆Time：13:30-17:30, December 28, 2021 Location: Meeting Room 3

Introduction: Great progress and wide applications has been witnessed in the field of image and graphics technology in the past few years, with the upsurge of deep learning and artificial intelligence. To better promote the development of image and graphics, the China Society of Image and Graphics (CSIG) calls for from its professional committees and publishes in Journal of Image and Graphics the “Annual Report on the Development of Image and Graphics”, which aims to systematically analyze the development status, cutting-edge topics and development trends of important research directions in the field of image and graphics technology, so as to better serve the scientific and technical community and further promote the development and application of image and graphics technology. Out of the 15 annual reports on the development of image and graphics in 2020, this summit forum invites six speakers (on behalf of the professional committees) to share their systematically survey and analysis on the development status, cutting-edge topics and development trends of six important research directions in the field of image and graphics technology.

Organizer: Yongfei Zhang (Beihang University)

Agenda:
13:30-14:00
Recent Advances in Video Processing and Coding
Siwei Ma

14:00-14:30
The road towards immersive experience: perception, quality assessment and compression for 360o video
Mai Xu

14:30-15:00
Recent Progress of Biometrics
Zhenan Sun

15:00-15:30
Traffic video structural analysis
Shikui Wei

15:30-16:00
Deep Learning for Scene Text Detection and Recognition
Lianwen Jin

16:00-16:30
Study on Graphical Cue Generation and Virtual Object Rendering in Virtual and Real Fusion Environment
Yue Liu

16:30-17:00
Overview of the development and application of 3D vision measurement technology
Zonghua Zhang

Title: Recent Advances in Video Processing and Coding
Speaker: Siwei Ma
Abstract: Video processing and compression are one of the most fundamental research areas in multimedia computing technology. They play a significant role in bridging the imaging and streaming devices as well as the visual analysis and understanding. In the contemporary era, the “5G+UHD+AI” is revoking a novel trend of technology revolution in multimedia computing. The video processing and compression techniques are also facing challenging reform. The demands are increasing for the theoretical and application research on high-efficient compact representation and high performance real-time processing. The cutting-edge research area and content in video processing and compression are systematically reviewed and analyzed, including statistical prior model based video data representation and processing methods, deep network based video processing and compression and video compression standardization process. The state-of-the-art methods and the developing trends are also comprehensively overviewed. In addition, the comparative study of those area between the oversea community and the domestic community is also extensively conducted to show the difference and similarity in the current situation. Finally, the future work of theoretical and application studies in video processing and compression is envisioned.

Biography: Siwei Ma is currently a Boya Distinguished Professor of the Institute of Digital Media, School of Electronics Engineering and Computer Science, Peking University, Beijing, China. His research interests include video coding, processing and transmission. He has published more than 300 papers on refereed journals and conferences. He served/serves as an Associate Editor for the IEEE Transactions on Circuits and Systems for Video Technology and the Journal of Visual Communication and Image Representation. He is the chair of the AVS video sub-group.

Title: The road towards immersive experience: perception, quality assessment and compression for 360o video
Speaker: Mai Xu
Abstract: Immersive media is being well developed and will be dominant as future multimedia services. In recent years, 360o video has gradually gained popularity and become the main carrier of immersive communication, providing immersive viewing experience. However, 360o video covers wide range with high resolution, resulting in explosive growth in data. This causes the bandwidth-hungry issue for transmitting these visual data, with the bottleneck between the supply and demand of communication bandwidth. On the other hand, the field of view for humans to watch 360o video only occupies 1/10 of the 360o video area, thus existing a large amount of perceptual redundancy. In order to breakthrough the bottleneck of immersive communication bandwidth, some works have been proposed for perception, quality evaluation and compression on 360o video. This report thus includes the topics: (1) Deep imitation learning based attention models on 360o video; (2) Data-driven approaches for visual quality assessment on 360o video; (3) Perception-inspired optimization for 360o video compression.

Mai Xu is a full professor of School of Electronic Information Engineering, Beihang University. He is also a Changjiang Scholar of the Ministry of Education (Youth program), and Deputy Director of the Youth Scholar Committee of the Chinese Society of Image and Graphics. His research interests include video compression and image processing. In the past five years, more than 100 papers have been published in prestigious journals such as IJCV, IEEE TPAMI, TIP, JSAC, TMM, and famous conferences such as IEEE CVPR, ICCV, ECCV, ACM MM, AAAI, and DCC. Many papers were selected as ESI highly cited papers/highlight papers. As the PI, he is supported by many projects, e.g., Excellent Young Scholar Funding of e National Natural Science Foundation of China, and Distinguished Young Scholar Funding of e National Natural Science Foundation of Beijing.

Title: Recent Progress of Biometrics
Speaker: Zhenan Sun
Abstract: Biometric recognition empowers a machine to automatically detect, capture, process, analyze, and recognize digital physiological or behavioral signals with advanced intelligence. Biometrics is a typical and complex pattern recognition problem, which is a frontier research direction of artificial intelligence. This talk reviews research progress, emerging directions, existing problems, and feasible ideas of main biometric modalities, including iris, face, fingerprint, finger/palm vein, gait, voiceprint, person re-identification, etc. Three major challenges in the field of biometric recognition are analyzed including acquisition, recognition and security issue. Especially security risks of biometric systems such as spoofing attacks, adversarial attacks, deepfake attacks have attracted great attentions recently. So trustworthy biometrics is a research focus in the future.

Biography: Zhenan Sun received the B.E. degree in industrial automation from the Dalian University of Technology, Dalian, China, in 1999, the M.S. degree in system engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2002, and the Ph.D. degree in pattern recognition and intelligent systems from the Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China, in 2006. Since 2006, he has been a Faculty Member with the CASIA. He is currently a Professor with the Center for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition, CASIA. Dr. Sun is a Senior Member of the IEEE and an Associate Editor of the IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE. He is the chair of Technical Committee on Biometrics, International Association for Pattern Recognition (IAPR) and an IAPR Fellow. He has authored/coauthored over 300 technical papers. His current research interests include biometrics, pattern recognition, and computer vision.

Title: Traffic video structural analysis
Speaker: Shikui Wei
Abstract: Traffic video structural analysis is one of the core techniques in smart transportation. It aims to use artificial intelligence algorithms to parse unstructured traffic video data into structured semantic information. Efficiently and accurately conducting traffic video structure analysis will be the focus of research in the next few years. Traffic video structural analysis includes vehicle video structural analysis, personnel structural analysis, and behavior analysis. In this talk, we discuss the related work on traffic video structural analysis in detail from three aspects, i.e., vehicle, personnel, and behavior analysis. Moreover, we summarize these research works and provide some reasonable directions for future work.

Biography: Shikui Wei is a professor at the Institute of Information Science at Beijing Jiaotong University. He received the Ph. D degree in the Institute of Information Science at Beijing Jiaotong University. His research interests include computer vision, multimedia content analysis, machine learning. He has published more than 80 research papers in academic journals and conferences, such as IEEE TKDE, IEEE TIP, CVPR, AAAI, ACMMM. He has won an excellent doctoral dissertation award from the China Computer Federation (CCF).

Title: Deep Learning for Scene Text Detection and Recognition
Speaker: Lianwen Jin
Abstract: As one of the most fundamental and influential inventions of humanity, text has played an important role in human life. Rich and precise semantic information carried by the text is important in a wide range of vision-based application scenarios, such as image searching, image understanding, industrial automation, information security, instant translation, intelligent finance, robot navigation and so on. In recent years, with the fast development of deep learning theory and technology, optical character recognition (OCR) has been achieved great progress in many aspects such as unconstraint handwritten text recognition, camera based printed document analysis and recognition, scene text detection and recognition in the wild and so on. In this talk, I will briefly introduce the state-of-the-art of various deep learning methods in the field of OCR, and specifically introduce the main research progress in scene text detection and recognition. I will also discuss about some unsolved problems, new research topics and challenges, and future research trends.

Biography: Lianwen Jin received the B.S. degree from the University of Science and Technology of China, Anhui, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China, in 1991 and 1996, respectively. He is currently a Professor with the School of Electronic and Information Engineering, South China University of Technology. He is the author of more than 200 scientific papers. He has served as AC/SPC/PC member for many international conferences, including ICDAR, ICFHR, ICPR, CVPR, ICCV, IJCAI, AAAI, etc. Dr. Jin was a recipient of the award of New Century Excellent Talent Program of MOE in 2006 and the Guangdong Pearl River Distinguished Professor Award in 2011. His research interests include optical character recognition, handwriting analysis and recognition, machine learning, deep learning, and computer vision.

Title: Study on Graphical Cue Generation and Virtual Object Rendering in Virtual and Real Fusion Environment
Speaker: Yue Liu
Abstract: Mixed reality systems can provide virtual and real fusion environment in which the virtual objects are added to the real world in real time. However, there are such problems as the lack of visual principles and perception theories that can be used to guide the rendering of virtual and real fusion scenes, the lack of the graphical clues which can provide absolute depth information and the lack of rendering features of virtual objects. In this talk I will present the review as well as study achievement on the graphical cues generation and virtual object rendering in virtual and real fusion environment.

Biography:
Yue Liu received his Ph.D. degree in Telecommunication and Information System from Jilin University in 2000. He is currently a professor of optics in Beijing Institute of Technology (BIT). He serves as the deputy director of Beijing Engineering Research Center of Mixed Reality and Advanced Display. His research interests include computer vision, human computer interaction, 3D display, image processing, virtual reality (VR) and augmented reality (AR) technologies and applications. Dr. Liu serves as the expert on culture technology for the National Main Research and Development Program of China. He is also deputy secretary general as well as a member of the board of directors of China Society for Image and Graphics（CSIG）.

Title: Overview of the development and application of 3D vision measurement technology
Speaker: Zonghua Zhang
Abstract: 3D vision measurement is a new and advanced technology of computer vision and precision measurement. It is the basic support of industry 4.0 and the core and key technology of advanced manufacturing industry characterized by networked and intelligent manufacturing. After decades of development, 3D vision measurement technology obtains the rapid development in basic research and applied research. It has formed the relatively complete direction system of four parts: theoretical method, technical process, system development and product application. 3D vision measurement technology presents a trend of systematic theory, multi-dimensional method, precise precision and rapid speed, which has become an indispensable optimization technology of intelligent manufacturing process control, product quality inspection and guarantee, and complete equipment service test. This report mainly focuses on typical 3D vision measurement technologies such as single-camera, double-camera and structured-light, and briefly introduces the connotation of the key technologies and summarizes its development status, frontier trends, hot issues and development trends. The 3D measurement technique of fringe projection and phase measuring deflectometry are mainly discussed. Finally, the development trend and future prospect of 3D vision measurement are given.

Biography: Dr. Zonghua Zhang is a full professor in Hebei University of Technology, in China and a visiting professor in University of Huddersfield, in UK. He obtained his Ph.D degree from Tianjin University of China in 2001. He worked in Ruhr University Bochum of Germany, Queen’s University of Canada, Heriot-Watt University, University of Leeds, and University of Huddersfield of UK. His main research interests include 3D optical measurement, fringe projection profilometry, and phase measuring deflectometry. 2016 to Jul. 2018, supported by EU2020 Maire-Curie Individual Fellowship, he studied direct phase measuring deflectometry in University of Huddersfield, in UK. He has published more than 190 papers and 40 patents. He is a Marie Sklodowska-Curie Individual Fellowship, and New Century Excellent Talents in University Supported by Ministry of Education of China. He is an Associate Editor for Optics Express now.