A Closer Look at Few-shot Image Generation
Yunqing Zhao
Henghui Ding
Houjing Huang
Ngai-Man Cheung
Singapore University of Technology and Design (SUTD)
ETH Zürich      ByteDance Ltd.
[Paper]
[GitHub]
[Data Repository]

Abstract

Modern GANs excel at generating high quality and diverse images. However, when transferring the pretrained GANs on small target data (e.g., 10-shot), the generator tends to replicate the training samples. Several methods have been proposed to address this few-shot image generation task, but there is a lack of effort to analyze them under a unified framework. As our first contribution, we propose a framework to analyze existing methods during the adaptation. Our analysis discovers that while some methods have disproportionate focus on diversity preserving which impede quality improvement, all methods achieve similar quality after convergence. Therefore, the better methods are those that can slow down diversity degradation. Furthermore, our analysis reveals that there is still plenty of room to further slow down diversity degradation. Informed by our analysis and to slow down the diversity degradation of the target generator during adaptation, our second contribution proposes to apply mutual information (MI) maximization to retain the source domain’s rich multi-level diversity information in the target domain generator. We propose to perform MI maximization by contrastive loss (CL), leverage the generator and discriminator as two feature encoders to extract different multi-level features for computing CL. We refer to our method as Dual Contrastive Learning (DCL). Extensive experiments on several public datasets show that, while leading to a slower diversity-degrading generator during adaptation, our proposed DCL brings visually pleasant quality and state-of-the-art quantitative performance.



Overview and our contributions

- 1: We tackle few-shot image generation (FSIG) via adapting a pre-trained source GAN to a small target domain.
- 2: Our work makes two main contributions:

  • We discover that, existing methods for FSIG achieve high quality of generated images during adaptation. On the other hand, while different methods can achieve similar quality on the target domain, their diversity-degrading rates vary drastically. These observations are shown in the below figure.
  • We propose a new method to improve the performance of FSIG and achieve comparable performance.
- 3: Schematic diagram of our method: We propose dual-contrastive learning (DCL), a mutual-information based method that aims to maximize the mutual information between the source image and the target image that are from the same latent code.

Experiment Results

Comparison with other methods for 10-shot adaptation on FFHQ-Babies and Sketches. Additional result in Supplement.

Left: Transferring a GAN pretrained on FFHQ to 10-shot target samples . We fix the noise input by each column to observe the relationship of generated images before and after adaptation. Mid: We observe that, most of existing methods lose diversity quickly before the quality improvement converges, and tend to replicate the training data. Our method, in contrast, slows down the loss of diversity and preserves more details. For example: In red frames (Upper), the hair style and hat are better preserved. In pink frames (Bottom), the smile teeth are well inherited from the source domain. We also outperform others in quantitative evaluation (Right).



Paper Additional Information

Yunqing Zhao et al.
A Closer Look at Few-shot Image Generation
In CVPR, 2022.
(hosted on arXiv)


If you find our work useful in your research, please consider citing our paper: [Bibtex]


Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.