AdAM: Parameter-Efficient Few-shot Image Generation

Abstract

Few-shot image generation (FSIG) aims to learn to generate new and diverse samples given an extremely limited number of samples from a domain, e.g., 10 training samples. Recent work has addressed the problem using transfer learning approach, leveraging a GAN pretrained on a large-scale source domain dataset and adapting that model to the target domain based on very limited target domain samples. Central to recent FSIG methods are knowledge preserving criteria, which aim to select a subset of source model's knowledge to be preserved into the adapted model.

However, a major limitation of existing methods is that their knowledge preserving criteria consider only source domain/source task, and they fail to consider target domain/adaptation task in selecting source model's knowledge, casting doubt on their suitability for setups of different proximity between source and target domain. Our work makes two contributions. As our first contribution, we re-visit recent FSIG works and their experiments. Our important finding is that, under setups which assumption of close proximity between source and target domains is relaxed, existing state-of-the-art (SOTA) methods which consider only source domain/source task in knowledge preserving perform no better than a baseline fine-tuning method. To address the limitation of existing methods, as our second contribution, we propose adaptation-aware kernel modulation to address general FSIG of different source-target domain proximity. Extensive experimental results show that the proposed method consistently achieves SOTA performance across source/target domains of different proximity, including challenging setups when source and target domains are more apart.

Overview

Contributions

1: We consider the problem of FSIG with Transfer Learning using very limited target samples (e.g., 10-shot).
2: Our work makes two contributions:

We discover that when the close proximity assumption between source-target domain is relaxed, SOTA FSIG methods, e.g., EWC (Li et al.), CDC (Ojha et al.), DCL (Zhao et al.), which consider only source domain/source task in knowledge preserving perform no better than a baseline fine-tuning method, e.g., TGAN, (Wang et al.).
We propose a novel adaptation-aware kernel modulation for FSIG that achieves SOTA performance across source / target domains with different proximity.

3: Schematic diagram of our proposed Importance Probing Mechanism: We measure the importance of each kernel for the target domain after probing and preserve source domain knowledge that is important for target domain adaptation. The same operations are applied to discriminator.

Experiment Results

Qualitative and quantitative comparison of 10-shot image generation with different FSIG methods. Images of each column are from the same noise input (Last row). Left: 10-shot real target samples for adaptation. Mid, Right:: For target domain with close proximity (e.g.Babies, top), our method can generate high quality images with more refined details and diversity knowledge, achieving best FID and Intra-LPIPS socre. For target domain which is distant (e.g., Cat, bottom), TGAN/FreezeD overfit to the 10-shot samples and others fail. In contrast, our method preserves meaningful semantic features at different levels (e.g., posture and color) from source, achieving a good trade off between quality and diversity. In particular, our Intra-LPIPS approaches that of EWC, while our generated images have much better quality qualitatively and quantitatively

BibTeX

If you find our research useful in your work, please consider citing our paper:

@inproceedings{zhao2022fewshot,
      title={Few-shot Image Generation via Adaptation-Aware Kernel Modulation},
      author={Yunqing Zhao and Keshigeyan Chandrasegaran and Milad Abdollahzadeh and Ngai-man Cheung},
      booktitle={Advances in Neural Information Processing Systems},
      editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
      year={2022},
      url={https://openreview.net/forum?id=Z5SE9PiAO4t}
    }

Meanwhile, we also demonstrate a relevant research for few-shot image generation, via Removing In-Compatible Knowledge (RICK, CVPR-2023) for few-shot Transfer to fine-tune the pretrained GANs:

@inproceedings{zhao2023exploring,
              title={Exploring incompatible knowledge transfer in few-shot image generation},
              author={Zhao, Yunqing and Du, Chao and Abdollahzadeh, Milad and Pang, Tianyu and Lin, Min and Yan, Shuicheng and Cheung, Ngai-Man},
              booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
              pages={7380--7391},
              year={2023}
            }