Masked Generative Adversarial Networks are Data-Efficient Generation Learners
Advances in Neural Information Processing Systems
This paper shows that masked generative adversarial networks (MaskedGAN) are robust image generation learners with limited training data. The idea of MaskedGAN is simple: it randomly masks out certain image information for effective GAN training with limited data. We develop two masking strategies that work along orthogonal dimensions of training images, including a shifted spatial masking that masks the images in spatial dimensions with random shifts, and a balanced spectral masking that masks certain image spectral bands with self-adaptive probabilities. The two masking strategies complement each other which together encourage more challenging holistic learning from limited training data, ultimately suppressing trivial solutions and failures in GAN training. Albeit simple, extensive experiments show that MaskedGAN achieves superior performance consistently across different network architectures (e.g., CNNs including BigGAN and StyleGAN-v2 and Transformers including TransGAN and GANformer) and datasets (e.g., CIFAR-10, CIFAR-100, ImageNet, 100-shot, AFHQ, FFHQ and Cityscapes).
J. Huang, et al, "Masked Generative Adversarial Networks are Data-Efficient Generation Learners", in 36th Conf. on Neural Info. Processing Systems (NeurIPS 2022), in Advances in Neural Information Processing Systems, vol 35, 2022. Available: https://proceedings.neurips.cc/paper_files/paper/2022/hash/0efcb1885b8534109f95ca82a5319d25-Abstract-Conference.html