Computer Vision Faculty Publications

Create Your World: Lifelong Text-to-Image Diffusion

Gan Sun, South China University of Technology
Wenqi Liang, Shenyang Institute of Automation Chinese Academy of Sciences
Jiahua Dong, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Jun Li, Nanjing University of Science and Technology
Zhengming Ding, Tulane University School of Science and Engineering
Yang Cong, South China University of Technology

Document Type

Article

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract

Text-to-image generative models can produce diverse high-quality images of concepts with a text prompt, which have demonstrated excellent ability in image generation, image translation, etc. We in this work study the problem of synthesizing instantiations of a user's own concepts in a never-ending manner, i.e., create your world, where the new concepts from user are quickly learned with a few examples. To achieve this goal, we propose a Lifelong text-to-image Diffusion Model (L $^{2}$ DM), which intends to overcome knowledge “catastrophic forgetting” for the past encountered concepts, and semantic “catastrophic neglecting” for one or more concepts in the text prompt. In respect of knowledge “catastrophic forgetting”, our L $^{2}$ DM framework devises a task-aware memory enhancement module and an elastic-concept distillation module, which could respectively safeguard the knowledge of both prior concepts and each past personalized concept. When generating images with a user text prompt, the solution to semantic “catastrophic neglecting” is that a concept attention artist module can alleviate the semantic neglecting from concept aspect, and an orthogonal attention module can reduce the semantic binding from attribute aspect. To the end, our model can generate more faithful image across a range of continual text prompts in terms of both qualitative and quantitative metrics, when comparing with the related state-of-the-art models. The code will be released at https://wenqiliang.github.io/.

DOI

10.1109/TPAMI.2024.3382753

Publication Date

1-1-2024

Keywords

Computational modeling, Continual Learning, Dogs, Electronic mail, Image Generation, Lifelong Machine Learning, Neural networks, Semantics, Stable Diffusion, Task analysis, Training

Recommended Citation

G. Sun et al., "Create Your World: Lifelong Text-to-Image Diffusion," IEEE Transactions on Pattern Analysis and Machine Intelligence, Jan 2024.

The definitive version is available at https://doi.org/10.1109/TPAMI.2024.3382753

This document is currently not available here.

COinS

Computer Vision Faculty Publications

Create Your World: Lifelong Text-to-Image Diffusion

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

Create Your World: Lifelong Text-to-Image Diffusion

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Recommended Citation

Share

Browse

Contribute

Links