Computer Vision Faculty Publications

DoodleFormer: Creative Sketch Drawing with Transformers

Ankan Kumar Bhunia, Mohamed bin Zayed University of Artificial IntelligenceFollow
Salman Khan, Mohamed bin Zayed University of Artificial Intelligence & Australian National UniversityFollow
Hisham Cholakkal, Mohamed bin Zayed University of Artificial IntelligenceFollow
Rao Muhammad Anwer, Mohamed bin Zayed University of Artificial Intelligence & Aalto UniversityFollow
Fahad Shahbaz Khan, Mohamed bin Zayed University of Artificial Intelligence & Linköping UniversityFollow
Jorma Laaksonen, Aalto University
Michael Felsberg, Linköping University

Document Type

Conference Proceeding

Publication Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition followed by the incorporation of fine-details in the sketch. We introduce graph-aware transformer encoders that effectively capture global dynamic as well as local static structural relations among different body parts. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder that explicitly models the variations of each sketch body part to be drawn. Experiments are performed on two creative sketch datasets: Creative Birds and Creative Creatures. Our qualitative, quantitative and human-based evaluations show that DoodleFormer outperforms the state-of-the-art on both datasets, yielding realistic and diverse creative sketches. On Creative Creatures, DoodleFormer achieves an absolute gain of 25 in Frèchet inception distance (FID) over state-of-the-art. We also demonstrate the effectiveness of DoodleFormer for related applications of text to creative sketch generation, sketch completion and house layout generation. Code is available at: https://github.com/ankanbhunia/doodleformer. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

First Page

338

Last Page

355

DOI

10.1007/978-3-031-19790-1_21

Publication Date

10-24-2022

Keywords

Body parts, Coarse to fine, Creatives, Global dynamics, Image generations, Sketchings, State of the art, Vision problems, Visual objects, Visual world, Drawing (graphics), Computer Vision and Pattern Recognition (cs.CV), Graphics (cs.GR)

Comments

IR conditions: non-described

Recommended Citation

A.K. Bhunia, et.al., "DoodleFormer: Creative sketch drawing with transformers", in Computer Vision (ECCV 2022), Lecture Notes in Computer Science, vol 13677, pp. 338-355, October 2022, doi: 10.1007/978-3-031-19790-1_21

Additional Links

Preprint version available at arXiv: https://arxiv.org/abs/2112.03258

Link to Full Text

COinS

Computer Vision Faculty Publications

DoodleFormer: Creative Sketch Drawing with Transformers

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Browse

Contribute

Links

Computer Vision Faculty Publications

DoodleFormer: Creative Sketch Drawing with Transformers

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Share

Browse

Contribute

Links