MMM: Manga Motion Model

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Machine Learning

Department

Machine Learning

First Advisor

Dr. Hao Li

Second Advisor

Prof. Timothy Baldwin

Abstract

Manga Motion is an emerging field in animation that creatively integrates the distinctive style of Japanese manga books into animated sequences. Our profound interest in the artistry and narrative depth of manga inspired us to embark on this innovative research journey. Due to the absence of a dedicated manga motion dataset, which is crucial for training machine learning models, we curated our own dataset. This dataset was meticulously collected by downloading a series of animation videos from \href{https://youtu.be/I5lLQza7Mew}{YouTube}, We then processed the videos by human involvement to insure precision. We used Davinci Resolve\cite{blackmagic} to center the motion we want and saved it at a high resolution of 1080*1080 pixels. We developed a specialized Manga Dedicated Adapter, a novel computational model designed to precisely replicate the dynamic and stylistic movements typical of manga characters. This adapter is an advanced tool that allows us to generate manga-like motions from textual descriptions, bridging the gap between static manga panels and full-motion video. We demonstrated that our Manga Motion Adapter effectively preserves the stylistic consistency of the manga genre without compromising the original model's structural integrity. Our integration strategy of the adapter involves a simple sequence consisting of a convolution layer followed by a transformer model, and another convolution layer into the AnimateDiff architecture \cite{animatediff}, showing promising results. Furthermore, our method captures the essence of manga motion and its domain more effectively than previous methods, such as the AnimateDiff\cite{animatediff} and Text2Video-Zero\cite{khachatryan2023text2videozero}. In future work, as outlined in section \ref{sec:Future_work}, we show that our method is also applicable to static images, although training a LoRa \cite{DBLP:journals/corr/abs-2106-09685} or Dreambooth \cite{ruiz2023dreambooth} model for each image may not be feasible due to computational demands. We are actively exploring more viable and scalable solutions for future developments in manga motion animation.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors:Hao Li, Timothy Baldwin

Online access available for MBZUAI patrons

Share

COinS