Natural Language Processing Faculty Publications

ArTST: Arabic Text and Speech Transformer

Hawau Olamide Toyin, Mohamed bin Zayed University of Artificial IntelligenceFollow
Amirbek Djanibekov, Mohamed bin Zayed University of Artificial IntelligenceFollow
Ajinkya Kulkarni, Mohamed bin Zayed University of Artificial IntelligenceFollow
Hanan Al Darmaki, Mohamed bin Zayed University of Artificial IntelligenceFollow

Document Type

Article

Publication Title

arXiv

Abstract

We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language. The model architecture follows the unified-modal framework, SpeechT5, that was recently released for English, and is focused on Modern Standard Arabic (MSA), with plans to extend the model for dialectal and code-switched Arabic in future editions. We pre-trained the model from scratch on MSA speech and text data, and fine-tuned it for the following tasks: Automatic Speech Recognition (ASR), Text-To-Speech synthesis (TTS), and spoken dialect identification. In our experiments comparing ArTST with SpeechT5, as well as with previously reported results in these tasks, ArTST performs on a par with or exceeding the current state-of-the-art in all three tasks. Moreover, we find that our pre-training is conducive for generalization, which is particularly evident in the low-resource TTS task. The pre-trained model as well as the fine-tuned ASR and TTS models are released for research use. Copyright © 2023, The Authors. All rights reserved.

DOI

10.48550/arXiv.2310.16621

Publication Date

10-25-2023

Keywords

Arabic languages, Arabic speech, Arabic texts, Automatic speech recognition, Modeling architecture, Modern standards, Open-source, Speech data, Speech technology, Standard arabics

Comments

Preprint: arXiv

Archived with thanks to arXiv

Uploaded 30 November 2023

Recommended Citation

H.O. Toyin, A. Djanibekov, A. Kulkarni, and H. Aldarmaki, "ArTST: Arabic Text and Speech Transformer", arXiv, Oct 2023. doi:10.48550/arXiv.2310.16621

Additional Links

arXiv Link: https://doi.org/10.48550/arXiv.2310.16621

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Natural Language Processing Faculty Publications

ArTST: Arabic Text and Speech Transformer

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Browse

Contribute

Links

Natural Language Processing Faculty Publications

ArTST: Arabic Text and Speech Transformer

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Share

Browse

Contribute

Links