Building Automatic Speech Recognition Models for Emirati Dialect

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Natural Language Processing

Department

Natural Language Processing

First Advisor

Hanan Aldarmaki

Second Advisor

Muhammad Abdul- Mageed

Abstract

This thesis explores the development of Automatic Speech Recognition (ASR) models for the Emirati dialect, a dialect that, until now, has seen limited representation in speech technology applications. With the Emirati dialect’s distinct phonological and linguistic traits, the study aims to fill a significant gap in the current ASR landscape by creating systems that can accurately recognize and transcribe speech from this specific dialectal group. This work is driven by the potential for such technologies to revolutionize interactions with digital devices and services for Emirati Arabic speakers, from improving accessibility to enhancing communication and information retrieval. Central to the research effort was the compilation of a comprehensive dataset characteristic of the Emirati dialect’s lexicon diversity, which involved meticulous collection and annotation processes. The development approach spanned the creation of detailed lexicon dictionaries and the application of both traditional and advanced modeling techniques, including Hidden Markov Models (HMM), Gaussian Mixture Models (GMM), and state-of-the-art deep learning methods. Special attention was given to dialect-specific features and the challenge of dialectal variability, addressing both through innovative model training strategies and the integration of dialect-aware components into the ASR systems. Comparative analysis across different ASR models highlights the enhanced performance achieved through these tailored approaches, marking a significant advancement in the recognition accuracy of the Emirati dialect. The findings of this study not only contribute valuable insights into the field of computational linguistics and speech processing for Arabic dialects but also pave the way for the development of more inclusive and effective ASR technologies.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Science in Natural Language Processing

Advisors: Hanan Aldarmaki, Muhammad Abdul- Mageed

Online access available for MBZUAI patrons

Share

COinS