Student Publications

The Employability of End-to-End Automatic Speech Recognition on Impaired Speech: An Investigation

Karima Kadaoui, Mohamed bin Zayed University of Artificial IntelligenceFollow

Document Type

Dissertation

Abstract

Speech impairment known as dysarthria prevents patients from interacting with their surroundings and engaging with others. Dysarthric individuals could benefit from the use of Automatic Speech Recognition (ASR) systems, but doing so is hindered by said systems’ low accuracy due to the high speech variability and the scarcity of data. Although the current state-of-the-art (SOTA) results in the field are achieved by hybrid ASRs (around 22% word error rate (WER)), these models are outperformed by end-to-end systems when it comes to healthy speech. We thus investigate the applicability of several end-to-end deep neural networks (DNNs) in the context of impaired speech. We conducted various experiments to gauge the suitability of different models for this objective on the UASpeech dataset. The Conformer CTC and Jasper models resulted in 47.54% and 46.9% word error rate (WER) respectively without the use of an external language model (LM). We highlighted their advantages and disadvantages and we believe that with additional techniques similar to what is currently being used on hybrid models, these architectures could greatly challenge their counterparts.

First Page

Last Page

Publication Date

12-30-2022

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Mohammad Yaqub, Dr. Shady Shehata

2 years embargo period

Recommended Citation

K. Kadaoui, "The Employability of End-to-End Automatic Speech Recognition on Impaired Speech: An Investigation", M.S. Thesis, Machine Learning, MBZUAI, Abu Dhabi, UAE, 2022.

This document is currently not available here.

COinS

Student Publications

The Employability of End-to-End Automatic Speech Recognition on Impaired Speech: An Investigation

Document Type

Abstract

First Page

Last Page

Publication Date

Comments

Recommended Citation

Browse

Contribute

Links

Student Publications

The Employability of End-to-End Automatic Speech Recognition on Impaired Speech: An Investigation

Authors

Document Type

Abstract

First Page

Last Page

Publication Date

Comments

Recommended Citation

Share

Browse

Contribute

Links