Computer Vision Faculty Publications

AAD-Net: Advanced end-to-end signal processing system for human emotion detection & recognition using attention-based deep echo state network

Mustaqeem Khan, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Abdulmotaleb El Saddik, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Fahd Saleh Alotaibi, King Abdulaziz University
Nhat Truong Pham, Sungkyunkwan University

Document Type

Article

Publication Title

Knowledge-Based Systems

Abstract

Speech signals are the most convenient way of communication between human beings and the eventual method of Human–Computer Interaction (HCI) to exchange emotions and information. Recognizing emotions from speech signals is a challenging task due to the sparse nature of emotional data and features. In this article, we proposed a Deep Echo-State-Network (DeepESN) system for emotion recognition with a dilated convolution neural network and multi-headed attention mechanism. To reduce the model complexity, we incorporate a DeepESN that combines reservoir computing for higher-dimensional mapping. We also used fine-tuned Sparse Random Projection (SRP) to reduce dimensionality and adopted an early fusion strategy to fuse the extracted cues and passed the joint feature vector via a classification layer to recognize emotions. Our proposed model is evaluated on two public speech corpora, EMO-DB and RAVDESS, and tested for subject/speaker-dependent/independent performance. The results show that our proposed system achieves a high recognition rate, 91.14, 85.57 for EMO-DB, and 82.01, 77.02 for RAVDESS, using speaker-dependent and independent experiments, respectively. Our proposed system outperforms the State-of-The-Art (SOTA) while requiring less computational time.

DOI

10.1016/j.knosys.2023.110525

Publication Date

6-21-2023

Keywords

Affective computing, Attention mechanism, Audio speech signals, Convolution neural network, Echo state networks, Emotion recognition, Human–computer interaction

Comments

IR Deposit conditions:

OA version (pathway c) Accepted version

24-month embargo

License: CC BY-NC-ND by 4.0

Must link to publisher version with DOI

Recommended Citation

M. Khan et al., "AAD-Net: Advanced end-to-end signal processing system for human emotion detection & recognition using attention-based deep echo state network," Knowledge-Based Systems, vol. 270, Jun 2023.

The definitive version is available at https://doi.org/10.1016/j.knosys.2023.110525

Additional Links

DOI: https://doi.org/10.1016/j.knosys.2023.110525

Link to Full Text

COinS

Computer Vision Faculty Publications

AAD-Net: Advanced end-to-end signal processing system for human emotion detection & recognition using attention-based deep echo state network

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Browse

Contribute

Links

Computer Vision Faculty Publications

AAD-Net: Advanced end-to-end signal processing system for human emotion detection & recognition using attention-based deep echo state network

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Share

Browse

Contribute

Links