AAD-Net: Advanced end-to-end signal processing system for human emotion detection & recognition using attention-based deep echo state network
Document Type
Article
Publication Title
Knowledge-Based Systems
Abstract
Speech signals are the most convenient way of communication between human beings and the eventual method of Human–Computer Interaction (HCI) to exchange emotions and information. Recognizing emotions from speech signals is a challenging task due to the sparse nature of emotional data and features. In this article, we proposed a Deep Echo-State-Network (DeepESN) system for emotion recognition with a dilated convolution neural network and multi-headed attention mechanism. To reduce the model complexity, we incorporate a DeepESN that combines reservoir computing for higher-dimensional mapping. We also used fine-tuned Sparse Random Projection (SRP) to reduce dimensionality and adopted an early fusion strategy to fuse the extracted cues and passed the joint feature vector via a classification layer to recognize emotions. Our proposed model is evaluated on two public speech corpora, EMO-DB and RAVDESS, and tested for subject/speaker-dependent/independent performance. The results show that our proposed system achieves a high recognition rate, 91.14, 85.57 for EMO-DB, and 82.01, 77.02 for RAVDESS, using speaker-dependent and independent experiments, respectively. Our proposed system outperforms the State-of-The-Art (SOTA) while requiring less computational time.
DOI
10.1016/j.knosys.2023.110525
Publication Date
6-21-2023
Keywords
Affective computing, Attention mechanism, Audio speech signals, Convolution neural network, Echo state networks, Emotion recognition, Human–computer interaction
Recommended Citation
M. Khan et al., "AAD-Net: Advanced end-to-end signal processing system for human emotion detection & recognition using attention-based deep echo state network," Knowledge-Based Systems, vol. 270, Jun 2023.
The definitive version is available at https://doi.org/10.1016/j.knosys.2023.110525
Comments
IR Deposit conditions:
OA version (pathway c) Accepted version
24-month embargo
License: CC BY-NC-ND by 4.0
Must link to publisher version with DOI