Proceedings of International Conference on Applied Innovation in IT
2025/08/29, Volume 13, Issue 4, pp.135-141
A Hybrid Spiking-Attention Transformer Model for Robust and Efficient Speech Emotion Recognition on Multi-Dataset Benchmarks
Samah Abbas Ali and Jamal Mustafa Abbas Abstract: This study introduces a novel and effective method for Speech Emotion Recognition (SER) that combines Spiking Neural Networks (SNNs), Temporal Attention, and Transformer encoders within a powerful hybrid model. SER is essential for improving human-computer interaction by enabling intelligent systems to effectively recognize emotions from speech. Unlike traditional methods that typically rely on shallow classifiers and manually engineered features, our deep learning-based approach takes full advantage of the energy efficiency of SNNs, the selective focus provided by temporal attention, and the long-range temporal modeling capabilities of Transformer architectures. We thoroughly evaluated the performance of this model on a comprehensive multi-dataset corpus, which included TESS, SAVEE, RAVDESS, and CREMA-D. The model achieved an impressive and consistent accuracy of 98% across all emotion classes. These strong results not only demonstrate the model’s superior effectiveness but also highlight its potential for use in real-time, resource-limited environments. Furthermore, this hybrid approach clearly surpasses existing state-of-the-art SER techniques and offers a reliable foundation for application in real-world affective computing scenarios.
Keywords: Speech Emotion Recognition (SER), Spiking Neural Networks (SNN), Temporal Attention, Transformer Encoders, Deep Learning, TESS, SAVEE, RAVDESS, CREMA-D.
DOI: 10.25673/122078
Download: PDF
References:
- C. Parlak, “Cochleogram-Based Speech Emotion Recognition with the Cascade of Asymmetric Resonators with Fast-Acting Compression Using Time-Distributed Convolutional Long Short-Term Memory and Support Vector Machines,” Biomimetics, vol. 10, no. 3, p. 167, 2025, doi: 10.3390/biomimetics10030167.
- D. Y. Badawood and F. M. Aldosari, “Enhanced Deep Learning Techniques for Real-Time Speech Emotion Recognition in Multilingual Contexts,” Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18662–18669, 2024.
- S. Yaser, M. S. I. Sadhin, and R. H. Ifty, “Speech Emotion Recognition using Transfer Learning Approach and Real-Time Evaluation in English and Bengali Language,” unpublished.
- “Speech Emotion Recognition (en).” [Online]. Available: https://www.kaggle.com/datasets/dmitrybabko/speech-emotion-recognition-en
- , accessed Jan. 19, 2025.
- M. A. Uddin, M. S. U. Chowdury, M. U. Khandaker, N. Tamam, and A. Sulieman, “The efficacy of deep learning-based mixed model for speech emotion recognition,” Computer Materials & Continua, vol. 74, no. 1, pp. 1709–1722, 2022.
- C. Tan, M. Šarlija, and N. Kasabov, “NeuroSense: Short-term emotion recognition and understanding based on spiking neural network modelling of spatio-temporal EEG patterns,” Neurocomputing, vol. 434, pp. 137–148, 2021.
- M. Ezz-Eldin, A. A. M. Khalaf, H. F. A. Hamed, and A. I. Hussein, “Efficient feature-aware hybrid model of deep learning architectures for speech emotion recognition,” IEEE Access, vol. 9, pp. 19999–20011, 2021.
- W. Alzhrani, M. Doborjeh, Z. Doborjeh, and N. Kasabov, “Emotion recognition and understanding using EEG data in a brain-inspired spiking neural network architecture,” in Proc. Int. Joint Conf. Neural Networks (IJCNN), 2021, pp. 1–9.
- B. Wang, J. Xu, L. Chen, Q. Zhang, and Y. Li, “Spiking Emotions: Dynamic Vision Emotion Recognition Using Spiking Neural Networks,” in Proc. AHPCAI, 2022, pp. 50–58.
- K. Mountzouris, I. Perikos, and I. Hatzilygeroudis, “Speech emotion recognition using convolutional neural networks with attention mechanism,” Electronics, vol. 12, no. 20, p. 4376, 2023, doi: 10.3390/electronics12204376.
- R. Ullah, Y. Zhang, S. Ali, H. Kim, and T. Lee, “Speech emotion recognition using convolution neural networks and multi-head convolutional transformer,” Sensors, vol. 23, no. 13, p. 6212, 2023, doi: 10.3390/s23136212.
- C. S. A. Kumar, A. Das Maharana, S. M. Krishnan, S. S. S. Hanuma, G. J. Lal, and V. Ravi, “Speech emotion recognition using CNN-LSTM and vision transformer,” in Proc. Int. Conf. Innovations in Bio-Inspired Computing and Applications, 2022, pp. 86–97.
- W. Li, C. Fang, Z. Zhu, C. Chen, and A. Song, “Fractal spiking neural network scheme for EEG-based emotion recognition,” IEEE J. Transl. Eng. Health Med., vol. 12, pp. 106–118, 2023.
- T.-W. Kim and K.-C. Kwak, “Speech emotion recognition using deep learning transfer models and explainable techniques,” Applied Sciences, vol. 14, no. 4, p. 1553, 2024, doi: 10.3390/app14041553.
- X. Tang, J. Huang, Y. Lin, T. Dang, and J. Cheng, “Speech emotion recognition via CNN-transformer and multidimensional attention mechanism,” Speech Communication, p. 103242, 2025, doi: 10.1016/j.specom.2025.103242.
- Z. Wei, C. Ge, C. Su, R. Chen, and J. Sun, “A Deep Learning Model for Speech Emotion Recognition on RAVDESS Dataset,” Int. J. Adv. Comput. Sci. Appl., vol. 16, no. 5, 2025.
|

HOME

- Conference
- Journal
- Paper Submission to Journal
- Paper Submission to Conference
- For Authors
- For Reviewers
- Important Dates
- Conference Committee
- Editorial Board
- Reviewers
- Last Proceedings

PROCEEDINGS
-
Volume 13, Issue 4 (ICAIIT 2025)
-
Volume 13, Issue 3 (ICAIIT 2025)
-
Volume 13, Issue 2 (ICAIIT 2025)
-
Volume 13, Issue 1 (ICAIIT 2025)
-
Volume 12, Issue 2 (ICAIIT 2024)
-
Volume 12, Issue 1 (ICAIIT 2024)
-
Volume 11, Issue 2 (ICAIIT 2023)
-
Volume 11, Issue 1 (ICAIIT 2023)
-
Volume 10, Issue 1 (ICAIIT 2022)
-
Volume 9, Issue 1 (ICAIIT 2021)
-
Volume 8, Issue 1 (ICAIIT 2020)
-
Volume 7, Issue 1 (ICAIIT 2019)
-
Volume 7, Issue 2 (ICAIIT 2019)
-
Volume 6, Issue 1 (ICAIIT 2018)
-
Volume 5, Issue 1 (ICAIIT 2017)
-
Volume 4, Issue 1 (ICAIIT 2016)
-
Volume 3, Issue 1 (ICAIIT 2015)
-
Volume 2, Issue 1 (ICAIIT 2014)
-
Volume 1, Issue 1 (ICAIIT 2013)

PAST CONFERENCES
ICAIIT 2025
-
Photos
-
Reports
ICAIIT 2024
-
Photos
-
Reports
ICAIIT 2023
-
Photos
-
Reports
ICAIIT 2021
-
Photos
-
Reports
ICAIIT 2020
-
Photos
-
Reports
ICAIIT 2019
-
Photos
-
Reports
ICAIIT 2018
-
Photos
-
Reports
ETHICS IN PUBLICATIONS
ACCOMODATION
CONTACT US
|
|