BSER-Bengali Speech Emotion Recognition using MFCC Features Based on SUBESCO and BanglaSER: A Comparative Study of Machine Learning Models

Nripendro Biswas; Ahmed Rijvee; Munaem Ahmed Mahdi; Alif Azam; Md. Takoat Hossain

doi:10.2991/978-94-6463-884-4_77

<Previous Article In Volume

Next Article In Volume>

BSER-Bengali Speech Emotion Recognition using MFCC Features Based on SUBESCO and BanglaSER: A Comparative Study of Machine Learning Models

Authors

Nripendro Biswas¹^{, *}, Ahmed Rijvee², Munaem Ahmed Mahdi¹, Alif Azam¹, Md. Takoat Hossain¹

¹Shahjalal University of Science of Technology, Sylhet, 3114, Bangladesh

²University of Rajshahi, Motihar, Kajla, Rajshahi, 6205, Bangladesh

^*Corresponding author. Email: nripendro4200@gmail.com

Corresponding Author

Nripendro Biswas

Available Online 18 November 2025.

DOI: 10.2991/978-94-6463-884-4_77 How to use a DOI?
Keywords: SER; MFCC; Data augmentation; KNN; Machine learning; sampling rate
Abstract: Speech-emotion recognition,(SER) is an important category in human-computer interaction as it makes it possible for a machine to comprehend and interpret human emotions. It seeks to determine the categories of emotions elicited through speech signals during social communication. This study focuses on the impact of feature extraction techniques and machine learning models on SER performance, utilizing a combined dataset of the SUBESCO and BanglaSER datasets and individual also. Data augmentation through noise injection enhances model robustness. Mel-Frequency Cepstral Coefficients (MFCCs) are extracted as primary features, with a sensitivity analysis conducted by varying the number of MFCCs (1 to 50) to determine the optimal value for SER accuracy. The impact of different sampling rates on feature extraction and subsequent model performance is also investigated. Various machine learning models, including K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Random Forest, and Logistic Regression, are trained and evaluated. The sensitivity analysis reveals that using 30 MFCCs yields the highest accuracy, and higher-order MFCCs significantly improve model performance in BSER. The optimal sampling rate for the dataset is found to be 44,100 Hz. Among the evaluated models, KNN demonstrates the best performance with an accuracy of 92.10%. This study provides important insights regarding the practical application of machine learning for SER in the context of the Bengali language.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 8th International Conference on Engineering Research, Innovation, and Education 2025 (ICERIE 2025)
Series: Advances in Engineering Research
Publication Date: 18 November 2025
ISBN: 978-94-6463-884-4
ISSN: 2352-5401
DOI: 10.2991/978-94-6463-884-4_77 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Nripendro Biswas
AU  - Ahmed Rijvee
AU  - Munaem Ahmed Mahdi
AU  - Alif Azam
AU  - Md. Takoat Hossain
PY  - 2025
DA  - 2025/11/18
TI  - BSER-Bengali Speech Emotion Recognition using MFCC Features Based on SUBESCO and BanglaSER: A Comparative Study of Machine Learning Models
BT  - Proceedings of the 8th International Conference on Engineering Research, Innovation, and Education 2025 (ICERIE 2025)
PB  - Atlantis Press
SP  - 636
EP  - 644
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6463-884-4_77
DO  - 10.2991/978-94-6463-884-4_77
ID  - Biswas2025
ER  -

download .riscopy to clipboard