Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13)

NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints

Authors
Tu Ming, Xie Xiang, Jiao Yishan
Corresponding Author
Tu Ming
Available Online November 2013.
DOI
10.2991/icmt-13.2013.67How to use a DOI?
Keywords
non-negative matrix factorization·speech and music separation·sparse coding·temporal continuity·semi-supervised learning
Abstract

This paper proposes a semi-supervised approach of speech and music separation in monaural speech recordings based on non-negative matrix factorization (NMF). Considering the scenario that the genre of background music is known, music basis vectors are randomly picked from the magnitude of short time fourier transform (STFT) of training music, while speech basis vectors are estimated by executing NMF on the magnitude of STFT of polluted speech signal. Moreover, we apply sparseness and temporal continuity constraints to speech and music respectively and evaluate how different constraints can in uence the separation performance. The test set contains 10 Mandarin speech utterances from 10 speakers mixed with music in different speech-music ratios (SMR). The baseline is semi-supervised separation system with no constraint. The results reveal that adding temporal continuity constraint can improve the separation performance compared with the baseline and separation system with only sparseness constraint.

Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13)
Series
Advances in Intelligent Systems Research
Publication Date
November 2013
ISBN
10.2991/icmt-13.2013.67
ISSN
1951-6851
DOI
10.2991/icmt-13.2013.67How to use a DOI?
Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Tu Ming
AU  - Xie Xiang
AU  - Jiao Yishan
PY  - 2013/11
DA  - 2013/11
TI  - NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints
BT  - Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13)
PB  - Atlantis Press
SP  - 541
EP  - 548
SN  - 1951-6851
UR  - https://doi.org/10.2991/icmt-13.2013.67
DO  - 10.2991/icmt-13.2013.67
ID  - Ming2013/11
ER  -