NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints

Tu Ming; Xie Xiang; Jiao Yishan

doi:10.2991/icmt-13.2013.67

<Previous Article In Volume

Next Article In Volume>

NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints

Authors

Tu Ming, Xie Xiang, Jiao Yishan

Corresponding Author

Tu Ming

Available Online November 2013.

DOI: 10.2991/icmt-13.2013.67 How to use a DOI?
Keywords: non-negative matrix factorization·speech and music separation·sparse coding·temporal continuity·semi-supervised learning
Abstract: This paper proposes a semi-supervised approach of speech and music separation in monaural speech recordings based on non-negative matrix factorization (NMF). Considering the scenario that the genre of background music is known, music basis vectors are randomly picked from the magnitude of short time fourier transform (STFT) of training music, while speech basis vectors are estimated by executing NMF on the magnitude of STFT of polluted speech signal. Moreover, we apply sparseness and temporal continuity constraints to speech and music respectively and evaluate how different constraints can in uence the separation performance. The test set contains 10 Mandarin speech utterances from 10 speakers mixed with music in different speech-music ratios (SMR). The baseline is semi-supervised separation system with no constraint. The results reveal that adding temporal continuity constraint can improve the separation performance compared with the baseline and separation system with only sparseness constraint.
Copyright: © 2013, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13)
Series: Advances in Intelligent Systems Research
Publication Date: November 2013
ISBN: 978-90-78677-89-5
ISSN: 1951-6851
DOI: 10.2991/icmt-13.2013.67 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Tu Ming
AU  - Xie Xiang
AU  - Jiao Yishan
PY  - 2013/11
DA  - 2013/11
TI  - NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints
BT  - Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13)
PB  - Atlantis Press
SP  - 541
EP  - 548
SN  - 1951-6851
UR  - https://doi.org/10.2991/icmt-13.2013.67
DO  - 10.2991/icmt-13.2013.67
ID  - Ming2013/11
ER  -

download .riscopy to clipboard