Query-by-example spoken term detection based on phonetic posteriorgram

Beili Song; Weiqiang Zhang; Meng Cai; Jia Liu; Michael T. Johnson

doi:10.2991/icemct-15.2015.256

<Previous Article In Volume

Next Article In Volume>

Query-by-example spoken term detection based on phonetic posteriorgram

Authors

Beili Song, Weiqiang Zhang, Meng Cai, Jia Liu, Michael T. Johnson

Corresponding Author

Beili Song

Available Online June 2015.

DOI: 10.2991/icemct-15.2015.256 How to use a DOI?
Keywords: query-by-example; spoken term detection; softmax output features; dynamic time warping.
Abstract: Spoken term detection in low-resource situations is a challenging problem, because traditional large vocabulary continuous speech recognition (LVCSR) approaches are often unusable. This paper introduces a method to use deep neural network (DNN) softmax outputs as input features in a query-by-example (QBE) spoken term detection (STD) system. Matches between queries and test utterances are located using a modified dynamic time warping (DTW) search approach. Subsystems are built with unsupervised Gaussian mixture model (GMM) and DNN monophone models trained on Chinese and English languages and evaluated on the SWS 2013 multilingual database of low-resource languages. The score-level fusion of these different subsystems are shown to improve performance significantly over the baseline results.
Copyright: © 2015, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2015 International Conference on Education, Management and Computing Technology
Series: Advances in Social Science, Education and Humanities Research
Publication Date: June 2015
ISBN: 978-94-62520-82-0
ISSN: 2352-5398
DOI: 10.2991/icemct-15.2015.256 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Beili Song
AU  - Weiqiang Zhang
AU  - Meng Cai
AU  - Jia Liu
AU  - Michael T. Johnson
PY  - 2015/06
DA  - 2015/06
TI  - Query-by-example spoken term detection based on phonetic posteriorgram
BT  - Proceedings of the 2015 International Conference on Education, Management and Computing Technology
PB  - Atlantis Press
SP  - 1251
EP  - 1256
SN  - 2352-5398
UR  - https://doi.org/10.2991/icemct-15.2015.256
DO  - 10.2991/icemct-15.2015.256
ID  - Song2015/06
ER  -

download .riscopy to clipboard