Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023)

Selection of Optimal Solution for Example and Model of Retrieval Based Voice Conversion

Authors
Zhongxi Ren1, *
1Faculty of Data Science, City University of Macau, Macau, 999078, China
*Corresponding author.
Corresponding Author
Zhongxi Ren
Available Online 14 February 2024.
DOI
10.2991/978-94-6463-370-2_48How to use a DOI?
Keywords
Timbre conversion; Mel cepstral distortion,Model training; Objective evaluation; Subjective evaluation
Abstract

Since 2010, the computer has been developing continuously in the field of speech conversion, and now the speech-to-text technology has become mature, but the development of timbre conversion and imitation is not perfect. Recently a new tone imitation program has become a focus, but this program model training options are still lacking. This paper hopes to train the model through the in-depth practical operation of this program and the custom value in the model training step of this program. Multiple training processes of Retrieval Based Voice Conversion (RVC) model will be practiced, and the timbour produced by the model with different number of rounds will be compared with the sound source. After the model training, two evaluation methods were used to check the similarity of the evaluation model. One is the objective evaluation method based on Mel cepstral distortion principle, which is realized by software. The other is a subjective evaluation method based on the principle of directly collecting human sensory data. The similarity statistics are obtained respectively, the selection criteria of the general optimal solution model are obtained, and the relative standard training reference values are provided for users.

Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023)
Series
Advances in Intelligent Systems Research
Publication Date
14 February 2024
ISBN
10.2991/978-94-6463-370-2_48
ISSN
1951-6851
DOI
10.2991/978-94-6463-370-2_48How to use a DOI?
Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Zhongxi Ren
PY  - 2024
DA  - 2024/02/14
TI  - Selection of Optimal Solution for Example and Model of Retrieval Based Voice Conversion
BT  - Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023)
PB  - Atlantis Press
SP  - 468
EP  - 475
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-370-2_48
DO  - 10.2991/978-94-6463-370-2_48
ID  - Ren2024
ER  -