[WIP] Zero-Shot Multi-Speaker Tacotron2#2120
[WIP] Zero-Shot Multi-Speaker Tacotron2#2120mravanelli merged 36 commits intospeechbrain:developfrom
Conversation
|
Thank you @pradnya-git-dev for submitting this PR! Your contribution is greatly appreciated as it adds a valuable feature to SpeechBrain. Below are my comments and suggestions: Readme Updates:
Recipe Test Failures:
python -c 'from tests.utils.recipe_tests import run_recipe_tests; print("TEST FAILED!") if not(run_recipe_tests(filters_fields=["Dataset"], filters=[["LibriTTS"]], do_checks=True, run_opts="--device=cuda")) else print("TEST PASSED")'
ERROR: Error in LibriTTS_row_03 (recipes/LibriTTS/TTS/mstacotron2/hparams/train.yaml). Check tests/tmp/LibriTTS_row_03/stderr.txt and tests/tmp/LibriTTS_row_03/stdout.txt for more info.
TEST FAILED!
spk_emb = speaker_embeddings[raw_batch[idx]["uttid"]]
KeyError: 'LJ050-0131'Script Redundancy:
Code Optimization:
|
|
Thank you @pradnya-git-dev for working on this PR! It is an important first step toward zero-shot TTS in SpeechBrain. The quality of the generated speech can be improved, but we will do that in a follow up PR. |
Contribution in a nutshell
Hey, this could help our community work with zero-shot multi-speaker text-to-speech
Scope