-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Mimic 3 engine (TTS) #30
Comments
As a side note regarding installation: although the voice models should be automatically downloaded by the CLI app and stored in See also https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#downloading-voices and the |
Already working (not fully implemented) at 7fadf45. @rsantos88 in case you want to use this in the upcoming demos, assuming you have installed the Spanish voices, launch it with:
On the teo-self-presentation side, pass |
Done at 618cb83, see speechSynthesis.py. All IDL commands have been implemented except the pitch accessors, pause and resume. I'd consider expanding the API with volume commands and renaming "language" to "voice", which might or might not include speaker information. There are two caveats to this implementation/engine:
|
Author of Mimic 3 here. I'm continuing my TTS work elsewhere: https://github.com/rhasspy/larynx2/ |
Thank you for the heads-up and your great work! We have migrated from Mimic 3 to Piper (current name) at #33. |
@rsantos88 has found a "fast, privacy-focused, open-source, neural Text to Speech (TTS) engine" that looks great:
https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3
https://github.com/MycroftAI/mimic3
https://github.com/MycroftAI/mimic3-voices
It is lightweight, offline, and features human-like voice (as opposed to the more robotic one we currently use via eSpeak). It is written in Python and can be installed through pip (mycroft-mimic3-tts package). There are four available Spanish voices (3 male, 1 female) compiled in two datasets; the "tux" voice from the "m-ailabs" dataset sounds quite appealing.
Sample invocation:
mimic3 --voice es_ES/m-ailabs#tux "hola, me llamo teo y tengo 10 años"
(or simply
--voice es_ES/m-ailabs
sincetux
is the default voice)Pro tip: add
--cuda
to enable GPU acceleration (requires the "onnxruntime-gpu" pip package).I'm thinking of a Python client implementation of our TextToSpeech IDL service similar to speechRecognition.py.
The text was updated successfully, but these errors were encountered: