Ideas for lipsync and visemes? #47

kinoc · 2021-12-21T07:25:24Z

First, love the project !

I have a robotic and virtual agent project that I'm trying to get as close to real-time response as possible.
I use the following to generate speech:
python3 fastVoice.py | larynx -v ek --interactive --ssml --raw-stream --cuda --half --max-thread-workers 8 --stdin-format lines --process-on-blank-line| aplay -r 22050 -c 1 -f S16_LE
Where fastVoice.py just dumps the SSML from a socket onto stdin (remember to flush properly ...)
fastVoice.txt

All works very well. Audio generally starts <1s from receiving the message. The question is how to get a phoneme-viseme sequence synced with the audio output.
I can manage to get level 0-ish lipsync by looking at the amplitude of the audio output, but that gives enough info for just the jaw, not the viseme's of the lips.

Do you have any ideas/pointers on how to maintain the responsiveness of "--raw-stream" while getting real-time matching info to generate the matching visemes?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas for lipsync and visemes? #47

Ideas for lipsync and visemes? #47

kinoc commented Dec 21, 2021

Ideas for lipsync and visemes? #47

Ideas for lipsync and visemes? #47

Comments

kinoc commented Dec 21, 2021