Light Speed ⚡

Light Speed ⚡ is an open-source text-to-speech model based on VITS, with some modifications:

utilizes phoneme duration's ground truth, obtained from an external forced aligner (such as Montreal Forced Aligner), to upsample phoneme information to frame-level information. The result is a more robust model, with a slight trade-off in speech quality.
employs dilated convolution to expand the Wavenet Flow module's receptive field, enhancing its ability to capture long-range interactions.

Pretrained models and demos

We provide two pretrained models and demos:

VN - Male voice: https://huggingface.co/spaces/ntt123/Vietnam-male-voice-TTS
VN - Female voice: https://huggingface.co/spaces/ntt123/Vietnam-female-voice-TTS

FAQ

Q: How do I train on custom dataset?
A: See the ./prepare_audio_1_tfdata.ipynb notebook for instructions on preparing the training data.

Q: How can I train the model with 1 GPU?
A: Run: python train.py --tfdata tfdata --rm-old-ckpt > logs/run.log

Q: How can I train the model with specific GPU? A: Run: CUDA_VISIBLE_DEVICES=2,3 torchrun --standalone --nnodes=1 --nproc-per-node=2 train.py --tfdata tfdata --rm-old-ckpt --batch-size 16

Q: How can I train a model to predict phoneme durations?
A: See the ./train_duration_model.ipynb notebook.

Q: How can I generate speech with a trained model?
A: See the ./inference.ipynb notebook.

Credits

Most of the code in this repository is based on the VITS official repository.

fix bug

bug with mfa aligner use newer version of mfa see here

bug with vbx dataset install git lfs

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash sudo apt-get install git-lfs

when train with custom dataset txt file and wav must be in the same folder
need to train separate duration model when inference
run mfa train -h to see how to config train ( it train on CPU don't know why)
to view loss table with tensorboard
- ctrl + shift + p -> python:Launch Tensorboard -> choose logs folder

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
attentions.py		attentions.py
commons.py		commons.py
config.json		config.json
environment.yml		environment.yml
flow.py		flow.py
inference.ipynb		inference.ipynb
lightspeed-1.yml		lightspeed-1.yml
lightspeed.yml		lightspeed.yml
losses.py		losses.py
mel_processing.py		mel_processing.py
miniconda.sh		miniconda.sh
models.py		models.py
modules.py		modules.py
net.svg		net.svg
prepare_audio_1_tfdata.ipynb		prepare_audio_1_tfdata.ipynb
prepare_audio_1_tfdata_ver2.ipynb		prepare_audio_1_tfdata_ver2.ipynb
prepare_ljs_tfdata copy.ipynb		prepare_ljs_tfdata copy.ipynb
prepare_ljs_tfdata.ipynb		prepare_ljs_tfdata.ipynb
prepare_reinfo_tfdata.ipynb		prepare_reinfo_tfdata.ipynb
prepare_vbx_tfdata copy.ipynb		prepare_vbx_tfdata copy.ipynb
prepare_vbx_tfdata.ipynb		prepare_vbx_tfdata.ipynb
tfloader.py		tfloader.py
train.py		train.py
train_duration_model.ipynb		train_duration_model.ipynb
train_duration_model.py		train_duration_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Light Speed ⚡

Pretrained models and demos

FAQ

Credits

fix bug

About

Releases

Packages

Languages

Nghiauet/lightspeed

Folders and files

Latest commit

History

Repository files navigation

Light Speed ⚡

Pretrained models and demos

FAQ

Credits

fix bug

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages