Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training model using only 1 instead of 5 dataset pairs #154

Open
maxmusterm4nn opened this issue Nov 19, 2023 · 2 comments
Open

Training model using only 1 instead of 5 dataset pairs #154

maxmusterm4nn opened this issue Nov 19, 2023 · 2 comments

Comments

@maxmusterm4nn
Copy link

maxmusterm4nn commented Nov 19, 2023

Hello tsurumeso,

First of all, I’m very pleased about your vocal-remover project!

I would like to train my own model using only 1 dataset pair (instrumental + mixture) instead of the default 5 pairs.

Could you please give an advice which settings should I change to do so?

I have football matches that are ca. 90-100 minutes long and contain multiple audio tracks with and without commentary. I’d like to use those sources to train my model one-by-one for each matches.

Do you think that it would work for these length of audio?

Furthermore, I’d like to buy AMD Radeon X7900XTX graphics card. Do you have any experience about training models using AMD GPU?

Thank you for your help, in advance! :-)

@aufr33
Copy link

aufr33 commented Nov 22, 2023

You can split the big pair into segments:

ffmpeg -i big_mix.wav -f segment -segment_time 300 -c copy %03d_mix.wav
ffmpeg -i big_inst.wav -f segment -segment_time 300 -c copy %03d_inst.wav

You definitely need validation pairs, about 20% of the entire dataset. The validation data must be different from the training data.

@maxmusterm4nn
Copy link
Author

Thank you for your advice!

Do you mean different, that in my case (where football stadium crowd noise is for instruments and + commentary is for mixture) the validation data should contain for example music+vocals?

How can I get know that which pair of the dataset will be the validation data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants