Music source separation using WaveUNet(https://arxiv.org/pdf/1806.03185.pdf)
Separate 4 audio stems(vocal, drum, bass, others) from music.
- clone this repository
- install requirements
python -m pip install -r requirements.txt
- install ffmpeg
for linux:
sudo apt install ffmpeg
for windows:
go to ffmpeg installation webpage(https://ffmpeg.org/download.html#build-windows) and download ffmpeg executable file and add downloaded directory's path(path\ffmpeg\bin
) to your environment variable Path
.
You can train WaveUNet model with MUSDB18 dataset.
- download musdb18 dataset(https://sigsep.github.io/datasets/musdb.html#musdb18-compressed-stems)
- unzip
musdb18.zip
ormusdb18hq.zip
- process the mp4 musdb dataset to numpy arrays. in
data
directory, you can useprocess.py
to separate mp4 audio data into mp3 audio stems, and convert them into the same length array segments.
python process.py path/to/musdb18/dataset path/to/save
- train the model using
train.py
. if you have processed the musdb18 dataset withprocess.py
, the directory of the train dataset will bepath/to/save/train/data_split
, and the test dataset will bepath/to/save/test/data_split
.
python train.py path/train/data_split path/test/data_split
- the best model(by validation loss, using early stopping) will be saved as
model.pt
.
Or, you can use your own dataset to train the WaveUNet model, but you may need to use your own data processing method or script.
You can separate the music into 4 stems using predict.py
: vocal, drum, bass, and others.
Result of the prediction will be saved in output/song_out_n.wav
. n
will be the stem numbers, from 0 to 3, each means the drum, bass, others, and vocal.
python predict.py --path_model model.pt --path_song path/to/song.mp3
Model | Test sdr(accompanies) | Test sdr(vocal) | File |
---|---|---|---|
baseline | 1.374189140248129 | 0.9882581233978271 | checkpoint/baseline.pt |
This is just the trial of implementation of WaveUNet. The models are not may as good as the result of the original paper, or the other implementations.