Dataset generation tool for language identification systems that use deep convolutional recurrent neural networks.
Datasets are an integral part of the field of machine learning. The tool can generate a human-labelled dataset from the environment(both noisy and clear) for language identification (LID) systems that use deep convolutional recurrent neural networks (CRNN). The system takes the recording received from a microphone and has a ".wav" extension. The next step is to make segments with 10 seconds duration from the recording. After that system generates spectrograms form segments and checks for bad images. The final results are spectrogram images dataset containing description files for training, validation and testing parts of the dataset. Description files have ".csv" extension and contain the links of the spectrograms in the local memory and their corresponding indexes(label of data name). The steps described are shown in figure below.
Configurations for recording:
- Sampling size is 16 bit integer format.
- Channels number is 1.
- Sampling rate is 44100 Hz.
- The number of frames per buffer is 1024.
The spectrogram generation process can be done in two ways:
- Generation from a noisy environment.
- Generation from a clear environment.
Configurations for generating spectrograms from noisy and/or clear environments:
- Spectrogram image contains 50 pixel per second.
- Spectrogram image is 129x500.
- Channels number is 1.
- Channel is mono.
- Rate is 10k.
The creation of spectrograms is followed by the process of checking bad spectrograms. The steps are as follows:
- The image becomes a vector of the sequence of pixel values .
- The arithmetic mean of the pixel values is calculated.
- The arithmetic mean of the pixel values calculated in the previous step is subtracted from each pixel value of the image.
- The image is removed from the dataset if the number of non-zero elements in the vector obtained by subtraction is equal to zero.
pip3 install -r requirements.txt
Go into the "Source" folder and run "python3 main.py" command.