Tickle.py Dev Diary #4
We finished training our model through Colab Notebook (thank you Google and your Tesla T4) and we have audio files! Below is the first Generated Example that we got from when my second Epoch finished training. (4500 samples at 128 batch size per Epoch)
As you can see it’s nothing too special and mostly silence, we started getting explicit ASMR triggers in our examples much later at Epoch = 8.
As you can see over here we are getting clear ASMR triggers and I am very happy with how this checkpoint sounds. However, out of curiosity I continued training my data for 12 more Epochs.
The last good click you can hear is at Epoch 12, afterwards the generated audio examples are very quiet and sound like they are filled with noise. As you can hear below at Epoch = 20.
This could be a classic example of overfitting the model but to be sure I will be generating multiple, 12 second examples from each epoch from 8 onwards to make sure I have a clear picture of how each training stage sounds as at the moment I only have one example per stage.
The next steps would be to analyze each training checkpoint to find if one that I am happy with and sounds great. And then write a wrapper for the prism-samplernn generate script that will create audio from my chosen checkpoint (I also want to normalise the audio as most of these generated examples are very quiet).
The SampleRNN instance running on my computer locally is still half way through its first epoch at a batch size of 4 so its unlikely I will have suitable examples from that but if it finishes before I am done with the script it would be nice to compare the difference in sounds between different batch sizes. (Special thank you to Kevin for introducing me to Colab Notebooks otherwise I would be in development hell still).
Anyways Almost done!