Tickle.py Dev Diary #7 - fin
I have successfully managed to make my model work in a standalone audio generation notebook! But more importantly, I have also come across a stack overflow answer which might solve my missing vertical audio artefact worries.
It seems that some of my generated wav files have clipped values assigned to their array which causes issues when reading the file, if I can replace these values with silence or noise I might potentially be able to salvage these outputs.
Downside of this approach means that we might have to deal with audio discontinuity which might ruin speakers. But this could also be a secondary way to generate ASMR content. That said I think this might be also a bit of a training issue from not supervising and tweaking the models output in a more hands on manner. This is however beyond the scope of my current project.
Right now i’m experimenting with turning the temperature up to 5. from 0.95 to see if that does anything to my outputs! The speed of generating audio is much faster as it now takes 3 minutes to generate 8 seconds of content!
Lets take a listen!
As you can see this is 1. Much louder than the other outputs and 2. much more noisy. Does this reflect in the plotted graphs however?
Yes. This is literally just noise.
Let me repeat this with a more trained model to see if this changes anything.
Yeah, no still noise. (What I am surprised by however is that both of these examples run fine and there are no sample discontinuities?, could the high temperature random noise be successfully masking these?)
Lets try lowering the temperature to something like 1.5 for e8 shall we?
Still noise but this is less noisy than 5.0, Looking carefully at the spectrograph we can see vertical sections that look similar to ASMR impulse responses forming however the audio is too noisy to perceive them clearly.
Let’s go down to t = 1.1 :
This sounds and looks much better! The ASMR clicks are clearly audible and visible in the spectrograph and the temperature noise is generating a soothing background noise.
Let me run this 4 more times to see if this is a fluke. Otherwise if these settings can reliably generate ASMR content I am happy to use these as my defaults for the tickle.py model!
Yay we have a reliable working ASMR model now! Future improvements could look into noise suppression to get rid of the white noise in the background but for now I am happy with the results as white noise is often found in the background of ASMR content or along with it so it is more a feature than an issue.
Also whilst making this I have finished making the final touches Google Colab Generator and am ready to release the model to the public! HAVE FUN! thank you prismRNN + Dr Christopher Melen this would not be possible without you!
I think I have achieved everything I wanted to do in this project so I am happy to leave it parked here for the time being now that there is a demonstrable outcome and maybe come back to it later as there are a few features that I would be interested in a adding (in addition, I intend to use the hidden seed audio generation capabilities as an audio effect in one of my future projects!)