Archive for the 'sound' Category

the effect of slope

23 January 2008

did some tests with the linear outputs to see how the slope affected the training of the network. all networks were trained with 20000 random training patterns and had an output target multiplier of 10. networks with output slopes of 5 and 10 failed, with errors jumping very high very quickly, and output nodes always outputting either positive or negative 490000. networks with output slopes of 1 and 0.1 worked, with 0.1 working just a tiny bit better than slope of 1, achieving a slightly lower minimum error. both networks still failed to recognize the first note. i am not going to try the same tests with even lower slopes.

linear output nodes

23 January 2008

yesterday i modified my network so that different activation functions could be used in each layers, and set it so that the output layer had linear activation functions rather than the typical sigmoid. the network was trained overnight on about 563000 randomly generated training patterns. results don’t seem as good as the sigmoidal outputs… here is the network’s output for the four-note scale:

glocklinear.png

the last three notes were hit fairly well, but it looks like it completely missed the first one. i think it might be best to go back to the sigmoid outputs for now.

today i need to figure out a better way for checking the network’s error. i am thinking maybe generating 100 training patterns and finding the error for each of those, then averaging. i also need to look more into how to write to files. the way i’m using now (writing as “Package”) output all the data without rounding anything off, but i don’t know if it can easily be imported back into mathematica.

training glockenspiel over the long weekend

22 January 2008

trained the the four-note glockenspiel network setup over the long weekend… unfortunately i was unable to tell how many training iterations it went through, but it seems to fit randomly generated training patterns pretty well. here is the output for the sound that the training patterns were based on:

glockscale.png

the network’s weights are saved in sndnn03.nb.

training sound

18 January 2008

finished updating the old-style network today. current file is sndnn03.nb. started training on a simple set of four notes played on glockenspiel (network has four outputs). one of the note’s major peak was outside the 64 point range i usually use for fourier transform, so i extended it to 72 points for this problem. noticed that training was very slow and what was causing the slowness was the “detune” function i had written yesterday, so i decided that instead of using it on every single new training pattern i would just run detune 21 times on every note of every instrument (from multiplier of 0.99 to 1.01) and store those sounds, then randomly select one of the tunings for each training set. this sped training up considerably, probably by at least a factor of 100.

sound

17 January 2008

worked on looking back into my sound stuff today, transferring functions over to my new network program, making improvements and completely rewriting some things so that they can be used easier. while looking into my old program i saw that i actually hadn’t been pitchshifting/detuning the sound files (i only did that with generated sounds), so i wrote a function that can pitchshift the files within mathematica. i also greatly improved the random training pattern generation function… now it will always output at least one instrument playing one note (but now that i think about it there should be at least a small chance that there will be no instrument playing) and instruments can be specified to be polyphonic or not. another idea i might try to implement in the future is adding random noise to the training patterns. the file is sndnn01.nb and it is backed up on euclid.

also, the three other computers in here have been set up to not reset, so in the future i’ll be able to train overnight on four machines.