messing around with the quickprop-cascade net some more… it has been training a single candidate node weightset for about 24 hours now… maybe i should try to implement quickprop on the candidate training as well? here is the output after 4 nodes added, currently working on adding the fifth node.

Archive for December, 2007
31 december 2007
31 December 2007problems with quickprop…
28 December 2007did some testing of the quickprop-cascade network today, it seems to act really weird… after adding 44 hidden nodes (on a problem that could probably be solved with 2 or 3) it was only able to do this:

more quickprop + cascade
26 December 2007worked on fixing some of the problems described in the last post. network can now solve simple problems all the time, but some very quick tests showed it might be having problems with anything more complicated still. file cascnet12.nb backed up on euclid.
still working on quickprop with cascade
23 December 2007just made a few small changes to the file today, added a new function OnceThruTS that will make making changes easier. need to add checks for nonzero denominator and nonzero weight changes, but need to look at every weight on every output node individually. file cascade11.nb backed up on euclid.
quickprop
22 December 2007i think i’ve gotten what a rough implementation of quickprop halfway working… one problem i was having previously was understanding the terms used in the paper. i am using α as the weight adjustment coefficient and λ as the momentum term, while the fahlman paper used ε as the weights coefficient and α as the momentum term. once i fixed a few problems coming from that misunderstanding the training seems to work sometimes. i am using the cascade net, but only looking at training with no hidden nodes being added with a simple problem that can be solved without hidden nodes. when it works the network error rapidly decreases, though the amount it decreases by each time through the training set jumps around a lot more than with the standard delta rule (which is expected). there are two different ways in which it can not work properly. first is when it gives a divide by zero ComplexInfinity error. this problem might be related to the solution of introducing the “maximum growth factor” as described on page 11 of the paper. second is when it continuously adjusts the weights by zero when there is still a large error. this is probably because at some point the weight change was zero, and using the quickprop formula with no modifications will lead to multiplication of all future weight adjustments by zero. a solution for this problem may also be discussed on page 11. modifications were made to the file cascnet10.nb, backed up on euclid.
testing
21 December 2007did a little bit more testing of the cascade net (not using quickprop) overnight, similar behavior to the first test was observed. after one node was added the network gave output similar to the output when five nodes were added (seen below). also modified it so that after training each hidden node the program will display a chart of how “S” changed over time. it seems to be constantly increasing. backed up the file cascnet8b.nb on euclid.

implementing quickprop
8 December 2007did some work on implementing the quickprop algorithm but it doesn’t seem to be working properly right now. the change in network error change decreases very rapidly without the network’s error actually changing that much at all. the file cascnet9.nb with the quickprop has been backed up on euclid. i also made the network make use of the “momentum term” (as described in gurney’s book) starting with the file cascnet8.nb, which is also backed up on euclid.
use of the cascade network on very simple patterns
7 December 2007the cascade network is able to match patterns divided by a single line without adding any nodes.


still not right
2 December 2007training of the cascade network still doesn’t seem to be right, even with the same general “circle” pattern it very rarely gets as close as my first try did. i might look into this paper by fahlman, which describes what seem to be similar to some of the problems i’m having. i think i will try to implement the “quickprop” algorithm described there.
after training overnight
2 December 2007the cascade network added 104 nodes and its output is almost exactly the same as when there were only 4 nodes. it looks like 4 red points were missed and 0 blue points were missed. network error was 0.0852 (i had set it up to stop training at 0.05). the problem, i believe, is that training is ending prematurely for both the hidden and output nodes. i observed some behavior in which the “change in error” first drops to a very low number, then starts to rise again. if it drops so low that the minimum error change level is passed, training will of the weights will stop.