the method described in the previous post doesn’t seem to work so well on other problems… i have been trying it on the “circle” problem and the “two circle” problem and it does not work very well on those. it seems the way that is working best in general now is the old standard backprop/gradient descent method that looks at every training pattern individually.
also, the new method didn’t work very well on the cascade spiral problem, either.