batch learning

Fri Nov 1 17:00:02 EST 1991

I think my question concerning batch learning needs some amplification.

The question was whether to add all contributions to the error from each
pattern before taking derivatives. The point is that the batch dE/dW can be estimated (given limited precision, noise, etc) more accurately than the sum
of many small dE(pat)/dW components. This will be particularly true towards
the end of learning when errors are small.

There may be several ways to determine the weight error derivative. The batch dE/dW would be most directly determined by twiddling the weights individually
and rerunning the training set. I know this is expensive but the issue is
accuracy not efficiency.

Howard Card