batch-mode parallel implementations

Fri Oct 18 14:30:16 EDT 1991

A couple of clarifications with regards to Yann's post:

i) The dataset used in the comparison had a high degree of redundancy.

ii) The "batch-mode" back-prop was vanilla fixed-step gradient descent, not
    a second order method.

The issue of "batch" versus "on-line" is still a very open one. For relatively
small problems (for me < ~5000 cases) I prefer conjugate gradient because
of accuracy and no need to tune parameters. These techniques are also very
easy to parallelize over cases.

I have also implemented on a Cray a BP simulator that vectorized over
connections rather than cases, and could implement on-line or batch techniques
with ease. My experience here suggested that speed-ups could be obtained
when the network had as few as a few thousand connections.

		- Steve