distributed/local

Fri Jun 7 22:48:11 EDT 1991

A nice paper that compares ID3 decision trees with backprop
on NETtalk and other data sets:

Shavlik, J. W., Mooney, R. J., and Towell, G. G.
Symbolic and neural learning algorithms: An experimental
comparison (revised).
Univ. Wisconsin Dept Comp. Sci Tech Report #955
(to appear Machine Learning #6).

Overall, backprop performed slightly better than ID3 but took
longer to train.  Backprop was also more effective in using
distributed coding schemes for the inputs and outputs.
An error-correcting code, or even a random code, works
better than a local code or hand-crafted features.
(Ghulum Bakiri and Tom Dietterich reached the same conclusion).

The issue of the code developed by the hidden units is also an
interesting issue.  In NETtalk, the intermediate code was
semidistributed -- around 15% of the hidden units were
used to represent each letter-to-sound correspondence.
The vowels and the consonants were fairly well
segregated, arguing for local coding at a gross population level
(something seen in the brain) but distributed coding at the
level of single units (also observed in the brain).
The degree of coarseness clearly depends on
the grain of the problem.

In the original study Charlie Rosenberg and I showed that
backprop with hidden units outperformed perceptorons,
and hence 26 independent linear discriminants.  The NETtalk
database is available to anyone who wants to benchmark
their learning algorithm.  For ftp access contact

Scott.Fahlman at b.gp.cs.cmu.edu

Terry