Combining connectionism with the GA
Rik Belew
rik%cs at ucsd.edu
Thu Nov 9 15:53:09 EST 1989
There's been too much Email ink spilled lately about potential interactions
between evolutionary algorithms and connectionist networks for me not to
throw my two cents in. I announced a new tech report, "Evolution, learning
and culture: Computational metaphors for adaptive algorithms" on this list a
couple of weeks ago and I don't know whether the recent storm of messages
has anything to do with that report or not. Anyway, some of the things I'll
say below are better said there.
Let me begin with Hammerstrom's analysis, related in [rudnick at cse.ogc.edu
26 Oct 89]. His basic point seems to be that connectionist algorithms (he
uses NetTalk as his example) take a long time, and putting an evolutionary
outer loop around them can only make matters worse. And if we are
satisfied with the results of a single backpropagation (BP) search and the
evloutionary loop is doing nothing more than randomly repeating this
experiment, he may be right.
But only if. First, who today is satisfied with vanilla BP in feed-forward
networks? (And I'm not just BP bashing; almost all of what I say applies
equally to other gradient descent connectionist learning algorithms.) A
great deal of current research is concerned with both simplifying such nets
to more minimal structures, and elaborating the nets (e.g., with recurrence)
to solve more difficult problems. Also, BP performance is known to be
highly variable under stochastic variation. Consequently most investigators
do use an 'outer-loop' iteration, using multiple restarts to improve their
confidence in the solution found. The Genetic Algoritms (GAs) can help with
these connectionist problems. (To clarify, when I talk of 'evolutionary
algorithms' I have some variant of John Holland's Genetic Algorithm in mind
because I believe it to be the most powerful. But there are other evolution-
like algorithms (eg, Ackley, Fogel Jr. & Sr.) and these may also prove useful
to connectionists.)
Second, there is a considerable body of work that shows that evolutionary
search is far from random. GA's are extremely effective sampling
procedures, at least on some kinds of problems. (See Goldberg's book or the
most recent GA proceedings for characterizations of what makes a problem
'GA-hard'.) Further, there are reasons to believe that connectionist nets are
a good problem for the GA: the GLOBAL SAMPLING performed by the GA is
very compimentary to the LOCAL SEARCH performed by gradient descent
procedures like BP. Bridges complains that we are compounding ignorance
when we try to consider hybrids of connectionist and GA algorithms
[clay at cs.cmu.edu 7 Nov 89]. But we are beginning to understand the basic
features of connectionist search (as function approximators, via analysis of
internal structure,etc.), and there are substantial things known about the
GA, too (e.g., Holland's Schema Theorem and its progeny). These foundations
do suggest deliberate research strategies and immediately eliminate others
(eg, some of the naive ways in which we might encode a network onto a GA
string).
There are a tremendous number of ways the techniques can be combined,
with the GA as simply an outer loop around a conventional BP simulation
being one of the least imaginative. For example, when used in conjunction
with the GA there is a very good question as to how long each BP training
cycle must be in order to provide a useful fitness measure for the GA.
Preliminary results of ours suggest much shorter training times are
possible. Similarly, use of the GA seems to tolerate quicker search (i.e.,
higher learning rate and momentum values) than typical. Another rich
dimension is just how a GA bit string is related to a net, from a literal
encoding of every real number weight at one extreme to complex
developmental programs that build nets at the other. One feature of these
hybrids that should not be underestimated is that the GA offers a very
natural mechanism for introducing PARALLELISM into connectionist
algorithms, since each individual in a population can be evaluated
independently. We have had some success exploiting this parallelism in two
implementations, one in a SIMD (Connection Machine) environment and one
using a heterogeneous mess of distributed computers.
Finally we shouldn't let computer science drive the entire agenda.
Theoretical biology and evolutionary neurophysiology have some extremely
important and unresolved questions that models like these can help to
address, for example concerning complex, Lamarckian-like interactions
between the learning of individuals and the evolution of species. (I think
the Harp et al. simulation may be particularly useful to evolutionary
neurophysiologists.) One of the things that makes connectionism most
exciting is that the same class of systems that interest (some)
mathematicians as new statistical techniques also interest neuroscientists
as a theory to coordinate their data collection. I think GA/connectionist
hybrids are important for similar reasons: they make sense as algorithms
AND they make sense as models of natural phenomena.
This note has been long on enthusiasm and short on specifics. But some
results have already been reported, by me and others. And I expect there to
be many new results reported at the upcoming sessions (at IJCNN in
Washington and at the NIPS workshop) devoted to this topic. So watch this
space.
Richard K. Belew
rik%cs at ucsd.edu
Assistant Professor
CSE Department (C-014)
UCSD
San Diego, CA 92093
619 / 534-2601 or 534-5288 (messages)
More information about the Connectionists
mailing list