posting of Siegelman and Sontag

Thu May 16 11:33:47 EDT 1991

Dr.'s Siegelman and Sontag,

You might be interested in an article of mine which appeared in Complex
Systems last year ("A mathematical theory of generalization: part II",
pp. 201-249, vol. 4). In it I describe experiments using essentially
genetic algorithms to train recurrent nets whose output is signaled
when a certain pre-determined node exceeds a threshold (I call this "output
flagging"). This sounds very similar to the work you describe. In my
work, the training was done in such a way as to minimize a
cross-validation error (I call this "self-guessing error" in the
paper), and automatically had zero learning error. This strategy was
followed to try to achieve good generalization off of the learning set.
Also, the individual nodes in the net weren't neurons in the
conventional sense, but were rather parameterized input-output
surfaces; the training involved changing not only the architecture of
the whole net but also the surfaces at the nodes. An interesting
advantage of this technique is that it allows you to have a node
represent some environmental information, i.e., one of the input-output
surfaces can be "hard-wired" and represent something in the environment
(e.g., visual data). This allows you to train on one environment and
then simply "slot in" another one later; the recurrent net "probes" the
environment by sending various input values into this environment
node and seeing what comes out. With this technique you don't have to
worry about having input neurons represent the environmental data.

The paper is part of what is essentially a thesis dump; in
hindsight, it is not as well written as I'd like. You can probably
safely skip most of the verbiage leading up to the description of the
experiments. If you find the paper interesting, but confusing, I'd be
more than happy to discuss it with you.

Finally, as an unpublished extension of the work in the paper, I've
proved that, with output flagging, a continuous-limit recurrent net
consisting entirely of linear (!) neurons can mimic a universal
computer. Again, this sounds very similar to the work you
describe.

Please send me a copy of your paper when it's finished. Thanks.

			David Wolpert (dhw at tweety.lanl.gov)