Connectionist Learning - Some New Ideas

Tue May 14 16:20:58 EDT 1996

We have recently published a set of principles for learning in neural
networks/connectionist models that is different from classical
connectionist learning (Neural Networks, Vol. 8, No. 2; IEEE
Transactions on Neural Networks, to appear; see references
below). Below is a brief summary of the new learning theory and
why we think classical connectionist learning, which is
characterized by pre-defined nets, local learning laws and
memoryless learning (no storing of training examples for learning),
is not brain-like at all. Since vigorous and open debate is very
healthy for a scientific field, we invite comments for and against our
ideas from all sides.

"A New Theory for Learning in Connectionist Models"

We believe that a good rigorous theory for artificial neural
networks/connectionist models should include learning methods
that perform the following tasks or adhere to the following criteria:

A. Perform Network Design Task: A neural network/connectionist
learning method must be able to design an appropriate network for
a given problem, since, in general, it is a task performed by the
brain. A pre-designed net should not be provided to the method as
part of its external input, since it never is an external input to the
brain. From a neuroengineering and neuroscience point of view, this
is an essential property for any "stand-alone" learning system - a
system that is expected to learn "on its own" without any external
design assistance.

B. 	Robustness in Learning: The method must be robust so as
not to have the local minima problem, the problems of oscillation
and catastrophic forgetting, the problem of recall or lost memories
and similar learning difficulties. Some people might argue that
ordinary brains, and particularly  those with learning disabilities, do
exhibit such problems and that these learning requirements are the
attributes only of a "super" brain. The goal of neuroengineers and
neuroscientists is to design and build learning systems that are
robust, reliable and powerful. They have no interest in creating
weak and problematic learning devices that need constant attention
and intervention.

C. 	Quickness in Learning: The method must be quick in its
learning and learn rapidly from only a few examples, much as
humans do. For example, one which learns from only 10 examples
learns faster than one which requires a 100 or a 1000 examples. We
have shown that on-line learning (see references below),  when not
allowed to store training examples in memory, can be extremely
slow in learning - that is, would require many more examples to
learn a given task compared to methods that use memory to
remember training examples. It is not desirable that a neural
network/connectionist learning system be similar in characteristics
to learners characterized by such sayings as "Told him a million
times and he still doesn't understand." On-line learning systems
must learn rapidly from only a few examples.

D. 	Efficiency in Learning: The method must be
computationally efficient in its learning when provided with a finite
number of training examples (Minsky and Papert[1988]). It must be
able to both design and train an appropriate net in polynomial time.
That is, given P examples, the learning time (i.e. both design and
training time) should be a polynomial function of P. This, again, is a
critical computational property from a neuroengineering and
neuroscience point of view.  This property has its origins in the
belief that  biological systems (insects, birds for example) could not
be solving NP-hard problems, especially when efficient, polynomial
time learning methods can conceivably be designed and developed.

E. 	Generalization in Learning: The method must be able to
generalize reasonably well so that only a small amount of network
resources is used. That is, it must try to design the smallest possible
net, although it might not be able to do so every time. This must be
an explicit part of the algorithm. This property is based on the
notion that the brain could not be wasteful of its limited resources,
so it must be trying to design the smallest possible net for every
task.

General Comments

This theory defines algorithmic characteristics that are obviously
much more brain-like than those of classical connectionist theory,
which is characterized by pre-defined nets, local learning laws and
memoryless learning (no storing of actual training examples for
learning). Judging by the above characteristics, classical
connectionist learning is not very powerful or robust. First of all, it
does not even address the issue of network design, a task that
should be central to any neural network/connectionist learning
theory. It is also plagued by efficiency (lack of polynomial time
complexity, need for excessive number of teaching examples) and
robustness problems (local minima, oscillation, catastrophic
forgetting, lost memories), problems that are partly acquired from
its attempt to learn without using memory. Classical connectionist
learning, therefore, is not very brain-like at all.

As far as I know, there is no biological evidence for any of the
premises of classical connectionist learning. Without having to
reach into biology, simple common sense arguments can show that
the ideas of local learning, memoryless learning and predefined nets
are impractical even for the brain! For example, the idea of local
learning requires a predefined network. Classical connectionist
learning forgot to ask a very fundamental question - who designs
the net for the brain? The answer is very simple: Who else, but the
brain itself! So, who should construct the net for a neural net
algorithm? The answer again is very simple: Who else, but the
algorithm itself! (By the way, this is not a criticism of constructive
algorithms that do design nets.) Under classical connectionist
learning, a net has to be constructed (by someone, somehow - but
not by the algorithm!) prior to having seen a single training
example! I cannot imagine any system, biological or otherwise,
being able to construct a net with zero information about the
problem to be solved and with no knowledge of the complexity of
the problem. (Again, this is not a criticism of constructive
algorithms.)

A good test for a so-called "brain-like" algorithm is to imagine it
actually being part of a human brain. Then examine the learning
phenomenon of the algorithm and compare it with that of the
human's. For example, pose the following question: If an algorithm
like back propagation is "planted" in the brain, how will it behave?
Will it be similar to human behavior in every way? Look at the
following simple "model/algorithm" phenomenon when the back-
propagation algorithm is "fitted" to a human brain. You give it a
few learning examples for a simple problem and after a while this
"back prop fitted" brain says: "I am stuck in a local minimum. I
need to relearn this problem. Start over again." And you ask:
"Which examples should I go over again?" And this "back prop
fitted" brain replies: "You need to go over all of them. I don't
remember anything you told me." So you go over the teaching
examples again. And let's say it gets stuck in a local minimum again
and, as usual, does not remember any of the past examples. So you
provide the teaching examples again and this process is repeated a
few times until it learns properly. The obvious questions are as
follows: Is "not remembering" any of the learning examples a brain-
like phenomenon? Are the interactions with this so-called "brain-
like" algorithm similar to what one would actually encounter with a
human in a similar situation? If the interactions are not similar, then
the algorithm is not brain-like. A so-called brain-like algorithm's
interactions with the external world/teacher cannot be different
from that of the human.

In the context of this example, it should be noted that
storing/remembering relevant facts and examples is very much a
natural part of the human learning process. Without the ability to
store and recall facts/information and discuss, compare and argue
about them, our ability to learn would be in serious jeopardy.
Information storage facilitates mental comparison of facts and
information and is an integral part of rapid and efficient learning. It
is not biologically justified when "brain-like" algorithms disallow
usage of memory to store relevant information.

Another typical phenomenon of classical connectionist learning is
the "external tweaking" of algorithms. How many times do we
"externally tweak" the brain (e.g. adjust the net, try a different
parameter setting) for it to learn? Interactions with a brain-like
algorithm has to be brain-like indeed in all respect.

The learning scheme postulated above does not specify how
learning is to take place - that is, whether memory is to be used  or
not to store training examples for learning, or whether learning is to
be through local learning at each node in the net or through some
global mechanism. It merely defines broad computational
characteristics and tasks (i.e. fundamental learning principles) that
are brain-like and that all neural network/connectionist algorithms
should follow. But there is complete freedom otherwise in
designing the algorithms themselves. We have shown that robust,
reliable learning algorithms can indeed be developed that satisfy
these learning principles (see references below). Many constructive
algorithms satisfy many of the learning principles defined above.
They can, perhaps, be modified to satisfy all of the learning
principles.

The learning theory above defines computational and learning
characteristics that have always been desired by the neural
network/connectionist field. It is difficult to argue that these
characteristics are not "desirable," especially for self-learning, self-
contained systems.  For neuroscientists and neuroengineers, it
should open the door to development of brain-like systems they
have always wanted - those that can learn on their own without any
external intervention or assistance, much like the brain. It essentially
tries to redefine the nature of algorithms considered to be brain-
like. And it defines the foundations for developing truly self-
learning systems - ones that wouldn't require constant intervention
and tweaking by external agents (human experts) for it to learn.

It is perhaps time to reexamine the foundations of the neural
network/connectionist field. This mailing list/newsletter provides an
excellent opportunity for participation by all concerned throughout
the world. I am looking forward to a lively debate on these matters.
That is how a scientific field makes real progress.

Asim Roy
Arizona State University
Tempe, Arizona 85287-3606, USA
Email: ataxr at asuvm.inre.asu.edu

References

1.  Roy, A., Govil, S. & Miranda, R. 1995. A Neural Network
Learning Theory and a Polynomial Time RBF Algorithm. IEEE
Transactions on Neural Networks, to appear.

2.  Roy, A., Govil, S. & Miranda, R. 1995. An Algorithm to
Generate Radial Basis Function (RBF)-like Nets for Classification
Problems. Neural Networks, Vol. 8, No. 2, pp. 179-202.

3.  Roy, A., Kim, L.S. & Mukhopadhyay, S. 1993. A Polynomial
Time Algorithm for the Construction and Training of a Class of
Multilayer Perceptrons. Neural Networks, Vol. 6, No. 4, pp. 535-
545.

4.  Mukhopadhyay, S., Roy, A., Kim, L.S. & Govil, S. 1993. A
Polynomial Time Algorithm for Generating Neural Networks for
Pattern Classification - its Stability Properties and Some Test
Results. Neural Computation, Vol. 5, No. 2, pp. 225-238.