CONNECTIONIST LEARNING: IS IT TIME TO RECONSIDER THE FOUNDATIONS?

Wed Apr 23 21:12:46 EDT 1997

1997 International Conference on Neural Networks (ICNN'97)
Houston, Texas (June 8 -12, 1997)
----------------------------------------------------------------
Further information on the conference is available on the
conference web page:

http://www.eng.auburn.edu/department/ee/ICNN97
------------------------------------------------------------------
PANEL DISCUSSION ON

"CONNECTIONIST LEARNING: IS IT TIME TO RECONSIDER THE FOUNDATIONS?"

-------------------------------------------------------------------
This is to announce that a panel will discuss the above question at
ICNN'97 on Monday afternoon (June 9). Below is the abstract for the
panel discussion broadly outlining the questions to be addressed. I
am also attaching a slightly modified version of a subsequent note
sent to the panelist. I think the issues are very broad and the
questions are simple. The questions are not tied to any specific
"algorithm" or "network architecture" or "task to be performed."
However, the answers to these simple questions may have an enormous
effect on the "nature of algorithms" that we would call
"brain-like" and for the design and construction of autonomous
learning systems and robots. I believe these questions also have a
bearing on other brain related sciences such as neuroscience,
neurobiology and cognitive science.

Please send any comments on these issues directly to me
(asim.roy at asu.edu). I will post the collection of responses to the
newsgroups in a few weeks. All comments/criticisms/suggestions are
welcome. All good science depends on vigorous debate.

Asim Roy
Arizona State University

-------------------------
PANEL MEMBERS

1. Igor Aleksander
2. Shunichi Amari
3. Eric Baum
4. Jim Bezdek
5. Rolf Eckmiller
6. Lee Giles
7. Geoffrey Hinton
8. Dan Levine
9. Robert Marks
10. Jean Jacques Slotine
11. John G. Taylor
12. David Waltz
13. Paul Werbos
14. Nicolaos Karayiannis (Panel Moderator, ICNN'97 General Chair)
15. Asim Roy

Six of the above members are plenary speakers at the meeting.
-------------------------

PANEL TITLE: 

"CONNECTIONIST LEARNING: IS IT TIME TO RECONSIDER THE FOUNDATIONS?"

ABSTRACT

Classical connectionist learning is based on two key ideas. First,
no training examples are to be stored by the learning algorithm in
its memory (memoryless learning). It can use and perform whatever
computations are needed on any particular training example, but
must forget that example before examining others. The idea is to
obviate the need for large amounts of memory to store a large
number of training examples. The second key idea is that of local
learning - that the nodes of a network are autonomous learners.
Local learning embodies the viewpoint that simple, autonomous
learners, such as the single nodes of a network, can in fact
produce complex behavior in a collective fashion. This second idea,
in its purest form, implies a predefined net being provided to the
algorithm for learning, such as in multilayer perceptrons.

Recently, some questions have been raised about the validity of
these classical ideas. The arguments against classical ideas are
simple and compelling. For example, it is a common fact that humans
do remember and recall information that is provided to them as part
of learning. And the task of learning is considerably easier when
one remembers relevant facts and information than when one doesn’t.
Second, strict local learning (e.g. back propagation type learning)
is not a feasible idea for any system, biological or otherwise. It
implies predefining a network "by the system" without having seen a
single training example and without having any knowledge at all of
the complexity of the problem. Again, there is no system that can
do that in a meaningful way. The other fallacy of the local
learning idea is that it acknowledges the existence of a "master"
system that provides the design so that autonomous learners can
learn.

Recent work has shown that much better learning algorithms, in
terms of computational properties (e.g. designing and training a
network in polynomial time complexity, etc.) can be developed if
we don’t constrain them with the restrictions of classical
learning. It is, therefore, perhaps time to reexamine the ideas of
what we call "brain-like learning."

This panel will attempt to address some of the following questions
on classical connectionists learning:

1.  Should memory  be used for learning? Is memoryless learning an
unnecessary restriction on learning algorithms?
2.  Is local learning a sensible idea? Can better learning
algorithms be developed without this restriction?
3.  Who designs the network inside an autonomous learning system
such as the brain?

-------------------------

A SUBSEQUENT NOTE SENT TO THE PANELIST

The panel abstract was written to question the two pillars of
classical connectionist learning - memoryless learning and pure
local learning. With regards to memoryless learning, the basic
argument against it is that humans do store information (remember
facts/information) in order to learn. So memoryless learning, as
far I understand, cannot be justified by any behavioral or
biological observations/facts. That does not mean that humans store
any and all information provided to them. They are definitely
selective and parsimonious in the choice of information/facts to
collect and store.

We have been arguing that it is the "combination" of memoryless
learning and pure local learning that is not feasible for any
system, biological or otherwise. Pure local learning, in this
context, implies that the system somehow puts together a set of
"local learners" that start learning with each learning example
given to it (e.g. in back propagation) without having seen a single
training example before and without knowing anything about the
complexity of the problem. Such a system can be demonstrated to do
well in some cases, but would not work in general.

Note that not all existing neural network algorithms are of this
pure local learning type. For example, if I understand correctly,
in constructive algorithms such as ART, RBF, RCE/hypersphere and
others,  a "decision" to create a new node is made by a "global
decision-maker" based on evidence on performance of the existing
system. So there is quite a bit of global coordination and
"decision-making" in those algorithms beyond the simple "local
learning".

Anyway, if we "accept" the idea that memory can indeed be used for
the purpose of learning (Paul Werbos indicated so in one of his
notes), the terms of the debate/discussion change dramatically. We
then open the door to the development of far more robust and
reliable learning algorithms with much nicer properties than
before. We can then start to develop algorithms that are closer to
"normal human learning processes". Normal human learning includes
processes such as (1) collection and storage of information about a
problem, (2) examination of the information at hand to determine
the complexity of the problem, (3) development of trial solutions
(nets)for the problem, (4) testing of trial solutions (nets), (5)
discarding such trial solutions (nets) if they are not good enough,
and (6) repetition of these processes until an acceptable solution
is found. And these learning processes are implemented within the
brain, without doubt, using local computing mechanisms of different
types. But these learning processes cannot exist without allowing
for storage of information about the problem.

One of the "large" missing pieces in the neural network field is
the definition or characterization of an autonomous learning system
such as the brain. We have never defined the external behavioral
characteristics of our learning algorithms. We have largely pursued
algorithm development from an "internal mechanisms" point of view
(local learning, memoryless learning) rather than from the point of
view of "external behavior or characteristics" of these resulting
algorithms. Some of these external characteristics of our learning
algorithms might be:(1) the capability to design the net on their
own, (2) polynomial time complexity of the algorithm in design and
training of the net, (3) generalization capability, and (4)
learning from as few examples as possible (quickness in learning).

It is perhaps time to define a set of desirable external
characteristics for our learning algorithms. We need to define
characteristics that are "independent of": (1) a particular
architecture, (2) the problem to be solved (function approximation,
classification, memory, etc.), (3)local/global learning issues, and
(4) issues of whether to use memory or not to learn. We should
rather argue about these external properties than issues of
global/local learning and of memoryless learning.

With best regards,
Asim Roy
Arizona State University