On the structure of connectionist models
Lev Goldfarb
goldfarb at unb.ca
Thu Feb 22 16:56:11 EST 1996
Dear connectionists:
Since my posting of the workshop announcement (What is inductive
learning?) several days ago, I was asked to clarify what I meant when I
said that "one can show that inductive class representations (in other
words, representations of concepts and categories) cannot be adequately
specified within the classical (numeric) mathematical models" including,
of course, connectionist models. Here are some general ideas from the
paper which will be presented at the workshop. The following observations
about the STRUCTURE of inductive learning models strongly suggest why the
classical (numeric) mathematical models will be able to support only
"weak" inductive learning models, i.e. the models that can perform
reasonably only in VERY rigidly delineated environments.
The questions I'm going to address in this posting on the one hand lay
at the very foundations of connectionism and on the other hand are
relatively simple, provided one keeps in mind that we are discussing the
overall FORMAL STRUCTURE of the learning models (which requires a
relatively high level of abstraction).
Let's look at the structure of connectionist models through the very basic
problem of inductive learning. In order to arrive at a useful formulation
of the inductive learning problem and, at the same time, at a useful
framework for solving the problem, I propose to proceed as follows.
First and foremost, the inductive learning involves a finite set of data
(objects from the class C) labeled either (C+, C-), positive and negative
examples, or, more generally, simply C', examples. Since we want to
compare quite different classes of models (e.g. symbolic and numeric), let
us focus only on very general assumptions about the nature of the object
representation (input) space:
Postulate 1. Input space S satisfies a finite set A of axioms.
(S, in fact, provide a formal specifications of all
the necessary data properties; compare with the
concept of abstract data type in computer science).
Thus, for example, the vector (linear) space is defined by means of the
well known set of axioms for vector addition and scalar multiplication.
Next, let us attach the name "inductive class representation" (ICR) to
the formal description (specification) of the class C obtained in a
chosen model as a result of an inductive learning process:
Postulate 2. In a learning model, ICR is specified in some (which?)
formal manner.
---------------------------------------------------------------------
| My first main point connects Postulate 2 to Postulate 1: ICR |
| should be expressed in the "language" of the axioms from set A. |
---------------------------------------------------------------------
For example, in a vector space ICR should be specified only in terms of
the given data set plus the operations in the vector space, i.e. we
are restricted to the spanned affine subspace or its approximation.
The reason is quite simple: the only relationships that can be
(mathematically) legitimately extracted from the input data are those that
are expressible in the language of the input space S. Otherwise, we are,
in fact, IMPLICITLY postulating some other relationships not specified in
the input space by Postulate 1, and, therefore, the "discovery" of such
implicit relationships in the data during the learning process is an
illusion: such relationships are not "visible" in S.
Thus, for example, "non-linear" relationships cannot be discovered from a
finite data in a vector space, simply because a non-linear relationship is
not part of the linear structure and, therefore, cannot be
(mathematically) legitimately extracted from the finite input set of vectors
in the vector space.
What is happening (of necessity) in a typical connectionist model is that
in addition to the set A of vector space axioms, some additional
non-linear structure (determined by the class of non-linear functions
chosen for the internal nodes of the NN) is being postulated IMPLICITLY
from the beginning.
Question: What does this additional non-linear structure has to do with
the finite input set of vectors?
(In fact, there are uncountably many such non-linear structures
and, typically, none of them is directly related to the
structure of the vector space or the input set of vectors.)
-----------------------------------------------------------------------
| My second main point is this: if S is a vector space, in both cases, |
| whether we do or don't postulate in addition to the vector space |
| axioms some non-linear structure (for the internal nodes), we are |
| faced with the following important question. What are we learning |
| during the learning process? Certainly, we are not learning any |
| interesting ICR: the entire STRUCTURE is fixed before the learning |
| process. |
-----------------------------------------------------------------------
It appears, that this situation is inevitable if we choose one of the
classical (numeric) mathematical structures to model the input space S.
However, in an appropriately defined symbolic setting (i.e. with an
appropriate dynamic metric structure, see my home page) the situation
changes fundamentally.
To summarize (but not everything is before your eyes), the "strong"
(symbolic) inductive learning models offer the ICRs that are much more
flexible than those offered by the classical (numeric) models. In other
words, the appropriate symbolic models offer true INDUCTIVE class
representations. [The latter is given by a subset of objects + the
constructed finite set of (weighted) operations that can transform objects
into objects.]
Lev Goldfarb
http://wwwos2.cs.unb.ca/profs/goldfarb/goldfarb.htm
More information about the Connectionists
mailing list