No subject

Tue Jun 6 06:52:25 EDT 2006

Thagard model. (1) It is not practical to build a new network on the fly
for every problem encountered, especially when the mechanism that builds
the networks sidesteps a lot of really hard problems by being fed input
as neatly pre-digested symbolic structures. (2) The task of finding a
mapping from a given base to a given target is a party trick.  In real
life you are presented with a target (a structure with some gaps) and
you have to find the best base (or bases) in long-term memory to support
the mapping that best fills the gaps.

Another paper on connectionist analogical inference is:

	Halford, G.S., Wilson, W.H., Guo, J., Wiles, J., & Stewart, J.E.M.
		(In preparation). Connectionist implications for processing
		capacity limitations in analogies.

I wouldn't be proper for me to comment on that paper as it is in preparation.

I am not aware of any other direct attempts at connectionist implementation
of analogical inference (in the sense that interests me: general, dynamic,
and practical), but I don't get much time to keep up with the literature
and would be pleased to be corrected on this.

- References to other work on analogical inference

There is a large-ish literature on analogy in psychology, AI, and philosophy.

In psychology try:

	Gentner, D. (1989) The mechanisms of analogical learning. In
		Vosniadou, S., & Ortony, A. (Eds.), Similarity and
		analogical reasoning (pp. 199-241). Cambridge:
		Cambridge University Press.

In Artificial Intelligence try:

	Kedar-Cabelli, S. (1988). Analogy - from a unified perspective. In
		D.H. Helman (Ed.), Analogical reasoning (pp. 65-103).
		Dordrecht: Kluwer Academic Publishers.

Sorry, I haven't followed the philosophy literature.

- References to related connectionist work

One of the really hard parts about trying to do connectionist analogical
inference is that you need to be able to represent structures, and this
is still a wide-open research area.  A book that is worth looking at in
this area is:

	Barnden, J.A., & Pollack, J.B. (Eds.) (1991). High-level
		connectionist models. Norwood NJ: Ablex.

For specific techniques to represent structured data you might try to
track down the work of Paul Smolensky, Jordan Pollack, and Tony Plate
(to name a few).

- Lonce Wyse (lwyse at park.bu.edu) says:

LW> I was strongly advised against going for such a "high level"
LW> cognitive phenomenon [on starting grad school].

I think that is good advice.  Connectionist analogical inference is a
*really hard* problem (at least if you are aiming for something realistically
useful).  The solution involves solving a bunch of other problems that are
hard in their own rights.  Doctoral candidates and untenured academics
can't afford the risk of attacking something like this because they have to
crank out the publications.  If you want to get into this area, either keep
it as a hobby or carve out an extremely circumscribed subset of the problem
(and lose the fun).

- Lonce Wyse also says:

LW> I think intermodal application of learning in neural networks
LW> is a future hot topic.

In classical symbolic AI the relationship between a concept and its
corresponding symbol is arbitrary and the 'internal structure' of the
symbol does not have any effect on the dynamics of processing.

In a connectionist symbol processor the symbol<->referent relationship
should still be arbitrary (because we need to be able to reconceive the
same referent at whim) but the internal structure of the symbol (a vector)
DOES effect the dynamics of processing.  The tricky part is to pick a
symbol that has the correct dynamic effect.  The possibility that I am
pursuing is to pick a pattern that is analogically consistent with what
is already in Long-Term Memory.  Extending the theme requires that the
pattern be consistent with information about the same referent
obtained via other sensory modes.

Some while back I used up some bandwidth on the mailing list asking about
the role of intermodal learning in symbol grounding.  For my purposes the
crucial aspect about symbol grounding is that it concerns linking a
perceptual icon to a symbol with the desired symbolic linkages.
My intuitive belief is that a system with only one perceptual mode and
no ability to interact with its environment can learn to approximate that
environment but not to build a genuine model of the environment as
distinct from the perceiver.  So intermodal learning and the ability to
interact with the environment are important.

- In answer to my assertion that: "Generalisation can occur without
	interpolation in a data space that you can observe, but it
	may involve interpolation in some other space that is
	constructed internally and dynamically"

	- Lev Goldfarb (goldfarb at unbmvs1.csd.unb.ca ?) says

LG> an *intelligent* system must have the capacity to generate new metrics
LG> based on the structural properties of the object classes.

	- and Thomas Hildebrandt (thildebr at athos.csee.lehigh.edu) says

TH> Generalization is interpolation in the *Right* space.

Psychological scaling studies have shown that the similarity metric over
a group of objects depends on the composition of the group.  For example,
the perceived similarity of an apple and an orange depends on whether they
are with other fruit, or other roughly spherical objects.

Mike Humphreys at the University of Queensland has stated that items in LTM
are essentially orthogonal (and I am sure he would have the studies to back
it up)

The point is that the metric used to relate objects is induced by the demands
of the current task.  I like to think of all the items in LTM as
(approximately) mutually orthogonal vectors in a very high dimensional
space.  The STM representation is a projection of the high-D space onto a
low-D space of those LTM items that are currently active.  The exact
projection that is used is dependent on the task demands.

Classic back-prop type connectionism attempts to generate a new
representation on the hidden layer such that interpolation is also
(correct) generalisation.  This is done by learning the correct weights
into the hidden layer.  Unfortunately, the weights are essentially fixed
with respect to the time scale of outside events.  What is required for
analogical inference (in my sense) is that the weights be dynamic and able
to vary at least as fast as the environmental input.  For this to have even
a hope of working without degenerating into an unstable mess there must be
lots of constraint: from the architecture, from cross-modal input and
from previous knowledge in LTM.

- Marek (marek at iuvax.cs.indiana.edu) says

M> Would Pentti Kanerva's model of associative memory fit in with
M> your definition of analogical inference?

- and Geoff Hinton (geoff at ai.toronto.edu) says

GH> It seems to me that my family trees example is exactly an example
GH> of analogical mapping.

Well, my memory of both is rather patchy, but I think not.  At least, not
in the sense that I am using analogical inference.  The reason that I say
this lies back in my previous paragraph.  Connectionist work, to date, is very 
static: the net learns *a* mapping and then you use it.  A net may learn to
generalise on one problem domain in a way that looks like analogy, but
I want it to be able to generalise to others on the fly. In order to
perform true analogical inference the network must search in real-time
for the correct transformation weights instead of learning them over
an extended period.

Hinton's 1981 network for assigning canonical object-based frames of
reference is probably closer in spirit to analogical retrieval.  In this
model there are objects that must be recognised from arbitrary view-points.
In this model the network settles simultaneously on the object class and
the transformation that maps the perceptual image onto the object class.

- Jim Franklin (jim at hydra.maths.unsw.oz.au) says

JF> What is the 'argument that analogical inference is the basic mode of
JF> retrieval from memory'?

I thought I'd be able to slip that one by, but I forgot you were out there
Jim. OK, here goes.

There is a piece of paper you occasionally find tacked on the walls of labs
that gives the translations for phrases used in scientific papers.  Amongst
others it contains:

'It is believed that ...' => 'In my last paper I said that ...'
'It is generally believed that ...' => 'I asked the person in the next office
and she thought so too.'

In other words, I can't quote you relevant papers but it appears to have some
currency among my academic colleagues in psychology.  As you would expect
the belief is most strongly held by people who study analogy.  People
studying other phenomena generally try to structure their experiments so
that analogical inference can't happen.

The strongest support for the notion probably comes from natural language
processing.  The AI people have been stymied by unconstrained language
being inherently metaphorical.  If you read even a technical news report
you find metaphorical usage: the market was <brittle>, trading was <heavy>.
Words don't have meanings, they are gestures towards meanings.
Wittgenstein pointed out the impossibility of precise definition.
Attempts to make natural language understanding software systems by bolting
on a metaphor-processor after synatx and semantics just don't work.
Metaphor has to be in from ground level.

Similarly, perceptual events don't have unambiguous meanings, they are
gestures towards meanings.  They must be interpreted in the context of
the rest of the perceptual field and the intentions of the perceiver.
One of the hallmarks of intelligent behaviour is to be able to perceive
things in a context dependent way: usually a filing cabinet is a device for
storage of papers, sometimes it is a platform for standing on while
replacing a light bulb.

Now suppose you have a very incomplete and ambiguous input to a memory
device.  You want the input to be completed and 'interpreted' in a way
that is consistent with the input fragment and with the intentional
context.  You also have a lot of hard-earned prior knowledge that you
should take advantage of.  Invoke a fairly standard auto-associator for
pattern completion.  If your input data is a literal match for a learned
item it can be simply completed.  If your input data is not a literal
match then find a transformation such that it *can* be completed via
the auto-associator and re-transformed back into the original pattern.
If the transformed-associated-untransformed version matches the
original and you have also filled in some of the gaps then you have
performed an analogical inference/retrieval.  The literal matching case
can be seen as a special case of this where the transform and its inverse
are the identity mapping.  So, if you have a memory mechanism that
performs analogical retrieval then you automatically get literal retrieval
but if your standard mechanism is literal retrieval then you have to
have some other mechanism for analogical inference.

I believe that if you can do analogical retrieval you have achieved the
crucial step on the way to symbol grounding, natural language understanding,
common sense reasoning and genuine artificial intelligence.  I shall now
step down from the soap box.

Ross Gayler
ross at psych.psy.uq.oz.au
     ^^^^^^^^^^^^^^^^^^ <- My mailer lives here, but I live 2,000km south.

Any job offers will be gratefully considered - I have a mortgage & dependents.