selective attention

Wed Mar 14 21:21:31 EST 1990

Depending on your perspective, the following may or
may not seem relevant to the topic of attention
in connectionist structures.  You decide:

A hierarchical MLP classifier that recognizes the
speech of many speakers by learning how to integrate
speaker-dependent modules.  This integration
involves a sort of supervisory net that learns how
to tell when a particular speaker-dependent module
is relevant to the speech recognition process.
This dynamic focussing of attention to particular
modules or groups of modules (depending on the
specific input speech signal) is learned indirectly
via what we have previously described on this net
as the "Meta-Pi" connection.  The training error
signal is derived from the global objective of
"correct phoneme classification", so there is no
explicit attention training of the supervisor net.
Nevertheless, it learns to direct its attention
to relevant modules pretty well (98.4% recognition rate).
Attention evolves as a by-product of the global objective.
The operation of the whole contraption is explained in
terms of Bayesian probability theory.

The original work describing this is

"The Meta-Pi Network:  Building Distributed
Knowledge Representations for Robust Pattern
Recognition", CMU tech report CMU-CS-89-166,
August, 1989.

THIS TR IS BEING SUPERSEDED AS WE SPEAK
by CMU-CS-89-166-R, so the new version is a few weeks away.
                 ^
                 |

A part of the TR (uneffected by the revision)
will appear in the forthcoming NIPS proceedings.

Cheers,

John Hampshire & Alex Waibel