CRG-TR-90-5 request

Fri Oct 12 07:22:50 EDT 1990

         PLEASE DO NOT FORWARD TO OTHER NEWSGROUPS OF MAILING LISTS
	 **********************************************************

The following technical report is now available.  You can get it from
carol at ai.toronto.edu.  Send your real mail address (omitting all other
information from your message).

---------------------------------------------------------------------------

                            COMPETING EXPERTS:
                    AN EXPERIMENTAL INVESTIGATION OF 
                        ASSOCIATIVE MIXTURE MODELS

                             Steven J. Nowlan

                       Department of Computer Science
	                   University of Toronto
                          Toronto, Canada M5S 1A4

	                       CRG-TR-90-5

Supervised algorithms, such as back-propagation, have proven capable of
discovering clever internal representations of the information necessary for
performing some task, while ignoring irrelevant detail in the input. However, 
such supervised algorithms suffer from problems of scale and interference 
between tasks when used to perform more than one task, or a complex task which
is a disjunction of many simple subtasks. To address these problems, several 
authors have proposed modular systems consisting of multiple networks 
(Hampshire & Waibel 1989, Jacobs, Jordan & Barto 1990, Jacobs, Jordan, Nowlan,
& Hinton, 1990). In this paper, we discuss experimental investigations of the
model introduced by Jacobs, Jordan, Nowlan & Hinton, in which a number of 
simple expert networks compete to solve distinct pieces of a large task; each 
expert has the power of a supervised algorithm to allow it to discover clever 
task specific internal representations, while an unsupervised competitive 
mechanism decomposes the task into easily computable subtasks. The competitive
mechanism is based on the mixture view of competition discussed in (Nowlan 
1990, Nowlan & Hinton), and the entire system may be viewed as an associative 
extension of a model consisting of a mixture of simple probability generators.
The task decomposition and training of individual experts are performed in 
parallel, leading to an interesting non-linear interaction between these two 
processes. Experiments on a number of simple tasks illustrate resistance to 
task interference, the ability to discover the ``appropriate'' number of 
subtasks, and good parallel scaling performance. The system of competing 
experts is also compared with an alternate formulation, suggested by the work 
of (Jacobs, Jordan, and Barto 1990), which allows cooperation rather than
competition between a number of simple expert networks. Results are also
described for a phoneme discrimination task, which reveals an ability for a
system of competing experts to uncover interesting subtask structure in a
complex task.

References:

J. Hampshire and A. Waibel, "The Meta-Pi network: Building distributed
knowledge representations for robust pattern recognition." Technical
Report CMU-CS-89-166, School of Computer Science, Carnegie Mellon University,
1989.

R. A. Jacobs, M. I. Jordan, and A. G. Barto, "Task decomposition through
competition in a modular connectionist architecture." Cognitive Science, 1990.
In Press.

R. A. Jacobs, M. I. Jordan, S. J. Nowlan and G. E. Hinton, "Adaptive mixtures
of local experts" Neural Computation, 1990. In Press.

S. J. Nowlan, 1990. "Maximum Likelihood Competitive Learning" in Neural
Information Processing Systems 2, D. Touretzky (ed.), Morgan Kauffmann, 1990.

S. J. Nowlan and G. E. Hinton, "The bootstrap Widrow-Hoff rule as a cluster
formation algorithm" Neural Computation 2:3, 1990.

------------------------------------------------------------------------------